packet2vec：利用word2vec在数据包中提取功能提取

论文标题

packet2vec：利用word2vec在数据包中提取功能提取

Packet2Vec: Utilizing Word2Vec for Feature Extraction in Packet Data

论文作者

Goodman, Eric L., Zimmerman, Chase, Hudson, Corey

论文摘要

深度学习的有吸引力的好处之一是能够从很大程度上可以从原始数据中自动提取目标问题的相关功能，而不是利用人类工程和错误的手工制作的功能。虽然深度学习在图像分类和自然语言处理等领域表现出了成功，但其在原始网络数据包数据中进行特征提取以进行入侵检测的应用是在很大程度上没有探索的。在本文中，我们修改了一种Word2Vec方法，用于文本处理，并将其应用于数据包数据以进行自动特征提取。我们称此方法packet2vec。对于2009 DARPA网络数据集中的良性与恶意流量的分类任务，我们在0.988-0.996和0.604-0.667之间获得了接收器操作特性（ROC）曲线下的区域（AUC）。

One of deep learning's attractive benefits is the ability to automatically extract relevant features for a target problem from largely raw data, instead of utilizing human engineered and error prone handcrafted features. While deep learning has shown success in fields such as image classification and natural language processing, its application for feature extraction on raw network packet data for intrusion detection is largely unexplored. In this paper we modify a Word2Vec approach, used for text processing, and apply it to packet data for automatic feature extraction. We call this approach Packet2Vec. For the classification task of benign versus malicious traffic on a 2009 DARPA network data set, we obtain an area under the curve (AUC) of the receiver operating characteristic (ROC) between 0.988-0.996 and an AUC of the Precision/Recall curve between 0.604-0.667.

下载PDF全文

下载文献需遵守相关版权规定

论文标题