当RF击败CNN和GRU时 - 对加密恶意软件流量分类的深度学习和古典机器学习方法的比较

论文标题

当RF击败CNN和GRU时 - 对加密恶意软件流量分类的深度学习和古典机器学习方法的比较

When a RF Beats a CNN and GRU, Together -- A Comparison of Deep Learning and Classical Machine Learning Approaches for Encrypted Malware Traffic Classification

论文作者

Lichy, Adi, Bader, Ofek, Dubin, Ran, Dvir, Amit, Hajaj, Chen

论文摘要

互联网流量分类广泛用于促进网络管理。它在服务质量（QoS），经验质量（QOE），网络可见性，入侵检测和交通趋势分析中起着至关重要的作用。尽管没有理论上的保证，即基于深度学习的解决方案比基于经典的机器学习（ML）的解决方案更好，但基于DL的模型已成为常见默认设备。本文比较了著名的基于DL和基于ML的模型，并表明，在恶意交通分类的情况下，最先进的基于DL的解决方案并不一定要超过经典的ML基于ML的解决方案。我们使用两个众所周知的数据集来体现这一发现，用于各种任务集，例如：恶意软件检测，恶意软件家庭分类，零日攻击的检测以及对迭代增长数据集的分类。请注意，评估所有可能的模型以做出具体陈述是不可行的，因此，上述发现不是避免基于DL的模型的建议，而是经验证明，在某些情况下，有一些更简单的解决方案可能会更好。

Internet traffic classification is widely used to facilitate network management. It plays a crucial role in Quality of Services (QoS), Quality of Experience (QoE), network visibility, intrusion detection, and traffic trend analyses. While there is no theoretical guarantee that deep learning (DL)-based solutions perform better than classic machine learning (ML)-based ones, DL-based models have become the common default. This paper compares well-known DL-based and ML-based models and shows that in the case of malicious traffic classification, state-of-the-art DL-based solutions do not necessarily outperform the classical ML-based ones. We exemplify this finding using two well-known datasets for a varied set of tasks, such as: malware detection, malware family classification, detection of zero-day attacks, and classification of an iteratively growing dataset. Note that, it is not feasible to evaluate all possible models to make a concrete statement, thus, the above finding is not a recommendation to avoid DL-based models, but rather empirical proof that in some cases, there are more simplistic solutions, that may perform even better.

下载PDF全文

下载文献需遵守相关版权规定

论文标题