论文标题
当RF击败CNN和GRU时 - 对加密恶意软件流量分类的深度学习和古典机器学习方法的比较
When a RF Beats a CNN and GRU, Together -- A Comparison of Deep Learning and Classical Machine Learning Approaches for Encrypted Malware Traffic Classification
论文作者
论文摘要
互联网流量分类广泛用于促进网络管理。它在服务质量(QoS),经验质量(QOE),网络可见性,入侵检测和交通趋势分析中起着至关重要的作用。尽管没有理论上的保证,即基于深度学习的解决方案比基于经典的机器学习(ML)的解决方案更好,但基于DL的模型已成为常见默认设备。本文比较了著名的基于DL和基于ML的模型,并表明,在恶意交通分类的情况下,最先进的基于DL的解决方案并不一定要超过经典的ML基于ML的解决方案。我们使用两个众所周知的数据集来体现这一发现,用于各种任务集,例如:恶意软件检测,恶意软件家庭分类,零日攻击的检测以及对迭代增长数据集的分类。请注意,评估所有可能的模型以做出具体陈述是不可行的,因此,上述发现不是避免基于DL的模型的建议,而是经验证明,在某些情况下,有一些更简单的解决方案可能会更好。
Internet traffic classification is widely used to facilitate network management. It plays a crucial role in Quality of Services (QoS), Quality of Experience (QoE), network visibility, intrusion detection, and traffic trend analyses. While there is no theoretical guarantee that deep learning (DL)-based solutions perform better than classic machine learning (ML)-based ones, DL-based models have become the common default. This paper compares well-known DL-based and ML-based models and shows that in the case of malicious traffic classification, state-of-the-art DL-based solutions do not necessarily outperform the classical ML-based ones. We exemplify this finding using two well-known datasets for a varied set of tasks, such as: malware detection, malware family classification, detection of zero-day attacks, and classification of an iteratively growing dataset. Note that, it is not feasible to evaluate all possible models to make a concrete statement, thus, the above finding is not a recommendation to avoid DL-based models, but rather empirical proof that in some cases, there are more simplistic solutions, that may perform even better.