IDPS签名分类具有拒绝选项和专业知识的合并

论文标题

IDPS签名分类具有拒绝选项和专业知识的合并

IDPS Signature Classification with a Reject Option and the Incorporation of Expert Knowledge

论文作者

Kawaguchi, Hidetoshi, Nakatani, Yuichi, Okada, Shogo

论文摘要

随着入侵检测和预防系统（IDPS）的重要性增加，要管理恶意通信模式文件产生的签名，就会产生巨大的成本。网络安全方面的专家需要通过重要性来对IDP的工作进行分类。我们建议并评估使用拒绝选项（RO）的机器学习签名分类模型，以降低设置IDP的成本。要训练提出的模型，必须设计有效的签名分类功能。专家将签名分类为预定义的IF-then规则。如果随后的规则返回基于签名中元素的关键字匹配的低，中，高或未知重要性的标签。因此，我们首先设计了两种类型的功能：符号功能（SFS）和关键字功能（KFS），这些功能用于IF-THEN规则的关键字匹配中。接下来，我们设计Web信息和消息功能（WMFS），以捕获与IF-IF-THEN规则不匹配的签名的属性。 WMF被提取为签名中消息文本的术语频率段文档频率（TF-IDF）。这些功能是通过从签名中描述的引用的外部攻击标识系统中进行Web刮擦获得的。因为在IDPS签名的分类中需要将故障最小化，所以如医学领域，我们考虑在建议的模型中引入RO。在实验中评估了所提出的分类模型的有效性，该实验具有两个由专家标记的签名组成的实际数据集：一个可以用If-then规则进行分类的数据集和具有不符合If-If-then规则的元素的数据集。在实验中，评估了提出的模型。在这两种情况下，组合的SFS和WMF的表现都比组合的SFS和KFS更好。此外，我们还进行了功能分析。

As the importance of intrusion detection and prevention systems (IDPSs) increases, great costs are incurred to manage the signatures that are generated by malicious communication pattern files. Experts in network security need to classify signatures by importance for an IDPS to work. We propose and evaluate a machine learning signature classification model with a reject option (RO) to reduce the cost of setting up an IDPS. To train the proposed model, it is essential to design features that are effective for signature classification. Experts classify signatures with predefined if-then rules. An if-then rule returns a label of low, medium, high, or unknown importance based on keyword matching of the elements in the signature. Therefore, we first design two types of features, symbolic features (SFs) and keyword features (KFs), which are used in keyword matching for the if-then rules. Next, we design web information and message features (WMFs) to capture the properties of signatures that do not match the if-then rules. The WMFs are extracted as term frequency-inverse document frequency (TF-IDF) features of the message text in the signatures. The features are obtained by web scraping from the referenced external attack identification systems described in the signature. Because failure needs to be minimized in the classification of IDPS signatures, as in the medical field, we consider introducing a RO in our proposed model. The effectiveness of the proposed classification model is evaluated in experiments with two real datasets composed of signatures labeled by experts: a dataset that can be classified with if-then rules and a dataset with elements that do not match an if-then rule. In the experiment, the proposed model is evaluated. In both cases, the combined SFs and WMFs performed better than the combined SFs and KFs. In addition, we also performed feature analysis.

下载PDF全文

下载文献需遵守相关版权规定

论文标题