论文标题

诊断:与嘈杂标签的Experts混合物的异常检测

ADMoE: Anomaly Detection with Mixture-of-Experts from Noisy Labels

论文作者

Zhao, Yue, Zheng, Guoqing, Mukherjee, Subhabrata, McCann, Robert, Awadallah, Ahmed

论文摘要

现有关于异常检测的作品(AD)依赖于人类注释者的清洁标签,这些标签在实践中获取昂贵。在这项工作中,我们提出了一种利用弱/嘈杂标签的方法(例如,由机器规则生成的用于检测恶意软件的风险评分),可获得较便宜的异常检测。具体来说,我们提出了Axpoe,这是从嘈杂标签中学习的异常检测算法的第一个框架。简而言之,Axpoe利用了专家(MOE)结构的混合物来鼓励从多个嘈杂来源的专门和可扩展的学习。它通过共享大多数模型参数来捕获嘈杂标签之间的相似性,同时通过构建“专家”子网络来鼓励专业化。为了进一步从嘈杂的标签中榨出信号,Ampoe将其用作输入功能来促进专家学习。八个数据集(包括专有企业安全数据集)上的广泛结果证明了Axpoe的有效性,在该数据集中,它在不使用它的情况下可以提高34%的性能改进。此外,它的表现总共超过了13个带有等效网络参数和失败的领先基线。值得注意的是,AXPOE是模型不可替代的,可以实现任何基于神经网络的检测方法来处理嘈杂的标签,在该方法中,我们在多层感知器(MLP)和领先的AD方法Deepsad上都展示了其结果。

Existing works on anomaly detection (AD) rely on clean labels from human annotators that are expensive to acquire in practice. In this work, we propose a method to leverage weak/noisy labels (e.g., risk scores generated by machine rules for detecting malware) that are cheaper to obtain for anomaly detection. Specifically, we propose ADMoE, the first framework for anomaly detection algorithms to learn from noisy labels. In a nutshell, ADMoE leverages mixture-of-experts (MoE) architecture to encourage specialized and scalable learning from multiple noisy sources. It captures the similarities among noisy labels by sharing most model parameters, while encouraging specialization by building "expert" sub-networks. To further juice out the signals from noisy labels, ADMoE uses them as input features to facilitate expert learning. Extensive results on eight datasets (including a proprietary enterprise security dataset) demonstrate the effectiveness of ADMoE, where it brings up to 34% performance improvement over not using it. Also, it outperforms a total of 13 leading baselines with equivalent network parameters and FLOPS. Notably, ADMoE is model-agnostic to enable any neural network-based detection methods to handle noisy labels, where we showcase its results on both multiple-layer perceptron (MLP) and the leading AD method DeepSAD.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源