使用模型不足的技术预测有缺陷的线路

论文标题

使用模型不足的技术预测有缺陷的线路

Predicting Defective Lines Using a Model-Agnostic Technique

论文作者

Wattanakriengkrai, Supatsara, Thongtanunam, Patanamon, Tantithamthavorn, Chakkrit, Hata, Hideaki, Matsumoto, Kenichi

论文摘要

提出了缺陷预测模型，以帮助团队优先考虑基于缺陷的可能性需要软件质量保证（SQA）的源代码区域文件。但是，开发人员可能会在整个文件上浪费他们的不必要的努力，而其源代码线的一小部分是有缺陷的。实际上，我们发现文件的线路的1％-3％是有缺陷的。因此，在这项工作中，我们提出了一个新颖的框架（称为LINE-DP），以使用模型 - 静态技术（即可解释的AI技术）来识别有缺陷的线路，该技术提供了信息，该技术提供了为什么模型做出这样的预测的信息。从广义上讲，我们的LINE-DP首先使用代码令牌功能构建文件级缺陷模型。然后，我们的LINE-DP使用最先进的模型 - 不合Snostic技术（即石灰）来识别风险代币，即导致文件级缺陷模型的代码令牌，以预测文件是否有缺陷。然后，预测包含有风险令牌的线路是有缺陷的线路。通过对32个Java开源系统的32个释放的案例研究，我们的评估结果表明，我们的行-DP的平均召回率为0.61，错误警报率为0.47，前20％的LOC召回0.27和16个初始错误警报，16，统计上比六个基线方法更好。我们的评估表明，我们的行-DP需要平均计算时间为10秒，包括模型构造和有缺陷的线标识时间。此外，我们发现可以通过线路DP识别的63％的有缺陷线与常见缺陷有关（例如，参数变化，条件变化）。这些结果表明，我们的行-DP可以有效地确定包含常见缺陷的有缺陷线，需要较少的检查工作和可管理的计算成本。

Defect prediction models are proposed to help a team prioritize source code areas files that need Software QualityAssurance (SQA) based on the likelihood of having defects. However, developers may waste their unnecessary effort on the whole filewhile only a small fraction of its source code lines are defective. Indeed, we find that as little as 1%-3% of lines of a file are defective. Hence, in this work, we propose a novel framework (called LINE-DP) to identify defective lines using a model-agnostic technique, i.e., an Explainable AI technique that provides information why the model makes such a prediction. Broadly speaking, our LINE-DP first builds a file-level defect model using code token features. Then, our LINE-DP uses a state-of-the-art model-agnostic technique (i.e.,LIME) to identify risky tokens, i.e., code tokens that lead the file-level defect model to predict that the file will be defective. Then, the lines that contain risky tokens are predicted as defective lines. Through a case study of 32 releases of nine Java open source systems, our evaluation results show that our LINE-DP achieves an average recall of 0.61, a false alarm rate of 0.47, a top 20%LOC recall of0.27, and an initial false alarm of 16, which are statistically better than six baseline approaches. Our evaluation shows that our LINE-DP requires an average computation time of 10 seconds including model construction and defective line identification time. In addition, we find that 63% of defective lines that can be identified by our LINE-DP are related to common defects (e.g., argument change, condition change). These results suggest that our LINE-DP can effectively identify defective lines that contain common defectswhile requiring a smaller amount of inspection effort and a manageable computation cost.

下载PDF全文

下载文献需遵守相关版权规定

论文标题