动态相对位置编码自动代码编辑的基于编码的变压器

论文标题

动态相对位置编码自动代码编辑的基于编码的变压器

Dynamically Relative Position Encoding-Based Transformer for Automatic Code Edit

论文作者

Qi, Shiyi, Li, Yaoxian, Gao, Cuiyun, Su, Xiaohong, Gao, Shuzheng, Zheng, Zibin, Liu, Chuanyi

论文摘要

最近已经深入研究了适应深度学习（DL）技术，以使非平凡的编码活动（例如代码文档和缺陷检测）进行自动化。学习预测代码变化是受欢迎的基本调查之一。先前的研究表明，诸如神经机器翻译（NMT）之类的DL技术可以使有意义的代码更改受益，包括错误修复和代码重构。但是，NMT模型在建模长序列时可能会遇到瓶颈，因此在准确预测代码更改方面受到限制。在这项工作中，我们设计了一种基于变压器的方法，考虑到变压器已被证明有效地捕获长期依赖性。具体来说，我们提出了一个名为DTRANS的新型模型。为了更好地整合代码的局部结构，即本文中的语句级信息，DTRAN的设计在变压器的多头注意力中具有动态相对位置编码。基准数据集上的实验表明，与最先进的方法相比，DTRAN可以更准确地生成斑块，从而在不同数据集上的确切匹配度量方面将性能至少增加了5.45 \％-46.57 \％。此外，DTRAN可以定位以比现有方法高1.75 \％-24.21 \％的线路更改的线路。

Adapting Deep Learning (DL) techniques to automate non-trivial coding activities, such as code documentation and defect detection, has been intensively studied recently. Learning to predict code changes is one of the popular and essential investigations. Prior studies have shown that DL techniques such as Neural Machine Translation (NMT) can benefit meaningful code changes, including bug fixing and code refactoring. However, NMT models may encounter bottleneck when modeling long sequences, thus are limited in accurately predicting code changes. In this work, we design a Transformer-based approach, considering that Transformer has proven effective in capturing long-term dependencies. Specifically, we propose a novel model named DTrans. For better incorporating the local structure of code, i.e., statement-level information in this paper, DTrans is designed with dynamically relative position encoding in the multi-head attention of Transformer. Experiments on benchmark datasets demonstrate that DTrans can more accurately generate patches than the state-of-the-art methods, increasing the performance by at least 5.45\%-46.57\% in terms of the exact match metric on different datasets. Moreover, DTrans can locate the lines to change with 1.75\%-24.21\% higher accuracy than the existing methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题