检索和完善：基于典范的神经评论

论文标题

检索和完善：基于典范的神经评论

Retrieve and Refine: Exemplar-based Neural Comment Generation

论文作者

Wei, Bolin, Li, Yongmin, Li, Ge, Xia, Xin, Jin, Zhi

论文摘要

旨在自动为源代码生成自然语言描述的代码评论生成是自动软件开发领域的至关重要的任务。传统评论生成方法使用手动制作的模板或信息检索（IR）技术来生成源代码的摘要。近年来，利用备受赞誉的编码器深度学习框架来从大规模平行代码语料库中学习评论的生成模式的基于神经网络的方法已取得了令人印象深刻的结果。但是，这些新兴方法仅将与代码相关的信息作为输入。软件重用在软件开发过程中很常见，这意味着类似代码段的评论有助于评论生成。受基于IR和基于模板的方法的启发，在本文中，我们提出了一种神经评论生成方法，在该方法中，我们将类似代码片段的现有评论用作指导评论生成的示例。具体来说，给定代码，我们首先使用IR技术检索类似的代码段，并将其评论视为示例。然后，我们设计了一个新颖的SEQ2SEQ神经网络，该网络将给定代码，其AST，相似的代码和其示例作为输入，并利用示例中的信息来帮助基于源代码和相似代码之间的语义相似性的目标评论生成。我们在大规模的Java语料库上评估了我们的方法，该方法包含大约2M样本，实验结果表明，我们的模型的表现优于最先进的方法。

Code comment generation which aims to automatically generate natural language descriptions for source code, is a crucial task in the field of automatic software development. Traditional comment generation methods use manually-crafted templates or information retrieval (IR) techniques to generate summaries for source code. In recent years, neural network-based methods which leveraged acclaimed encoder-decoder deep learning framework to learn comment generation patterns from a large-scale parallel code corpus, have achieved impressive results. However, these emerging methods only take code-related information as input. Software reuse is common in the process of software development, meaning that comments of similar code snippets are helpful for comment generation. Inspired by the IR-based and template-based approaches, in this paper, we propose a neural comment generation approach where we use the existing comments of similar code snippets as exemplars to guide comment generation. Specifically, given a piece of code, we first use an IR technique to retrieve a similar code snippet and treat its comment as an exemplar. Then we design a novel seq2seq neural network that takes the given code, its AST, its similar code, and its exemplar as input, and leverages the information from the exemplar to assist in the target comment generation based on the semantic similarity between the source code and the similar code. We evaluate our approach on a large-scale Java corpus, which contains about 2M samples, and experimental results demonstrate that our model outperforms the state-of-the-art methods by a substantial margin.

下载PDF全文

下载文献需遵守相关版权规定

论文标题