论文标题

NS3:神经符号语义代码搜索

NS3: Neuro-Symbolic Semantic Code Search

论文作者

Arakelyan, Shushan, Hakhverdyan, Anna, Allamanis, Miltiadis, Garcia, Luis, Hauser, Christophe, Ren, Xiang

论文摘要

语义代码搜索是在给定其功能的文本描述下检索代码段的任务。最近的工作集中在文本和代码的神经嵌入之间使用相似性指标。但是,众所周知,当前的语言模型会在较长的,构图文本和多步推理中挣扎。为了克服这一限制,我们建议用其语义结构的布局补充查询句子。语义布局用于将最终的推理决定分解为一系列低级决策。我们使用神经模块网络体系结构来实现此想法。我们将模型-NS3(神经符号语义搜索)与许多基线进行比较,包括最先进的语义代码检索方法,并在两个数据集中进行评估 - codesearchnet和代码搜索和问题答案。我们证明了我们的方法会导致更精确的代码检索,并且在处理构图查询时,我们研究了模块化设计的有效性。

Semantic code search is the task of retrieving a code snippet given a textual description of its functionality. Recent work has been focused on using similarity metrics between neural embeddings of text and code. However, current language models are known to struggle with longer, compositional text, and multi-step reasoning. To overcome this limitation, we propose supplementing the query sentence with a layout of its semantic structure. The semantic layout is used to break down the final reasoning decision into a series of lower-level decisions. We use a Neural Module Network architecture to implement this idea. We compare our model - NS3 (Neuro-Symbolic Semantic Search) - to a number of baselines, including state-of-the-art semantic code retrieval methods, and evaluate on two datasets - CodeSearchNet and Code Search and Question Answering. We demonstrate that our approach results in more precise code retrieval, and we study the effectiveness of our modular design when handling compositional queries.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源