论文标题
类型驱动的神经编程以身作则
Type-driven Neural Programming by Example
论文作者
论文摘要
在本文中,我们调查了示例(PBE)的编程,该编程是关于找到给定输出的程序映射给定输入的程序。传统上,PBE在形式和神经方法之间进行了分歧,在正式方法中,正式方法通常涉及演绎技术,例如SAT求解器和类型,而神经方法则涉及使用其相应程序对样品输入输出进行培训,通常使用基于序列的机器计算机学习技术(例如LSTMS)[41]。由于这种拆分,编程类型尚未用于神经程序合成技术。 我们提出了一种将编程类型纳入PBE的神经程序合成方法中的方法。我们介绍了基于此想法的键入神经符号程序合成(TNSP)方法,并在功能编程环境中对其进行测试以验证类型信息可能有助于改善限量数据集对神经合成器的泛化。 我们的TNSP模型建立在现有的Neuro-Symbolic程序合成(NSP)的基础上,这是一种基于树的神经合成器,结合了输入输出示例和当前程序的信息,通过进一步公开有关这些输入输出示例的类型,语法生产规则以及我们希望在程序中扩展的漏洞的信息。 我们进一步解释了如何在域内生成数据集,该数据集使用有限的Haskell作为合成语言。最后,我们讨论了一些感兴趣的主题,这些主题可能有助于进一步提出这些想法。为了重现性,我们公开发布代码。
In this thesis we look into programming by example (PBE), which is about finding a program mapping given inputs to given outputs. PBE has traditionally seen a split between formal versus neural approaches, where formal approaches typically involve deductive techniques such as SAT solvers and types, while the neural approaches involve training on sample input-outputs with their corresponding program, typically using sequence-based machine learning techniques such as LSTMs [41]. As a result of this split, programming types had yet to be used in neural program synthesis techniques. We propose a way to incorporate programming types into a neural program synthesis approach for PBE. We introduce the Typed Neuro-Symbolic Program Synthesis (TNSPS) method based on this idea, and test it in the functional programming context to empirically verify type information may help improve generalization in neural synthesizers on limited-size datasets. Our TNSPS model builds upon the existing Neuro-Symbolic Program Synthesis (NSPS), a tree-based neural synthesizer combining info from input-output examples plus the current program, by further exposing information on types of those input-output examples, of the grammar production rules, as well as of the hole that we wish to expand in the program. We further explain how we generated a dataset within our domain, which uses a limited subset of Haskell as the synthesis language. Finally we discuss several topics of interest that may help take these ideas further. For reproducibility, we release our code publicly.