论文标题

Peano:学习形式的数学推理

Peano: Learning Formal Mathematical Reasoning

论文作者

Poesia, Gabriel, Goodman, Noah D.

论文摘要

一般数学推理在计算上是不可决定的,但人类通常会解决新问题。此外,几个世纪以来的发现很快就被教导到后代。什么结构可以实现这一目标,如何为自动化的数学推理提供信息?我们认为,这两个难题的核心是基础数学基础抽象的结构。我们在汗学院平台上的5个开始代数的案例研究中探讨了这个想法。为了定义计算基础,我们介绍了Peano,Peano是一个定理的环境,在任何时候,有效的动作是有限的。我们使用Peano正式化介绍性代数问题和公理,从而获得明确定义的搜索问题。我们观察到现有的强化学习方法,即符号推理不足以解决更严重的问题。增加了从其解决方案中诱导可重复使用的抽象(“战术”)的能力,使代理可以稳定进步,解决所有问题。此外,这些抽象引起了问题的顺序,这些问题在训练过程中随机看到。收回的命令与专家设计的可汗学院课程有重大协议,而接受回收课程的第二代代理商的学习速度要快得多。这些结果说明了抽象和课程在数学文化传播中的协同作用。

General mathematical reasoning is computationally undecidable, but humans routinely solve new problems. Moreover, discoveries developed over centuries are taught to subsequent generations quickly. What structure enables this, and how might that inform automated mathematical reasoning? We posit that central to both puzzles is the structure of procedural abstractions underlying mathematics. We explore this idea in a case study on 5 sections of beginning algebra on the Khan Academy platform. To define a computational foundation, we introduce Peano, a theorem-proving environment where the set of valid actions at any point is finite. We use Peano to formalize introductory algebra problems and axioms, obtaining well-defined search problems. We observe existing reinforcement learning methods for symbolic reasoning to be insufficient to solve harder problems. Adding the ability to induce reusable abstractions ("tactics") from its own solutions allows an agent to make steady progress, solving all problems. Furthermore, these abstractions induce an order to the problems, seen at random during training. The recovered order has significant agreement with the expert-designed Khan Academy curriculum, and second-generation agents trained on the recovered curriculum learn significantly faster. These results illustrate the synergistic role of abstractions and curricula in the cultural transmission of mathematics.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源