论文标题
自动测试生成用于刮擦程序
Automated Test Generation for Scratch Programs
论文作者
论文摘要
编程教育的重要性导致了专门的教育编程环境,在该环境中,用户在视觉上安排基于块的编程结构,通常控制图形,交互式游戏式程序。划痕编程环境特别受欢迎,在撰写本文时,有超过7000万注册用户。尽管基于块的SCRATCH性质通过预防句法错误有助于学习者,但仍然需要提供反馈和支持以实现所需的功能。为了支持个人学习和课堂设置,理想情况下应以自动化方式提供此反馈和支持,这需要测试才能实现动态程序分析。 Whisker框架可以对刮擦程序进行自动测试,但是为Scratch程序创建这些自动化测试具有挑战性。因此,在本文中,我们研究了如何自动生成晶须测试。这引起了重要的挑战:首先,类似游戏的程序通常是随机的,导致了片状测试。其次,刮擦程序通常由动画和与长时间延迟的互动组成,从而抑制了经典测试生成方法的应用。对常见编程练习的评估,1000个刮擦用户程序的随机样本以及1000个最受欢迎的刮擦程序表明,我们的方法可以使晶球能够可靠地加速测试执行,即使许多划痕程序又小且易于涵盖,但仍需要许多具有高级搜索测试的挑战来实现许多高级搜索测试,以实现高覆盖范围。
The importance of programming education has lead to dedicated educational programming environments, where users visually arrange block-based programming constructs that typically control graphical, interactive game-like programs. The Scratch programming environment is particularly popular, with more than 70 million registered users at the time of this writing. While the block-based nature of Scratch helps learners by preventing syntactical mistakes, there nevertheless remains a need to provide feedback and support in order to implement desired functionality. To support individual learning and classroom settings, this feedback and support should ideally be provided in an automated fashion, which requires tests to enable dynamic program analysis. The Whisker framework enables automated testing of Scratch programs, but creating these automated tests for Scratch programs is challenging. In this paper, we therefore investigate how to automatically generate Whisker tests. This raises important challenges: First, game-like programs are typically randomised, leading to flaky tests. Second, Scratch programs usually consist of animations and interactions with long delays, inhibiting the application of classical test generation approaches. Evaluation on common programming exercises, a random sample of 1000 Scratch user programs, and the 1000 most popular Scratch programs demonstrates that our approach enables Whisker to reliably accelerate test executions, and even though many Scratch programs are small and easy to cover, there are many unique challenges for which advanced search-based test generation using many-objective algorithms is needed in order to achieve high coverage.