论文标题
机器学习软件的无法施用的测试数据生成
Unsupposable Test-data Generation for Machine-learned Software
论文作者
论文摘要
至于机器学习的软件开发,通过使用现有数据集的一部分作为测试数据来评估训练有素的模型。但是,如果具有与现有数据不同的特征的数据是输入的,则该模型并不总是按预期的。因此,为了更严格地确认模型的行为,有必要创建与现有数据不同并使用不同数据测试模型的数据。要测试的数据不仅包括开发人员可以假设的数据(可替代数据),而且还包括他们无法假设的数据(不可据说的数据)。为了严格确认模型的行为,重要的是创建尽可能多的不可胰的数据。因此,在这项研究中,提出了一种称为“不可抑制的测试数据生成”(UTG)的方法---提出了对模型开发人员和测试人员的建议的建议。 UTG使用各种自动编码器(VAE)生成不可磷的数据。通过在VAE的先验分布中获取潜在值并将获得的潜在值输入解码器中的潜在值,从而生成了不可抑制的数据。如果解码器生成的数据中包含了不可估计的数据,则开发人员可以通过参考数据来识别新的不可申请的功能。根据那些无法支持的功能,开发人员将能够创建具有相同功能的其他不庞大的数据。提出的UTG应用于MNIST数据集和房屋销售价格数据集。结果证明了UTG的可行性。
As for software development by machine learning, a trained model is evaluated by using part of an existing dataset as test data. However, if data with characteristics that differ from the existing data is input, the model does not always behave as expected. Accordingly, to confirm the behavior of the model more strictly, it is necessary to create data that differs from the existing data and test the model with that different data. The data to be tested includes not only data that developers can suppose (supposable data) but also data they cannot suppose (unsupposable data). To confirm the behavior of the model strictly, it is important to create as much unsupposable data as possible. In this study, therefore, a method called "unsupposable test-data generation" (UTG)---for giving suggestions for unsupposable data to model developers and testers---is proposed. UTG uses a variational autoencoder (VAE) to generate unsupposable data. The unsupposable data is generated by acquiring latent values with low occurrence probability in the prior distribution of the VAE and inputting the acquired latent values into the decoder. If unsupposable data is included in the data generated by the decoder, the developer can recognize new unsupposable features by referring to the data. On the basis of those unsupposable features, the developer will be able to create other unsupposable data with the same features. The proposed UTG was applied to the MNIST dataset and the House Sales Price dataset. The results demonstrate the feasibility of UTG.