论文标题

数据驱动的替代机器学习模型开发的方法

Data-driven Approaches to Surrogate Machine Learning Model Development

论文作者

Jones, H. Rhys, Mu, Tingting, Popescu, Andrei C., Sulehman, Yusuf

论文摘要

我们证明了三种既定方法对替代机器学习模型开发领域的改编。这些方法是数据增强,自定义损失功能和转移学习。这些方法中的每一种都在机器学习领域中广泛使用,但是,在这里,我们专门将它们应用于机器学习模型开发。构成这项工作背后基础的机器学习模型旨在代理英国核工业中使用的传统工程模型。由于培训数据有限,该模型的先前性能受到较差的性能的阻碍。在这里,我们证明,通过相结合的其他技术,模型性能可以显着提高。我们表明,上述每种技术本身都具有效用,并且相互结合。但是,我们认为它们最适合作为转移学习操作的一部分。在这项研究之前生产的五种预训练的替代模型通过增强数据集和我们的自定义损失功能进一步培训。通过所有三种技术的组合,我们看到这五个型号的性能至少提高了$ 38 \%$。

We demonstrate the adaption of three established methods to the field of surrogate machine learning model development. These methods are data augmentation, custom loss functions and transfer learning. Each of these methods have seen widespread use in the field of machine learning, however, here we apply them specifically to surrogate machine learning model development. The machine learning model that forms the basis behind this work was intended to surrogate a traditional engineering model used in the UK nuclear industry. Previous performance of this model has been hampered by poor performance due to limited training data. Here, we demonstrate that through a combination of additional techniques, model performance can be significantly improved. We show that each of the aforementioned techniques have utility in their own right and in combination with one another. However, we see them best applied as part of a transfer learning operation. Five pre-trained surrogate models produced prior to this research were further trained with an augmented dataset and with our custom loss function. Through the combination of all three techniques, we see an improvement of at least $38\%$ in performance across the five models.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源