论文标题

用于机器学习分子和固体能量相关特性的混合局部图内核

Hybrid localized graph kernel for machine learning energy-related properties of molecules and solids

论文作者

Casier, Bastien, da Silva, Mauricio Chagas, Badawi, Michael, Pascale, Fabien, Bučko, Tomáš, Lebègue, Sébastien, Rocca, Dario

论文摘要

如今,电子结构和机器学习技术的耦合是预测广泛系统的化学和物理特性的强大工具。为了提高预测的准确性,已经开发了大量的机器学习应用分子和固体表示。在这项工作中,我们提出了一个基于分子图概念的新颖描述符。尽管图形主要用于化学信息或生物信息学的分类问题,但它们通常不经常用于回归问题,尤其是与能量相关的特性。我们的方法基于原子环境的局部分解以及两个内核函数的杂交:图形内核贡献,描述了化学模式和库仑标签的贡献,该贡献1范式编码局部几何细节。通过考虑流行的QM7和BA10数据集证明了这种新内核方法在分子和凝结相系统的能量预测中的准确性。这些示例表明,混合局部图内核优于传统方法,例如,原子位置(SOAP)和库仑矩阵的平滑重叠。

Nowadays, the coupling of electronic structure and machine learning techniques serves as a powerful tool to predict chemical and physical properties of a broad range of systems. With the aim of improving the accuracy of predictions, a large number of representations for molecules and solids for machine learning applications has been developed. In this work we propose a novel descriptor based on the notion of molecular graph. While graphs are largely employed in classification problems in cheminformatics or bioinformatics, they are not often used in regression problem, especially of energy-related properties. Our method is based on a local decomposition of atomic environments and on the hybridization of two kernel functions: a graph kernel contribution that describes the chemical pattern and a Coulomb label contribution that 1encodes finer details of the local geometry. The accuracy of this new kernel method in energy predictions of molecular and condensed phase systems is demonstrated by considering the popular QM7 and BA10 datasets. These examples show that the hybrid localized graph kernel outperforms traditional approaches such as, for example, the smooth overlap of atomic positions (SOAP) and the Coulomb matrices.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源