分解化学空间：适用于原子能的机器学习

论文标题

分解化学空间：适用于原子能的机器学习

Decomposing Chemical Space: Applications to the Machine Learning of Atomic Energies

论文作者

Kjeldal, Frederik Ø., Eriksen, Janus J.

论文摘要

我们在整个标准QM7数据集中应用了许多原子分解方案 - 平衡几何形状处的一组有机分子集，以检查来自分子中嵌入的不同元素对雾化能量的趋势的可能出现。具体而言，将基于空间局部分子轨道的最新分解方案与替代品进行了比较，而替代方法是分区分子能量基于哪些核个人原子轨道的中心。我们发现这些分配方案在离散原子贡献的分组，构造和异质性方面，以非常不同的方式暴露了化学化合物空间的组成，例如那些与与不同沉重原子粘合的氢粘合有关的贡献。此外，对于某些方案而言，发现了对单电子基集的非物理依赖性，但并非所有这些方案。接下来要评估这些组成因素对基于原子能量的训练量身定制的神经网络模型的相关性和重要性。我们在当代机器学习模型方面确定了局限性和可能的优势，并讨论了基于原子的潜在对应物的设计，以及作为主要分解单元的固有能量。

We apply a number of atomic decomposition schemes across the standard QM7 dataset -- a small model set of organic molecules at equilibrium geometry -- to inspect the possible emergence of trends among contributions to atomization energies from distinct elements embedded within molecules. Specifically, a recent decomposition scheme of ours based on spatially localized molecular orbitals is compared to alternatives that instead partition molecular energies on account of which nuclei individual atomic orbitals are centred on. We find these partitioning schemes to expose the composition of chemical compound space in very dissimilar ways in terms of the grouping, binning, and heterogeneity of discrete atomic contributions, e.g., those associated with hydrogens bonded to different heavy atoms. Furthermore, unphysical dependencies on the one-electron basis set are found for some, but not all of these schemes. The relevance and importance of these compositional factors for training tailored neural network models based on atomic energies are next assessed. We identify both limitations and possible advantages with respect to contemporary machine learning models and discuss the design of potential counterparts based on atoms and the intrinsic energies of these as the principal decomposition units.

下载PDF全文

下载文献需遵守相关版权规定

论文标题