3D分子图的能量动机的术训练

论文标题

3D分子图的能量动机的术训练

Energy-Motivated Equivariant Pretraining for 3D Molecular Graphs

论文作者

Jiao, Rui, Han, Jiaqi, Huang, Wenbing, Rong, Yu, Liu, Yang

论文摘要

没有标签的分子表示模型是各种应用的基础。常规方法主要是处理2D分子图，仅专注于2D任务，这使得其预告片模型无法表征3D几何形状，因此对于下游3D任务有缺陷。在这项工作中，我们从完整而新颖的意义上处理了3D分子预处理。特别是，我们首先提议采用基于能量的模型作为预处理的骨干，该模型具有实现3D空间对称性的优点。然后，我们为力预测开发了节点级预处理的损失，在此过程中，我们进一步利用了Riemann-Gaussian分布，以确保损失为E（3） - 不景气，从而实现了更多的稳健性。此外，还利用了图形噪声量表预测任务，以进一步促进最终的性能。我们根据两个具有挑战性的3D基准：MD17和QM9评估了从大规模3D数据集Geom-QM9预估计的模型。实验结果证明了我们方法对当前最新预处理方法的疗效，并验证了我们设计的有效性。

Pretraining molecular representation models without labels is fundamental to various applications. Conventional methods mainly process 2D molecular graphs and focus solely on 2D tasks, making their pretrained models incapable of characterizing 3D geometry and thus defective for downstream 3D tasks. In this work, we tackle 3D molecular pretraining in a complete and novel sense. In particular, we first propose to adopt an equivariant energy-based model as the backbone for pretraining, which enjoys the merits of fulfilling the symmetry of 3D space. Then we develop a node-level pretraining loss for force prediction, where we further exploit the Riemann-Gaussian distribution to ensure the loss to be E(3)-invariant, enabling more robustness. Moreover, a graph-level noise scale prediction task is also leveraged to further promote the eventual performance. We evaluate our model pretrained from a large-scale 3D dataset GEOM-QM9 on two challenging 3D benchmarks: MD17 and QM9. Experimental results demonstrate the efficacy of our method against current state-of-the-art pretraining approaches, and verify the validity of our design for each proposed component.

下载PDF全文

下载文献需遵守相关版权规定

论文标题