论文标题

在视觉上富裕文档中的关系表示学习

Relational Representation Learning in Visually-Rich Documents

论文作者

Li, Xin, Zheng, Yan, Hu, Yiqing, Cao, Haoyu, Wu, Yunfei, Jiang, Deqiang, Liu, Yinsong, Ren, Bo

论文摘要

关系理解对于许多视觉富裕文档(VRD)理解任务至关重要。通过多模式的预训练,最近的研究提供了全面的上下文表示,并将其作为下游任务的先验知识。尽管取得了令人印象深刻的结果,但我们观察到,尚未挖掘出基于上下文知识的广泛关系提示(例如,收据上关键/价值字段的关系)。为了减轻这一差距,我们提出了Docrel,即文档关系表示框架。 Docrel根源在各种关系中的主要挑战。从最简单的成对关系与复杂的全球结构,由于关系的定义有所不同,甚至在不同任务中发生冲突,进行监督培训是不可行的。为了处理对关系的不可预测的定义,我们提出了一项名为“关系一致性建模(RCM)”的新颖对比学习任务,该任务利用了一个事实,即现有关系在不同的增强积极观点中应保持一致。 RCM提供的关系表示形式,即使在没有任何了解关系定义的情况下,即使在不了解的是下游任务的迫切需要。 Docrel在各种VRD关系理解任务上取得了更好的性能,包括表结构识别,关键信息提取和阅读顺序检测。

Relational understanding is critical for a number of visually-rich documents (VRDs) understanding tasks. Through multi-modal pre-training, recent studies provide comprehensive contextual representations and exploit them as prior knowledge for downstream tasks. In spite of their impressive results, we observe that the widespread relational hints (e.g., relation of key/value fields on receipts) built upon contextual knowledge are not excavated yet. To mitigate this gap, we propose DocReL, a Document Relational Representation Learning framework. The major challenge of DocReL roots in the variety of relations. From the simplest pairwise relation to the complex global structure, it is infeasible to conduct supervised training due to the definition of relation varies and even conflicts in different tasks. To deal with the unpredictable definition of relations, we propose a novel contrastive learning task named Relational Consistency Modeling (RCM), which harnesses the fact that existing relations should be consistent in differently augmented positive views. RCM provides relational representations which are more compatible to the urgent need of downstream tasks, even without any knowledge about the exact definition of relation. DocReL achieves better performance on a wide variety of VRD relational understanding tasks, including table structure recognition, key information extraction and reading order detection.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源