论文标题
重新访问文档图像通过网格正则化露水
Revisiting Document Image Dewarping by Grid Regularization
论文作者
论文摘要
本文解决了文档图像脱水的问题,该问题旨在消除文档图像中的几何变形以进行文档数字化。我们没有设计一个更好的神经网络来近似输入和输出之间的光流场,而是通过从约束优化的角度考虑文本行和文档边界来追求最佳的可读性。具体而言,我们提出的方法首先了解文本线中的边界点和像素,然后遵循最简单的观察,即在脱水后应保留水平和垂直方向的边界和文本线,以引入一种新颖的网格正则化方案。为了获得最终的向前映射以进行露水,我们解决了提议的网格正则化的优化问题。这些实验全面表明,我们所提出的方法在可读性方面(具有字符误差率和编辑距离的指标),在保持公开可允许的Docunet基准上的最佳图像质量方面优于先前的艺术。
This paper addresses the problem of document image dewarping, which aims at eliminating the geometric distortion in document images for document digitization. Instead of designing a better neural network to approximate the optical flow fields between the inputs and outputs, we pursue the best readability by taking the text lines and the document boundaries into account from a constrained optimization perspective. Specifically, our proposed method first learns the boundary points and the pixels in the text lines and then follows the most simple observation that the boundaries and text lines in both horizontal and vertical directions should be kept after dewarping to introduce a novel grid regularization scheme. To obtain the final forward mapping for dewarping, we solve an optimization problem with our proposed grid regularization. The experiments comprehensively demonstrate that our proposed approach outperforms the prior arts by large margins in terms of readability (with the metrics of Character Errors Rate and the Edit Distance) while maintaining the best image quality on the publicly-available DocUNet benchmark.