论文标题
文本复杂性的可解释预测:简化文本的缺失初步
Explainable Prediction of Text Complexity: The Missing Preliminaries for Text Simplification
论文作者
论文摘要
文本简化可降低专业内容的语言复杂性,以访问性目的。端到端神经网络模型已被广泛采用,以直接生成简化的输入文本版本,通常用作黑框。我们表明,可以将简化的文本简化分解为任务的紧凑管道,以确保过程的透明度和解释性。该管道中的前两个步骤通常被忽略:1)预测是否需要简化给定文本,以及2)如果是,则可以识别文本的复杂部分。可以使用词汇或深度学习方法分别解决这两个任务,也可以共同解决。只需将可解释的复杂性预测作为初步步骤,可以通过较大的边距改进样本的最先进的黑盒简化模型的简化性能。
Text simplification reduces the language complexity of professional content for accessibility purposes. End-to-end neural network models have been widely adopted to directly generate the simplified version of input text, usually functioning as a blackbox. We show that text simplification can be decomposed into a compact pipeline of tasks to ensure the transparency and explainability of the process. The first two steps in this pipeline are often neglected: 1) to predict whether a given piece of text needs to be simplified, and 2) if yes, to identify complex parts of the text. The two tasks can be solved separately using either lexical or deep learning methods, or solved jointly. Simply applying explainable complexity prediction as a preliminary step, the out-of-sample text simplification performance of the state-of-the-art, black-box simplification models can be improved by a large margin.