论文标题

线索:中文理解评估基准

CLUE: A Chinese Language Understanding Evaluation Benchmark

论文作者

Xu, Liang, Hu, Hai, Zhang, Xuanwei, Li, Lu, Cao, Chenjie, Li, Yudong, Xu, Yechen, Sun, Kai, Yu, Dian, Yu, Cong, Tian, Yin, Dong, Qianqian, Liu, Weitang, Shi, Bo, Cui, Yiming, Li, Junyi, Zeng, Jun, Wang, Rongzhao, Xie, Weijian, Li, Yanting, Patterson, Yina, Tian, Zuoyu, Zhang, Yiwen, Zhou, He, Liu, Shaoweihua, Zhao, Zhe, Zhao, Qipeng, Yue, Cong, Zhang, Xinrui, Yang, Zhengliang, Richardson, Kyle, Lan, Zhenzhong

论文摘要

自然语言理解的出现(NLU)对英语的基准(例如胶水和超级胶水)可以使新的NLU模型在各种任务中进行评估。这些全面的基准促进了自然语言处理(NLP)的广泛研究和应用。但是,问题在于,大多数此类基准都限于英语,这使得很难复制其他语言中英语NLU的许多成功。为了帮助解决这个问题,我们介绍了第一个大规模的中文理解评估(线索)基准。 Clue是一个开放式的,由社区驱动的项目,汇集了9个任务,这些任务涵盖了几个完善的单句/句子分类任务以及机器阅读理解,所有任务均在原始中文文本上。为了确定这些任务的结果,我们使用一组详尽的最新预训练的中国模型(总共9个)报告得分。我们还引入了许多补充数据集和其他工具,以帮助促进中国NLU的进一步进展。我们的基准标在https://www.cluebenchmarks.com上发布

The advent of natural language understanding (NLU) benchmarks for English, such as GLUE and SuperGLUE allows new NLU models to be evaluated across a diverse set of tasks. These comprehensive benchmarks have facilitated a broad range of research and applications in natural language processing (NLP). The problem, however, is that most such benchmarks are limited to English, which has made it difficult to replicate many of the successes in English NLU for other languages. To help remedy this issue, we introduce the first large-scale Chinese Language Understanding Evaluation (CLUE) benchmark. CLUE is an open-ended, community-driven project that brings together 9 tasks spanning several well-established single-sentence/sentence-pair classification tasks, as well as machine reading comprehension, all on original Chinese text. To establish results on these tasks, we report scores using an exhaustive set of current state-of-the-art pre-trained Chinese models (9 in total). We also introduce a number of supplementary datasets and additional tools to help facilitate further progress on Chinese NLU. Our benchmark is released at https://www.CLUEbenchmarks.com

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源