论文标题

使用基于BERT的域适应的阿拉伯方言识别

Arabic Dialect Identification Using BERT-Based Domain Adaptation

论文作者

Beltagy, Ahmad, Wael, Abdelrahman, ElSherief, Omar

论文摘要

阿拉伯语是世界上最重要,最不断增长的语言之一。随着Twitter等社交媒体平台的兴起,阿拉伯语口语方言已变得更加使用。在本文中,我们描述了我们在NADI共享任务1上的方法,该任务1要求我们构建一个系统以区分不同的21种阿拉伯方言,我们介绍了一种深度学习的半监督时尚方法,并在NADI共享任务1语料库上进行了预处理。我们的系统在NADI共享的任务竞赛中排名第四,以23.09%的F1宏平均得分达到23.09%,并采用简单而有效的方法来区分给定推文的21种阿拉伯语方言。

Arabic is one of the most important and growing languages in the world. With the rise of social media platforms such as Twitter, Arabic spoken dialects have become more in use. In this paper, we describe our approach on the NADI Shared Task 1 that requires us to build a system to differentiate between different 21 Arabic dialects, we introduce a deep learning semi-supervised fashion approach along with pre-processing that was reported on NADI shared Task 1 Corpus. Our system ranks 4th in NADI's shared task competition achieving a 23.09% F1 macro average score with a simple yet efficient approach to differentiating between 21 Arabic Dialects given tweets.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源