论文标题
意大利的语言品种:技术挑战和机遇
Language Varieties of Italy: Technology Challenges and Opportunities
论文作者
论文摘要
意大利的特征是欧洲一种一种一种一种语言多样性的景观,该景观暗中编码了当地知识,文化传统,艺术表达和演讲者的历史。但是,意大利的大多数本地语言和方言都有在几代人之内消失的风险。 NLP社区最近开始使用濒危语言,包括意大利的语言。然而,大多数努力都认为这些品种具有资源不足的语言整体,具有既定的书面形式和均匀的功能和需求,因此可以互相互换,并具有高资源的标准化语言。在本文中,我们介绍了意大利的语言背景,并挑战了意大利语言品种的NLP默认以机器为中心的假设。我们主张将范式从以机器为中心转变为以说话者为中心的NLP,并为工作提供优先级优先于技术进步的工作的建议和机会。为了促进该过程,我们终于建议建立一个当地社区,以旨在支持意大利语言和方言的活力。
Italy is characterized by a one-of-a-kind linguistic diversity landscape in Europe, which implicitly encodes local knowledge, cultural traditions, artistic expressions and history of its speakers. However, most local languages and dialects in Italy are at risk of disappearing within few generations. The NLP community has recently begun to engage with endangered languages, including those of Italy. Yet, most efforts assume that these varieties are under-resourced language monoliths with an established written form and homogeneous functions and needs, and thus highly interchangeable with each other and with high-resource, standardized languages. In this paper, we introduce the linguistic context of Italy and challenge the default machine-centric assumptions of NLP for Italy's language varieties. We advocate for a shift in the paradigm from machine-centric to speaker-centric NLP, and provide recommendations and opportunities for work that prioritizes languages and their speakers over technological advances. To facilitate the process, we finally propose building a local community towards responsible, participatory efforts aimed at supporting vitality of languages and dialects of Italy.