Krona：使用Kronecker适配器进行参数有效调整

论文标题

Krona：使用Kronecker适配器进行参数有效调整

KronA: Parameter Efficient Tuning with Kronecker Adapter

论文作者

Edalati, Ali, Tahaei, Marzieh, Kobyzev, Ivan, Nia, Vahid Partovi, Clark, James J., Rezagholizadeh, Mehdi

论文摘要

在特定的下游任务上对预训练的语言模型（PLM）进行微调一直是自然语言处理的众所周知的范式。但是，随着PLM的不断增长，在几个下游任务上训练整个模型变得非常昂贵且渴望资源。最近，提出了不同的参数有效调整（PET）技术来提高微调PLM的效率。 PET方法的一种流行类别是低级适应方法，该方法将可学习的SVD模块插入原始模型或并行。但是，低排时分解遭受有限的表示功率。在这项工作中，我们使用Kronecker产品而不是低级别表示解决了这个问题。我们介绍Krona，Krona，这是一种基于Kronecker产品的适配器模块，用于有效地对基于变压器的PLM进行微调。我们将提出的方法应用于胶水基准上的微调T5，以表明合并基于Kronecker的模块可以胜过最先进的PET方法。

Fine-tuning a Pre-trained Language Model (PLM) on a specific downstream task has been a well-known paradigm in Natural Language Processing. However, with the ever-growing size of PLMs, training the entire model on several downstream tasks becomes very expensive and resource-hungry. Recently, different Parameter Efficient Tuning (PET) techniques are proposed to improve the efficiency of fine-tuning PLMs. One popular category of PET methods is the low-rank adaptation methods which insert learnable truncated SVD modules into the original model either sequentially or in parallel. However, low-rank decomposition suffers from limited representation power. In this work, we address this problem using the Kronecker product instead of the low-rank representation. We introduce KronA, a Kronecker product-based adapter module for efficient fine-tuning of Transformer-based PLMs. We apply the proposed methods for fine-tuning T5 on the GLUE benchmark to show that incorporating the Kronecker-based modules can outperform state-of-the-art PET methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题