论文标题

有效的变压器:调查

Efficient Transformers: A Survey

论文作者

Tay, Yi, Dehghani, Mostafa, Bahri, Dara, Metzler, Donald

论文摘要

变压器模型架构最近由于其在语言,视觉和强化学习等一系列领域的有效性而引起了极大的兴趣。例如,在自然语言处理的领域中,变形金刚已成为现代深度学习堆栈中必不可少的主食。最近,已经提出了令人眼花earge乱的“ X型”模型 - 改革者,线形,表演者,longformer等待一些 - 对原始变压器架构进行了改进,其中许多都可以改善计算和内存效率。为了帮助狂热的研究人员驾驶这一爆炸,本文描述了近期效率味的“ X-Former”模型的大量精选,提供了对多个领域的现有工作和模型的有组织而全面的概述。

Transformer model architectures have garnered immense interest lately due to their effectiveness across a range of domains like language, vision and reinforcement learning. In the field of natural language processing for example, Transformers have become an indispensable staple in the modern deep learning stack. Recently, a dizzying number of "X-former" models have been proposed - Reformer, Linformer, Performer, Longformer, to name a few - which improve upon the original Transformer architecture, many of which make improvements around computational and memory efficiency. With the aim of helping the avid researcher navigate this flurry, this paper characterizes a large and thoughtful selection of recent efficiency-flavored "X-former" models, providing an organized and comprehensive overview of existing work and models across multiple domains.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源