论文标题

用新维度打破AI芯片的存储墙

Breaking the Memory Wall for AI Chip with a New Dimension

论文作者

Tam, Eugene, Jiang, Shenfei, Duan, Paul, Meng, Shawn, Pang, Yue, Huang, Cayden, Han, Yi, Xie, Jacke, Cui, Yuanjun, Yu, Jinsong, Lu, Minggui

论文摘要

深度学习的最新进展导致在计算机视觉和自然语言处理等应用中广泛采用人工智能(AI)。随着神经网络变得越来越大,AI建模需要超过常规芯片体系结构的功能。内存带宽落后于处理能力。能源消耗主导了总拥有成本。当前,内存容量不足以支持最先进的NLP模型。在这项工作中,我们提出了一个3D AI芯片,称为Sunrise,并具有近乎内存的计算体系结构,以应对这三个挑战。这种分布式的近乎内存的计算体系结构使我们能够用大量的数据带宽拆除限制性能的记忆墙。我们在40nm技术方面达到了与7NM技术竞争的芯片相同水平的能源效率。通过转向与其他AI芯片类似的技术,我们预计将达到能源效率的十倍以上,是当前最新芯片的性能的七倍,并且与每个基准中最好的芯片相比,具有二十倍的记忆能力。

Recent advancements in deep learning have led to the widespread adoption of artificial intelligence (AI) in applications such as computer vision and natural language processing. As neural networks become deeper and larger, AI modeling demands outstrip the capabilities of conventional chip architectures. Memory bandwidth falls behind processing power. Energy consumption comes to dominate the total cost of ownership. Currently, memory capacity is insufficient to support the most advanced NLP models. In this work, we present a 3D AI chip, called Sunrise, with near-memory computing architecture to address these three challenges. This distributed, near-memory computing architecture allows us to tear down the performance-limiting memory wall with an abundance of data bandwidth. We achieve the same level of energy efficiency on 40nm technology as competing chips on 7nm technology. By moving to similar technologies as other AI chips, we project to achieve more than ten times the energy efficiency, seven times the performance of the current state-of-the-art chips, and twenty times of memory capacity as compared with the best chip in each benchmark.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源