论文标题
NSF支持的网络基础设施的人工智能和高性能计算的融合
Convergence of Artificial Intelligence and High Performance Computing on NSF-supported Cyberinfrastructure
论文作者
论文摘要
升级和构建大规模科学设施的大量投资要求在研发上进行相应的投资,以设计算法和计算方法,以在大数据时代实现科学和工程的突破。创新的人工智能(AI)应用程序为行业和技术中的大数据挑战提供了动力的转型解决方案,这些解决方案现在推动了数十亿美元的行业,并且在塑造人类社会模式的角色越来越多。随着AI继续演变成具有统计和数学严格的计算范式,很明显,用于培训,验证和测试的单GPU解决方案不再足够,而对于科学设施带来的计算宏伟挑战就足以产生速度和数量的数据,这些挑战超出了可用的Cyberinfrasintructure平台的计算能力。这种认识一直在推动AI和高性能计算(HPC)的汇合,以减少距离的时间,并能够对域启发的AI体系结构和优化方案进行系统的研究以启用数据驱动的发现。在本文中,我们介绍了该领域最新发展的摘要,并描述了本文中的作者率先加速和简化HPC平台在学术界和行业中设计和应用加速的AI算法。
Significant investments to upgrade and construct large-scale scientific facilities demand commensurate investments in R&D to design algorithms and computing approaches to enable scientific and engineering breakthroughs in the big data era. Innovative Artificial Intelligence (AI) applications have powered transformational solutions for big data challenges in industry and technology that now drive a multi-billion dollar industry, and which play an ever increasing role shaping human social patterns. As AI continues to evolve into a computing paradigm endowed with statistical and mathematical rigor, it has become apparent that single-GPU solutions for training, validation, and testing are no longer sufficient for computational grand challenges brought about by scientific facilities that produce data at a rate and volume that outstrip the computing capabilities of available cyberinfrastructure platforms. This realization has been driving the confluence of AI and high performance computing (HPC) to reduce time-to-insight, and to enable a systematic study of domain-inspired AI architectures and optimization schemes to enable data-driven discovery. In this article we present a summary of recent developments in this field, and describe specific advances that authors in this article are spearheading to accelerate and streamline the use of HPC platforms to design and apply accelerated AI algorithms in academia and industry.