论文标题
启用爱丽丝运行的分布式分析3
Enabling distributed analysis for ALICE Run 3
论文作者
论文摘要
爱丽丝合作刚刚完成了一个主要的检测器升级,该升级将数据利率功能提高了两个数量级,并将允许收集前所未有的数据样本。例如,PB-PB碰撞1个月的分析输入约为5 pb。为了对如此大的数据样本进行分析,对爱丽丝分布式基础架构进行了修订,并创建了专用的运行工具3分析。这些首先是$ \ mathrm {o^2} $分析框架,该框架建立在多进程架构上,通过在C ++中实现的共享内存交换平面数据格式。其次,用于网格的分布式分析和在Java/JavaScript/React中实现的专用分析设施的Hyperloop Train系统。这些系统已通过转换后的2个数据进行了委托,并且使用了最近的LHC飞行员光束,并准备在运行3开始进行数据分析。此贡献讨论了要求和使用的概念,从而提供了有关实际实施的详细信息。还将讨论有关LHC飞行员光束的操作状态。
The ALICE Collaboration has just finished a major detector upgrade that increases the data-taking rate capability by two orders of magnitude and will allow to collect unprecedented data samples. For example, the analysis input for 1 month of Pb-Pb collisions amounts to about 5 PB. In order to enable analysis on such large data samples, the ALICE distributed infrastructure was revised and dedicated tools for Run 3 analysis were created. These are firstly the $\mathrm{O^2}$ analysis framework that builds on a multi-process architecture exchanging a flat data format through shared memory implemented in C++. Secondly, the Hyperloop train system for distributed analysis on the Grid and on dedicated analysis facilities implemented in Java/Javascript/React. These systems have been commissioned with converted Run 2 data and with the recent LHC pilot beam and are ready for data analysis for the start of Run 3. This contribution discusses the requirements and the used concepts, providing details on the actual implementation. The status of the operation in particular with respect to the LHC pilot beam will also be discussed.