论文标题

Kotlin Coroutines中的快速,可扩展的通道

Fast and Scalable Channels in Kotlin Coroutines

论文作者

Koval, Nikita, Alistarh, Dan, Elizarov, Roman

论文摘要

在过去的十年中,异步编程已广受欢迎:对这种编程模式的支持可以通过库和本地语言实现在许多流行语言中获得,通常以Coroutines或异步/等待结构的形式获得。这个概念不是通过共享内存编程,而是通过消息传递假设隐式同步。启用这种通信的关键数据结构是集合通道。粗略地,集合通道是尺寸零的阻塞队列,因此发送(e)和接收()操作彼此等待,在相遇时进行会合。为了优化消息传递模式,通道通常配备固定尺寸的缓冲区,因此发送(e)-s不会暂停并将元素放入缓冲区,直到超过其容量。该原始性被称为缓冲通道。 本文介绍了Rendezvous和缓冲通道的快速可扩展算法。与现代队列类似,我们的解决方案基于一个无限阵列,其中有两个位置计数器用于发送(e)和接收()操作,并利用无条件的获取和ADD指令来更新它们。然而,该算法需要对这种经典模式进行非平凡的修改,以支持完整的频道语义,例如缓冲和等待请求的取消。我们将解决方案的性能与Kotlin实施的性能以及其他学术建议相比,最多显示9.8倍。为了展示其表现力和性能,我们还将提出的算法集成到标准的Kotlin Coroutines库中,以取代先前的频道实现。

Asynchronous programming has gained significant popularity over the last decade: support for this programming pattern is available in many popular languages via libraries and native language implementations, typically in the form of coroutines or the async/await construct. Instead of programming via shared memory, this concept assumes implicit synchronization through message passing. The key data structure enabling such communication is the rendezvous channel. Roughly, a rendezvous channel is a blocking queue of size zero, so both send(e) and receive() operations wait for each other, performing a rendezvous when they meet. To optimize the message passing pattern, channels are usually equipped with a fixed-size buffer, so send(e)-s do not suspend and put elements into the buffer until its capacity is exceeded. This primitive is known as a buffered channel. This paper presents a fast and scalable algorithm for both rendezvous and buffered channels. Similarly to modern queues, our solution is based on an infinite array with two positional counters for send(e) and receive() operations, leveraging the unconditional Fetch-And-Add instruction to update them. Yet, the algorithm requires non-trivial modifications of this classic pattern, in order to support the full channel semantics, such as buffering and cancellation of waiting requests. We compare the performance of our solution to that of the Kotlin implementation, as well as against other academic proposals, showing up to 9.8x speedup. To showcase its expressiveness and performance, we also integrated the proposed algorithm into the standard Kotlin Coroutines library, replacing the previous channel implementations.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源