论文标题
Golang现实世界中数据竞赛的研究
A Study of Real-World Data Races in Golang
论文作者
论文摘要
并发的编程文献富含用于数据竞赛检测的工具和技术。然而,关于现实世界,行业规模的部署,经验和有关数据竞赛的见解的了解较少。 Golang(简而言之)是一种现代的编程语言,使并发成为一流的公民。 GO提供消息传递和共享内存,以在并发线程之间进行通信。 GO在现代基于微服务的系统中越来越受欢迎。面对新兴的知名度,GO中的数据竞赛。 在本文中,以我们的工业代码库为例,我们证明了Go开发人员拥抱并发,并展示了如何与语言成语和细微差别并发生的差异使GO计划使GO计划非常容易受到数据竞赛的影响。 Google的Go发行船具有基于线程齐射器的内置动态数据竞赛检测器。但是,动态种族探测器构成了可扩展性和片状挑战。我们讨论各种软件工程权衡,以使该检测器在大规模上有效地工作。我们已经在Uber的4,600万行GO代码库中部署了该检测器,托管了2100个不同的微服务,发现了2000年的数据竞赛,并修复了1000次数据竞赛,涵盖了790个不同的代码补丁,在六个月的时间内由210个独特的开发人员提交。基于对GO中这些数据竞赛模式的详细研究,我们进行了七个高级观察,该观察结果与GO语言范式和数据竞赛之间的复杂相互作用有关。
The concurrent programming literature is rich with tools and techniques for data race detection. Less, however, has been known about real-world, industry-scale deployment, experience, and insights about data races. Golang (Go for short) is a modern programming language that makes concurrency a first-class citizen. Go offers both message passing and shared memory for communicating among concurrent threads. Go is gaining popularity in modern microservice-based systems. Data races in Go stand in the face of its emerging popularity. In this paper, using our industrial codebase as an example, we demonstrate that Go developers embrace concurrency and show how the abundance of concurrency alongside language idioms and nuances make Go programs highly susceptible to data races. Google's Go distribution ships with a built-in dynamic data race detector based on ThreadSanitizer. However, dynamic race detectors pose scalability and flakiness challenges; we discuss various software engineering trade-offs to make this detector work effectively at scale. We have deployed this detector in Uber's 46 million lines of Go codebase hosting 2100 distinct microservices, found over 2000 data races, and fixed over 1000 data races, spanning 790 distinct code patches submitted by 210 unique developers over a six-month period. Based on a detailed investigation of these data race patterns in Go, we make seven high-level observations relating to the complex interplay between the Go language paradigm and data races.