论文标题
在提供数据密集型软件解决方案时应对数据挑战
Dealing with Data Challenges when Delivering Data-Intensive Software Solutions
论文作者
论文摘要
预测对数据密集型解决方案开发需求的增加是推动对软件,数据和领域专家的需求,以有效地在多学科数据密集型软件团队(MDST)中进行合作。我们通过与MDST的24位从业人员的访谈进行了社会技术基础理论研究,以更好地了解这些团队在提供数据密集型软件解决方案时面临的挑战。访谈提供了跨不同类型角色的观点,包括领域,数据和软件专家,并涵盖了团队成员,团队经理到执行领导者的不同组织级别。我们发现,这些团队的关键问题是应对与数据相关的挑战。在本文中,我们介绍了处理数据挑战的理论,该理论解释了MDST所面临的挑战,包括获得数据,对齐数据,了解数据并解决数据质量问题;这些挑战发生的背景和条件,导致挑战的原因以及相关的后果,例如必须进行补救活动,无法实现预期的结果以及对已交付的解决方案缺乏信任。我们还确定了用于应对挑战的意外事件或策略,包括高级战略方法,例如实施数据治理,实施新工具和技术,例如数据质量可视化和监视工具,以及通过专注于人员动态,沟通技巧和交叉技巧来建立强大的团队。我们的发现对从业者和研究人员有直接的影响,以更好地了解数据挑战的格局以及如何应对它们。
The predicted increase in demand for data-intensive solution development is driving the need for software, data, and domain experts to effectively collaborate in multi-disciplinary data-intensive software teams (MDSTs). We conducted a socio-technical grounded theory study through interviews with 24 practitioners in MDSTs to better understand the challenges these teams face when delivering data-intensive software solutions. The interviews provided perspectives across different types of roles including domain, data and software experts, and covered different organisational levels from team members, team managers to executive leaders. We found that the key concern for these teams is dealing with data-related challenges. In this paper, we present the theory of dealing with data challenges that explains the challenges faced by MDSTs including gaining access to data, aligning data, understanding data, and resolving data quality issues; the context in and condition under which these challenges occur, the causes that lead to the challenges, and the related consequences such as having to conduct remediation activities, inability to achieve expected outcomes and lack of trust in the delivered solutions. We also identified contingencies or strategies applied to address the challenges including high-level strategic approaches such as implementing data governance, implementing new tools and techniques such as data quality visualisation and monitoring tools, as well as building stronger teams by focusing on people dynamics, communication skill development and cross-skilling. Our findings have direct implications for practitioners and researchers to better understand the landscape of data challenges and how to deal with them.