论文标题

从稀疏数据集中学习:通过机器学习预测混凝土的强度

Learning from Sparse Datasets: Predicting Concrete's Strength by Machine Learning

论文作者

Ouyang, Boya, Li, Yuhai, Song, Yu, Wu, Feishu, Yu, Huizi, Wang, Yongzhe, Bauchy, Mathieu, Sant, Gaurav

论文摘要

尽管过去几十年来建立具体比例和力量之间的关系巨大的努力,但仍缺乏基于知识的基于知识的模型,以实现准确的具体强度预测。作为物理或化学模型的替代方法,数据驱动的机器学习(ML)方法为此问题提供了新的解决方案。尽管这种方法有望在混凝土混合物比例和强度之间处理复杂的,非线性的,非添加的关系,但ML的主要局限性在于,模型训练需要大的数据集。这是一个关注的问题,因为可靠,一致的强度数据受到限制,尤其是对于现实的工业混凝土而言。在这里,基于对工业生产的混凝土测得的抗压强度的大数据集(> 10,000个观察结果)的分析,我们比较了选择的ML算法“学习”如何可靠地预测具体强度作为数据集大小的函数的能力。基于这些结果,我们讨论了给定模型最终的准确性(在大型数据集中训练时)与培训该模型实际需要多少数据之间的竞争。

Despite enormous efforts over the last decades to establish the relationship between concrete proportioning and strength, a robust knowledge-based model for accurate concrete strength predictions is still lacking. As an alternative to physical or chemical-based models, data-driven machine learning (ML) methods offer a new solution to this problem. Although this approach is promising for handling the complex, non-linear, non-additive relationship between concrete mixture proportions and strength, a major limitation of ML lies in the fact that large datasets are needed for model training. This is a concern as reliable, consistent strength data is rather limited, especially for realistic industrial concretes. Here, based on the analysis of a large dataset (>10,000 observations) of measured compressive strengths from industrially-produced concretes, we compare the ability of select ML algorithms to "learn" how to reliably predict concrete strength as a function of the size of the dataset. Based on these results, we discuss the competition between how accurate a given model can eventually be (when trained on a large dataset) and how much data is actually required to train this model.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源