快速亚型的软标记策略

论文标题

快速亚型的软标记策略

Soft-labeling Strategies for Rapid Sub-Typing

论文作者

Rosario, Grant, Noever, David, Ciolino, Matt

论文摘要

为计算机视觉标记大型示例数据集的挑战继续限制图像存储库的可用性和范围。这项研究为自动数据收集，策展，标签和迭代培训提供了一种新的方法，对架空卫星图像和对象检测的情况下的人体干预最少。新的操作量表有效地扫描了整个城市（68平方英里）的网格搜索，并通过太空观测来预测汽车颜色。经过部分训练的Yolov5模型是一种初始推理种子，以进一步输出迭代循环中更精致的模型预测。这里的软标签是指接受标签噪声作为潜在的有价值的增强，以减少过度拟合并增强对以前看不见的测试数据的广义预测。该方法利用了现实的实例，在该实例中，汽车的裁剪图像可以自动从像素值中自动接收白色或彩色信息，从而完成端到端管道，而不会过度依赖人工劳动。

The challenge of labeling large example datasets for computer vision continues to limit the availability and scope of image repositories. This research provides a new method for automated data collection, curation, labeling, and iterative training with minimal human intervention for the case of overhead satellite imagery and object detection. The new operational scale effectively scanned an entire city (68 square miles) in grid search and yielded a prediction of car color from space observations. A partially trained yolov5 model served as an initial inference seed to output further, more refined model predictions in iterative cycles. Soft labeling here refers to accepting label noise as a potentially valuable augmentation to reduce overfitting and enhance generalized predictions to previously unseen test data. The approach takes advantage of a real-world instance where a cropped image of a car can automatically receive sub-type information as white or colorful from pixel values alone, thus completing an end-to-end pipeline without overdependence on human labor.

下载PDF全文

下载文献需遵守相关版权规定

论文标题