根据CBAM注意机制的U-NET钣金工程图的分割方法

论文标题

根据CBAM注意机制的U-NET钣金工程图的分割方法

Segmentation method of U-net sheet metal engineering drawing based on CBAM attention mechanism

论文作者

Song, Zhiwei, Yao, Hui

论文摘要

在重工设备的制造过程中，焊接图中的特定单元首先是手动重新绘制的，然后切割了相应的钣金零件，这效率低下。为此，本文提出了一种基于U-NET的方法，用于焊接工程图中特定单元的分割和提取。此方法使切割设备可以根据视觉信息自动分割特定的图形单元，并根据分割结果自动切割相应形状的钣金零件。这个过程比传统的人辅助切割更有效。 U-NET网络中的两个弱点将导致分割性能的下降：首先，对全局语义特征信息的关注是弱的，其次，浅层编码器特征和深度解码器特征之间存在很大的维度差异。根据CBAM（卷积块注意模块）注意机制，本文提出了一个具有注意机制的U-NET跳跃结构模型，以提高网络的全局语义特征提取能力。此外，设计了具有双层次卷积融合的U-NET注意机理模型，深编码器的最大池 +卷积特征和浅层编码器的平均池 +卷积特征垂直融合，以减少浅层编码器和深度解码器之间的尺寸差异。双池卷积关注跳跃结构取代了传统的U-NET跳跃结构，该结构可以有效地改善焊接工程图的特定单位分割性能。使用VGG16作为骨干网络，实验证实了我们在焊接工程绘图数据集分割任务中模型的IOU，MAP和ACCU分别为84.72％，86.84％和99.42％。

In the manufacturing process of heavy industrial equipment, the specific unit in the welding diagram is first manually redrawn and then the corresponding sheet metal parts are cut, which is inefficient. To this end, this paper proposes a U-net-based method for the segmentation and extraction of specific units in welding engineering drawings. This method enables the cutting device to automatically segment specific graphic units according to visual information and automatically cut out sheet metal parts of corresponding shapes according to the segmentation results. This process is more efficient than traditional human-assisted cutting. Two weaknesses in the U-net network will lead to a decrease in segmentation performance: first, the focus on global semantic feature information is weak, and second, there is a large dimensional difference between shallow encoder features and deep decoder features. Based on the CBAM (Convolutional Block Attention Module) attention mechanism, this paper proposes a U-net jump structure model with an attention mechanism to improve the network's global semantic feature extraction ability. In addition, a U-net attention mechanism model with dual pooling convolution fusion is designed, the deep encoder's maximum pooling + convolution features and the shallow encoder's average pooling + convolution features are fused vertically to reduce the dimension difference between the shallow encoder and deep decoder. The dual-pool convolutional attention jump structure replaces the traditional U-net jump structure, which can effectively improve the specific unit segmentation performance of the welding engineering drawing. Using vgg16 as the backbone network, experiments have verified that the IoU, mAP, and Accu of our model in the welding engineering drawing dataset segmentation task are 84.72%, 86.84%, and 99.42%, respectively.

下载PDF全文

下载文献需遵守相关版权规定

论文标题