VIT-DD：半监督驾驶员分心检测的多任务视觉变压器

论文标题

VIT-DD：半监督驾驶员分心检测的多任务视觉变压器

ViT-DD: Multi-Task Vision Transformer for Semi-Supervised Driver Distraction Detection

论文作者

Ma, Yunsheng, Wang, Ziran

论文摘要

确保交通安全和减轻现代驾驶事故至关重要，计算机视觉技术有可能为这一目标做出重大贡献。本文介绍了用于驾驶员分心检测（称为VIT-DD）的多模式视觉变压器，该变压器包含了与分心检测和驾驶员情绪识别有关的训练信号中的电感信息。此外，开发了一种自学习算法，从而使驱动程序数据无缝集成而没有情感标签进入VIT-DD的多任务训练过程。实验结果表明，拟议的VIT-DD分别超过了现有的SFDDD和AUCDD数据集的驾驶员干扰检测方法的最新方法。

Ensuring traffic safety and mitigating accidents in modern driving is of paramount importance, and computer vision technologies have the potential to significantly contribute to this goal. This paper presents a multi-modal Vision Transformer for Driver Distraction Detection (termed ViT-DD), which incorporates inductive information from training signals related to both distraction detection and driver emotion recognition. Additionally, a self-learning algorithm is developed, allowing for the seamless integration of driver data without emotion labels into the multi-task training process of ViT-DD. Experimental results reveal that the proposed ViT-DD surpasses existing state-of-the-art methods for driver distraction detection by 6.5% and 0.9% on the SFDDD and AUCDD datasets, respectively.

下载PDF全文

下载文献需遵守相关版权规定

论文标题