论文标题
缓解行动识别中的表示偏见:算法和基准测试
Mitigating Representation Bias in Action Recognition: Algorithms and Benchmarks
论文作者
论文摘要
深度学习模型已在大规模视频基准测试上取得了出色的识别结果。但是,当将其应用于具有罕见场景或物体的视频时,它们的性能很差,这主要是由于现有视频数据集的偏见。我们从两个不同的角度解决了这个问题:算法和数据集。从算法的角度来看,我们提出了空间感知的多种偏见(SMAD),它既将明确的偏见与多种相对的对抗性训练以及内隐式偏差与空间行动重新加权模块进行,以学习对非行动方面的不一致性的代表性。为了消除内在的数据集偏差,我们建议OmnideBias有选择地利用Web数据进行联合培训,这可以通过更少的Web数据实现更高的性能。为了验证有效性,我们建立了评估协议并对现有数据集的重新分配分配和新的评估数据集进行了广泛的实验,该数据集的重点是稀有场景。我们还表明,当转移到其他数据集和任务时,依据的表示形式可以更好地概括。
Deep learning models have achieved excellent recognition results on large-scale video benchmarks. However, they perform poorly when applied to videos with rare scenes or objects, primarily due to the bias of existing video datasets. We tackle this problem from two different angles: algorithm and dataset. From the perspective of algorithms, we propose Spatial-aware Multi-Aspect Debiasing (SMAD), which incorporates both explicit debiasing with multi-aspect adversarial training and implicit debiasing with the spatial actionness reweighting module, to learn a more generic representation invariant to non-action aspects. To neutralize the intrinsic dataset bias, we propose OmniDebias to leverage web data for joint training selectively, which can achieve higher performance with far fewer web data. To verify the effectiveness, we establish evaluation protocols and perform extensive experiments on both re-distributed splits of existing datasets and a new evaluation dataset focusing on the action with rare scenes. We also show that the debiased representation can generalize better when transferred to other datasets and tasks.