论文标题

SLIC:人类动作视频的迭代集群的自我监督学习

SLIC: Self-Supervised Learning with Iterative Clustering for Human Action Videos

论文作者

Khorasgani, Salar Hosseini, Chen, Yuxuan, Shkurti, Florian

论文摘要

自我监督的方法已通过端到端监督学习的图像分类显着缩小了差距。但是,在人类行动视频的情况下,外观和运动都是变化的重要因素,因此该差距仍然很大。这样做的关键原因之一是,采样对类似的视频剪辑,这是许多自我监督的对比学习方法所需的步骤,目前是保守的,以避免误报。一个典型的假设是,类似剪辑仅在单个视频中暂时关闭,导致运动相似性的示例不足。为了减轻这种情况,我们提出了SLIC,这是一种基于聚类的自我监督的对比度学习方法,用于人类行动视频。我们的主要贡献是,我们通过使用迭代聚类来分组类似的视频实例来改善传统的视频内抽样。这使我们的方法能够利用集群分配的伪标签来取样更艰难的阳性和负面因素。在UCF101上,SLIC的表现优于最先进的视频检索基线 + +15.4%,而直接转移到HMDB51时,SLIC检索基线的率高为15.4%, +5.7%。通过用于行动分类的端到端登录,SLIC可在UCF101上实现83.2%的TOP-1准确性(+0.8%),HMDB51(+1.6%)上的HMDB51的冠军获得了54.5%。在动力学预处理后,SLIC还与最先进的行动分类竞争。

Self-supervised methods have significantly closed the gap with end-to-end supervised learning for image classification. In the case of human action videos, however, where both appearance and motion are significant factors of variation, this gap remains significant. One of the key reasons for this is that sampling pairs of similar video clips, a required step for many self-supervised contrastive learning methods, is currently done conservatively to avoid false positives. A typical assumption is that similar clips only occur temporally close within a single video, leading to insufficient examples of motion similarity. To mitigate this, we propose SLIC, a clustering-based self-supervised contrastive learning method for human action videos. Our key contribution is that we improve upon the traditional intra-video positive sampling by using iterative clustering to group similar video instances. This enables our method to leverage pseudo-labels from the cluster assignments to sample harder positives and negatives. SLIC outperforms state-of-the-art video retrieval baselines by +15.4% on top-1 recall on UCF101 and by +5.7% when directly transferred to HMDB51. With end-to-end finetuning for action classification, SLIC achieves 83.2% top-1 accuracy (+0.8%) on UCF101 and 54.5% on HMDB51 (+1.6%). SLIC is also competitive with the state-of-the-art in action classification after self-supervised pretraining on Kinetics400.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源