在视频流中连续学习的关注轨迹的随机连贯性

论文标题

在视频流中连续学习的关注轨迹的随机连贯性

Stochastic Coherence Over Attention Trajectory For Continuous Learning In Video Streams

论文作者

Tiezzi, Matteo, Marullo, Simone, Faggi, Lapo, Meloni, Enrico, Betti, Alessandro, Melacci, Stefano

论文摘要

设计能够在环境中生活并通过观察周围环境学习的智能代理是人工智能的长期目标。从裸露的机器学习的角度来看，当阻止代理商利用大型完全注销的数据集时，会出现挑战，而与监督信号的交互在时空和时间上分布很少。本文提出了一种基于神经网络的新型方法，用于在视频流中逐步自主地开发像素表示。所提出的方法是基于类似人类的注意机制，该机制使代理可以通过观察参加的位置移动来学习。沿着注意力轨迹的时空随机连贯性与对比度术语配对，导致无监督的学习标准自然地应对所考虑的设置。与大多数现有的作品不同，学到的表示形式用于每个帧像素的开放集类信息分类，依靠很少的监督。我们的实验利用了3D虚拟环境，他们表明所提出的代理可以学会仅通过观察视频流来区分对象。从最先进的模型继承功能并不像人们预期的那样强大。

Devising intelligent agents able to live in an environment and learn by observing the surroundings is a longstanding goal of Artificial Intelligence. From a bare Machine Learning perspective, challenges arise when the agent is prevented from leveraging large fully-annotated dataset, but rather the interactions with supervisory signals are sparsely distributed over space and time. This paper proposes a novel neural-network-based approach to progressively and autonomously develop pixel-wise representations in a video stream. The proposed method is based on a human-like attention mechanism that allows the agent to learn by observing what is moving in the attended locations. Spatio-temporal stochastic coherence along the attention trajectory, paired with a contrastive term, leads to an unsupervised learning criterion that naturally copes with the considered setting. Differently from most existing works, the learned representations are used in open-set class-incremental classification of each frame pixel, relying on few supervisions. Our experiments leverage 3D virtual environments and they show that the proposed agents can learn to distinguish objects just by observing the video stream. Inheriting features from state-of-the art models is not as powerful as one might expect.

下载PDF全文

下载文献需遵守相关版权规定

论文标题