论文标题
实时细分网络应延迟意识
Real-Time Segmentation Networks should be Latency Aware
论文作者
论文摘要
随着场景细分系统在视觉上准确的结果,许多最近的论文着重于使这些网络体系结构更快,更小,更高效。特别是,研究通常旨在设计建立时代系统。在对自动驾驶汽车和机器人的实时视频理解的背景下,实现这一目标尤其重要。在本文中,我们认为,联合(MIOU)的平均值相交的常用性能指标(MIOU)并未完全捕获这些网络运行时估算这些网络的真实性能所需的信息”。我们提出了分段任务中目标的更改及其相关的指标,该指标以以下方式封装了此丢失信息:我们建议预测将在网络完成处理时与未来输入框架相匹配的未来输入框架的未来输出细分映射。我们介绍了相关的延迟感知度量,我们可以从中确定排名。我们在不同硬件上执行一些最近网络的延迟定时实验,并评估这些网络在我们建议的任务上的性能。我们建议对场景细分网络进行改进,以通过使用多帧输入和提高初始卷积层的容量来更好地执行我们的任务。
As scene segmentation systems reach visually accurate results, many recent papers focus on making these network architectures faster, smaller and more efficient. In particular, studies often aim at designingreal-time'systems. Achieving this goal is particularly relevant in the context of real-time video understanding for autonomous vehicles, and robots. In this paper, we argue that the commonly used performance metric of mean Intersection over Union (mIoU) does not fully capture the information required to estimate the true performance of these networks when they operate inreal-time'. We propose a change of objective in the segmentation task, and its associated metric that encapsulates this missing information in the following way: We propose to predict the future output segmentation map that will match the future input frame at the time when the network finishes the processing. We introduce the associated latency-aware metric, from which we can determine a ranking. We perform latency timing experiments of some recent networks on different hardware and assess the performances of these networks on our proposed task. We propose improvements to scene segmentation networks to better perform on our task by using multi-frames input and increasing capacity in the initial convolutional layers.