论文标题
连贯损失:稳定视频细分的通用框架
Coherent Loss: A Generic Framework for Stable Video Segmentation
论文作者
论文摘要
视频细分方法对于众多视觉任务尤其是在娱乐视频操纵中非常重要。由于与获取高质量的人均分段注释和具有不同环境的大型视频数据集相关的挑战,学习方法显示了测试数据集的总体准确性,但对大多数实际应用中自我校正的抖动文物缺乏严格的时间限制。我们研究了这种抖动的人工制品如何降低视频分割结果的视觉质量,并提出了时间稳定性的指标来对其进行数字评估。特别是,我们提出了与通用框架的连贯损失,以增强神经网络的性能,以防止抖动的伪像,这结合了高精度和高稠度。现有的视频对象/语义细分方法配备了我们的方法,可以在视频人数据集中以及在该领域以及戴维斯和城市景观上进行进一步研究的视频质量更加令人满意的视觉质量。
Video segmentation approaches are of great importance for numerous vision tasks especially in video manipulation for entertainment. Due to the challenges associated with acquiring high-quality per-frame segmentation annotations and large video datasets with different environments at scale, learning approaches shows overall higher accuracy on test dataset but lack strict temporal constraints to self-correct jittering artifacts in most practical applications. We investigate how this jittering artifact degrades the visual quality of video segmentation results and proposed a metric of temporal stability to numerically evaluate it. In particular, we propose a Coherent Loss with a generic framework to enhance the performance of a neural network against jittering artifacts, which combines with high accuracy and high consistency. Equipped with our method, existing video object/semantic segmentation approaches achieve a significant improvement in term of more satisfactory visual quality on video human dataset, which we provide for further research in this field, and also on DAVIS and Cityscape.