论文标题
图形自动编码
Self-Supervised Road Layout Parsing with Graph Auto-Encoding
论文作者
论文摘要
为了寻求更高级别的场景理解,这项工作提出了一种神经网络方法,该方法将鸟眼视图中的路线图作为输入,并预测代表道路拓扑布局的人体解剖图。我们的方法将对道路布局的理解从像素级别提高到图形水平。为了实现此目标,使用图像图像图像自动编码器。该网络旨在学会在其自动编码器瓶颈上回归图表。图像重建损失无需任何外部手动注释就可以自我监督。我们创建一个包含通用道路布局模式的合成数据集,并将其用于对自动编码器进行训练,除了现实世界中的Argoverse数据集。通过使用此其他合成数据集,从概念上讲,它可以捕获人类对道路布局的知识并使网络可用于培训,我们能够稳定并进一步提高对现实世界中argvorse数据集的拓扑道路布局理解的性能。评估表明,我们的方法表现出可比的性能与强大的全面监督基线。
Aiming for higher-level scene understanding, this work presents a neural network approach that takes a road-layout map in bird's-eye-view as input, and predicts a human-interpretable graph that represents the road's topological layout. Our approach elevates the understanding of road layouts from pixel level to the level of graphs. To achieve this goal, an image-graph-image auto-encoder is utilized. The network is designed to learn to regress the graph representation at its auto-encoder bottleneck. This learning is self-supervised by an image reconstruction loss, without needing any external manual annotations. We create a synthetic dataset containing common road layout patterns and use it for training of the auto-encoder in addition to the real-world Argoverse dataset. By using this additional synthetic dataset, which conceptually captures human knowledge of road layouts and makes this available to the network for training, we are able to stabilize and further improve the performance of topological road layout understanding on the real-world Argoverse dataset. The evaluation shows that our approach exhibits comparable performance to a strong fully-supervised baseline.