论文标题

SuperGB-D:零摄像机实例分割在混乱的室内环境中

SupeRGB-D: Zero-shot Instance Segmentation in Cluttered Indoor Environments

论文作者

Örnek, Evin Pınar, Krishnan, Aravindhan K, Gayaka, Shreekant, Kuo, Cheng-Hao, Sen, Arnie, Navab, Nassir, Tombari, Federico

论文摘要

对象实例分割是室内机器人的关键挑战,该机器人用许多小对象导航杂乱的环境。 3D传感功能的局限性通常会使检测每个可能的对象很难。尽管深度学习方法可能对此问题有效,但手动注释3D数据以进行监督学习是耗时的。在这项工作中,我们从RGB-D数据探索零摄像的实例分割(ZSI),以识别语义类别 - 不可思议方式的看不见的对象。我们为桌面对象数据集(TOD-Z)引入了一个零射击拆分,以启用这项研究,并提出一种使用带注释的对象来学习像素的``obsiptnes''的方法,并推广到杂乱无章的室内环境中的对象类别。我们的方法,SuperGB-D,根据几何提示将像素分为小斑点,并学会以深层聚集的聚类方式合并斑块。 SuperGB-D在看不见的对象上优于现有基线,同时在可见对象上实现相似的性能。我们进一步在实际数据集OCID上显示了竞争结果。凭借其轻巧的设计(0.4 MB内存需求),我们的方法非常适合移动和机器人应用。额外的Dino功能可以通过更高的记忆要求提高性能。数据集拆分和代码可在https://github.com/evinpinar/supergb-d上找到。

Object instance segmentation is a key challenge for indoor robots navigating cluttered environments with many small objects. Limitations in 3D sensing capabilities often make it difficult to detect every possible object. While deep learning approaches may be effective for this problem, manually annotating 3D data for supervised learning is time-consuming. In this work, we explore zero-shot instance segmentation (ZSIS) from RGB-D data to identify unseen objects in a semantic category-agnostic manner. We introduce a zero-shot split for Tabletop Objects Dataset (TOD-Z) to enable this study and present a method that uses annotated objects to learn the ``objectness'' of pixels and generalize to unseen object categories in cluttered indoor environments. Our method, SupeRGB-D, groups pixels into small patches based on geometric cues and learns to merge the patches in a deep agglomerative clustering fashion. SupeRGB-D outperforms existing baselines on unseen objects while achieving similar performance on seen objects. We further show competitive results on the real dataset OCID. With its lightweight design (0.4 MB memory requirement), our method is extremely suitable for mobile and robotic applications. Additional DINO features can increase performance with a higher memory requirement. The dataset split and code are available at https://github.com/evinpinar/supergb-d.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源