3D点云分割的分层变压器

论文标题

3D点云分割的分层变压器

Stratified Transformer for 3D Point Cloud Segmentation

论文作者

Lai, Xin, Liu, Jianhui, Jiang, Li, Wang, Liwei, Zhao, Hengshuang, Liu, Shu, Qi, Xiaojuan, Jia, Jiaya

论文摘要

近年来，3D点云细分取得了巨大进展。大多数当前方法都集中在汇总本地特征，但无法直接建模长期依赖性。在本文中，我们提出了能够捕获远程环境并表现出强大的概括能力和高性能的分层变压器。具体来说，我们首先提出了一种新颖的关键采样策略。对于每个查询点，我们以分层的方式将附近的点密集和遥远的点稀疏地作为其钥匙进行采样，这使该模型能够扩大有效的接受场并以低计算成本享受远程环境。此外，为了应对不规则的积分安排所带来的挑战，我们提出了嵌入的第一层点以汇总本地信息，从而有助于融合并提高性能。此外，我们采用上下文相对位置编码以自适应捕获位置信息。最后，引入了记忆效率的实现，以克服每个窗口中不同点号的问题。广泛的实验证明了我们方法对S3DIS，ScannETV2和ShapenetPart数据集的有效性和优势。代码可在https://github.com/dvlab-research/stratatified-transformer上找到。

3D point cloud segmentation has made tremendous progress in recent years. Most current methods focus on aggregating local features, but fail to directly model long-range dependencies. In this paper, we propose Stratified Transformer that is able to capture long-range contexts and demonstrates strong generalization ability and high performance. Specifically, we first put forward a novel key sampling strategy. For each query point, we sample nearby points densely and distant points sparsely as its keys in a stratified way, which enables the model to enlarge the effective receptive field and enjoy long-range contexts at a low computational cost. Also, to combat the challenges posed by irregular point arrangements, we propose first-layer point embedding to aggregate local information, which facilitates convergence and boosts performance. Besides, we adopt contextual relative position encoding to adaptively capture position information. Finally, a memory-efficient implementation is introduced to overcome the issue of varying point numbers in each window. Extensive experiments demonstrate the effectiveness and superiority of our method on S3DIS, ScanNetv2 and ShapeNetPart datasets. Code is available at https://github.com/dvlab-research/Stratified-Transformer.

下载PDF全文

下载文献需遵守相关版权规定

论文标题