内窥镜视频的学习表示，以检测工具的存在而无需监督

论文标题

内窥镜视频的学习表示，以检测工具的存在而无需监督

Learning Representations of Endoscopic Videos to Detect Tool Presence Without Supervision

论文作者

Li, David Z., Ishii, Masaru, Taylor, Russell H., Hager, Gregory D., Sinha, Ayushi

论文摘要

在这项工作中，我们探讨了是否可以学习内窥镜视频帧的表示，以执行诸如识别手术工具在没有监督的情况下的任务。我们使用最大的平均差异（MMD）变分自动编码器（VAE）学习内窥镜视频的低维度潜在表示，并操纵这些表示形式，以区分包含工具的框架与没有工具的工具。我们使用三种不同的方法来操纵这些潜在表示，以预测每个帧中的工具存在。我们完全无监督的方法可以确定内窥镜视频框架是否包含平均精度为71.56、73.93和76.18的工具，与受监督的方法相当。我们的代码可在https://github.com/zdavidli/tool-presence/上找到

In this work, we explore whether it is possible to learn representations of endoscopic video frames to perform tasks such as identifying surgical tool presence without supervision. We use a maximum mean discrepancy (MMD) variational autoencoder (VAE) to learn low-dimensional latent representations of endoscopic videos and manipulate these representations to distinguish frames containing tools from those without tools. We use three different methods to manipulate these latent representations in order to predict tool presence in each frame. Our fully unsupervised methods can identify whether endoscopic video frames contain tools with average precision of 71.56, 73.93, and 76.18, respectively, comparable to supervised methods. Our code is available at https://github.com/zdavidli/tool-presence/

下载PDF全文

下载文献需遵守相关版权规定

论文标题