FineHand：学习美国手语识别的手形

论文标题

FineHand：学习美国手语识别的手形

FineHand: Learning Hand Shapes for American Sign Language Recognition

论文作者

Hosain, Al Amin, Santhalingam, Panneer Selvam, Pathak, Parth, Rangwala, Huzefa, Kosecka, Jana

论文摘要

美国手语识别是一个困难的手势识别问题，其特征是快速，高度清晰的手势。这些由手臂运动组成，具有不同的手工形状，面部表情和头部运动。在这些组成部分中，手形是手势的至关重要，通常是最歧视的部分。在这项工作中，我们提出了一种有效学习手形嵌入的方法，这对ASL手势具有歧视性。对于手形识别，我们的方法使用手动标记的手形和高置信度预测的混合物来训练深卷积神经网络（CNN）。顺序的手势组件是通过在第一阶段学习的嵌入式训练的递归神经网络（RNN）捕获的。我们将证明，在具有各种扬声器，不同的照明和明显的运动模糊的挑战性条件下，更高质量的手形模型可以显着提高最终视频手势分类的准确性。我们将我们的模型与利用不同方式和数据表示的替代方法进行比较，并在GMU-ASL51基准数据集上显示了改进的视频识别精度

American Sign Language recognition is a difficult gesture recognition problem, characterized by fast, highly articulate gestures. These are comprised of arm movements with different hand shapes, facial expression and head movements. Among these components, hand shape is the vital, often the most discriminative part of a gesture. In this work, we present an approach for effective learning of hand shape embeddings, which are discriminative for ASL gestures. For hand shape recognition our method uses a mix of manually labelled hand shapes and high confidence predictions to train deep convolutional neural network (CNN). The sequential gesture component is captured by recursive neural network (RNN) trained on the embeddings learned in the first stage. We will demonstrate that higher quality hand shape models can significantly improve the accuracy of final video gesture classification in challenging conditions with variety of speakers, different illumination and significant motion blurr. We compare our model to alternative approaches exploiting different modalities and representations of the data and show improved video gesture recognition accuracy on GMU-ASL51 benchmark dataset

下载PDF全文

下载文献需遵守相关版权规定

论文标题