论文标题

基于深度学习的手势识别系统和人机接口的设计

Deep learning based Hand gesture recognition system and design of a Human-Machine Interface

论文作者

Sen, Abir, Mishra, Tapas Kumar, Dash, Ratnakar

论文摘要

在这项工作中,提出了基于实时的手势识别系统的人体计算机界面(HCI)。该系统包括六个阶段:(1)手势分割,(3)使用五个预训练的卷积神经网络模型(CNN)和视觉变压器(VIT),(4)建立交互式人机接口(HMI),(HMI),(5)基于手势对照的尺寸,基于kesman cortiontion themans themanty the Mornity of Formity of Many(6)kal kerman of the Many(6)kal kalman cortiation the Many(6)kal sallys of the Many(6)kal sal of the Many(6)指针的改进。在我们的工作中,已采用了五个预训练的CNN(VGG16,VGG19,RESNET50,RESNET101和INCEPTION-V1)模型和VIT来对手势图像进行分类。已经使用了两个多级数据集(一个公共和一个自定义)来验证模型。考虑到该模型的性能,与其他四个CNN模型相比,Inception-V1的分类性能显着显示出更好的分类性能,并且在准确性,精度,召回和F-SCORE值方面表现出了更好的分类性能。我们还扩展了该系统,以在实时场景中使用不同的自定义手势命令来控制某些桌面应用程序(例如VLC播放器,音频播放器,文件管理,播放2D Super-Mario-Bros游戏等)。该系统的平均速度已达到25 fps(每秒帧),符合实时方案的要求。提议的手势控制系统的性能在每个控制中获得了Milisecond的平均响应时间,这使其适用于实时。该模型(原型)将使与桌面互动的身体残疾人受益。

In this work, a real-time hand gesture recognition system-based human-computer interface (HCI) is presented. The system consists of six stages: (1) hand detection, (2) gesture segmentation, (3) use of five pre-trained convolutional neural network models (CNN) and vision transformer (ViT), (4) building an interactive human-machine interface (HMI), (5) development of a gesture-controlled virtual mouse, (6) use of Kalman filter to estimate the hand position, based on that the smoothness of the motion of pointer is improved. In our work, five pre-trained CNN (VGG16, VGG19, ResNet50, ResNet101, and Inception-V1) models and ViT have been employed to classify hand gesture images. Two multi-class datasets (one public and one custom) have been used to validate the models. Considering the model's performances, it is observed that Inception-V1 has significantly shown a better classification performance compared to the other four CNN models and ViT in terms of accuracy, precision, recall, and F-score values. We have also expanded this system to control some desktop applications (such as VLC player, audio player, file management, playing 2D Super-Mario-Bros game, etc.) with different customized gesture commands in real-time scenarios. The average speed of this system has reached 25 fps (frames per second), which meets the requirements for the real-time scenario. Performance of the proposed gesture control system obtained the average response time in milisecond for each control which makes it suitable for real-time. This model (prototype) will benefit physically disabled people interacting with desktops.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源