论文标题

解开您的应用程序:通过深度学习预测移动GUI组件的自然语言标签

Unblind Your Apps: Predicting Natural-Language Labels for Mobile GUI Components by Deep Learning

论文作者

Chen, Jieshan, Chen, Chunyang, Xing, Zhenchang, Xu, Xiwei, Zhu, Liming, Li, Guoqiang, Wang, Jinshui

论文摘要

根据世界卫生组织(WHO)的说法,据估计,全球约有13亿人患有某些形式的视力障碍,其中3600万人是盲人。由于他们的残疾,将这些少数派参与社会是一个具有挑战性的问题。智能手机的最新兴起通过使盲人用户可以方便地访问信息和服务来了解世界,从而为了解世界提供了新的解决方案。视力障碍的用户可以采用移动操作系统中嵌入的屏幕读取器来读取应用程序中每个屏幕的内容,并使用手势与手机进行交互。但是,使用屏幕读取器的前提是开发人员在开发应用程序时必须在基于图像的组件中添加自然语言标签。不幸的是,根据我们对10,408个Android应用程序的分析,超过77%的应用程序存在缺失标签的问题。这些问题大多数是由于开发人员在考虑少数群体时缺乏意识和知识而引起的。即使开发人员想将标签添加到UI组件中,它们也可能不会简洁明了,因为其中大多数都没有视觉问题。为了克服这些挑战,我们开发了一个基于深度学习的模型,称为LabelDroid,以通过在Google Play中的大型商业应用中学习来自动预测基于图像按钮的标签。实验结果表明,我们的模型可以进行准确的预测,而生成的标签比实际Android开发人员具有更高的质量。

According to the World Health Organization(WHO), it is estimated that approximately 1.3 billion people live with some forms of vision impairment globally, of whom 36 million are blind. Due to their disability, engaging these minority into the society is a challenging problem. The recent rise of smart mobile phones provides a new solution by enabling blind users' convenient access to the information and service for understanding the world. Users with vision impairment can adopt the screen reader embedded in the mobile operating systems to read the content of each screen within the app, and use gestures to interact with the phone. However, the prerequisite of using screen readers is that developers have to add natural-language labels to the image-based components when they are developing the app. Unfortunately, more than 77% apps have issues of missing labels, according to our analysis of 10,408 Android apps. Most of these issues are caused by developers' lack of awareness and knowledge in considering the minority. And even if developers want to add the labels to UI components, they may not come up with concise and clear description as most of them are of no visual issues. To overcome these challenges, we develop a deep-learning based model, called LabelDroid, to automatically predict the labels of image-based buttons by learning from large-scale commercial apps in Google Play. The experimental results show that our model can make accurate predictions and the generated labels are of higher quality than that from real Android developers.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源