TransASL: A Smart Glass based Comprehensive ASL Recognizer in Daily Life

Yincheng Jin, Seokmin Choi, Yang Gao, Jiyang Li, Zhengxiong Li, Zhanpeng Jin
{"title":"TransASL: A Smart Glass based Comprehensive ASL Recognizer in Daily Life","authors":"Yincheng Jin, Seokmin Choi, Yang Gao, Jiyang Li, Zhengxiong Li, Zhanpeng Jin","doi":"10.1145/3581641.3584071","DOIUrl":null,"url":null,"abstract":"Sign language is a primary language used by deaf and hard-of-hearing (DHH) communities. However, existing sign language translation solutions primarily focus on recognizing manual markers. The non-manual markers, such as negative head shaking, question markers, and mouthing, are critical grammatical and semantic components of sign language for better usability and generalizability. Considering the significant role of non-manual markers, we propose the TransASL, a real-time, end-to-end system for sign language recognition and translation. TransASL extracts feature from both manual markers and non-manual markers via a customized eyeglasses-style wearable device with two parallel sensing modalities. Manual marker information is collected by two pairs of outward-facing microphones and speakers mounted to the legs of the eyeglasses. In contrast, non-manual marker information is acquired from a pair of inward-facing microphones and speakers connected to the eyeglasses. Both manual and non-manual marker features undergo a multi-modal, multi-channel fusion network and are eventually recognized as comprehensible ASL content. We evaluate the recognition performance of various sign language expressions at both the word and sentence levels. Given 80 frequently used ASL words and 40 meaningful sentences consisting of manual and non-manual markers, TransASL can achieve the WER of 8.3% and 7.1%, respectively. Our proposed work reveals a great potential for convenient ASL recognition in daily communications between ASL signers and hearing people.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 28th International Conference on Intelligent User Interfaces","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3581641.3584071","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Sign language is a primary language used by deaf and hard-of-hearing (DHH) communities. However, existing sign language translation solutions primarily focus on recognizing manual markers. The non-manual markers, such as negative head shaking, question markers, and mouthing, are critical grammatical and semantic components of sign language for better usability and generalizability. Considering the significant role of non-manual markers, we propose the TransASL, a real-time, end-to-end system for sign language recognition and translation. TransASL extracts feature from both manual markers and non-manual markers via a customized eyeglasses-style wearable device with two parallel sensing modalities. Manual marker information is collected by two pairs of outward-facing microphones and speakers mounted to the legs of the eyeglasses. In contrast, non-manual marker information is acquired from a pair of inward-facing microphones and speakers connected to the eyeglasses. Both manual and non-manual marker features undergo a multi-modal, multi-channel fusion network and are eventually recognized as comprehensible ASL content. We evaluate the recognition performance of various sign language expressions at both the word and sentence levels. Given 80 frequently used ASL words and 40 meaningful sentences consisting of manual and non-manual markers, TransASL can achieve the WER of 8.3% and 7.1%, respectively. Our proposed work reveals a great potential for convenient ASL recognition in daily communications between ASL signers and hearing people.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
TransASL:日常生活中基于智能玻璃的综合ASL识别器
手语是聋人和听力障碍者(DHH)社区使用的主要语言。然而,现有的手语翻译解决方案主要侧重于识别手动标记。非手动标记,如消极摇头、问号和口型,是手语中重要的语法和语义成分,有助于提高可用性和泛化性。考虑到非手动标记的重要作用,我们提出了TransASL,一个实时的端到端手语识别和翻译系统。TransASL通过定制的眼镜式可穿戴设备从手动和非手动标记中提取特征,该设备具有两种并行传感模式。手动标记信息通过安装在眼镜腿上的两对向外的麦克风和扬声器收集。相比之下,非手动标记信息是从连接到眼镜的一对内向麦克风和扬声器中获取的。手动和非手动标记特征都经过多模态、多通道的融合网络,最终被识别为可理解的美国手语内容。我们在单词和句子两个层面上评估了各种手语表达的识别性能。在给定80个常用的ASL单词和40个由手动和非手动标记组成的有意义的句子的情况下,TransASL分别可以实现8.3%和7.1%的WER。我们的研究表明,在手语使用者和听障人士之间的日常交流中,手语识别具有很大的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Interactive User Interface for Dialogue Summarization Human-Centered Deferred Inference: Measuring User Interactions and Setting Deferral Criteria for Human-AI Teams Drawing with Reframer: Emergence and Control in Co-Creative AI Don’t fail me! The Level 5 Autonomous Driving Information Dilemma regarding Transparency and User Experience It Seems Smart, but It Acts Stupid: Development of Trust in AI Advice in a Repeated Legal Decision-Making Task
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1