Y. Gizatdinova, O. Špakov, O. Tuisku, Matthew Turk, Veikko Surakka
We examined two vision-based interfaces (VBIs) for performance and user experience during character-based text entry using an on-screen virtual keyboard. Head-based VBI uses head motion to steer the computer pointer and mouth-opening gestures to select the keyboard keys. Gaze-based VBI utilizes gaze for pointing at the keys and an adjustable dwell for key selection. The results showed that after three sessions (45 min of typing in total), able-bodied novice participants (N = 34) typed significantly slower yet yielded significantly more accurate text with head-based VBI with gaze-based VBIs. The analysis of errors and corrective actions relative to the spatial layout of the keyboard revealed a difference in the error correction behavior of the participants when typing using both interfaces. We estimated the error correction cost for both interfaces and suggested implications for the future use and improvement of VBIs for hands-free text entry.
{"title":"Vision-Based Interfaces for Character-Based Text Entry: Comparison of Errors and Error Correction Properties of Eye Typing and Head Typing","authors":"Y. Gizatdinova, O. Špakov, O. Tuisku, Matthew Turk, Veikko Surakka","doi":"10.1155/2023/8855764","DOIUrl":"https://doi.org/10.1155/2023/8855764","url":null,"abstract":"We examined two vision-based interfaces (VBIs) for performance and user experience during character-based text entry using an on-screen virtual keyboard. Head-based VBI uses head motion to steer the computer pointer and mouth-opening gestures to select the keyboard keys. Gaze-based VBI utilizes gaze for pointing at the keys and an adjustable dwell for key selection. The results showed that after three sessions (45 min of typing in total), able-bodied novice participants (N = 34) typed significantly slower yet yielded significantly more accurate text with head-based VBI with gaze-based VBIs. The analysis of errors and corrective actions relative to the spatial layout of the keyboard revealed a difference in the error correction behavior of the participants when typing using both interfaces. We estimated the error correction cost for both interfaces and suggested implications for the future use and improvement of VBIs for hands-free text entry.","PeriodicalId":192934,"journal":{"name":"Adv. Hum. Comput. Interact.","volume":"47 1","pages":"8855764:1-8855764:23"},"PeriodicalIF":0.0,"publicationDate":"2023-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139248982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recent advances in speech recognition have achieved remarkable performance comparable with human transcribers’ abilities. But this significant performance is not the same for all the spoken languages. The Arabic language is one of them. Arabic speech recognition is bounded to the lack of suitable datasets. Artificial intelligence algorithms have shown promising capabilities for Arabic speech recognition. Arabic is the official language of 22 countries, and it has been estimated that 400 million people speak the Arabic language worldwide. Speech disabilities have been one of the expanding problems in the last decades, even in kids. Some devices can be used to generate speech for those people. One of these devices is the Servox Digital Electro-Larynx (EL). In this research, we developed an autoencoder with a combination of long short-term memory (LSTM) and gated recurrent units (GRU) models to recognize recorded signals from Servox Digital EL Electro-Larynx. The proposed framework consisted of three steps: denoising, feature extraction, and Arabic speech recognition. The experimental results show 95.31% accuracy for Arabic speech recognition with the proposed model. In this research, we evaluated different combinations of LSTM and GRU for constructing the best autoencoder. A rigorous evaluation process indicates better performance with the use of GRU in both encoder and decoder structures. The proposed model achieved a 4.69% word error rate (WER). Experimental results confirm that the proposed model can be used for developing a real-time app to recognize common Arabic spoken words.
语音识别的最新进展已经取得了与人类转录能力相当的显著表现。但这种显著的表现并不适用于所有的口语。阿拉伯语就是其中之一。阿拉伯语语音识别受限于缺乏合适的数据集。人工智能算法在阿拉伯语语音识别方面表现出了很好的能力。阿拉伯语是22个国家的官方语言,据估计全世界有4亿人说阿拉伯语。在过去的几十年里,语言障碍一直是一个不断扩大的问题,甚至在儿童中也是如此。一些设备可以用来为这些人生成语音。其中一种装置是伺服数字电喉(EL)。在这项研究中,我们开发了一种结合长短期记忆(LSTM)和门控循环单元(GRU)模型的自编码器,以识别来自Servox Digital EL电喉的记录信号。该框架包括三个步骤:去噪、特征提取和阿拉伯语语音识别。实验结果表明,该模型对阿拉伯语语音识别的准确率为95.31%。在这项研究中,我们评估了LSTM和GRU的不同组合来构建最佳的自编码器。严格的评估过程表明,在编码器和解码器结构中使用GRU具有更好的性能。该模型的错误率为4.69%。实验结果表明,该模型可用于开发实时识别阿拉伯语常用口语单词的应用程序。
{"title":"Deep Learning Methods for Arabic Autoencoder Speech Recognition System for Electro-Larynx Device","authors":"Z. J. M. Ameen, A. Kadhim","doi":"10.1155/2023/7398538","DOIUrl":"https://doi.org/10.1155/2023/7398538","url":null,"abstract":"Recent advances in speech recognition have achieved remarkable performance comparable with human transcribers’ abilities. But this significant performance is not the same for all the spoken languages. The Arabic language is one of them. Arabic speech recognition is bounded to the lack of suitable datasets. Artificial intelligence algorithms have shown promising capabilities for Arabic speech recognition. Arabic is the official language of 22 countries, and it has been estimated that 400 million people speak the Arabic language worldwide. Speech disabilities have been one of the expanding problems in the last decades, even in kids. Some devices can be used to generate speech for those people. One of these devices is the Servox Digital Electro-Larynx (EL). In this research, we developed an autoencoder with a combination of long short-term memory (LSTM) and gated recurrent units (GRU) models to recognize recorded signals from Servox Digital EL Electro-Larynx. The proposed framework consisted of three steps: denoising, feature extraction, and Arabic speech recognition. The experimental results show 95.31% accuracy for Arabic speech recognition with the proposed model. In this research, we evaluated different combinations of LSTM and GRU for constructing the best autoencoder. A rigorous evaluation process indicates better performance with the use of GRU in both encoder and decoder structures. The proposed model achieved a 4.69% word error rate (WER). Experimental results confirm that the proposed model can be used for developing a real-time app to recognize common Arabic spoken words.","PeriodicalId":192934,"journal":{"name":"Adv. Hum. Comput. Interact.","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124778583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recent years have seen a surge in interest in the multifaceted topic of human-computer interaction (HCI). Since the advent of the Fourth Industrial Revolution, the significance of human-computer interaction in the field of safety risk management has only grown. There has not been a lot of focus on developing human-computer interaction for identifying potential hazards in buildings. After conducting a comprehensive literature review, we developed a study framework for the use of human-computer interaction in the identification of construction-related hazards (CHR-HCI). Future studies will focus on the intersection of computer vision, VR, and ergonomics. In this research, we have built a theoretical foundation for past studies’ findings and connections and offered concrete recommendations for the improvement of HCI in danger identification in the future. Moreover, we analyzed two cases studies related to the domain of CHR-HCI in terms of wearable vibration-based systems and context aware navigation.
{"title":"CHR vs. Human-Computer Interaction Design for Emerging Technologies: Two Case Studies","authors":"Sharefa Murad, A. Qusef, Muhanna A. Muhanna","doi":"10.1155/2023/8710638","DOIUrl":"https://doi.org/10.1155/2023/8710638","url":null,"abstract":"Recent years have seen a surge in interest in the multifaceted topic of human-computer interaction (HCI). Since the advent of the Fourth Industrial Revolution, the significance of human-computer interaction in the field of safety risk management has only grown. There has not been a lot of focus on developing human-computer interaction for identifying potential hazards in buildings. After conducting a comprehensive literature review, we developed a study framework for the use of human-computer interaction in the identification of construction-related hazards (CHR-HCI). Future studies will focus on the intersection of computer vision, VR, and ergonomics. In this research, we have built a theoretical foundation for past studies’ findings and connections and offered concrete recommendations for the improvement of HCI in danger identification in the future. Moreover, we analyzed two cases studies related to the domain of CHR-HCI in terms of wearable vibration-based systems and context aware navigation.","PeriodicalId":192934,"journal":{"name":"Adv. Hum. Comput. Interact.","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124634751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Many unstructured documents contain segments with specific topics. Extracting these segments and identifying their topics helps to access the required information directly. This can improve the quality of many NLP applications such as information extraction, information retrieval, summarization, and question answering. Resumes (CVs) are unstructured documents that have diverse formats. They contain various segments such as personal information, experience, and education. Manually processing resumes to find the most suitable candidates for a particular job is a difficult task. Due to the increased amount of data, it has become very necessary to manipulate resumes by computer to save time and effort. This research presents a new algorithm named TSHD for topic segmentation based on headings detection. We apply the algorithm to extract resume segments and identify their topics. The proposed TSHD algorithm is accurate and addresses many weaknesses in previous studies. Evaluation results show a very high F1 score (about 96%) and a very low segmentation error (about 2%). The algorithm can be easily adapted to deal with other textual domains that contain headings in their segments.
{"title":"TSHD: Topic Segmentation Based on Headings Detection (Case Study: Resumes)","authors":"Majd E. Tannous, Wassim Ramadan, Mohanad A. Rajab","doi":"10.1155/2023/6044007","DOIUrl":"https://doi.org/10.1155/2023/6044007","url":null,"abstract":"Many unstructured documents contain segments with specific topics. Extracting these segments and identifying their topics helps to access the required information directly. This can improve the quality of many NLP applications such as information extraction, information retrieval, summarization, and question answering. Resumes (CVs) are unstructured documents that have diverse formats. They contain various segments such as personal information, experience, and education. Manually processing resumes to find the most suitable candidates for a particular job is a difficult task. Due to the increased amount of data, it has become very necessary to manipulate resumes by computer to save time and effort. This research presents a new algorithm named TSHD for topic segmentation based on headings detection. We apply the algorithm to extract resume segments and identify their topics. The proposed TSHD algorithm is accurate and addresses many weaknesses in previous studies. Evaluation results show a very high F1 score (about 96%) and a very low segmentation error (about 2%). The algorithm can be easily adapted to deal with other textual domains that contain headings in their segments.","PeriodicalId":192934,"journal":{"name":"Adv. Hum. Comput. Interact.","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129415618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
L. Riera, Sebastià Verger, P. Montoya, Francisco J. Perales López
Chronic pain affects the quality of life of those affected. The need to investigate alternative and complementary methods to the pharmacological one to alleviate chronic pain is evident, so virtual reality and binaural tones have become a topic of interest in this field in recent years. This study aims to analyze the contributions of the combination of these two techniques in pediatric patients with chronic pain. For this, data on psychophysiological responses (heart rate and galvanic skin response) and pain perception are collected during and after interaction with this technology using a mixed pre- and posttest experimental methodology. The physiological data and answers in the Pediatric Pain Questionnaire (PPQ) have been collected in a sample of n = 13 healthy participants and n = 9 pediatric patients with chronic pain. The results show a significant difference between baseline and after applying virtual reality and binaural beats, md = 1.205 (t = 3.32; p <