Pub Date : 2024-08-30DOI: 10.1007/s12193-024-00436-x
Martina Schuß, Luca Pizzoni, Andreas Riener
Shared Automated Vehicles (SAVs) promise to make automated mobility accessible to a wide range of people while reducing air pollution and improving traffic flow. In the future, these vehicles will operate with no human driver on board, which poses several challenges that might differ depending on the cultural context and make one-fits-all solutions demanding. A promising substitute for the driver could be Digital Companions (DCs), i.e. conversational agents presented on a screen inside the vehicles. We conducted interviews with Colombian participants and workshops with German and Korean participants and derived two design concepts of DCs as an alternative for the human driver on SAVs: a human-like and a robot-like. We compared these two concepts to a baseline without companion using a scenario-based online questionnaire with participants from Colombia (N = 57), Germany (N = 50), and Korea (N = 29) measuring anxiety, security, trust, risk, control, threat, and user experience. In comparison with the baseline, both DCs are statistically significantly perceived as more positively. While we found a preference for the human-like DC among all participants, this preference is higher among Colombians while Koreans show the highest openness towards the robot-like DC.
共享自动驾驶汽车(SAVs)有望在减少空气污染和改善交通流量的同时,让更多人享受到自动驾驶汽车带来的便利。未来,这些车辆将在没有人类驾驶员的情况下运行,这将带来若干挑战,这些挑战可能因文化背景不同而各异,并使 "一刀切 "的解决方案变得十分困难。数字伴侣(DCs),即在车内屏幕上显示的对话代理,可能会成为驾驶员的一个有前途的替代品。我们对哥伦比亚的参与者进行了访谈,并与德国和韩国的参与者进行了研讨,得出了作为 SAV 上人类驾驶员替代品的两个 DC 设计概念:类人和类机器人。我们将这两个概念与无陪伴基线进行了比较,采用基于场景的在线问卷调查,对来自哥伦比亚(57 人)、德国(50 人)和韩国(29 人)的参与者进行了焦虑、安全、信任、风险、控制、威胁和用户体验方面的测量。与基线相比,从统计学角度看,两种直流电源都被认为更积极。虽然我们发现所有参与者都偏好类人 DC,但哥伦比亚人的偏好度更高,而韩国人对类机器人 DC 的开放度最高。
{"title":"Human or robot? Exploring different avatar appearances to increase perceived security in shared automated vehicles","authors":"Martina Schuß, Luca Pizzoni, Andreas Riener","doi":"10.1007/s12193-024-00436-x","DOIUrl":"https://doi.org/10.1007/s12193-024-00436-x","url":null,"abstract":"<p>Shared Automated Vehicles (SAVs) promise to make automated mobility accessible to a wide range of people while reducing air pollution and improving traffic flow. In the future, these vehicles will operate with no human driver on board, which poses several challenges that might differ depending on the cultural context and make one-fits-all solutions demanding. A promising substitute for the driver could be Digital Companions (DCs), i.e. conversational agents presented on a screen inside the vehicles. We conducted interviews with Colombian participants and workshops with German and Korean participants and derived two design concepts of DCs as an alternative for the human driver on SAVs: a human-like and a robot-like. We compared these two concepts to a baseline without companion using a scenario-based online questionnaire with participants from Colombia (N = 57), Germany (N = 50), and Korea (N = 29) measuring anxiety, security, trust, risk, control, threat, and user experience. In comparison with the baseline, both DCs are statistically significantly perceived as more positively. While we found a preference for the human-like DC among all participants, this preference is higher among Colombians while Koreans show the highest openness towards the robot-like DC.</p>","PeriodicalId":17529,"journal":{"name":"Journal on Multimodal User Interfaces","volume":"11 1","pages":""},"PeriodicalIF":2.9,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Virtual reality (VR) technology has been increasingly focusing on incorporating multimodal outputs to enhance the sense of immersion and realism. In this work, we developed AirWhisper, a modular wearable device that provides dynamic airflow feedback to enhance VR experiences. AirWhisper simulates wind from multiple directions around the user’s head via four micro fans and 3D-printed attachments. We applied a Just Noticeable Difference study to support the design of the control system and explore the user’s perception of the characteristics of the airflow in different directions. Through multimodal comparison experiments, we find that vision-airflow multimodality output can improve the user’s VR experience from several perspectives. Finally, we designed scenarios with different airflow change patterns and different levels of interaction to test AirWhisper’s performance in various contexts and explore the differences in users’ perception of airflow under different virtual environment conditions. Our work shows the importance of developing human-centered multimodal feedback adaptive learning models that can make real-time dynamic changes based on the user’s perceptual characteristics and environmental features.
{"title":"AirWhisper: enhancing virtual reality experience via visual-airflow multimodal feedback","authors":"Fangtao Zhao, Ziming Li, Yiming Luo, Yue Li, Hai-Ning Liang","doi":"10.1007/s12193-024-00438-9","DOIUrl":"https://doi.org/10.1007/s12193-024-00438-9","url":null,"abstract":"<p>Virtual reality (VR) technology has been increasingly focusing on incorporating multimodal outputs to enhance the sense of immersion and realism. In this work, we developed AirWhisper, a modular wearable device that provides dynamic airflow feedback to enhance VR experiences. AirWhisper simulates wind from multiple directions around the user’s head via four micro fans and 3D-printed attachments. We applied a Just Noticeable Difference study to support the design of the control system and explore the user’s perception of the characteristics of the airflow in different directions. Through multimodal comparison experiments, we find that vision-airflow multimodality output can improve the user’s VR experience from several perspectives. Finally, we designed scenarios with different airflow change patterns and different levels of interaction to test AirWhisper’s performance in various contexts and explore the differences in users’ perception of airflow under different virtual environment conditions. Our work shows the importance of developing human-centered multimodal feedback adaptive learning models that can make real-time dynamic changes based on the user’s perceptual characteristics and environmental features.</p>","PeriodicalId":17529,"journal":{"name":"Journal on Multimodal User Interfaces","volume":"26 1","pages":""},"PeriodicalIF":2.9,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The transportation of goods by road is crucial in Tanzania’s Central and North-South Corridors. However, challenges such as truck congestion and road accidents are affecting the efficiency of these routes. Road crashes are prevalent in low- and middle-income countries, with Africa experiencing an exceptionally high rate. This study examines the opinions of Tanzanian truck drivers on the effectiveness of Advanced Driver Assistance Systems in reducing road safety risks. A discriminant analysis was conducted to assess the awareness and use of these systems among different driver experienced groups. The results highlight the importance of improving infrastructure, ensuring vehicle safety standards, providing comprehensive driver training, and integrating innovative Intelligent Transport Systems to address road safety issues. In conclusion, the study provides valuable insights for policymakers and stakeholders to strengthen road safety measures in Tanzania, facilitating smoother road freight transport operations and promoting economic growth.
{"title":"Truck drivers’ views on the road safety benefits of advanced driver assistance systems and Intelligent Transport Systems in Tanzania","authors":"Marwa Chacha, Prosper Nyaki, Ariane Cuenen, Ansar Yasar, Geert Wets","doi":"10.1007/s12193-024-00437-w","DOIUrl":"https://doi.org/10.1007/s12193-024-00437-w","url":null,"abstract":"<p>The transportation of goods by road is crucial in Tanzania’s Central and North-South Corridors. However, challenges such as truck congestion and road accidents are affecting the efficiency of these routes. Road crashes are prevalent in low- and middle-income countries, with Africa experiencing an exceptionally high rate. This study examines the opinions of Tanzanian truck drivers on the effectiveness of Advanced Driver Assistance Systems in reducing road safety risks. A discriminant analysis was conducted to assess the awareness and use of these systems among different driver experienced groups. The results highlight the importance of improving infrastructure, ensuring vehicle safety standards, providing comprehensive driver training, and integrating innovative Intelligent Transport Systems to address road safety issues. In conclusion, the study provides valuable insights for policymakers and stakeholders to strengthen road safety measures in Tanzania, facilitating smoother road freight transport operations and promoting economic growth.</p>","PeriodicalId":17529,"journal":{"name":"Journal on Multimodal User Interfaces","volume":"4 1","pages":""},"PeriodicalIF":2.9,"publicationDate":"2024-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141942192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-26DOI: 10.1007/s12193-024-00434-z
Pär Gustavsson, Mikael Ljung Aust
Close following to lead vehicles is associated with increased risk of rear-end crashes in road traffic. One way to reduce instances of close following is through increased use of the Advanced Driver Assistance System (ADAS) Adaptive Cruise Control (ACC), which is designed to adjust vehicle speed to maintain a safe time headway. Since the activation of ACC is driver-initiated, there is a need to influence the propensity of drivers to use the function. This research aimed to explore whether in-vehicle nudging interventions could be effective for this purpose. A field trial was conducted to consecutively assess the effects of two nudges on drivers’ utilization of ACC, compared to baseline usage. Exposing the participants (n = 49) to the first ambient design nudge resulted in a 46% increase in ACC usage on average. Following the introduction of the second nudge (a competitive leaderboard nudge), the average increase among participants (n = 48) during the complete treatment period reached 61%. The changes in ACC utilization varied between individual drivers, highlighting the need to monitor behavioral outcomes of nudges and adapt them when needed. In conclusion, this research shows that utilizing in-vehicle nudging is a promising approach to increase the use of vehicle functions contributing to improved traffic safety.
{"title":"In-vehicle nudging for increased Adaptive Cruise Control use: a field study","authors":"Pär Gustavsson, Mikael Ljung Aust","doi":"10.1007/s12193-024-00434-z","DOIUrl":"https://doi.org/10.1007/s12193-024-00434-z","url":null,"abstract":"<p>Close following to lead vehicles is associated with increased risk of rear-end crashes in road traffic. One way to reduce instances of close following is through increased use of the Advanced Driver Assistance System (ADAS) Adaptive Cruise Control (ACC), which is designed to adjust vehicle speed to maintain a safe time headway. Since the activation of ACC is driver-initiated, there is a need to influence the propensity of drivers to use the function. This research aimed to explore whether in-vehicle nudging interventions could be effective for this purpose. A field trial was conducted to consecutively assess the effects of two nudges on drivers’ utilization of ACC, compared to baseline usage. Exposing the participants (<i>n</i> = 49) to the first ambient design nudge resulted in a 46% increase in ACC usage on average. Following the introduction of the second nudge (a competitive leaderboard nudge), the average increase among participants (<i>n</i> = 48) during the complete treatment period reached 61%. The changes in ACC utilization varied between individual drivers, highlighting the need to monitor behavioral outcomes of nudges and adapt them when needed. In conclusion, this research shows that utilizing in-vehicle nudging is a promising approach to increase the use of vehicle functions contributing to improved traffic safety.</p>","PeriodicalId":17529,"journal":{"name":"Journal on Multimodal User Interfaces","volume":"37 1","pages":""},"PeriodicalIF":2.9,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141500992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-28DOI: 10.1007/s12193-024-00433-0
Dungar Singh, Pritikana Das, Indrajit Ghosh
The primary safety hazard at unsignalized intersections, particularly in urban areas, is pedestrian-vehicle collisions. Due to its complexity and inattention, pedestrian crossing behaviour has a significant impact on their safety. This study introduces a novel framework to enhance pedestrian safety at unsignalized intersections by developing a predictive model of pedestrian crossing behaviour using machine learning algorithms. While accounting for crossing behaviour as the dependent variable and other independent variables, the analysis prioritises accuracy and internal validity. Important feature scores for the different algorithms were assessed. The model results revealed that the arrival first of a pedestrian or vehicle, pedestrian delay, vehicle speed, pedestrian speed, age, gender, traffic hour, and vehicle category are highly influencing variables for analysing pedestrian behaviour while crossing at unsignalized intersections. This study found that the prediction of pedestrian behaviour based on random forest, extreme gradient boosting and binary logit model achieved 81.72%, 77.19% and 74.95%, respectively. Algorithms, including k-nearest neighbours, artificial neural networks, and support vector machines, have varying classification performance at every step. The findings of this study may be used to support infrastructure-to-vehicle interactions, enabling vehicles to successfully negotiate rolling pedestrian behaviour and improving pedestrian safety.
在没有信号灯的交叉路口,尤其是在城市地区,主要的安全隐患是行人与车辆的碰撞。由于行人过马路的复杂性和注意力不集中,行人过马路的行为对其安全有很大影响。本研究引入了一个新颖的框架,通过使用机器学习算法开发行人过街行为预测模型,提高无信号交叉路口的行人安全。在考虑作为因变量的过马路行为和其他自变量的同时,分析优先考虑准确性和内部有效性。对不同算法的重要特征得分进行了评估。模型结果显示,行人或车辆的先到时间、行人延迟、车辆速度、行人速度、年龄、性别、交通时间和车辆类别是分析行人在无信号灯交叉路口过马路行为的高度影响变量。研究发现,基于随机森林、极梯度提升和二元 Logit 模型的行人行为预测率分别为 81.72%、77.19% 和 74.95%。包括 k 近邻、人工神经网络和支持向量机在内的算法在每一步的分类性能都不尽相同。这项研究的结果可用于支持基础设施与车辆之间的互动,使车辆能够成功协商行人的滚动行为,并改善行人安全。
{"title":"Prediction of pedestrian crossing behaviour at unsignalized intersections using machine learning algorithms: analysis and comparison","authors":"Dungar Singh, Pritikana Das, Indrajit Ghosh","doi":"10.1007/s12193-024-00433-0","DOIUrl":"https://doi.org/10.1007/s12193-024-00433-0","url":null,"abstract":"<p>The primary safety hazard at unsignalized intersections, particularly in urban areas, is pedestrian-vehicle collisions. Due to its complexity and inattention, pedestrian crossing behaviour has a significant impact on their safety. This study introduces a novel framework to enhance pedestrian safety at unsignalized intersections by developing a predictive model of pedestrian crossing behaviour using machine learning algorithms. While accounting for crossing behaviour as the dependent variable and other independent variables, the analysis prioritises accuracy and internal validity. Important feature scores for the different algorithms were assessed. The model results revealed that the arrival first of a pedestrian or vehicle, pedestrian delay, vehicle speed, pedestrian speed, age, gender, traffic hour, and vehicle category are highly influencing variables for analysing pedestrian behaviour while crossing at unsignalized intersections. This study found that the prediction of pedestrian behaviour based on random forest, extreme gradient boosting and binary logit model achieved 81.72%, 77.19% and 74.95%, respectively. Algorithms, including k-nearest neighbours, artificial neural networks, and support vector machines, have varying classification performance at every step. The findings of this study may be used to support infrastructure-to-vehicle interactions, enabling vehicles to successfully negotiate rolling pedestrian behaviour and improving pedestrian safety.</p>","PeriodicalId":17529,"journal":{"name":"Journal on Multimodal User Interfaces","volume":"63 1","pages":""},"PeriodicalIF":2.9,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141167667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-04DOI: 10.1007/s12193-024-00429-w
Sophia C. Steinhaeusser, Albin Zehe, Peggy Schnetter, Andreas Hotho, Birgit Lugrin
Storytelling is a long-established tradition and listening to stories is still a popular leisure activity. Caused by technization, storytelling media expands, e.g., to social robots acting as multi-modal storytellers, using different multimodal behaviours such as facial expressions or body postures. With the overarching goal to automate robotic storytelling, we have been annotating stories with emotion labels which the robot can use to automatically adapt its behavior. With it, three different approaches are compared in two studies in this paper: 1) manual labels by human annotators (MA), 2) software-based word-sensitive annotation using the Linguistic Inquiry and Word Count program (LIWC), and 3) a machine learning based approach (ML). In an online study showing videos of a storytelling robot, the annotations were validated, with LIWC and MA achieving the best, and ML the worst results. In a laboratory user study, the three versions of the story were compared regarding transportation and cognitive absorption, revealing no significant differences but a positive trend towards MA. On this empirical basis, the Automated Robotic Storyteller was implemented using manual annotations. Future iterations should include other robots and modalities, fewer emotion labels and their probabilities.
讲故事是一种历史悠久的传统,听故事仍然是一种流行的休闲活动。随着技术的发展,讲故事的媒介也在不断扩展,例如,社交机器人可以作为多模态讲故事者,使用面部表情或身体姿势等不同的多模态行为。为了实现机器人自动讲故事的总体目标,我们一直在用情感标签注释故事,机器人可以利用这些情感标签自动调整自己的行为。本文通过两项研究比较了三种不同的方法:1) 由人工标注者手动标注(MA);2) 使用语言学调查和字数统计程序(LIWC)进行基于软件的词敏感标注;3) 基于机器学习的方法(ML)。在一项展示讲故事机器人视频的在线研究中,对注释进行了验证,LIWC 和 MA 的结果最好,而 ML 的结果最差。在一项实验室用户研究中,对三个版本的故事进行了运输和认知吸收方面的比较,结果显示没有明显差异,但MA版本有积极的趋势。在此经验基础上,自动机器人讲故事器通过手动注释得以实现。未来的迭代应包括其他机器人和模式、更少的情感标签及其概率。
{"title":"Towards the development of an automated robotic storyteller: comparing approaches for emotional story annotation for non-verbal expression via body language","authors":"Sophia C. Steinhaeusser, Albin Zehe, Peggy Schnetter, Andreas Hotho, Birgit Lugrin","doi":"10.1007/s12193-024-00429-w","DOIUrl":"https://doi.org/10.1007/s12193-024-00429-w","url":null,"abstract":"<p>Storytelling is a long-established tradition and listening to stories is still a popular leisure activity. Caused by technization, storytelling media expands, e.g., to social robots acting as multi-modal storytellers, using different multimodal behaviours such as facial expressions or body postures. With the overarching goal to automate robotic storytelling, we have been annotating stories with emotion labels which the robot can use to automatically adapt its behavior. With it, three different approaches are compared in two studies in this paper: 1) manual labels by human annotators (MA), 2) software-based word-sensitive annotation using the Linguistic Inquiry and Word Count program (LIWC), and 3) a machine learning based approach (ML). In an online study showing videos of a storytelling robot, the annotations were validated, with LIWC and MA achieving the best, and ML the worst results. In a laboratory user study, the three versions of the story were compared regarding transportation and cognitive absorption, revealing no significant differences but a positive trend towards MA. On this empirical basis, the <i>Automated Robotic Storyteller</i> was implemented using manual annotations. Future iterations should include other robots and modalities, fewer emotion labels and their probabilities.</p>","PeriodicalId":17529,"journal":{"name":"Journal on Multimodal User Interfaces","volume":"34 1","pages":""},"PeriodicalIF":2.9,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140588171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-19DOI: 10.1007/s12193-023-00427-4
Zahra J. Muhsin, Rami Qahwaji, Faruque Ghanchi, Majid Al-Taee
The development of many tools and technologies for people with visual impairment has become a major priority in the field of assistive technology research. However, many of these technology advancements have limitations in terms of the human aspects of the user experience (e.g., usability, learnability, and time to user adaptation) as well as difficulties in translating research prototypes into production. Also, there was no clear distinction between the assistive aids of adults and children, as well as between “partial impairment” and “total blindness”. As a result of these limitations, the produced aids have not gained much popularity and the intended users are still hesitant to utilise them. This paper presents a comprehensive review of substitutive interventions that aid in adapting to vision loss, centred on laboratory research studies to assess user-system interaction and system validation. Depending on the primary cueing feedback signal offered to the user, these technology aids are categorized as visual, haptics, or auditory-based aids. The context of use, cueing feedback signals, and participation of visually impaired people in the evaluation are all considered while discussing these aids. Based on the findings, a set of recommendations is suggested to assist the scientific community in addressing persisting challenges and restrictions faced by both the totally blind and partially sighted people.
{"title":"Review of substitutive assistive tools and technologies for people with visual impairments: recent advancements and prospects","authors":"Zahra J. Muhsin, Rami Qahwaji, Faruque Ghanchi, Majid Al-Taee","doi":"10.1007/s12193-023-00427-4","DOIUrl":"https://doi.org/10.1007/s12193-023-00427-4","url":null,"abstract":"<p>The development of many tools and technologies for people with visual impairment has become a major priority in the field of assistive technology research. However, many of these technology advancements have limitations in terms of the human aspects of the user experience (e.g., usability, learnability, and time to user adaptation) as well as difficulties in translating research prototypes into production. Also, there was no clear distinction between the assistive aids of adults and children, as well as between “partial impairment” and “total blindness”. As a result of these limitations, the produced aids have not gained much popularity and the intended users are still hesitant to utilise them. This paper presents a comprehensive review of substitutive interventions that aid in adapting to vision loss, centred on laboratory research studies to assess user-system interaction and system validation. Depending on the primary cueing feedback signal offered to the user, these technology aids are categorized as visual, haptics, or auditory-based aids. The context of use, cueing feedback signals, and participation of visually impaired people in the evaluation are all considered while discussing these aids. Based on the findings, a set of recommendations is suggested to assist the scientific community in addressing persisting challenges and restrictions faced by both the totally blind and partially sighted people.</p>","PeriodicalId":17529,"journal":{"name":"Journal on Multimodal User Interfaces","volume":"234 1 1","pages":""},"PeriodicalIF":2.9,"publicationDate":"2023-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138742323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-14DOI: 10.1007/s12193-023-00428-3
Abstract
In Industry 4.0, manufacturing entails a rapid change in customer demands which leads to mass customization. The variation in customer requirements leads to small batch sizes and several process variations. Assembly task is one of most important steps in any manufacturing process. A factory floor worker often needs a guidance system due to variations in product or process, to assist them in assembly task. Existing Augmented Reality (AR) based systems use markers for each assembly component for detection which is time consuming and laborious. This paper proposed utilizing state-of-the-art deep learning based object detection technique and employed a regression based mapping technique to obtain the 3D locations of assembly components. Automatic detection of machine parts was followed by a multimodal interface involving both eye gaze and hand tracking to guide the manual assembly process. We proposed eye cursor to guide the user through the task and utilized fingertip distances along with object sizes to detect any error committed during the task. We analyzed the proposed mapping method and found that the mean mapping error was 1.842 cm. We also investigated the effectiveness of the proposed multimodal user interface by conducting two user studies. The first study indicated that the current interface design with eye cursor enabled participants to perform the task significantly faster compared to the interface without eye cursor. The shop floor workers during the second user study reported that the proposed guidance system was comprehendible and easy to use to complete the assembly task. Results showed that the proposed guidance system enabled 11 end users to finish the assembly of one pneumatic cylinder within 55 s with average TLX score less than 25 in a scale of 100 and Cronbach alpha score of 0.8 indicating convergence of learning experience.
{"title":"Augmented reality and deep learning based system for assisting assembly process","authors":"","doi":"10.1007/s12193-023-00428-3","DOIUrl":"https://doi.org/10.1007/s12193-023-00428-3","url":null,"abstract":"<h3>Abstract</h3> <p>In Industry 4.0, manufacturing entails a rapid change in customer demands which leads to mass customization. The variation in customer requirements leads to small batch sizes and several process variations. Assembly task is one of most important steps in any manufacturing process. A factory floor worker often needs a guidance system due to variations in product or process, to assist them in assembly task. Existing Augmented Reality (AR) based systems use markers for each assembly component for detection which is time consuming and laborious. This paper proposed utilizing state-of-the-art deep learning based object detection technique and employed a regression based mapping technique to obtain the 3D locations of assembly components. Automatic detection of machine parts was followed by a multimodal interface involving both eye gaze and hand tracking to guide the manual assembly process. We proposed eye cursor to guide the user through the task and utilized fingertip distances along with object sizes to detect any error committed during the task. We analyzed the proposed mapping method and found that the mean mapping error was 1.842 cm. We also investigated the effectiveness of the proposed multimodal user interface by conducting two user studies. The first study indicated that the current interface design with eye cursor enabled participants to perform the task significantly faster compared to the interface without eye cursor. The shop floor workers during the second user study reported that the proposed guidance system was comprehendible and easy to use to complete the assembly task. Results showed that the proposed guidance system enabled 11 end users to finish the assembly of one pneumatic cylinder within 55 s with average TLX score less than 25 in a scale of 100 and Cronbach alpha score of 0.8 indicating convergence of learning experience.</p>","PeriodicalId":17529,"journal":{"name":"Journal on Multimodal User Interfaces","volume":"9 1","pages":""},"PeriodicalIF":2.9,"publicationDate":"2023-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138684924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-11DOI: 10.1007/s12193-023-00426-5
Beatrice Biancardi, Maurizio Mancini, Brian Ravenet, Giovanna Varni
Abstract Transactive memory system (TMS) is a team emergent state representing the knowledge of each member about “who knows what” in a team performing a joint task. We present a study to show how the three TMS dimensions Credibility, Specialisation, Coordination, can be modelled as a linear combination of the nonverbal multimodal features displayed by the team performing the joint task. Results indicate that, to some extent, the three dimensions of TMS can be expressed as a linear combination of nonverbal multimodal features. Moreover, the higher the number of modalities (audio, movement, spatial), the better the modelling. Results could be used in future work to design human-centered computing applications able to automatically estimate TMS from teams’ behavioural patterns, to provide feedback and help teams’ interactions.
{"title":"Modelling the “transactive memory system” in multimodal multiparty interactions","authors":"Beatrice Biancardi, Maurizio Mancini, Brian Ravenet, Giovanna Varni","doi":"10.1007/s12193-023-00426-5","DOIUrl":"https://doi.org/10.1007/s12193-023-00426-5","url":null,"abstract":"Abstract Transactive memory system (TMS) is a team emergent state representing the knowledge of each member about “who knows what” in a team performing a joint task. We present a study to show how the three TMS dimensions Credibility, Specialisation, Coordination, can be modelled as a linear combination of the nonverbal multimodal features displayed by the team performing the joint task. Results indicate that, to some extent, the three dimensions of TMS can be expressed as a linear combination of nonverbal multimodal features. Moreover, the higher the number of modalities (audio, movement, spatial), the better the modelling. Results could be used in future work to design human-centered computing applications able to automatically estimate TMS from teams’ behavioural patterns, to provide feedback and help teams’ interactions.","PeriodicalId":17529,"journal":{"name":"Journal on Multimodal User Interfaces","volume":"9 8","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135042028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-06DOI: 10.1007/s12193-023-00423-8
Simon Linke, Rolf Bader, Robert Mores
Abstract The most common strategy for interactive sonification is parameter mapping sonification, where sensed or defined data is pre-processed and then used to control one or more variables in a signal processing chain. A well-known but rarely used alternative is model-based sonification, where data is fed into a physical or conceptual model that generates or modifies sound. In this paper, we suggest the Impulse Pattern Formulation (IPF) as a model-based sonification strategy. The IPF can model natural systems and interactions, like the sound production of musical instruments, the reverberation in rooms, and human synchronization to a rhythm. Hence, the IPF has the potential to be easy to interpret and intuitive to interact with. Experiment results show that the IPF is able to produce an intuitively interpretable, natural zero, i.e., a coordinate origin. Coordinate origins are necessary to sonify both polarities of a dimension as well as absolute magnitudes.
{"title":"Model-based sonification based on the impulse pattern formulation","authors":"Simon Linke, Rolf Bader, Robert Mores","doi":"10.1007/s12193-023-00423-8","DOIUrl":"https://doi.org/10.1007/s12193-023-00423-8","url":null,"abstract":"Abstract The most common strategy for interactive sonification is parameter mapping sonification, where sensed or defined data is pre-processed and then used to control one or more variables in a signal processing chain. A well-known but rarely used alternative is model-based sonification, where data is fed into a physical or conceptual model that generates or modifies sound. In this paper, we suggest the Impulse Pattern Formulation (IPF) as a model-based sonification strategy. The IPF can model natural systems and interactions, like the sound production of musical instruments, the reverberation in rooms, and human synchronization to a rhythm. Hence, the IPF has the potential to be easy to interpret and intuitive to interact with. Experiment results show that the IPF is able to produce an intuitively interpretable, natural zero, i.e., a coordinate origin. Coordinate origins are necessary to sonify both polarities of a dimension as well as absolute magnitudes.","PeriodicalId":17529,"journal":{"name":"Journal on Multimodal User Interfaces","volume":"310 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135679078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}