Traditional tools used to evaluate the Quality of Experience (QoE) of users after browsing an ad, using a product, or performing any kind of task typically involves surveys, user testing, and analytics. However, these methods provide limited insights and have limitations due to the need of users’ active cooperation and sincerity, the long testing time, the high cost, and the limited scalability. On this work we present the tools we are developing to automatically evaluate QoE in different use cases such as dashboards that show on real time reactions to different events in the form of emotions and affections predicted by different models based on physiological data. To develop these tools, we require datasets on affective computing. We highlight some limitations of the available ones, the difficulties during the creation of such data, and our current work in the confection of a new one with automatic annotation of ground truth.
{"title":"Towards the Creation of Scalable Tools for automatic Quality of Experience Evaluation and a Multi-Purpose Dataset for Affective Computing","authors":"Juan Antonio De Rus Arance, M. Montagud, M. Cobos","doi":"10.1145/3573381.3596468","DOIUrl":"https://doi.org/10.1145/3573381.3596468","url":null,"abstract":"Traditional tools used to evaluate the Quality of Experience (QoE) of users after browsing an ad, using a product, or performing any kind of task typically involves surveys, user testing, and analytics. However, these methods provide limited insights and have limitations due to the need of users’ active cooperation and sincerity, the long testing time, the high cost, and the limited scalability. On this work we present the tools we are developing to automatically evaluate QoE in different use cases such as dashboards that show on real time reactions to different events in the form of emotions and affections predicted by different models based on physiological data. To develop these tools, we require datasets on affective computing. We highlight some limitations of the available ones, the difficulties during the creation of such data, and our current work in the confection of a new one with automatic annotation of ground truth.","PeriodicalId":120872,"journal":{"name":"Proceedings of the 2023 ACM International Conference on Interactive Media Experiences","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133314391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper contains the research proposal of Juan Antonio De Rus presented at the IMX 23 Doctoral Symposium. Virtual Reality (VR) applications are already used to support diverse tasks such as online meetings, education, or training, and the usages grow every year. To enrich the experience VR scenarios, include multimodal content (video, audio, text, synthetic content) and multi-sensory stimuli are typically included. Tools to evaluate the Quality of Experience (QoE) of such scenarios are needed. Traditional tools used to evaluate the QoE of users performing any kind of task typically involves surveys, user testing or analytics. However, these methods provide limited insights for our tasks with VR and have shortcomings and a limited scalability. In this doctoral study we have formulated a set of open research questions and objectives on which we plan to generate contributions and knowledge in the field of Affective Computing (AC) and Multimodal Interactive Virtual Environments. Hence, in this paper we present a set of tools we are developing to automatically evaluate QoE in different use cases. They include dashboards to monitor in real time reactions to different events in the form of emotions and affections predicted by different models based on physiological data, as well as the creation of a dataset for AC and its associated methodology.
本文包含Juan Antonio De Rus在imx23博士研讨会上提出的研究计划。虚拟现实(VR)应用程序已经用于支持各种任务,如在线会议、教育或培训,并且其使用每年都在增长。为了丰富VR场景的体验,通常包括多模态内容(视频、音频、文本、合成内容)和多感官刺激。需要评估这些场景的体验质量(QoE)的工具。用于评估用户执行任何类型任务的QoE的传统工具通常包括调查、用户测试或分析。然而,这些方法对我们的VR任务提供的见解有限,并且存在缺点和有限的可扩展性。在这项博士研究中,我们制定了一套开放的研究问题和目标,我们计划在情感计算(AC)和多模态交互虚拟环境领域产生贡献和知识。因此,在本文中,我们展示了一组我们正在开发的工具,用于在不同用例中自动评估QoE。它们包括仪表板,用于实时监控人们对不同事件的反应,这些反应以基于生理数据的不同模型预测的情绪和情感的形式出现,以及创建AC数据集及其相关方法。
{"title":"Towards the Creation of Tools for Automatic Quality of Experience Evaluation with Focus on Interactive Virtual Environments","authors":"Juan Antonio De Rus Arance, M. Montagud, M. Cobos","doi":"10.1145/3573381.3596508","DOIUrl":"https://doi.org/10.1145/3573381.3596508","url":null,"abstract":"This paper contains the research proposal of Juan Antonio De Rus presented at the IMX 23 Doctoral Symposium. Virtual Reality (VR) applications are already used to support diverse tasks such as online meetings, education, or training, and the usages grow every year. To enrich the experience VR scenarios, include multimodal content (video, audio, text, synthetic content) and multi-sensory stimuli are typically included. Tools to evaluate the Quality of Experience (QoE) of such scenarios are needed. Traditional tools used to evaluate the QoE of users performing any kind of task typically involves surveys, user testing or analytics. However, these methods provide limited insights for our tasks with VR and have shortcomings and a limited scalability. In this doctoral study we have formulated a set of open research questions and objectives on which we plan to generate contributions and knowledge in the field of Affective Computing (AC) and Multimodal Interactive Virtual Environments. Hence, in this paper we present a set of tools we are developing to automatically evaluate QoE in different use cases. They include dashboards to monitor in real time reactions to different events in the form of emotions and affections predicted by different models based on physiological data, as well as the creation of a dataset for AC and its associated methodology.","PeriodicalId":120872,"journal":{"name":"Proceedings of the 2023 ACM International Conference on Interactive Media Experiences","volume":"168 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133519284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Video streaming is growing exponentially. High-resolution videos require high bandwidth to transport the videos over the network. There is a great demand for compression technologies to compress video and maintain quality. Video codecs are used to encode and decode video streams. These codecs have been developed by MPEG, Google, Microsoft, and Apple Inc. The goal of this research is to develop a technology that will realize contribution transmission through connecting the latest methods generation of single and multi-way video encoding with the new protocols that will provide transmission reliability and keep low latency. A literature review will be carried out. The literature covers different video codecs and transmission techniques and the methods used to evaluate the quality of those techniques and codecs. Based on the literature review, the theoretical framework will be formed, and a video encoding method prototype will be developed. The developed method will be for one-way and multi-way software that will automatically optimize the settings of the video codec to set its operating conditions at optimal. The new transmission software will use newer codecs, such as H265/HEVC, VP9, and AV1, MPEG5, which will allow additional reduction of the bit stream and deliver secure, reliable, and quality video with low latency.
{"title":"Optimization and Evaluation of Emerging Codecs","authors":"Syed Uddin, M. Leszczuk, M. Grega","doi":"10.1145/3573381.3596504","DOIUrl":"https://doi.org/10.1145/3573381.3596504","url":null,"abstract":"Video streaming is growing exponentially. High-resolution videos require high bandwidth to transport the videos over the network. There is a great demand for compression technologies to compress video and maintain quality. Video codecs are used to encode and decode video streams. These codecs have been developed by MPEG, Google, Microsoft, and Apple Inc. The goal of this research is to develop a technology that will realize contribution transmission through connecting the latest methods generation of single and multi-way video encoding with the new protocols that will provide transmission reliability and keep low latency. A literature review will be carried out. The literature covers different video codecs and transmission techniques and the methods used to evaluate the quality of those techniques and codecs. Based on the literature review, the theoretical framework will be formed, and a video encoding method prototype will be developed. The developed method will be for one-way and multi-way software that will automatically optimize the settings of the video codec to set its operating conditions at optimal. The new transmission software will use newer codecs, such as H265/HEVC, VP9, and AV1, MPEG5, which will allow additional reduction of the bit stream and deliver secure, reliable, and quality video with low latency.","PeriodicalId":120872,"journal":{"name":"Proceedings of the 2023 ACM International Conference on Interactive Media Experiences","volume":"416 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114739460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
E. Ennadifi, T. Ravet, M. Mancas, Mohammed El Amine Mokhtari, B. Gosselin
This study explores the potential of enhancing interaction experiences, such as virtual reality (VR) games, through the use of computational attention models. Our proposed approach utilizes a saliency map generated by attention models to dynamically adjust game difficulty levels and to help in the game level design, resulting in a more immersive and engaging experience for users. To inform the development of this approach, we present an experimental setup that is able tp collect data in a VR environment and intends to be able to validate the adaptation of attention models to this domain. Through this work, we aim to create a framework for VR game design that leverages attention models to offer a new level of immersion and engagement for users. We believe our contributions have significant potential to enhance VR experiences and advance the field of game design.
{"title":"Enhancing VR Gaming Experience using Computational Attention Models and Eye-Tracking","authors":"E. Ennadifi, T. Ravet, M. Mancas, Mohammed El Amine Mokhtari, B. Gosselin","doi":"10.1145/3573381.3597218","DOIUrl":"https://doi.org/10.1145/3573381.3597218","url":null,"abstract":"This study explores the potential of enhancing interaction experiences, such as virtual reality (VR) games, through the use of computational attention models. Our proposed approach utilizes a saliency map generated by attention models to dynamically adjust game difficulty levels and to help in the game level design, resulting in a more immersive and engaging experience for users. To inform the development of this approach, we present an experimental setup that is able tp collect data in a VR environment and intends to be able to validate the adaptation of attention models to this domain. Through this work, we aim to create a framework for VR game design that leverages attention models to offer a new level of immersion and engagement for users. We believe our contributions have significant potential to enhance VR experiences and advance the field of game design.","PeriodicalId":120872,"journal":{"name":"Proceedings of the 2023 ACM International Conference on Interactive Media Experiences","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124301034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Virtual Reality Theatre continues to grow as a form of digital creative output for theatre practitioners. However, understanding the common challenges faced during the development of productions by practitioners and the barriers to entry produced by the complexity of platforms is under-researched. This paper provides an in-depth analysis of several challenges identified through semi-structured interviews of practitioners and a thematic review.
{"title":"Identifying the Developmental Challenges of Creating Virtual Reality Theatre: This paper reports on the identified challenges of creating virtual reality theatre from the practitioner's perspective.","authors":"Daniel Lock, B. Kirman","doi":"10.1145/3573381.3603359","DOIUrl":"https://doi.org/10.1145/3573381.3603359","url":null,"abstract":"Virtual Reality Theatre continues to grow as a form of digital creative output for theatre practitioners. However, understanding the common challenges faced during the development of productions by practitioners and the barriers to entry produced by the complexity of platforms is under-researched. This paper provides an in-depth analysis of several challenges identified through semi-structured interviews of practitioners and a thematic review.","PeriodicalId":120872,"journal":{"name":"Proceedings of the 2023 ACM International Conference on Interactive Media Experiences","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121858795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
2D cameras are often used in interactive systems. Other systems like gaming consoles provide more powerful 3D cameras for short range depth sensing. Overall, these cameras are not reliable in large, complex environments. In this work, we propose a 3D stereo vision based pipeline for interactive systems, that is able to handle both ordinary and sensitive applications, through robust scene understanding. We explore the fusion of multiple 3D cameras to do full scene reconstruction, which allows for preforming a wide range of tasks, like event recognition, subject tracking, and notification. Using possible feedback approaches, the system can receive data from the subjects present in the environment, to learn to make better decisions, or to adapt to completely new environments. Throughout the paper, we introduce the pipeline and explain our preliminary experimentation and results. Finally, we draw the roadmap for the next steps that need to be taken, in order to get this pipeline into production.
{"title":"Feedback Driven Multi Stereo Vision System for Real-Time Event Analysis","authors":"Mohamed Benkedadra, M. Mancas, S. Mahmoudi","doi":"10.1145/3573381.3597220","DOIUrl":"https://doi.org/10.1145/3573381.3597220","url":null,"abstract":"2D cameras are often used in interactive systems. Other systems like gaming consoles provide more powerful 3D cameras for short range depth sensing. Overall, these cameras are not reliable in large, complex environments. In this work, we propose a 3D stereo vision based pipeline for interactive systems, that is able to handle both ordinary and sensitive applications, through robust scene understanding. We explore the fusion of multiple 3D cameras to do full scene reconstruction, which allows for preforming a wide range of tasks, like event recognition, subject tracking, and notification. Using possible feedback approaches, the system can receive data from the subjects present in the environment, to learn to make better decisions, or to adapt to completely new environments. Throughout the paper, we introduce the pipeline and explain our preliminary experimentation and results. Finally, we draw the roadmap for the next steps that need to be taken, in order to get this pipeline into production.","PeriodicalId":120872,"journal":{"name":"Proceedings of the 2023 ACM International Conference on Interactive Media Experiences","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129394615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mickael Lafontaine, Julie Cloarec-Michaud, Kévin Riou, Yujie Huang, Kaiwen Dong, P. Le Callet
Digital tools offer extensive solutions to explore novel interactive-art paradigms, by relying on various sensors to create installations and performances where the human activity can be captured, analysed and used to generate visual and sound universes in real-time. Deep learning approaches, including human detection and human pose estimation, constitute ideal human-art interaction mediums, as they allow automatic human gesture analysis, which can be directly used to produce the interactive piece of art. In this context, this paper presents an interactive work of art that explores the relationship between thought and movement by combining dance, philosophy, numerical arts, and deep learning. We present a novel system that combines a multi-camera setup to capture human movement, state-of-the-art human pose estimation models to automatically analyze this movement, and an immersive 180° projection system that projects a dynamic textual content that intuitively responds to the users’ behaviors. The demonstration being proposed consists of two parts. Firstly, a professional dancer will utilize the proposed setup to deliver a conference-show. Secondly, the audience will be given the opportunity to experiment and discover the potential of the proposed setup, which has been transformed into an interactive installation. This allows multiple spectators to engage simultaneously with clusters of words and letters extracted from the conference text.
{"title":"Kinetic particles : from human pose estimation to an immersive and interactive piece of art questionning thought-movement relationships.","authors":"Mickael Lafontaine, Julie Cloarec-Michaud, Kévin Riou, Yujie Huang, Kaiwen Dong, P. Le Callet","doi":"10.1145/3573381.3597228","DOIUrl":"https://doi.org/10.1145/3573381.3597228","url":null,"abstract":"Digital tools offer extensive solutions to explore novel interactive-art paradigms, by relying on various sensors to create installations and performances where the human activity can be captured, analysed and used to generate visual and sound universes in real-time. Deep learning approaches, including human detection and human pose estimation, constitute ideal human-art interaction mediums, as they allow automatic human gesture analysis, which can be directly used to produce the interactive piece of art. In this context, this paper presents an interactive work of art that explores the relationship between thought and movement by combining dance, philosophy, numerical arts, and deep learning. We present a novel system that combines a multi-camera setup to capture human movement, state-of-the-art human pose estimation models to automatically analyze this movement, and an immersive 180° projection system that projects a dynamic textual content that intuitively responds to the users’ behaviors. The demonstration being proposed consists of two parts. Firstly, a professional dancer will utilize the proposed setup to deliver a conference-show. Secondly, the audience will be given the opportunity to experiment and discover the potential of the proposed setup, which has been transformed into an interactive installation. This allows multiple spectators to engage simultaneously with clusters of words and letters extracted from the conference text.","PeriodicalId":120872,"journal":{"name":"Proceedings of the 2023 ACM International Conference on Interactive Media Experiences","volume":" 61","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114060660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zachary Mckendrick, Lior Somin, Patrick Finn, E. Sharlin
Contemporary performance artists use Virtual Reality (VR) tools to create immersive narratives and extend the boundaries of traditional performance mediums. As the medium evolves, performance practice is changing with it. Our work explores ways to leverage VR to support the creative process by introducing the Virtual Rehearsal Suite (VRS) that provides users with the experience of a large-scale rehearsal or performance environment while occupying limited physical space and minimal real-world obstructions. In this paper, we discuss findings from scene study experiments conducted within the VRS. In addition, we contribute our thresholding protocols a framework designed to support user transitions into and out of VR experiences. Our integrated approach to digital performance practice and creative collaboration combines traditional and contemporary acting techniques with HCI research to harness the innovative capabilities of virtual reality technologies creating accessible, immersive experiences for actors while facilitating user presence through state change protocols.
{"title":"Virtual Rehearsal Suite: An Environment and Framework for Virtual Performance Practice","authors":"Zachary Mckendrick, Lior Somin, Patrick Finn, E. Sharlin","doi":"10.1145/3573381.3596158","DOIUrl":"https://doi.org/10.1145/3573381.3596158","url":null,"abstract":"Contemporary performance artists use Virtual Reality (VR) tools to create immersive narratives and extend the boundaries of traditional performance mediums. As the medium evolves, performance practice is changing with it. Our work explores ways to leverage VR to support the creative process by introducing the Virtual Rehearsal Suite (VRS) that provides users with the experience of a large-scale rehearsal or performance environment while occupying limited physical space and minimal real-world obstructions. In this paper, we discuss findings from scene study experiments conducted within the VRS. In addition, we contribute our thresholding protocols a framework designed to support user transitions into and out of VR experiences. Our integrated approach to digital performance practice and creative collaboration combines traditional and contemporary acting techniques with HCI research to harness the innovative capabilities of virtual reality technologies creating accessible, immersive experiences for actors while facilitating user presence through state change protocols.","PeriodicalId":120872,"journal":{"name":"Proceedings of the 2023 ACM International Conference on Interactive Media Experiences","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124294206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
C. Jost, Justin Debloos, Brigitte Le Pévédic, G. Uzan
This paper presents the advancement of the PRIM project that aims at giving the power to non-computer experts to create digital and interactive scenarios. For this purpose, we explored the strengths and limitations of the visual programming languages and the mulsemedia editors chosen for their ease of use. Results led to a criteria list that our final solution should meet, named the ICAMUS criteria for Interface, Combinatorics, Affordance, Modularity, Ubiquity, and Synoptic. This paper proposes a scale based on the ICAMUS criteria that may assess the Interactive and Multisensory Authoring Tools (IMAT scale). Last, this paper discusses how to compute a score based on three metrics (presence/absence of elements, number of clicks to do an action, and time needed to do this action) and the visual representation of this score that has to give a complete profile of the tool. We hypothesize that this scale will be able to highlight the complementarities of visual programming languages and mulsemedia editors as well as the challenges to face.
{"title":"ICAMUS: Evaluation Criteria of an Interactive Multisensory Authoring Tool","authors":"C. Jost, Justin Debloos, Brigitte Le Pévédic, G. Uzan","doi":"10.1145/3573381.3596472","DOIUrl":"https://doi.org/10.1145/3573381.3596472","url":null,"abstract":"This paper presents the advancement of the PRIM project that aims at giving the power to non-computer experts to create digital and interactive scenarios. For this purpose, we explored the strengths and limitations of the visual programming languages and the mulsemedia editors chosen for their ease of use. Results led to a criteria list that our final solution should meet, named the ICAMUS criteria for Interface, Combinatorics, Affordance, Modularity, Ubiquity, and Synoptic. This paper proposes a scale based on the ICAMUS criteria that may assess the Interactive and Multisensory Authoring Tools (IMAT scale). Last, this paper discusses how to compute a score based on three metrics (presence/absence of elements, number of clicks to do an action, and time needed to do this action) and the visual representation of this score that has to give a complete profile of the tool. We hypothesize that this scale will be able to highlight the complementarities of visual programming languages and mulsemedia editors as well as the challenges to face.","PeriodicalId":120872,"journal":{"name":"Proceedings of the 2023 ACM International Conference on Interactive Media Experiences","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131010477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Abe, Shoko Fujii, Hideya Mino, Jun Goto, G. Ohtake, S. Fujitsu, Kinji Matsumura, H. Fujisawa
Subtitles and closed captions, which are prepared for hearing-impaired users, are now widely used by users without hearing concerns. In this paper, we focus on the adaptation of subtitles and captions for non-hearing-impaired users, particularly the adaptation of the kanji ruby. From our experiments on non-hearing-impaired adults, Welch’s t-test was used to clarify whether listening to audio with the same content affects the necessity of kanji ruby. In addition, we proposed and evaluated an adaptive model to predict whether ruby should be added to kanji captions based on the experimental results. The experimental results suggest that not only the difficulty of the kanji and the user’s kanji ability, but also the content of the audio is important for the optimization of kanji ruby.
{"title":"Survey on the Impact of Listening to Audio for Adaptive Japanese Subtitles and Captions Ruby","authors":"S. Abe, Shoko Fujii, Hideya Mino, Jun Goto, G. Ohtake, S. Fujitsu, Kinji Matsumura, H. Fujisawa","doi":"10.1145/3573381.3596456","DOIUrl":"https://doi.org/10.1145/3573381.3596456","url":null,"abstract":"Subtitles and closed captions, which are prepared for hearing-impaired users, are now widely used by users without hearing concerns. In this paper, we focus on the adaptation of subtitles and captions for non-hearing-impaired users, particularly the adaptation of the kanji ruby. From our experiments on non-hearing-impaired adults, Welch’s t-test was used to clarify whether listening to audio with the same content affects the necessity of kanji ruby. In addition, we proposed and evaluated an adaptive model to predict whether ruby should be added to kanji captions based on the experimental results. The experimental results suggest that not only the difficulty of the kanji and the user’s kanji ability, but also the content of the audio is important for the optimization of kanji ruby.","PeriodicalId":120872,"journal":{"name":"Proceedings of the 2023 ACM International Conference on Interactive Media Experiences","volume":"136 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133616185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}