ACM Transactions on Interactive Intelligent Systems最新文献_第5页

RadarSense: Accurate Recognition of Mid-air Hand Gestures with Radar Sensing and Few Training Examples RadarSense:基于雷达传感的空中手势的准确识别和少量训练实例

IF 3.4 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Interactive Intelligent Systems

Pub Date : 2023-03-31 DOI: 10.1145/3589645

Arthur Sluÿters, S. Lambot, J. Vanderdonckt, Radu-Daniel Vatavu

Microwave radars bring many benefits to mid-air gesture sensing due to their large field of view and independence from environmental conditions, such as ambient light and occlusion. However, radar signals are highly dimensional and usually require complex deep learning approaches. To understand this landscape, we report results from a systematic literature review of (N=118) scientific papers on radar sensing, unveiling a large variety of radar technology of different operating frequencies and bandwidths and antenna configurations but also various gesture recognition techniques. Although highly accurate, these techniques require a large amount of training data that depend on the type of radar. Therefore, the training results cannot be easily transferred to other radars. To address this aspect, we introduce a new gesture recognition pipeline that implements advanced full-wave electromagnetic modeling and inversion to retrieve physical characteristics of gestures that are radar independent, i.e., independent of the source, antennas, and radar-hand interactions. Inversion of radar signals further reduces the size of the dataset by several orders of magnitude, while preserving the essential information. This approach is compatible with conventional gesture recognizers, such as those based on template matching, which only need a few training examples to deliver high recognition accuracy rates. To evaluate our gesture recognition pipeline, we conducted user-dependent and user-independent evaluations on a dataset of 16 gesture types collected with the Walabot, a low-cost off-the-shelf array radar. We contrast these results with those obtained for the same gesture types collected with an ultra-wideband radar made of a vector network analyzer with a single horn antenna and with a computer vision sensor, respectively. Based on our findings, we suggest some design implications to support future development in radar-based gesture recognition.

微波雷达由于其大视场和不受环境条件(如环境光和遮挡)的影响，为空中手势传感带来了许多好处。然而，雷达信号是高度多维的，通常需要复杂的深度学习方法。为了了解这一情况，我们报告了对(N=118)篇雷达传感科学论文的系统文献综述的结果，揭示了各种不同工作频率、带宽和天线配置的雷达技术，以及各种手势识别技术。虽然准确度很高，但这些技术需要大量的训练数据，这取决于雷达的类型。因此，训练结果不容易转移到其他雷达。为了解决这方面的问题，我们引入了一种新的手势识别管道，该管道实现了先进的全波电磁建模和反演，以检索与雷达无关的手势的物理特征，即与源、天线和雷达手交互无关。雷达信号的反演在保留基本信息的前提下，将数据集的大小进一步降低了几个数量级。该方法与传统的基于模板匹配的手势识别器兼容，只需少量的训练样例即可提供较高的识别准确率。为了评估我们的手势识别管道，我们对使用低成本的现成阵列雷达Walabot收集的16种手势类型的数据集进行了用户依赖和用户独立的评估。我们将这些结果与由带有单喇叭天线的矢量网络分析仪和计算机视觉传感器组成的超宽带雷达收集的相同手势类型的结果进行了对比。基于我们的研究结果，我们提出了一些设计建议，以支持基于雷达的手势识别的未来发展。

{"title":"RadarSense: Accurate Recognition of Mid-air Hand Gestures with Radar Sensing and Few Training Examples","authors":"Arthur Sluÿters, S. Lambot, J. Vanderdonckt, Radu-Daniel Vatavu","doi":"10.1145/3589645","DOIUrl":"https://doi.org/10.1145/3589645","url":null,"abstract":"Microwave radars bring many benefits to mid-air gesture sensing due to their large field of view and independence from environmental conditions, such as ambient light and occlusion. However, radar signals are highly dimensional and usually require complex deep learning approaches. To understand this landscape, we report results from a systematic literature review of (N=118) scientific papers on radar sensing, unveiling a large variety of radar technology of different operating frequencies and bandwidths and antenna configurations but also various gesture recognition techniques. Although highly accurate, these techniques require a large amount of training data that depend on the type of radar. Therefore, the training results cannot be easily transferred to other radars. To address this aspect, we introduce a new gesture recognition pipeline that implements advanced full-wave electromagnetic modeling and inversion to retrieve physical characteristics of gestures that are radar independent, i.e., independent of the source, antennas, and radar-hand interactions. Inversion of radar signals further reduces the size of the dataset by several orders of magnitude, while preserving the essential information. This approach is compatible with conventional gesture recognizers, such as those based on template matching, which only need a few training examples to deliver high recognition accuracy rates. To evaluate our gesture recognition pipeline, we conducted user-dependent and user-independent evaluations on a dataset of 16 gesture types collected with the Walabot, a low-cost off-the-shelf array radar. We contrast these results with those obtained for the same gesture types collected with an ultra-wideband radar made of a vector network analyzer with a single horn antenna and with a computer vision sensor, respectively. Based on our findings, we suggest some design implications to support future development in radar-based gesture recognition.","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":"40 1","pages":"1 - 45"},"PeriodicalIF":3.4,"publicationDate":"2023-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86834309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

RadarSense: Accurate Recognition of Mid-Air Hand Gestures with Radar Sensing and Few Training Examples RadarSense:基于雷达传感的空中手势的准确识别和少量训练实例

IF 3.4 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Interactive Intelligent Systems

Pub Date : 2023-03-31 DOI: https://dl.acm.org/doi/10.1145/3589645

Arthur SluŸters, Sébastien Lambot, Jean Vanderdonckt, Radu-Daniel Vatavu

Microwave radars bring many benefits to mid-air gesture sensing due to their large field of view and independence from environmental conditions, such as ambient light and occlusion. However, radar signals are highly dimensional and usually require complex deep learning approaches. To understand this landscape, we report results from a systematic literature review of (N = 118) scientific papers on radar sensing, unveiling a large variety of radar technology of different operating frequencies and bandwidths, antenna configurations, but also various gesture recognition techniques. Although highly accurate, these techniques require a large amount of training data that depend on the type of radar. Therefore, the training results cannot be easily transferred to other radars. To address this aspect, we introduce a new gesture recognition pipeline that implements advanced full-wave electromagnetic modeling and inversion to retrieve physical characteristics of gestures that are radar independent, i.e., independent of the source, antennas, and radar-hand interactions. Inversion of radar signals further reduces the size of the dataset by several orders of magnitude, while preserving the essential information. This approach is compatible with conventional gesture recognizers, such as those based on template matching, which only need a few training examples to deliver high recognition accuracy rates. To evaluate our gesture recognition pipeline, we conducted user-dependent and user-independent evaluations on a dataset of 16 gesture types collected with the Walabot, a low-cost off-the-shelf array radar. We contrast these results with those obtained for the same gesture types collected with an ultra-wideband radar made of a vector network analyzer with a single horn antenna and with a computer vision sensor, respectively. Based on our findings, we suggest some design implications to support future development in radar-based gesture recognition.

微波雷达由于其大视场和不受环境条件(如环境光和遮挡)的影响，为空中手势传感带来了许多好处。然而，雷达信号是高度多维的，通常需要复杂的深度学习方法。为了了解这一情况，我们报告了对(N = 118)篇雷达传感科学论文的系统文献综述的结果，揭示了各种不同工作频率和带宽的雷达技术，天线配置，以及各种手势识别技术。虽然准确度很高，但这些技术需要大量的训练数据，这取决于雷达的类型。因此，训练结果不容易转移到其他雷达。为了解决这方面的问题，我们引入了一种新的手势识别管道，该管道实现了先进的全波电磁建模和反演，以检索与雷达无关的手势的物理特征，即与源、天线和雷达手交互无关。雷达信号的反演在保留基本信息的前提下，将数据集的大小进一步降低了几个数量级。该方法与传统的基于模板匹配的手势识别器兼容，只需少量的训练样例即可提供较高的识别准确率。为了评估我们的手势识别管道，我们对使用低成本的现成阵列雷达Walabot收集的16种手势类型的数据集进行了用户依赖和用户独立的评估。我们将这些结果与由带有单喇叭天线的矢量网络分析仪和计算机视觉传感器组成的超宽带雷达收集的相同手势类型的结果进行了对比。基于我们的研究结果，我们提出了一些设计建议，以支持基于雷达的手势识别的未来发展。

{"title":"RadarSense: Accurate Recognition of Mid-Air Hand Gestures with Radar Sensing and Few Training Examples","authors":"Arthur SluŸters, Sébastien Lambot, Jean Vanderdonckt, Radu-Daniel Vatavu","doi":"https://dl.acm.org/doi/10.1145/3589645","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3589645","url":null,"abstract":"Microwave radars bring many benefits to mid-air gesture sensing due to their large field of view and independence from environmental conditions, such as ambient light and occlusion. However, radar signals are highly dimensional and usually require complex deep learning approaches. To understand this landscape, we report results from a systematic literature review of (N = 118) scientific papers on radar sensing, unveiling a large variety of radar technology of different operating frequencies and bandwidths, antenna configurations, but also various gesture recognition techniques. Although highly accurate, these techniques require a large amount of training data that depend on the type of radar. Therefore, the training results cannot be easily transferred to other radars. To address this aspect, we introduce a new gesture recognition pipeline that implements advanced full-wave electromagnetic modeling and inversion to retrieve physical characteristics of gestures that are radar independent, i.e., independent of the source, antennas, and radar-hand interactions. Inversion of radar signals further reduces the size of the dataset by several orders of magnitude, while preserving the essential information. This approach is compatible with conventional gesture recognizers, such as those based on template matching, which only need a few training examples to deliver high recognition accuracy rates. To evaluate our gesture recognition pipeline, we conducted user-dependent and user-independent evaluations on a dataset of 16 gesture types collected with the Walabot, a low-cost off-the-shelf array radar. We contrast these results with those obtained for the same gesture types collected with an ultra-wideband radar made of a vector network analyzer with a single horn antenna and with a computer vision sensor, respectively. Based on our findings, we suggest some design implications to support future development in radar-based gesture recognition.","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":"51 4","pages":""},"PeriodicalIF":3.4,"publicationDate":"2023-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138508028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

LIMEADE: From AI Explanations to Advice Taking 莱姆德:从人工智能解释到建议采纳

IF 3.4 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Interactive Intelligent Systems

Pub Date : 2023-03-28 DOI: https://dl.acm.org/doi/10.1145/3589345

Benjamin Charles Germain Lee, Doug Downey, Kyle Lo, Daniel S. Weld

Research in human-centered AI has shown the benefits of systems that can explain their predictions. Methods that allow an AI to take advice from humans in response to explanations are similarly useful. While both capabilities are well-developed for transparent learning models (e.g., linear models and GA²Ms), and recent techniques (e.g., LIME and SHAP) can generate explanations for opaque models, little attention has been given to advice methods for opaque models. This paper introduces LIMEADE, the first general framework that translates both positive and negative advice (expressed using high-level vocabulary such as that employed by post-hoc explanations) into an update to an arbitrary, underlying opaque model. We demonstrate the generality of our approach with case studies on seventy real-world models across two broad domains: image classification and text recommendation. We show our method improves accuracy compared to a rigorous baseline on the image classification domains. For the text modality, we apply our framework to a neural recommender system for scientific papers on a public website; our user study shows that our framework leads to significantly higher perceived user control, trust, and satisfaction.

以人为中心的人工智能研究表明，能够解释其预测的系统的好处。允许人工智能在回应解释时听取人类建议的方法同样有用。虽然这两种能力都是为透明学习模型(例如，线性模型和GA2Ms)开发的，并且最近的技术(例如，LIME和SHAP)可以为不透明模型生成解释，但很少有人关注不透明模型的建议方法。本文介绍了LIMEADE，这是第一个将积极和消极建议(使用高级词汇，如事后解释所使用的词汇)转换为对任意的、底层不透明模型的更新的通用框架。我们通过70个真实世界模型的案例研究展示了我们方法的通用性，这些模型跨越两个广泛的领域:图像分类和文本推荐。我们表明，与严格的基线图像分类域相比，我们的方法提高了精度。对于文本模态，我们将该框架应用于公共网站科学论文的神经推荐系统;我们的用户研究表明，我们的框架显著提高了用户控制、信任和满意度。

{"title":"LIMEADE: From AI Explanations to Advice Taking","authors":"Benjamin Charles Germain Lee, Doug Downey, Kyle Lo, Daniel S. Weld","doi":"https://dl.acm.org/doi/10.1145/3589345","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3589345","url":null,"abstract":"Research in human-centered AI has shown the benefits of systems that can explain their predictions. Methods that allow an AI to take advice from humans in response to explanations are similarly useful. While both capabilities are well-developed for transparent learning models (e.g., linear models and GA2Ms), and recent techniques (e.g., LIME and SHAP) can generate explanations for opaque models, little attention has been given to advice methods for opaque models. This paper introduces LIMEADE, the first general framework that translates both positive and negative advice (expressed using high-level vocabulary such as that employed by post-hoc explanations) into an update to an arbitrary, underlying opaque model. We demonstrate the generality of our approach with case studies on seventy real-world models across two broad domains: image classification and text recommendation. We show our method improves accuracy compared to a rigorous baseline on the image classification domains. For the text modality, we apply our framework to a neural recommender system for scientific papers on a public website; our user study shows that our framework leads to significantly higher perceived user control, trust, and satisfaction.","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":"52 5","pages":""},"PeriodicalIF":3.4,"publicationDate":"2023-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138508023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Crowdsourcing Thumbnail Captions: Data Collection and Validation 众包缩略图说明:数据收集和验证

IF 3.4 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Interactive Intelligent Systems

Pub Date : 2023-03-28 DOI: https://dl.acm.org/doi/10.1145/3589346

Carlos Aguirre, Shiye Cao, Amama Mahmood, Chien-Ming Huang

Speech interfaces, such as personal assistants and screen readers, read image captions to users—but typically only one caption is available per image, which may not be adequate for all situations (e.g., browsing large quantities of images). Long captions provide a deeper understanding of an image but require more time to listen to, whereas shorter captions may not allow for such thorough comprehension, yet have the advantage of being faster to consume. We explore how to effectively collect both thumbnail captions—succinct image descriptions meant to be consumed quickly—and comprehensive captions—which allow individuals to understand visual content in greater detail; we consider text-based instructions and time-constrained methods to collect descriptions at these two levels of detail and find that a time-constrained method is the most effective for collecting thumbnail captions while preserving caption accuracy. Additionally, we verify that caption authors using this time-constrained method are still able to focus on the most important regions of an image by tracking their eye gaze. We evaluate our collected captions along human-rated axes—correctness, fluency, amount of detail, and mentions of important concepts—and discuss the potential for model-based metrics to perform large-scale automatic evaluations in the future.

语音界面，如个人助理和屏幕阅读器，会向用户读取图像标题，但通常每个图像只有一个标题可用，这可能不适用于所有情况(例如，浏览大量图像)。较长的字幕提供了对图像更深入的理解，但需要更多的时间来听，而较短的字幕可能不允许如此彻底的理解，但具有更快的消费优势。我们探讨了如何有效地收集缩略图标题(简洁的图像描述，旨在快速消费)和综合标题(允许个人更详细地理解视觉内容);我们考虑了基于文本的指令和时间约束的方法来收集这两个细节级别的描述，并发现时间约束的方法在收集缩略图标题的同时保持标题的准确性是最有效的。此外，我们验证了使用这种时间约束方法的标题作者仍然能够通过跟踪他们的眼睛注视来关注图像中最重要的区域。我们按照人类评定的标准(正确性、流畅性、细节数量和重要概念的提及)评估收集到的标题，并讨论基于模型的指标在未来执行大规模自动评估的潜力。

{"title":"Crowdsourcing Thumbnail Captions: Data Collection and Validation","authors":"Carlos Aguirre, Shiye Cao, Amama Mahmood, Chien-Ming Huang","doi":"https://dl.acm.org/doi/10.1145/3589346","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3589346","url":null,"abstract":"Speech interfaces, such as personal assistants and screen readers, read image captions to users—but typically only one caption is available per image, which may not be adequate for all situations (e.g., browsing large quantities of images). Long captions provide a deeper understanding of an image but require more time to listen to, whereas shorter captions may not allow for such thorough comprehension, yet have the advantage of being faster to consume. We explore how to effectively collect both thumbnail captions—succinct image descriptions meant to be consumed quickly—and comprehensive captions—which allow individuals to understand visual content in greater detail; we consider text-based instructions and time-constrained methods to collect descriptions at these two levels of detail and find that a time-constrained method is the most effective for collecting thumbnail captions while preserving caption accuracy. Additionally, we verify that caption authors using this time-constrained method are still able to focus on the most important regions of an image by tracking their eye gaze. We evaluate our collected captions along human-rated axes—correctness, fluency, amount of detail, and mentions of important concepts—and discuss the potential for model-based metrics to perform large-scale automatic evaluations in the future.","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":"34 1-2","pages":""},"PeriodicalIF":3.4,"publicationDate":"2023-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138508068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Crowdsourcing Thumbnail Captions: Data Collection and Validation 众包缩略图说明:数据收集和验证

IF 3.4 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Interactive Intelligent Systems

Pub Date : 2023-03-28 DOI: 10.1145/3589346

Carlos Alejandro Aguirre, Shiye Cao, Amama Mahmood, Chien-Ming Huang

Speech interfaces, such as personal assistants and screen readers, read image captions to users. Typically, however, only one caption is available per image, which may not be adequate for all situations (e.g., browsing large quantities of images). Long captions provide a deeper understanding of an image but require more time to listen to, whereas shorter captions may not allow for such thorough comprehension yet have the advantage of being faster to consume. We explore how to effectively collect both thumbnail captions—succinct image descriptions meant to be consumed quickly—and comprehensive captions—which allow individuals to understand visual content in greater detail. We consider text-based instructions and time-constrained methods to collect descriptions at these two levels of detail and find that a time-constrained method is the most effective for collecting thumbnail captions while preserving caption accuracy. Additionally, we verify that caption authors using this time-constrained method are still able to focus on the most important regions of an image by tracking their eye gaze. We evaluate our collected captions along human-rated axes—correctness, fluency, amount of detail, and mentions of important concepts—and discuss the potential for model-based metrics to perform large-scale automatic evaluations in the future.

语音界面，如个人助理和屏幕阅读器，会向用户读取图像说明。然而，通常情况下，每张图片只有一个标题，这可能不适用于所有情况(例如，浏览大量图片)。较长的字幕提供了对图像更深入的理解，但需要更多的时间来听，而较短的字幕可能不允许如此彻底的理解，但具有更快的消费优势。我们探讨了如何有效地收集缩略图标题(简洁的图像描述，旨在快速消费)和综合标题(允许个人更详细地理解视觉内容)。我们考虑了基于文本的指令和时间约束的方法来收集这两个细节级别的描述，并发现时间约束的方法在收集缩略图标题的同时保持标题的准确性是最有效的。此外，我们验证了使用这种时间约束方法的标题作者仍然能够通过跟踪他们的眼睛注视来关注图像中最重要的区域。我们按照人类评定的标准(正确性、流畅性、细节数量和重要概念的提及)评估收集到的标题，并讨论基于模型的指标在未来执行大规模自动评估的潜力。

{"title":"Crowdsourcing Thumbnail Captions: Data Collection and Validation","authors":"Carlos Alejandro Aguirre, Shiye Cao, Amama Mahmood, Chien-Ming Huang","doi":"10.1145/3589346","DOIUrl":"https://doi.org/10.1145/3589346","url":null,"abstract":"Speech interfaces, such as personal assistants and screen readers, read image captions to users. Typically, however, only one caption is available per image, which may not be adequate for all situations (e.g., browsing large quantities of images). Long captions provide a deeper understanding of an image but require more time to listen to, whereas shorter captions may not allow for such thorough comprehension yet have the advantage of being faster to consume. We explore how to effectively collect both thumbnail captions—succinct image descriptions meant to be consumed quickly—and comprehensive captions—which allow individuals to understand visual content in greater detail. We consider text-based instructions and time-constrained methods to collect descriptions at these two levels of detail and find that a time-constrained method is the most effective for collecting thumbnail captions while preserving caption accuracy. Additionally, we verify that caption authors using this time-constrained method are still able to focus on the most important regions of an image by tracking their eye gaze. We evaluate our collected captions along human-rated axes—correctness, fluency, amount of detail, and mentions of important concepts—and discuss the potential for model-based metrics to perform large-scale automatic evaluations in the future.","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":"14 4 1","pages":"1 - 28"},"PeriodicalIF":3.4,"publicationDate":"2023-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89153840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

How do Users Experience Traceability of AI Systems? Examining Subjective Information Processing Awareness in Automated Insulin Delivery (AID) Systems 用户如何体验人工智能系统的可追溯性?胰岛素自动输送(AID)系统的主观信息处理意识研究

IF 3.4 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Interactive Intelligent Systems

Pub Date : 2023-03-24 DOI: https://dl.acm.org/doi/10.1145/3588594

Tim Schrills, Thomas Franke

When interacting with artificial intelligence (AI) in the medical domain, users frequently face automated information processing, which can remain opaque to them. For example, users with diabetes may interact daily with automated insulin delivery (AID). However, effective AID therapy requires traceability of automated decisions for diverse users. Grounded in research on human-automation interaction, we study Subjective Information Processing Awareness (SIPA) as a key construct to research users’ experience of explainable AI. The objective of the present research was to examine how users experience differing levels of traceability of an AI algorithm. We developed a basic AID simulation to create realistic scenarios for an experiment with N = 80, where we examined the effect of three levels of information disclosure on SIPA and performance. Attributes serving as the basis for insulin needs calculation were shown to users, who predicted the AID system’s calculation after over 60 observations. Results showed a difference in SIPA after repeated observations, associated with a general decline of SIPA ratings over time. Supporting scale validity, SIPA was strongly correlated with trust and satisfaction with explanations. The present research indicates that the effect of different levels of information disclosure may need several repetitions before it manifests. Additionally, high levels of information disclosure may lead to a miscalibration between SIPA and performance in predicting the system’s results. The results indicate that for a responsible design of XAI, system designers could utilize prediction tasks in order to calibrate experienced traceability.

在与医疗领域的人工智能(AI)交互时，用户经常面临自动化的信息处理，而这些信息对他们来说可能仍然是不透明的。例如，糖尿病患者可能每天都与自动胰岛素输送(AID)互动。然而，有效的艾滋病治疗需要不同用户的自动决策的可追溯性。在人机交互研究的基础上，我们研究了主观信息处理意识(SIPA)作为研究可解释人工智能用户体验的关键结构。本研究的目的是研究用户如何体验不同程度的人工智能算法的可追溯性。我们开发了一个基本的AID模拟，为一个N = 80的实验创建了真实的场景，在这个实验中，我们检查了三个级别的信息披露对SIPA和绩效的影响。将作为胰岛素需求计算基础的属性显示给用户，用户经过60多次观察后预测AID系统的计算结果。反复观察后，结果显示SIPA的差异，与SIPA评分随时间的普遍下降有关。支持量表效度，SIPA与信任和对解释的满意度呈强相关。本研究表明，不同层次的信息披露效应可能需要多次重复才能显现。此外，高水平的信息披露可能导致SIPA与预测系统结果的性能之间的错误校准。结果表明，对于负责任的XAI设计，系统设计者可以利用预测任务来校准经验可追溯性。

{"title":"How do Users Experience Traceability of AI Systems? Examining Subjective Information Processing Awareness in Automated Insulin Delivery (AID) Systems","authors":"Tim Schrills, Thomas Franke","doi":"https://dl.acm.org/doi/10.1145/3588594","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3588594","url":null,"abstract":"When interacting with artificial intelligence (AI) in the medical domain, users frequently face automated information processing, which can remain opaque to them. For example, users with diabetes may interact daily with automated insulin delivery (AID). However, effective AID therapy requires traceability of automated decisions for diverse users. Grounded in research on human-automation interaction, we study Subjective Information Processing Awareness (SIPA) as a key construct to research users’ experience of explainable AI. The objective of the present research was to examine how users experience differing levels of traceability of an AI algorithm. We developed a basic AID simulation to create realistic scenarios for an experiment with N = 80, where we examined the effect of three levels of information disclosure on SIPA and performance. Attributes serving as the basis for insulin needs calculation were shown to users, who predicted the AID system’s calculation after over 60 observations. Results showed a difference in SIPA after repeated observations, associated with a general decline of SIPA ratings over time. Supporting scale validity, SIPA was strongly correlated with trust and satisfaction with explanations. The present research indicates that the effect of different levels of information disclosure may need several repetitions before it manifests. Additionally, high levels of information disclosure may lead to a miscalibration between SIPA and performance in predicting the system’s results. The results indicate that for a responsible design of XAI, system designers could utilize prediction tasks in order to calibrate experienced traceability.","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":"3 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2023-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138508071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

How do Users Experience Traceability of AI Systems? Examining Subjective Information Processing Awareness in Automated Insulin Delivery (AID) Systems 用户如何体验人工智能系统的可追溯性?胰岛素自动输送(AID)系统的主观信息处理意识研究

IF 3.4 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Interactive Intelligent Systems

Pub Date : 2023-03-24 DOI: 10.1145/3588594

Tim Schrills, T. Franke

When interacting with artificial intelligence (AI) in the medical domain, users frequently face automated information processing, which can remain opaque to them. For example, users with diabetes may interact daily with automated insulin delivery (AID). However, effective AID therapy requires traceability of automated decisions for diverse users. Grounded in research on human-automation interaction, we study Subjective Information Processing Awareness (SIPA) as a key construct to research users’ experience of explainable AI. The objective of the present research was to examine how users experience differing levels of traceability of an AI algorithm. We developed a basic AID simulation to create realistic scenarios for an experiment with N = 80, where we examined the effect of three levels of information disclosure on SIPA and performance. Attributes serving as the basis for insulin needs calculation were shown to users, who predicted the AID system’s calculation after over 60 observations. Results showed a difference in SIPA after repeated observations, associated with a general decline of SIPA ratings over time. Supporting scale validity, SIPA was strongly correlated with trust and satisfaction with explanations. The present research indicates that the effect of different levels of information disclosure may need several repetitions before it manifests. Additionally, high levels of information disclosure may lead to a miscalibration between SIPA and performance in predicting the system’s results. The results indicate that for a responsible design of XAI, system designers could utilize prediction tasks in order to calibrate experienced traceability.

在与医疗领域的人工智能(AI)交互时，用户经常面临自动化的信息处理，而这些信息对他们来说可能仍然是不透明的。例如，糖尿病患者可能每天都与自动胰岛素输送(AID)互动。然而，有效的艾滋病治疗需要不同用户的自动决策的可追溯性。在人机交互研究的基础上，我们研究了主观信息处理意识(SIPA)作为研究可解释人工智能用户体验的关键结构。本研究的目的是研究用户如何体验不同程度的人工智能算法的可追溯性。我们开发了一个基本的AID模拟，为一个N = 80的实验创建了真实的场景，在这个实验中，我们检查了三个级别的信息披露对SIPA和绩效的影响。将作为胰岛素需求计算基础的属性显示给用户，用户经过60多次观察后预测AID系统的计算结果。反复观察后，结果显示SIPA的差异，与SIPA评分随时间的普遍下降有关。支持量表效度，SIPA与信任和对解释的满意度呈强相关。本研究表明，不同层次的信息披露效应可能需要多次重复才能显现。此外，高水平的信息披露可能导致SIPA与预测系统结果的性能之间的错误校准。结果表明，对于负责任的XAI设计，系统设计者可以利用预测任务来校准经验可追溯性。

{"title":"How do Users Experience Traceability of AI Systems? Examining Subjective Information Processing Awareness in Automated Insulin Delivery (AID) Systems","authors":"Tim Schrills, T. Franke","doi":"10.1145/3588594","DOIUrl":"https://doi.org/10.1145/3588594","url":null,"abstract":"When interacting with artificial intelligence (AI) in the medical domain, users frequently face automated information processing, which can remain opaque to them. For example, users with diabetes may interact daily with automated insulin delivery (AID). However, effective AID therapy requires traceability of automated decisions for diverse users. Grounded in research on human-automation interaction, we study Subjective Information Processing Awareness (SIPA) as a key construct to research users’ experience of explainable AI. The objective of the present research was to examine how users experience differing levels of traceability of an AI algorithm. We developed a basic AID simulation to create realistic scenarios for an experiment with N = 80, where we examined the effect of three levels of information disclosure on SIPA and performance. Attributes serving as the basis for insulin needs calculation were shown to users, who predicted the AID system’s calculation after over 60 observations. Results showed a difference in SIPA after repeated observations, associated with a general decline of SIPA ratings over time. Supporting scale validity, SIPA was strongly correlated with trust and satisfaction with explanations. The present research indicates that the effect of different levels of information disclosure may need several repetitions before it manifests. Additionally, high levels of information disclosure may lead to a miscalibration between SIPA and performance in predicting the system’s results. The results indicate that for a responsible design of XAI, system designers could utilize prediction tasks in order to calibrate experienced traceability.","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":"18 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2023-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85212145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Conversational Context-sensitive Ad Generation with a Few Core-Queries 会话上下文敏感广告生成与一些核心查询

IF 3.4 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Interactive Intelligent Systems

Pub Date : 2023-03-23 DOI: 10.1145/3588578

Ryoichi Shibata, Shoya Matsumori, Yosuke Fukuchi, Tomoyuki Maekawa, Mitsuhiko Kimoto, M. Imai

When people are talking together in front of digital signage, advertisements that are aware of the context of the dialogue will work the most effectively. However, it has been challenging for computer systems to retrieve the appropriate advertisement from among the many options presented in large databases. Our proposed system, the Conversational Context-sensitive Advertisement generator (CoCoA), is the first attempt to apply masked word prediction to web information retrieval that takes into account the dialogue context. The novelty of CoCoA is that advertisers simply need to prepare a few abstract phrases, called Core-Queries, and then CoCoA automatically generates a context-sensitive expression as a complete search query by utilizing a masked word prediction technique that adds a word related to the dialogue context to one of the prepared Core-Queries. This automatic generation frees the advertisers from having to come up with context-sensitive phrases to attract users’ attention. Another unique point is that the modified Core-Query offers users speaking in front of the CoCoA system a list of context-sensitive advertisements. CoCoA was evaluated by crowd workers regarding the context-sensitivity of the generated search queries against the dialogue text of multiple domains prepared in advance. The results indicated that CoCoA could present more contextual and practical advertisements than other web-retrieval systems. Moreover, CoCoA acquired a higher evaluation in a particular conversation that included many travel topics to which the Core-Queries were designated, implying that it succeeded in adapting the Core-Queries for the specific ongoing context better than the compared method without any effort on the part of the advertisers. In addition, case studies with users and advertisers revealed that the context-sensitive advertisements generated by CoCoA also had an effect on the content of the ongoing dialogue. Specifically, since pairs unfamiliar with each other more frequently referred to the advertisement CoCoA displayed, the advertisements had an effect on the topics about which the pairs spoke. Moreover, participants of an advertiser role recognized that some of the search queries generated by CoCoA fit the context of a conversation and that CoCoA improved the effect of the advertisement. In particular, they learned how to design of designing a good Core-Query at ease by observing the users’ response to the advertisements retrieved with the generated search queries.

当人们在数字标牌前交谈时，了解对话背景的广告将最有效地发挥作用。然而，对于计算机系统来说，从大型数据库中提供的众多选项中检索适当的广告一直是一项挑战。我们提出的会话上下文敏感广告生成器(conversation context -sensitive advertising generator, CoCoA)是首次尝试将掩码词预测应用于考虑对话上下文的web信息检索。CoCoA的新颖之处在于，广告商只需要准备几个抽象短语(称为核心查询)，然后CoCoA利用掩码词预测技术(将与对话上下文相关的单词添加到一个准备好的核心查询中)，自动生成一个上下文敏感的表达式作为完整的搜索查询。这种自动生成功能使广告商不必为了吸引用户的注意力而想出与上下文相关的短语。另一个独特之处在于，修改后的Core-Query为在CoCoA系统前发言的用户提供了一个上下文敏感的广告列表。CoCoA是由人群工作人员根据预先准备的多个域的对话文本对生成的搜索查询的上下文敏感性进行评估的。结果表明，与其他网络检索系统相比，CoCoA可以提供更多的情境性和实用性广告。此外，在包含许多指定的核心查询的旅游主题的特定对话中，CoCoA获得了更高的评价，这意味着它比比较的方法更成功地使核心查询适应特定的持续上下文，而无需广告商的任何努力。此外，对用户和广告商的案例研究表明，由CoCoA生成的上下文敏感广告也对正在进行的对话的内容产生影响。具体来说，由于彼此不熟悉的配对更频繁地提到CoCoA展示的广告，广告对配对谈论的话题有影响。此外，广告角色的参与者认识到，由CoCoA生成的一些搜索查询符合对话的上下文，并且CoCoA提高了广告的效果。特别是，他们学会了如何通过观察用户对生成的搜索查询所检索到的广告的反应来轻松地设计一个好的核心查询。

{"title":"Conversational Context-sensitive Ad Generation with a Few Core-Queries","authors":"Ryoichi Shibata, Shoya Matsumori, Yosuke Fukuchi, Tomoyuki Maekawa, Mitsuhiko Kimoto, M. Imai","doi":"10.1145/3588578","DOIUrl":"https://doi.org/10.1145/3588578","url":null,"abstract":"When people are talking together in front of digital signage, advertisements that are aware of the context of the dialogue will work the most effectively. However, it has been challenging for computer systems to retrieve the appropriate advertisement from among the many options presented in large databases. Our proposed system, the Conversational Context-sensitive Advertisement generator (CoCoA), is the first attempt to apply masked word prediction to web information retrieval that takes into account the dialogue context. The novelty of CoCoA is that advertisers simply need to prepare a few abstract phrases, called Core-Queries, and then CoCoA automatically generates a context-sensitive expression as a complete search query by utilizing a masked word prediction technique that adds a word related to the dialogue context to one of the prepared Core-Queries. This automatic generation frees the advertisers from having to come up with context-sensitive phrases to attract users’ attention. Another unique point is that the modified Core-Query offers users speaking in front of the CoCoA system a list of context-sensitive advertisements. CoCoA was evaluated by crowd workers regarding the context-sensitivity of the generated search queries against the dialogue text of multiple domains prepared in advance. The results indicated that CoCoA could present more contextual and practical advertisements than other web-retrieval systems. Moreover, CoCoA acquired a higher evaluation in a particular conversation that included many travel topics to which the Core-Queries were designated, implying that it succeeded in adapting the Core-Queries for the specific ongoing context better than the compared method without any effort on the part of the advertisers. In addition, case studies with users and advertisers revealed that the context-sensitive advertisements generated by CoCoA also had an effect on the content of the ongoing dialogue. Specifically, since pairs unfamiliar with each other more frequently referred to the advertisement CoCoA displayed, the advertisements had an effect on the topics about which the pairs spoke. Moreover, participants of an advertiser role recognized that some of the search queries generated by CoCoA fit the context of a conversation and that CoCoA improved the effect of the advertisement. In particular, they learned how to design of designing a good Core-Query at ease by observing the users’ response to the advertisements retrieved with the generated search queries.","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":"80 1","pages":"1 - 37"},"PeriodicalIF":3.4,"publicationDate":"2023-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89068206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Conversational Context-Sensitive Ad Generation With a Few Core-Queries 会话上下文敏感广告生成与一些核心查询

IF 3.4 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Interactive Intelligent Systems

Pub Date : 2023-03-23 DOI: https://dl.acm.org/doi/10.1145/3588578

Ryoichi Shibata, Shoya Matsumori, Yosuke Fukuchi, Tomoyuki Maekawa, Mitsuhiko Kimoto, Michita Imai

When people are talking together in front of digital signage, advertisements that are aware of the context of the dialogue will work the most effectively. However, it has been challenging for computer systems to retrieve the appropriate advertisement from among the many options presented in large databases. Our proposed system, the Conversational Context-sensitive Advertisement generator (CoCoA), is the first attempt to apply masked word prediction to web information retrieval that takes into account the dialogue context. The novelty of CoCoA is that advertisers simply need to prepare a few abstract phrases, called Core-Queries, and then CoCoA automatically generates a context-sensitive expression as a complete search query by utilizing a masked word prediction technique that adds a word related to the dialogue context to one of the prepared Core-Queries. This automatic generation frees the advertisers from having to come up with context-sensitive phrases to attract users’ attention. Another unique point is that the modified Core-Query offers users speaking in front of the CoCoA system a list of context-sensitive advertisements. CoCoA was evaluated by crowd workers regarding the context-sensitivity of the generated search queries against the dialogue text of multiple domains prepared in advance. The results indicated that CoCoA could present more contextual and practical advertisements than other web-retrieval systems. Moreover, CoCoA acquired a higher evaluation in a particular conversation that included many travel topics to which the Core-Queries were designated, implying that it succeeded in adapting the Core-Queries for the specific ongoing context better than the compared method without any effort on the part of the advertisers. In addition, case studies with users and advertisers revealed that the context-sensitive advertisements generated by CoCoA also had an effect on the content of the ongoing dialogue. Specifically, since pairs unfamiliar with each other more frequently referred to the advertisement CoCoA displayed, the advertisements had an effect on the topics about which the pairs spoke. Moreover, participants of an advertiser role recognized that some of the search queries generated by CoCoA fitted the context of a conversation and that CoCoA improved the effect of the advertisement. In particular, they assimilated the hang of designing a good Core-Query at ease by observing the users’ response to the advertisements retrieved with the generated search queries.

当人们在数字标牌前交谈时，了解对话背景的广告将最有效地发挥作用。然而，对于计算机系统来说，从大型数据库中提供的众多选项中检索适当的广告一直是一项挑战。我们提出的会话上下文敏感广告生成器(conversation context -sensitive advertising generator, CoCoA)是首次尝试将掩码词预测应用于考虑对话上下文的web信息检索。CoCoA的新颖之处在于，广告商只需要准备几个抽象短语(称为核心查询)，然后CoCoA利用掩码词预测技术(将与对话上下文相关的单词添加到一个准备好的核心查询中)，自动生成一个上下文敏感的表达式作为完整的搜索查询。这种自动生成功能使广告商不必为了吸引用户的注意力而想出与上下文相关的短语。另一个独特之处在于，修改后的Core-Query为在CoCoA系统前发言的用户提供了一个上下文敏感的广告列表。CoCoA是由人群工作人员根据预先准备的多个域的对话文本对生成的搜索查询的上下文敏感性进行评估的。结果表明，与其他网络检索系统相比，CoCoA可以提供更多的情境性和实用性广告。此外，在包含许多指定的核心查询的旅游主题的特定对话中，CoCoA获得了更高的评价，这意味着它比比较的方法更成功地使核心查询适应特定的持续上下文，而无需广告商的任何努力。此外，对用户和广告商的案例研究表明，由CoCoA生成的上下文敏感广告也对正在进行的对话的内容产生影响。具体来说，由于彼此不熟悉的配对更频繁地提到CoCoA展示的广告，广告对配对谈论的话题有影响。此外，广告角色的参与者认识到，由CoCoA生成的一些搜索查询符合对话的上下文，并且CoCoA提高了广告的效果。特别是，他们通过观察用户对生成的搜索查询检索到的广告的反应，轻松地吸收了设计一个好的核心查询的诀窍。

{"title":"Conversational Context-Sensitive Ad Generation With a Few Core-Queries","authors":"Ryoichi Shibata, Shoya Matsumori, Yosuke Fukuchi, Tomoyuki Maekawa, Mitsuhiko Kimoto, Michita Imai","doi":"https://dl.acm.org/doi/10.1145/3588578","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3588578","url":null,"abstract":"When people are talking together in front of digital signage, advertisements that are aware of the context of the dialogue will work the most effectively. However, it has been challenging for computer systems to retrieve the appropriate advertisement from among the many options presented in large databases. Our proposed system, the Conversational Context-sensitive Advertisement generator (CoCoA), is the first attempt to apply masked word prediction to web information retrieval that takes into account the dialogue context. The novelty of CoCoA is that advertisers simply need to prepare a few abstract phrases, called Core-Queries, and then CoCoA automatically generates a context-sensitive expression as a complete search query by utilizing a masked word prediction technique that adds a word related to the dialogue context to one of the prepared Core-Queries. This automatic generation frees the advertisers from having to come up with context-sensitive phrases to attract users’ attention. Another unique point is that the modified Core-Query offers users speaking in front of the CoCoA system a list of context-sensitive advertisements. CoCoA was evaluated by crowd workers regarding the context-sensitivity of the generated search queries against the dialogue text of multiple domains prepared in advance. The results indicated that CoCoA could present more contextual and practical advertisements than other web-retrieval systems. Moreover, CoCoA acquired a higher evaluation in a particular conversation that included many travel topics to which the Core-Queries were designated, implying that it succeeded in adapting the Core-Queries for the specific ongoing context better than the compared method without any effort on the part of the advertisers. In addition, case studies with users and advertisers revealed that the context-sensitive advertisements generated by CoCoA also had an effect on the content of the ongoing dialogue. Specifically, since pairs unfamiliar with each other more frequently referred to the advertisement CoCoA displayed, the advertisements had an effect on the topics about which the pairs spoke. Moreover, participants of an advertiser role recognized that some of the search queries generated by CoCoA fitted the context of a conversation and that CoCoA improved the effect of the advertisement. In particular, they assimilated the hang of designing a good Core-Query at ease by observing the users’ response to the advertisements retrieved with the generated search queries.","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":"58 5","pages":""},"PeriodicalIF":3.4,"publicationDate":"2023-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138508037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Effects of AI and Logic-Style Explanations on Users’ Decisions under Different Levels of Uncertainty 不同不确定性水平下人工智能与逻辑式解释对用户决策的影响

IF 3.4 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Interactive Intelligent Systems

Pub Date : 2023-03-16 DOI: 10.1145/3588320

Federico Maria Cau, H. Hauptmann, L. D. Spano, N. Tintarev

Existing eXplainable Artificial Intelligence (XAI) techniques support people in interpreting AI advice. However, while previous work evaluates the users’ understanding of explanations, factors influencing the decision support are largely overlooked in the literature. This paper addresses this gap by studying the impact of user uncertainty, AI correctness, and the interaction between AI uncertainty and explanation logic-styles, for classification tasks. We conducted two separate studies: one requesting participants to recognise hand-written digits and one to classify the sentiment of reviews. To assess the decision making, we analysed the task performance, agreement with the AI suggestion, and the user’s reliance on the XAI interface elements. Participants make their decision relying on three pieces of information in the XAI interface (image or text instance, AI prediction, and explanation). Participants were shown one explanation style (between-participants design): according to three styles of logical reasoning (inductive, deductive, and abductive). This allowed us to study how different levels of AI uncertainty influence the effectiveness of different explanation styles. The results show that user uncertainty and AI correctness on predictions significantly affected users’ classification decisions considering the analysed metrics. In both domains (images and text), users relied mainly on the instance to decide. Users were usually overconfident about their choices, and this evidence was more pronounced for text. Furthermore, the inductive style explanations led to over-reliance on the AI advice in both domains – it was the most persuasive, even when the AI was incorrect. The abductive and deductive styles have complex effects depending on the domain and the AI uncertainty levels.

现有的可解释人工智能(XAI)技术支持人们解释人工智能的建议。然而，虽然以前的研究评估了用户对解释的理解，但文献中很大程度上忽视了影响决策支持的因素。本文通过研究用户不确定性、人工智能正确性以及人工智能不确定性与解释逻辑风格之间的交互对分类任务的影响来解决这一差距。我们进行了两项独立的研究:一项要求参与者识别手写的数字，另一项要求参与者对评论的情绪进行分类。为了评估决策，我们分析了任务执行情况、与AI建议的一致性以及用户对XAI界面元素的依赖程度。参与者根据XAI界面中的三个信息(图像或文本实例、AI预测和解释)做出决策。参与者被展示了一种解释风格(参与者之间的设计):根据三种逻辑推理风格(归纳、演绎和溯因)。这使我们能够研究不同程度的人工智能不确定性如何影响不同解释风格的有效性。结果表明，考虑所分析的指标，用户的不确定性和人工智能对预测的正确性显著影响用户的分类决策。在这两个领域(图像和文本)中，用户主要依靠实例来决定。用户通常对自己的选择过于自信，这一点在文本方面表现得更为明显。此外，归纳式解释导致在这两个领域过度依赖人工智能的建议——它是最有说服力的，即使人工智能是不正确的。溯因和演绎风格具有复杂的影响，取决于领域和人工智能的不确定性水平。

{"title":"Effects of AI and Logic-Style Explanations on Users’ Decisions under Different Levels of Uncertainty","authors":"Federico Maria Cau, H. Hauptmann, L. D. Spano, N. Tintarev","doi":"10.1145/3588320","DOIUrl":"https://doi.org/10.1145/3588320","url":null,"abstract":"Existing eXplainable Artificial Intelligence (XAI) techniques support people in interpreting AI advice. However, while previous work evaluates the users’ understanding of explanations, factors influencing the decision support are largely overlooked in the literature. This paper addresses this gap by studying the impact of user uncertainty, AI correctness, and the interaction between AI uncertainty and explanation logic-styles, for classification tasks. We conducted two separate studies: one requesting participants to recognise hand-written digits and one to classify the sentiment of reviews. To assess the decision making, we analysed the task performance, agreement with the AI suggestion, and the user’s reliance on the XAI interface elements. Participants make their decision relying on three pieces of information in the XAI interface (image or text instance, AI prediction, and explanation). Participants were shown one explanation style (between-participants design): according to three styles of logical reasoning (inductive, deductive, and abductive). This allowed us to study how different levels of AI uncertainty influence the effectiveness of different explanation styles. The results show that user uncertainty and AI correctness on predictions significantly affected users’ classification decisions considering the analysed metrics. In both domains (images and text), users relied mainly on the instance to decide. Users were usually overconfident about their choices, and this evidence was more pronounced for text. Furthermore, the inductive style explanations led to over-reliance on the AI advice in both domains – it was the most persuasive, even when the AI was incorrect. The abductive and deductive styles have complex effects depending on the domain and the AI uncertainty levels.","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":"104 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2023-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80461681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1