Emily M. Langston , Varitnan Hattakitjamroen , Mario Hernandez , Hye Soo Lee , Hannah Ç. Mason , Willencia Louis-Charles , Neil Charness , Sara J. Czaja , Wendy A. Rogers , Joseph Sharit , Walter R. Boot
{"title":"Exploring artificial intelligence-powered virtual assistants to understand their potential to support older adults’ search needs","authors":"Emily M. Langston , Varitnan Hattakitjamroen , Mario Hernandez , Hye Soo Lee , Hannah Ç. Mason , Willencia Louis-Charles , Neil Charness , Sara J. Czaja , Wendy A. Rogers , Joseph Sharit , Walter R. Boot","doi":"10.1016/j.hfh.2025.100092","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><div>We investigated the accuracy and amount of information provided by artificial intelligence (AI)-powered virtual assistants in response to queries relevant to aging adults in the domains of Medicare, long-term care insurance, and resource access.</div></div><div><h3>Background</h3><div>Older adults are faced with complex decisions and must gather and integrate information from diverse sources to help support these decisions (e.g., across various websites and online resources). Information-seeking, integration, and decision-making are cognitively demanding and can be impacted by age-related cognitive changes. Virtual assistants powered by AI have the potential to provide older adults with easy access to information and answers to their queries. However, it is unclear how accurate this information and these answers might be.</div></div><div><h3>Method</h3><div>Alexa, Google Assistant, Bard, and ChatGPT-4 were queried. Coders assessed the accuracy of these responses, and the amount of supplemental information provided as a measure of response complexity.</div></div><div><h3>Results</h3><div>Overall, Large Language Model (LLM)-based virtual assistants (Bard, ChatGPT-4) responded more accurately than non-LLM assistants (e.g., 6 % inaccurate responses for Bard vs. 60 % for Alexa) and provided substantially more supplemental information (79 % of responses with high supplemental information for Bard and 37 % for Chat-GPT, vs. 20 % or less for others). We note, however, that responses can vary over time.</div></div><div><h3>Conclusion</h3><div>Based on their ability to provide largely accurate responses, LLMs may be helpful tools for older adults seeking information related to health, insurance, and available resources. However, the potential for error, high response complexity, and response variability should be considered.</div></div><div><h3>Application</h3><div>LLM-based virtual assistants may be a helpful tool for older adults seeking information to support health and financial decisions.</div></div>","PeriodicalId":93564,"journal":{"name":"Human factors in healthcare","volume":"7 ","pages":"Article 100092"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Human factors in healthcare","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S277250142500003X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Objective
We investigated the accuracy and amount of information provided by artificial intelligence (AI)-powered virtual assistants in response to queries relevant to aging adults in the domains of Medicare, long-term care insurance, and resource access.
Background
Older adults are faced with complex decisions and must gather and integrate information from diverse sources to help support these decisions (e.g., across various websites and online resources). Information-seeking, integration, and decision-making are cognitively demanding and can be impacted by age-related cognitive changes. Virtual assistants powered by AI have the potential to provide older adults with easy access to information and answers to their queries. However, it is unclear how accurate this information and these answers might be.
Method
Alexa, Google Assistant, Bard, and ChatGPT-4 were queried. Coders assessed the accuracy of these responses, and the amount of supplemental information provided as a measure of response complexity.
Results
Overall, Large Language Model (LLM)-based virtual assistants (Bard, ChatGPT-4) responded more accurately than non-LLM assistants (e.g., 6 % inaccurate responses for Bard vs. 60 % for Alexa) and provided substantially more supplemental information (79 % of responses with high supplemental information for Bard and 37 % for Chat-GPT, vs. 20 % or less for others). We note, however, that responses can vary over time.
Conclusion
Based on their ability to provide largely accurate responses, LLMs may be helpful tools for older adults seeking information related to health, insurance, and available resources. However, the potential for error, high response complexity, and response variability should be considered.
Application
LLM-based virtual assistants may be a helpful tool for older adults seeking information to support health and financial decisions.