Natural language processing to evaluate texting conversations between patients and healthcare providers during COVID-19 Home-Based Care in Rwanda at scale

Richard T Lester, Matthew Manson, Muhammed Semakula, Hyeju Jang, Hassan Mugabo, Ali Magzari, Junhong Ma Blackmer, Fanan Fattah, Simon Pierre Niyonsenga, Edson Rwagasore, Charles Ruranga, Eric Remera, Jean Claude S. Ngabonziza, Giuseppe Carenini, Sabin Nsanzimana
{"title":"Natural language processing to evaluate texting conversations between patients and healthcare providers during COVID-19 Home-Based Care in Rwanda at scale","authors":"Richard T Lester, Matthew Manson, Muhammed Semakula, Hyeju Jang, Hassan Mugabo, Ali Magzari, Junhong Ma Blackmer, Fanan Fattah, Simon Pierre Niyonsenga, Edson Rwagasore, Charles Ruranga, Eric Remera, Jean Claude S. Ngabonziza, Giuseppe Carenini, Sabin Nsanzimana","doi":"10.1101/2024.08.30.24312636","DOIUrl":null,"url":null,"abstract":"Isolation of patients with communicable infectious diseases limits spread of pathogens but can be difficult to manage outside hospitals. Rwanda deployed a digital health service nationally to assist public health clinicians to remotely monitor and support SARS-CoV-2 cases via their mobile phones using daily interactive short message service (SMS) check-ins. We aimed to assess the texting patterns and communicated topics to understand patient experiences. We extracted data on all COVID-19 cases and exposed contacts who were enrolled in the WelTel text messaging program between March 18, 2020, and March 31, 2022, and linked demographic and clinical data from the national COVID-19 registry. A sample of the text conversation corpus was English-translated and labeled with topics of interest defined by medical experts. Multiple natural language processing (NLP) topic classification models were trained and compared using F1 scores. Best performing models were applied to classify unlabeled conversations. Total 33,081 isolated patients (mean age 33·9, range 0-100), 44% female, including 30,398 cases and 2,683 contacts) were registered in WelTel. Registered patients generated 12,119 interactive text conversations in Kinyarwanda (n=8,183, 67%), English (n=3,069, 25%) and other languages. Sufficiently trained large language models (LLMs) were unavailable for Kinyarwanda. Traditional machine learning (ML) models outperformed fine-tuned transformer architecture language models on the native untranslated language corpus, however, the reverse was observed of models trained on English-only data. The most frequently identified topics discussed included symptoms (69%), diagnostics (38%), social issues (19%), prevention (18%), healthcare logistics (16%), and treatment (8·5%). Education, advice, and triage on these topics were provided to patients. Interactive text messaging can be used to remotely support isolated patients in pandemics at scale. NLP can help evaluate the medical and social factors that affect isolated patients which could ultimately inform precision public health responses to future pandemics.","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":"26 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv - Health Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.08.30.24312636","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Isolation of patients with communicable infectious diseases limits spread of pathogens but can be difficult to manage outside hospitals. Rwanda deployed a digital health service nationally to assist public health clinicians to remotely monitor and support SARS-CoV-2 cases via their mobile phones using daily interactive short message service (SMS) check-ins. We aimed to assess the texting patterns and communicated topics to understand patient experiences. We extracted data on all COVID-19 cases and exposed contacts who were enrolled in the WelTel text messaging program between March 18, 2020, and March 31, 2022, and linked demographic and clinical data from the national COVID-19 registry. A sample of the text conversation corpus was English-translated and labeled with topics of interest defined by medical experts. Multiple natural language processing (NLP) topic classification models were trained and compared using F1 scores. Best performing models were applied to classify unlabeled conversations. Total 33,081 isolated patients (mean age 33·9, range 0-100), 44% female, including 30,398 cases and 2,683 contacts) were registered in WelTel. Registered patients generated 12,119 interactive text conversations in Kinyarwanda (n=8,183, 67%), English (n=3,069, 25%) and other languages. Sufficiently trained large language models (LLMs) were unavailable for Kinyarwanda. Traditional machine learning (ML) models outperformed fine-tuned transformer architecture language models on the native untranslated language corpus, however, the reverse was observed of models trained on English-only data. The most frequently identified topics discussed included symptoms (69%), diagnostics (38%), social issues (19%), prevention (18%), healthcare logistics (16%), and treatment (8·5%). Education, advice, and triage on these topics were provided to patients. Interactive text messaging can be used to remotely support isolated patients in pandemics at scale. NLP can help evaluate the medical and social factors that affect isolated patients which could ultimately inform precision public health responses to future pandemics.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用自然语言处理技术评估卢旺达 COVID-19 家庭护理期间患者与医疗服务提供者之间的大规模短信对话
隔离传染性疾病患者可以限制病原体的传播,但在医院外却很难管理。卢旺达在全国范围内部署了一项数字医疗服务,以协助公共卫生临床医生通过手机使用每日互动短信服务(SMS)签到对 SARS-CoV-2 病例进行远程监控和支持。我们旨在评估短信模式和交流主题,以了解患者的经历。我们提取了 2020 年 3 月 18 日至 2022 年 3 月 31 日期间加入 WelTel 短信项目的所有 COVID-19 病例和接触者的数据,并将全国 COVID-19 登记处的人口统计学和临床数据联系起来。文本对话语料库的样本经过英语翻译,并标注了医学专家定义的相关主题。对多个自然语言处理(NLP)主题分类模型进行了训练,并使用 F1 分数进行比较。表现最好的模型被用于对未标记的对话进行分类。WelTel 共登记了 33,081 名孤立患者(平均年龄 33-9,范围 0-100),其中 44% 为女性,包括 30,398 个病例和 2,683 个联系人。已登记的患者以基尼亚卢旺达语(8183 人,占 67%)、英语(3069 人,占 25%)和其他语言进行了 12119 次互动文本对话。基尼亚卢旺达语没有经过充分训练的大型语言模型(LLM)。在本地未翻译语言语料库中,传统机器学习(ML)模型的表现优于微调转换器架构语言模型,但在纯英语数据中训练的模型则相反。最常见的讨论主题包括症状(69%)、诊断(38%)、社会问题(19%)、预防(18%)、医疗物流(16%)和治疗(8-5%)。就这些主题向患者提供了教育、建议和分流服务。互动短信可用于大规模远程支持大流行病中与世隔绝的患者。NLP 可以帮助评估影响被隔离患者的医疗和社会因素,最终为未来大流行病的精确公共卫生应对措施提供信息。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A case is not a case is not a case - challenges and solutions in determining urolithiasis caseloads using the digital infrastructure of a clinical data warehouse Reliable Online Auditory Cognitive Testing: An observational study Federated Multiple Imputation for Variables that Are Missing Not At Random in Distributed Electronic Health Records Characterizing the connection between Parkinson's disease progression and healthcare utilization Generative AI and Large Language Models in Reducing Medication Related Harm and Adverse Drug Events - A Scoping Review
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1