Pub Date : 2024-09-02DOI: 10.1101/2024.09.01.24312881
Selahattin Colakoglu, Mustafa Durmus, Zeynep Pelin Polat, Asli Yildiz, Emre Sezgin
Background Understanding user engagement with conversational agents (CAs) in mobile health apps is crucial for improving sustained usage. We analyzed CA interactions in a mobile health app to identify usage patterns and potential barriers.
背景 了解用户与移动医疗应用程序中的对话代理(CA)的互动对于提高持续使用率至关重要。我们分析了一款移动健康应用程序中的 CA 互动,以确定使用模式和潜在障碍。
{"title":"The Use of Conversational Agents in Self-Management: A Retrospective Analysis","authors":"Selahattin Colakoglu, Mustafa Durmus, Zeynep Pelin Polat, Asli Yildiz, Emre Sezgin","doi":"10.1101/2024.09.01.24312881","DOIUrl":"https://doi.org/10.1101/2024.09.01.24312881","url":null,"abstract":"<strong>Background</strong> Understanding user engagement with conversational agents (CAs) in mobile health apps is crucial for improving sustained usage. We analyzed CA interactions in a mobile health app to identify usage patterns and potential barriers.","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142211502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-01DOI: 10.1101/2024.08.30.24312871
Benjamin G Mittman, Bo Hu, Rebecca Schulte, Phuc Le, Matthew A Pappas, Aaron Hamilton, Michael B Rothberg
Background Guidelines recommend pharmacological venous thromboembolism (VTE) prophylaxis only for high-risk patients, but the probability of VTE considered “high-risk” is not specified. Our objective was to define an appropriate probability threshold (or range) for VTE risk stratification and corresponding prophylaxis in medical inpatients.
{"title":"What Constitutes High Risk for Venous Thromboembolism? Comparing Approaches to Determining an Appropriate Threshold","authors":"Benjamin G Mittman, Bo Hu, Rebecca Schulte, Phuc Le, Matthew A Pappas, Aaron Hamilton, Michael B Rothberg","doi":"10.1101/2024.08.30.24312871","DOIUrl":"https://doi.org/10.1101/2024.08.30.24312871","url":null,"abstract":"<strong>Background</strong> Guidelines recommend pharmacological venous thromboembolism (VTE) prophylaxis only for high-risk patients, but the probability of VTE considered “high-risk” is not specified. Our objective was to define an appropriate probability threshold (or range) for VTE risk stratification and corresponding prophylaxis in medical inpatients.","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":"38 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142211501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-31DOI: 10.1101/2024.08.29.24312810
N. Kenji Taylor, Takashi Nishibayashi
Background AI-driven symptom checkers (SC) are increasingly adopted in healthcare for their potential to provide users with accessible and immediate preliminary health education. These tools, powered by advanced artificial intelligence algorithms, assist patients in quickly assessing their symptoms. Previous studies using clinical vignette approaches have evaluated SC accuracy, highlighting both strengths and areas for improvement.
{"title":"Ubie Symptom Checker: A Clinical Vignette Simulation Study","authors":"N. Kenji Taylor, Takashi Nishibayashi","doi":"10.1101/2024.08.29.24312810","DOIUrl":"https://doi.org/10.1101/2024.08.29.24312810","url":null,"abstract":"<strong>Background</strong> AI-driven symptom checkers (SC) are increasingly adopted in healthcare for their potential to provide users with accessible and immediate preliminary health education. These tools, powered by advanced artificial intelligence algorithms, assist patients in quickly assessing their symptoms. Previous studies using clinical vignette approaches have evaluated SC accuracy, highlighting both strengths and areas for improvement.","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":"53 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142211503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Brain networks play a crucial role in the diagnosis of brain disorders by enabling the identification of abnormal patterns and connections in brain activities. Previous studies exploit the Pearson’s correlation coefficient to construct functional brain networks from fMRI data and use graph learning to diagnose brain diseases. However, correlation-based brain networks are overly dense (often fully connected), which obscures meaningful connections and complicates subsequent analyses. This dense connectivity poses substantial performance challenges to traditional graph transformers, which are primarily designed for sparse graphs. Consequently, this results in a notable reduction in diagnostic accuracy. To address this challenging issue, we propose a multifunctional brain graph transformer model for brain disorders diagnosis, namely BrainGT, which is capable of constructing multifunctional brain networks rather than a dense brain network from fMRI data. It utilizes the fusion of self-attention and cross-attention mechanisms to learn important features within and across multiple functional brain networks. Classification (diagnosis) experiments conducted on three real fMRI datasets (i.e., ADNI, PPMI, and ABIDE) demonstrate the superiority of the proposed BrainGT over state-of-the-art methods.
{"title":"BrainGT: Multifunctional Brain Graph Transformer for Brain Disorder Diagnosis","authors":"Ahsan Shehzad, Shuo Yu, Dongyu Zhang, Shagufta Abid, Xinrui Cheng, Jingjing Zhou, Feng Xia","doi":"10.1101/2024.08.30.24312819","DOIUrl":"https://doi.org/10.1101/2024.08.30.24312819","url":null,"abstract":"Brain networks play a crucial role in the diagnosis of brain disorders by enabling the identification of abnormal patterns and connections in brain activities. Previous studies exploit the Pearson’s correlation coefficient to construct functional brain networks from fMRI data and use graph learning to diagnose brain diseases. However, correlation-based brain networks are overly dense (often fully connected), which obscures meaningful connections and complicates subsequent analyses. This dense connectivity poses substantial performance challenges to traditional graph transformers, which are primarily designed for sparse graphs. Consequently, this results in a notable reduction in diagnostic accuracy. To address this challenging issue, we propose a multifunctional brain graph transformer model for brain disorders diagnosis, namely BrainGT, which is capable of constructing multifunctional brain networks rather than a dense brain network from fMRI data. It utilizes the fusion of self-attention and cross-attention mechanisms to learn important features within and across multiple functional brain networks. Classification (diagnosis) experiments conducted on three real fMRI datasets (i.e., ADNI, PPMI, and ABIDE) demonstrate the superiority of the proposed BrainGT over state-of-the-art methods.","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":"58 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142211504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-31DOI: 10.1101/2024.08.30.24312861
Alec B. Chapman, Talia Panadero, Rachel Dalrymple, Alicia Cohen, Nipa Kamdar, Farhana Pethani, Andrea Kalvesmaki, Richard E. Nelson, Jorie Butler
Food insecurity is an important social risk factor that is directly linked to patient health and well-being. The Department of Veterans Affairs (VA) aims to identify and resolve food insecurity through social and clinical interventions. However, evaluating the impact of such interventions is made challenging by the lack of follow-up data on Veteran food insecurity status. One potential solution is to leverage documentation of food insecurity in electronic health records (EHRs). In this paper, we developed and validated a natural language processing system to identify food insecurity status from clinical notes and applied it to study longitudinal trajectories of food insecurity among a large cohort of food insecure Veterans. Our analyses provide insight into the timing and persistence of Veteran food insecurity; in the future, our methods will be used to evaluate food insecurity interventions and evaluate VA policy.
{"title":"Studying Veteran food insecurity longitudinally using electronic health record data and natural language processing","authors":"Alec B. Chapman, Talia Panadero, Rachel Dalrymple, Alicia Cohen, Nipa Kamdar, Farhana Pethani, Andrea Kalvesmaki, Richard E. Nelson, Jorie Butler","doi":"10.1101/2024.08.30.24312861","DOIUrl":"https://doi.org/10.1101/2024.08.30.24312861","url":null,"abstract":"Food insecurity is an important social risk factor that is directly linked to patient health and well-being. The Department of Veterans Affairs (VA) aims to identify and resolve food insecurity through social and clinical interventions. However, evaluating the impact of such interventions is made challenging by the lack of follow-up data on Veteran food insecurity status. One potential solution is to leverage documentation of food insecurity in electronic health records (EHRs). In this paper, we developed and validated a natural language processing system to identify food insecurity status from clinical notes and applied it to study longitudinal trajectories of food insecurity among a large cohort of food insecure Veterans. Our analyses provide insight into the timing and persistence of Veteran food insecurity; in the future, our methods will be used to evaluate food insecurity interventions and evaluate VA policy.","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":"63 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142211507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-31DOI: 10.1101/2024.08.30.24312744
Nicholas Fong, Michael S. Lipnick, Ella Behnke, Yu Chou, Seif Elmankabadi, Lily Ortiz, Christopher S. Almond, Isabella Auchus, Garrett W. Burnett, Ronald Bisegerwa, Desireé R Conrad, Carolyn M. Hendrickson, Shubhada Hooli, Robert Kopotic, Gregory Leeb, Daniel Martin, Eric D. McCollum, Ellis P. Monk, Kelvin L. Moore, Leonid Shmuylovich, J. Brady Scott, An-Kwok Ian Wong, Tianyue Zhou, Romain Pirracchio, Philip E. Bickler, John Feiner, Tyler Law
The OpenOximetry Repository is a structured database storing clinical and lab pulse oximetry data, serving as a centralized repository and data model for pulse oximetry initiatives. It supports measurements of arterial oxygen saturation (SaO2) by arterial blood gas co-oximetry and pulse oximetry (SpO2), alongside processed and unprocessed photoplethysmography (PPG) data and other metadata. This includes skin color measurements, finger diameter, vital signs (e.g., arterial blood pressure, end-tidal carbon dioxide), and arterial blood gas parameters (e.g., acid-base balance, hemoglobin concentration).
{"title":"Open Access Data Repository and Common Data Model for Pulse Oximeter Performance Data","authors":"Nicholas Fong, Michael S. Lipnick, Ella Behnke, Yu Chou, Seif Elmankabadi, Lily Ortiz, Christopher S. Almond, Isabella Auchus, Garrett W. Burnett, Ronald Bisegerwa, Desireé R Conrad, Carolyn M. Hendrickson, Shubhada Hooli, Robert Kopotic, Gregory Leeb, Daniel Martin, Eric D. McCollum, Ellis P. Monk, Kelvin L. Moore, Leonid Shmuylovich, J. Brady Scott, An-Kwok Ian Wong, Tianyue Zhou, Romain Pirracchio, Philip E. Bickler, John Feiner, Tyler Law","doi":"10.1101/2024.08.30.24312744","DOIUrl":"https://doi.org/10.1101/2024.08.30.24312744","url":null,"abstract":"The OpenOximetry Repository is a structured database storing clinical and lab pulse oximetry data, serving as a centralized repository and data model for pulse oximetry initiatives. It supports measurements of arterial oxygen saturation (SaO2) by arterial blood gas co-oximetry and pulse oximetry (SpO2), alongside processed and unprocessed photoplethysmography (PPG) data and other metadata. This includes skin color measurements, finger diameter, vital signs (e.g., arterial blood pressure, end-tidal carbon dioxide), and arterial blood gas parameters (e.g., acid-base balance, hemoglobin concentration).","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":"127 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142211538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-31DOI: 10.1101/2024.08.30.24312636
Richard T Lester, Matthew Manson, Muhammed Semakula, Hyeju Jang, Hassan Mugabo, Ali Magzari, Junhong Ma Blackmer, Fanan Fattah, Simon Pierre Niyonsenga, Edson Rwagasore, Charles Ruranga, Eric Remera, Jean Claude S. Ngabonziza, Giuseppe Carenini, Sabin Nsanzimana
Isolation of patients with communicable infectious diseases limits spread of pathogens but can be difficult to manage outside hospitals. Rwanda deployed a digital health service nationally to assist public health clinicians to remotely monitor and support SARS-CoV-2 cases via their mobile phones using daily interactive short message service (SMS) check-ins. We aimed to assess the texting patterns and communicated topics to understand patient experiences. We extracted data on all COVID-19 cases and exposed contacts who were enrolled in the WelTel text messaging program between March 18, 2020, and March 31, 2022, and linked demographic and clinical data from the national COVID-19 registry. A sample of the text conversation corpus was English-translated and labeled with topics of interest defined by medical experts. Multiple natural language processing (NLP) topic classification models were trained and compared using F1 scores. Best performing models were applied to classify unlabeled conversations. Total 33,081 isolated patients (mean age 33·9, range 0-100), 44% female, including 30,398 cases and 2,683 contacts) were registered in WelTel. Registered patients generated 12,119 interactive text conversations in Kinyarwanda (n=8,183, 67%), English (n=3,069, 25%) and other languages. Sufficiently trained large language models (LLMs) were unavailable for Kinyarwanda. Traditional machine learning (ML) models outperformed fine-tuned transformer architecture language models on the native untranslated language corpus, however, the reverse was observed of models trained on English-only data. The most frequently identified topics discussed included symptoms (69%), diagnostics (38%), social issues (19%), prevention (18%), healthcare logistics (16%), and treatment (8·5%). Education, advice, and triage on these topics were provided to patients. Interactive text messaging can be used to remotely support isolated patients in pandemics at scale. NLP can help evaluate the medical and social factors that affect isolated patients which could ultimately inform precision public health responses to future pandemics.
{"title":"Natural language processing to evaluate texting conversations between patients and healthcare providers during COVID-19 Home-Based Care in Rwanda at scale","authors":"Richard T Lester, Matthew Manson, Muhammed Semakula, Hyeju Jang, Hassan Mugabo, Ali Magzari, Junhong Ma Blackmer, Fanan Fattah, Simon Pierre Niyonsenga, Edson Rwagasore, Charles Ruranga, Eric Remera, Jean Claude S. Ngabonziza, Giuseppe Carenini, Sabin Nsanzimana","doi":"10.1101/2024.08.30.24312636","DOIUrl":"https://doi.org/10.1101/2024.08.30.24312636","url":null,"abstract":"Isolation of patients with communicable infectious diseases limits spread of pathogens but can be difficult to manage outside hospitals. Rwanda deployed a digital health service nationally to assist public health clinicians to remotely monitor and support SARS-CoV-2 cases via their mobile phones using daily interactive short message service (SMS) check-ins. We aimed to assess the texting patterns and communicated topics to understand patient experiences. We extracted data on all COVID-19 cases and exposed contacts who were enrolled in the WelTel text messaging program between March 18, 2020, and March 31, 2022, and linked demographic and clinical data from the national COVID-19 registry. A sample of the text conversation corpus was English-translated and labeled with topics of interest defined by medical experts. Multiple natural language processing (NLP) topic classification models were trained and compared using F1 scores. Best performing models were applied to classify unlabeled conversations. Total 33,081 isolated patients (mean age 33·9, range 0-100), 44% female, including 30,398 cases and 2,683 contacts) were registered in WelTel. Registered patients generated 12,119 interactive text conversations in Kinyarwanda (n=8,183, 67%), English (n=3,069, 25%) and other languages. Sufficiently trained large language models (LLMs) were unavailable for Kinyarwanda. Traditional machine learning (ML) models outperformed fine-tuned transformer architecture language models on the native untranslated language corpus, however, the reverse was observed of models trained on English-only data. The most frequently identified topics discussed included symptoms (69%), diagnostics (38%), social issues (19%), prevention (18%), healthcare logistics (16%), and treatment (8·5%). Education, advice, and triage on these topics were provided to patients. Interactive text messaging can be used to remotely support isolated patients in pandemics at scale. NLP can help evaluate the medical and social factors that affect isolated patients which could ultimately inform precision public health responses to future pandemics.","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":"26 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142211682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lifestyle factors (LSFs) are increasingly recognized as instrumental in both the development and control of diseases. Despite their importance, there is a lack of methods to extract relations between LSFs and diseases from the literature, a step necessary to consolidate the currently available knowledge into a structured form. As simple co-occurrence-based relation extraction (RE) approaches are unable to distinguish between the different types of LSF-disease relations, context-aware transformer-based models are required to extract and classify these relations into specific relation types. No comprehensive LSF–disease RE system existed, primarily due to the lack of a suitable corpus for developing it. We present LSD600, the first corpus specifically designed for LSF-disease RE, comprising 600 abstracts with 1900 relations of eight distinct types between 5,027 diseases and 6,930 LSF entities. We evaluated LSD600’s quality by training a RoBERTa model on the corpus, achieving an F-score of 68.5% for the multi-label RE task on the held-out test set. We further validated LSD600 by using the trained model on the two Nutrition-Disease and FoodDisease datasets, where it achieved F-scores of 70.7% and 80.7%, respectively. Building on these performance results, LSD600 and the RE system trained on it can be valuable resources to fill the existing gap in this area and pave the way for downstream applications.
人们日益认识到,生活方式因素(LSFs)在疾病的发生和控制中起着重要作用。尽管生活方式因素非常重要,但目前还缺乏从文献中提取生活方式因素与疾病之间关系的方法,而这是将现有知识整合成结构化形式的必要步骤。由于简单的基于共现的关系提取(RE)方法无法区分 LSF-疾病关系的不同类型,因此需要基于上下文感知转换器的模型来提取这些关系并将其分类为特定的关系类型。目前还没有全面的 LSF-疾病 RE 系统,主要原因是缺乏合适的语料库来开发该系统。我们提出了 LSD600,这是第一个专门为 LSF-疾病 RE 设计的语料库,由 600 个摘要组成,包含 5,027 种疾病和 6,930 个 LSF 实体之间八种不同类型的 1900 种关系。我们在该语料库上训练了一个 RoBERTa 模型,对 LSD600 的质量进行了评估,在测试集上的多标签 RE 任务中取得了 68.5% 的 F-score。我们还在营养疾病和食品疾病两个数据集上使用训练好的模型进一步验证了 LSD600,其 F 分数分别达到了 70.7% 和 80.7%。在这些性能结果的基础上,LSD600 及其训练的 RE 系统可以成为填补该领域现有空白的宝贵资源,并为下游应用铺平道路。
{"title":"LSD600: the first corpus of biomedical abstracts annotated with lifestyle–disease relations","authors":"Esmaeil Nourani, Evangelia-Mantelena Makri, Xiqing Mao, Sampo Pyysalo, Søren Brunak, Katerina Nastou, Lars Juhl Jensen","doi":"10.1101/2024.08.30.24312862","DOIUrl":"https://doi.org/10.1101/2024.08.30.24312862","url":null,"abstract":"Lifestyle factors (LSFs) are increasingly recognized as instrumental in both the development and control of diseases. Despite their importance, there is a lack of methods to extract relations between LSFs and diseases from the literature, a step necessary to consolidate the currently available knowledge into a structured form. As simple co-occurrence-based relation extraction (RE) approaches are unable to distinguish between the different types of LSF-disease relations, context-aware transformer-based models are required to extract and classify these relations into specific relation types. No comprehensive LSF–disease RE system existed, primarily due to the lack of a suitable corpus for developing it. We present LSD600, the first corpus specifically designed for LSF-disease RE, comprising 600 abstracts with 1900 relations of eight distinct types between 5,027 diseases and 6,930 LSF entities. We evaluated LSD600’s quality by training a RoBERTa model on the corpus, achieving an F-score of 68.5% for the multi-label RE task on the held-out test set. We further validated LSD600 by using the trained model on the two Nutrition-Disease and FoodDisease datasets, where it achieved F-scores of 70.7% and 80.7%, respectively. Building on these performance results, LSD600 and the RE system trained on it can be valuable resources to fill the existing gap in this area and pave the way for downstream applications.","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":"20 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142211506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-30DOI: 10.1101/2024.08.30.24312824
Marvin Kopka, Markus A. Feufel
Digital health research often relies on case vignettes (descriptions of fictitious or real patients) to navigate ethical and practical challenges. Despite their utility, the quality and lack of standardization of these vignettes has often been criticized, especially in studies on symptom-assessment applications (SAAs) and triage decision-making. To address this, our paper introduces a method to refine an existing set of vignettes, drawing on principles from classical test theory. First, we removed any vignette with an item difficulty of zero and an item-total correlation below zero. Second, we stratified the remaining vignettes to reflect the natural base rates of symptoms that SAAs are typically approached with, selecting those vignettes with the highest item-total correlation in each quota. Although this two-step procedure reduced the size of the original vignette set by 40%, comparing triage performance on the reduced and the original vignette sets, we found a strong correlation (r = 0.747 to r = 0.997, p < .001). This indicates that using our refinement method helps identifying vignettes with high predictive power of an agent’s triage performance while simultaneously increasing cost-efficiency of vignette-based evaluation studies. This might ultimately lead to higher research quality and more reliable results.
数字健康研究通常依赖病例小故事(对虚构或真实患者的描述)来应对伦理和实际挑战。尽管这些小故事很有用,但其质量和缺乏标准化的问题经常受到批评,尤其是在症状评估应用(SAA)和分诊决策研究中。为了解决这个问题,我们的论文借鉴了经典测试理论的原则,介绍了一种完善现有小故事集的方法。首先,我们删除了所有项目难度为零、项目总相关性低于零的小测验。其次,我们对剩余的小题进行分层,以反映自闭症患者通常会出现的症状的自然基数,并在每个配额中选择项目-总相关性最高的小题。尽管这两步程序将原始小节集的规模缩小了 40%,但比较缩小后的小节集和原始小节集的分流效果,我们发现两者之间存在很强的相关性(r = 0.747 到 r = 0.997,p <.001)。这表明,使用我们的细化方法有助于识别对代理的分流性能具有较高预测能力的小插图,同时提高基于小插图的评估研究的成本效益。这最终可能会带来更高的研究质量和更可靠的结果。
{"title":"Statistical refinement of case vignettes for digital health research","authors":"Marvin Kopka, Markus A. Feufel","doi":"10.1101/2024.08.30.24312824","DOIUrl":"https://doi.org/10.1101/2024.08.30.24312824","url":null,"abstract":"Digital health research often relies on case vignettes (descriptions of fictitious or real patients) to navigate ethical and practical challenges. Despite their utility, the quality and lack of standardization of these vignettes has often been criticized, especially in studies on symptom-assessment applications (SAAs) and triage decision-making. To address this, our paper introduces a method to refine an existing set of vignettes, drawing on principles from classical test theory. First, we removed any vignette with an item difficulty of zero and an item-total correlation below zero. Second, we stratified the remaining vignettes to reflect the natural base rates of symptoms that SAAs are typically approached with, selecting those vignettes with the highest item-total correlation in each quota. Although this two-step procedure reduced the size of the original vignette set by 40%, comparing triage performance on the reduced and the original vignette sets, we found a strong correlation (r = 0.747 to r = 0.997, p < .001). This indicates that using our refinement method helps identifying vignettes with high predictive power of an agent’s triage performance while simultaneously increasing cost-efficiency of vignette-based evaluation studies. This might ultimately lead to higher research quality and more reliable results.","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142211508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-29DOI: 10.1101/2024.08.28.24312737
Emre Sezgin, Daniel I. Jackson, Kate Kaufman, Micah Skeens, Cynthia A. Gerhardt, Emily L. Moscato
Purpose This study examined the perceptions of caregivers of young childhood cancer survivors (YCCS) regarding the use of virtual assistant (VA) technology for health information seeking and care management. The study aim was to understand how VAs can support caregivers, especially those from underserved communities, in navigating health information related to cancer survivorship.
{"title":"Perceptions about the Use of Virtual Assistants for Seeking Health Information among Caregivers of Young Childhood Cancer Survivors","authors":"Emre Sezgin, Daniel I. Jackson, Kate Kaufman, Micah Skeens, Cynthia A. Gerhardt, Emily L. Moscato","doi":"10.1101/2024.08.28.24312737","DOIUrl":"https://doi.org/10.1101/2024.08.28.24312737","url":null,"abstract":"<strong>Purpose</strong> This study examined the perceptions of caregivers of young childhood cancer survivors (YCCS) regarding the use of virtual assistant (VA) technology for health information seeking and care management. The study aim was to understand how VAs can support caregivers, especially those from underserved communities, in navigating health information related to cancer survivorship.","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":"33 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142211510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}