首页 > 最新文献

AMIA ... Annual Symposium proceedings. AMIA Symposium最新文献

英文 中文
Technology and Human Support Systems in Decentralized Studies: A Participant-Centered Case Study in Cystic Fibrosis. 分散研究中的技术和人类支持系统:一项以参与者为中心的囊性纤维化病例研究。
Pub Date : 2025-05-22 eCollection Date: 2024-01-01
Ayana Sarrieddine, Claire Lai, Oliver Bear Don't Walk, Nick F H Reid, Gregory Sawicki, Ariel Berlinski, Margaret Rosenfeld, Andrea L Hartzler

As the integration of informatics into clinical research reshapes the landscape of decentralized studies, optimizing participant experience remains a key challenge. Although prior research has established foundations for decentralized study design, a more comprehensive understanding of participant perspectives is essential to ensure remote methods for data collection meet participant needs. This study contributes to a growing literature in participant-centered decentralized studies through an analysis of OUTREACH, a 3-month home spirometry study among individuals with cystic fibrosis. Through a qualitative analysis of 46 participant exit interviews, we identified three overarching categories that influenced participant experience: motivators, technological infrastructure, and human coordination. Our findings emphasize the value of reliable technology and comprehensive interpersonal support systems. These findings shed light upon the importance of sociotechnical elements for optimizing participant experience, which may enhance the quality of clinical study data through meaningful participant engagement.

随着信息学与临床研究的整合重塑了分散研究的格局,优化参与者体验仍然是一个关键挑战。虽然先前的研究已经为分散研究设计奠定了基础,但更全面地了解参与者的观点对于确保远程数据收集方法满足参与者的需求至关重要。通过对OUTREACH的分析,本研究为以参与者为中心的分散研究提供了越来越多的文献,OUTREACH是一项为期3个月的囊性纤维化患者家庭肺活量测定研究。通过对46个参与者离职访谈的定性分析,我们确定了影响参与者体验的三个主要类别:激励因素、技术基础设施和人际协调。我们的研究结果强调了可靠的技术和全面的人际支持系统的价值。这些发现揭示了社会技术因素对优化参与者体验的重要性,这可能会通过有意义的参与者参与来提高临床研究数据的质量。
{"title":"Technology and Human Support Systems in Decentralized Studies: A Participant-Centered Case Study in Cystic Fibrosis.","authors":"Ayana Sarrieddine, Claire Lai, Oliver Bear Don't Walk, Nick F H Reid, Gregory Sawicki, Ariel Berlinski, Margaret Rosenfeld, Andrea L Hartzler","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>As the integration of informatics into clinical research reshapes the landscape of decentralized studies, optimizing participant experience remains a key challenge. Although prior research has established foundations for decentralized study design, a more comprehensive understanding of participant perspectives is essential to ensure remote methods for data collection meet participant needs. This study contributes to a growing literature in participant-centered decentralized studies through an analysis of OUTREACH, a 3-month home spirometry study among individuals with cystic fibrosis. Through a qualitative analysis of 46 participant exit interviews, we identified three overarching categories that influenced participant experience: motivators, technological infrastructure, and human coordination. Our findings emphasize the value of reliable technology and comprehensive interpersonal support systems. These findings shed light upon the importance of sociotechnical elements for optimizing participant experience, which may enhance the quality of clinical study data through meaningful participant engagement.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2024 ","pages":"1130-1139"},"PeriodicalIF":0.0,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12919454/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147273135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identifying Missing IS-A Relations in SNOMED CT with Fine-Tuned Pre-trained Language Models and Non-lattice Subgraphs. 用微调预训练语言模型和非格子图识别SNOMED CT中缺失的IS-A关系。
Pub Date : 2025-05-22 eCollection Date: 2024-01-01
Xubing Hao, Rashmie Abeysinghe, Jay Shi, Guo-Qiang Zhang, Licong Cui

Ensuring the completeness of IS-A relations in SNOMED CT is crucial for maintaining its accuracy in clinical applications. In this study, we propose a hybrid approach leveraging non-lattice subgraphs and pre-trained language models (PLMs) to identify missing IS-A relations in SNOMED CT. We fine-tuned four BERT-based models: BERT, DistillBERT, DeBERTa, and BioClinicalBERT, and four generative large language models (LLMs): BioMistral, Llama3, Gemma2, and Phi-4. Missing IS-A relations were identified through consensus predictions by all eight models. De-BERTa achieved the best performance (precision: 0.96, recall: 0.97, F1-score: 0.965) for IS-A relation prediction. Our approach identified 678 potential missing IS-A relations in SNOMED CT (March 2023 US Edition), of which 100 randomly selected cases were manually reviewed by a domain expert, confirming 93 as valid (93% precision). These results demonstrate the effectiveness of fine-tuned PLMs in detecting missing IS-A relations within non-lattice subgraphs, offering a promising avenue for improving SNOMED CT's quality.

确保SNOMED CT中is - a关系的完整性对于保持其临床应用的准确性至关重要。在这项研究中,我们提出了一种利用非格子图和预训练语言模型(PLMs)的混合方法来识别SNOMED CT中缺失的IS-A关系。我们微调了四个基于BERT的模型:BERT、DistillBERT、DeBERTa和BioClinicalBERT,以及四个生成式大型语言模型(LLMs): BioMistral、Llama3、Gemma2和Phi-4。缺失的IS-A关系是通过所有八个模型的一致预测来确定的。De-BERTa在IS-A关系预测中表现最佳(precision: 0.96, recall: 0.97, F1-score: 0.965)。我们的方法在SNOMED CT(2023年3月美国版)中确定了678例可能缺失的IS-A关系,其中100例随机选择的病例由领域专家手工审查,确认了93例有效(准确率为93%)。这些结果证明了微调plm在检测非格子图中缺失的IS-A关系方面的有效性,为提高SNOMED CT的质量提供了一条有希望的途径。
{"title":"Identifying Missing IS-A Relations in SNOMED CT with Fine-Tuned Pre-trained Language Models and Non-lattice Subgraphs.","authors":"Xubing Hao, Rashmie Abeysinghe, Jay Shi, Guo-Qiang Zhang, Licong Cui","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Ensuring the completeness of IS-A relations in SNOMED CT is crucial for maintaining its accuracy in clinical applications. In this study, we propose a hybrid approach leveraging non-lattice subgraphs and pre-trained language models (PLMs) to identify missing IS-A relations in SNOMED CT. We fine-tuned four BERT-based models: BERT, DistillBERT, DeBERTa, and BioClinicalBERT, and four generative large language models (LLMs): BioMistral, Llama3, Gemma2, and Phi-4. Missing IS-A relations were identified through consensus predictions by all eight models. De-BERTa achieved the best performance (precision: 0.96, recall: 0.97, F1-score: 0.965) for IS-A relation prediction. Our approach identified 678 potential missing IS-A relations in SNOMED CT (March 2023 US Edition), of which 100 randomly selected cases were manually reviewed by a domain expert, confirming 93 as valid (93% precision). These results demonstrate the effectiveness of fine-tuned PLMs in detecting missing IS-A relations within non-lattice subgraphs, offering a promising avenue for improving SNOMED CT's quality.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2024 ","pages":"433-442"},"PeriodicalIF":0.0,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12919620/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147273142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
STop Clock for Automated Tracking (STAT) during Time-Critical Medical Work: Evaluating the Accuracy and Usability of an AI-Driven Automated Stop Clock. 在时间紧迫的医疗工作中用于自动跟踪(STAT)的计时器:评估人工智能驱动的自动计时器的准确性和可用性。
Pub Date : 2025-05-22 eCollection Date: 2024-01-01
Katherine A Zellner, Sifan Yuan, Emily R Ernst, Dylan W Arkowitz, Aaron H Mun, Mary S Kim, Ivan Marsic, Randall S Burd, Aleksandra Sarcevic

Delays and process inefficiencies during trauma resuscitation can contribute to adverse patient outcomes. While tracking elapsed time may improve the trauma team's temporal awareness and reduce delays, reliance on manual activation of stop clocks can introduce variability. To address this limitation, we implemented a computer vision-powered automatic stop clock designed to activate upon patient arrival without requiring manual input. We conducted a retrospective video review of 50 trauma resuscitations to assess how the clock was used in practice, followed by semi-structured interviews with nine trauma team members to elicit their feedback and perceptions. This study contributes to the broader discussion on AI-assisted clinical tools, highlighting the role of automation in supporting trauma teams, reducing variability in time tracking, and improving process efficiency.

创伤复苏过程中的延误和流程效率低下可能导致患者预后不良。虽然追踪经过的时间可以提高创伤小组的时间意识并减少延误,但依赖手动激活停止时钟可能会引入可变性。为了解决这一限制,我们实现了一个计算机视觉驱动的自动停止时钟,设计用于在患者到达时激活,而无需手动输入。我们对50例创伤复苏进行了回顾性视频回顾,以评估时钟在实践中的使用情况,随后对9名创伤小组成员进行了半结构化访谈,以获取他们的反馈和看法。这项研究有助于对人工智能辅助临床工具进行更广泛的讨论,强调了自动化在支持创伤团队中的作用,减少了时间跟踪的可变性,提高了流程效率。
{"title":"STop Clock for Automated Tracking (STAT) during Time-Critical Medical Work: Evaluating the Accuracy and Usability of an AI-Driven Automated Stop Clock.","authors":"Katherine A Zellner, Sifan Yuan, Emily R Ernst, Dylan W Arkowitz, Aaron H Mun, Mary S Kim, Ivan Marsic, Randall S Burd, Aleksandra Sarcevic","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Delays and process inefficiencies during trauma resuscitation can contribute to adverse patient outcomes. While tracking elapsed time may improve the trauma team's temporal awareness and reduce delays, reliance on manual activation of stop clocks can introduce variability. To address this limitation, we implemented a computer vision-powered automatic stop clock designed to activate upon patient arrival without requiring manual input. We conducted a retrospective video review of 50 trauma resuscitations to assess how the clock was used in practice, followed by semi-structured interviews with nine trauma team members to elicit their feedback and perceptions. This study contributes to the broader discussion on AI-assisted clinical tools, highlighting the role of automation in supporting trauma teams, reducing variability in time tracking, and improving process efficiency.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2024 ","pages":"1502-1510"},"PeriodicalIF":0.0,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12919504/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147273145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Contextual Phenotyping of Pediatric Sepsis Cohort Using Large Language Models. 使用大型语言模型的儿童脓毒症队列的上下文表型分析。
Pub Date : 2025-05-22 eCollection Date: 2024-01-01
Aditya Nagori, Ayush Gautam, Matthew O Wiens, Vuong Nguyen, Nathan Kenya Mugisha, Jerome Kabakyenga, Niranjan Kissoon, John Mark Ansermino, Rishikesan Kamaleswaran

The clustering of patient subgroups is essential for personalized care and efficient use of resources. Traditional clustering methods struggle with high-dimensional heterogeneous healthcare data and lack contextual understanding. This study evaluates clustering based on the Large Language Model (LLM) against classical methods using a pediatric sepsis dataset from a low-income country (LIC), containing 2,686 records with 28 numerical variables and 119 categorical variables. Patient records were serialized into text with and without a clustering objective. Embeddings were generated using quantized LLAMA 3.1 8B, DeepSeek-R1-Distill-Llama-8B with low-rank adaptation(LoRA), and Stella-En-400M-V5 models. K-means clustering was applied to these embeddings. Classical comparisons included K-Medoids clustering on UMAP and FAMD-reduced mixed data. Silhouette scores and statistical tests evaluated the quality and distinctiveness of the cluster. Stella-En-400M-V5 achieved the highest Silhouette Score (0.86). LLAMA 3.1 8B with the clustering objective performed better with a higher number of clusters, identifying subgroups with distinct nutritional, clinical, and socioeconomic profiles. LLM-based methods outperformed classical techniques by capturing richer context and prioritizing key features. These results highlight the potential of LLMs for contextual phenotyping and informed decision making in resource-limited settings.

患者亚组的聚类对于个性化护理和有效利用资源至关重要。传统的聚类方法难以处理高维异构医疗保健数据,并且缺乏上下文理解。本研究使用来自低收入国家(LIC)的儿童败血症数据集(包含2,686条记录,28个数值变量和119个分类变量)评估基于大语言模型(LLM)与经典方法的聚类。患者记录被序列化成文本,有或没有聚类目标。使用量化的LLAMA 3.1 8B、deepseek - r1 - distill - lama-8B与低秩自适应(LoRA)模型和Stella-En-400M-V5模型生成嵌入。对这些嵌入应用K-means聚类。经典比较包括在UMAP和famd减少的混合数据上的k - mediids聚类。剪影评分和统计检验评估了聚类的质量和独特性。斯特拉- en - 400m - v5获得了最高的剪影评分(0.86)。具有聚类目标的LLAMA 3.1 8B随着聚类数量的增加而表现更好,识别出具有不同营养,临床和社会经济概况的亚组。基于llm的方法通过捕获更丰富的上下文和优先考虑关键特性而优于传统技术。这些结果突出了llm在资源有限的环境中对上下文表型和知情决策的潜力。
{"title":"Contextual Phenotyping of Pediatric Sepsis Cohort Using Large Language Models.","authors":"Aditya Nagori, Ayush Gautam, Matthew O Wiens, Vuong Nguyen, Nathan Kenya Mugisha, Jerome Kabakyenga, Niranjan Kissoon, John Mark Ansermino, Rishikesan Kamaleswaran","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The clustering of patient subgroups is essential for personalized care and efficient use of resources. Traditional clustering methods struggle with high-dimensional heterogeneous healthcare data and lack contextual understanding. This study evaluates clustering based on the Large Language Model (LLM) against classical methods using a pediatric sepsis dataset from a low-income country (LIC), containing 2,686 records with 28 numerical variables and 119 categorical variables. Patient records were serialized into text with and without a clustering objective. Embeddings were generated using quantized LLAMA 3.1 8B, DeepSeek-R1-Distill-Llama-8B with low-rank adaptation(LoRA), and Stella-En-400M-V5 models. K-means clustering was applied to these embeddings. Classical comparisons included K-Medoids clustering on UMAP and FAMD-reduced mixed data. Silhouette scores and statistical tests evaluated the quality and distinctiveness of the cluster. Stella-En-400M-V5 achieved the highest Silhouette Score (0.86). LLAMA 3.1 8B with the clustering objective performed better with a higher number of clusters, identifying subgroups with distinct nutritional, clinical, and socioeconomic profiles. LLM-based methods outperformed classical techniques by capturing richer context and prioritizing key features. These results highlight the potential of LLMs for contextual phenotyping and informed decision making in resource-limited settings.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2024 ","pages":"929-938"},"PeriodicalIF":0.0,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12919534/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147273189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Lessons Learned from OpenEMR Implementation in Graduate Health Informatics Curriculum. 研究生健康信息学课程实施openenemr的经验教训。
Pub Date : 2025-05-22 eCollection Date: 2024-01-01
Keerthika Sunchu, Megha M Moncy, Saptarshi Purkayastha, Cathy R Fulton

This study examines the integration of OpenEMR, a Meaningful Use-certified open-source electronic health record (EHR) system, into a Health Informatics curriculum. The primary objective was to address the disparity between theoretical knowledge and practical application in health informatics education. The implementation process revealed several significant challenges, including unintended system modifications that compromised functionality, data entry errors that impacted usability, and technical issues that impeded accessibility. To mitigate these challenges, a series of interventions were implemented. These included backend modifications to enhance data entry accuracy, usability improvements such as limiting open tabs to facilitate navigation, and the implementation ofproactive measures to expedite the resolution of technical issues. The experiences gained from this integration process highlight three critical aspects of health informatics education: the significance of practical proficiency in EHR systems, the necessity for user-centric interface design, and the importance of adaptability and problem-solving skills. The study proposes several future directions for research and practice. These include fostering global collaboration, developing standardized curricula for EHR education, and establishing robust mechanisms for continuous assessment and improvement. The findings underscore the pivotal role of integrating hands-on EHR experience into health informatics education, emphasizing its potential to equip students with the essential competencies required to navigate the complex and dynamic healthcare landscape.

本研究考察了开放式健康档案系统(一个有意义使用认证的开源电子健康档案系统)与健康信息学课程的整合。主要目的是解决卫生信息学教育中理论知识与实际应用之间的差距。实现过程揭示了几个重要的挑战,包括破坏功能的意外系统修改,影响可用性的数据输入错误,以及阻碍可访问性的技术问题。为了缓解这些挑战,实施了一系列干预措施。其中包括后端修改以提高数据输入的准确性,可用性改进(如限制打开选项卡以方便导航),以及实施主动措施以加快技术问题的解决。从这一整合过程中获得的经验突出了健康信息学教育的三个关键方面:电子病历系统实践熟练程度的重要性,以用户为中心的界面设计的必要性,以及适应性和解决问题技能的重要性。本研究提出了未来研究和实践的几个方向。这些措施包括促进全球合作,制定电子健康档案教育的标准化课程,以及建立持续评估和改进的健全机制。研究结果强调了将电子病历实践经验整合到健康信息学教育中的关键作用,强调了其潜力,使学生具备驾驭复杂和动态的医疗保健环境所需的基本能力。
{"title":"Lessons Learned from OpenEMR Implementation in Graduate Health Informatics Curriculum.","authors":"Keerthika Sunchu, Megha M Moncy, Saptarshi Purkayastha, Cathy R Fulton","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>This study examines the integration of OpenEMR, a Meaningful Use-certified open-source electronic health record (EHR) system, into a Health Informatics curriculum. The primary objective was to address the disparity between theoretical knowledge and practical application in health informatics education. The implementation process revealed several significant challenges, including unintended system modifications that compromised functionality, data entry errors that impacted usability, and technical issues that impeded accessibility. To mitigate these challenges, a series of interventions were implemented. These included backend modifications to enhance data entry accuracy, usability improvements such as limiting open tabs to facilitate navigation, and the implementation ofproactive measures to expedite the resolution of technical issues. The experiences gained from this integration process highlight three critical aspects of health informatics education: the significance of practical proficiency in EHR systems, the necessity for user-centric interface design, and the importance of adaptability and problem-solving skills. The study proposes several future directions for research and practice. These include fostering global collaboration, developing standardized curricula for EHR education, and establishing robust mechanisms for continuous assessment and improvement. The findings underscore the pivotal role of integrating hands-on EHR experience into health informatics education, emphasizing its potential to equip students with the essential competencies required to navigate the complex and dynamic healthcare landscape.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2024 ","pages":"1079-1088"},"PeriodicalIF":0.0,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12099383/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144144577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RealMedQA: A pilot biomedical question answering dataset containing realistic clinical questions. RealMedQA:一个试验性的生物医学问题回答数据集,包含现实的临床问题。
Pub Date : 2025-05-22 eCollection Date: 2024-01-01
Gregory Kell, Angus Roberts, Serge Umansky, Yuti Khare, Najma Ahmed, Nikhil Patel, Chloe Simela, Jack Coumbe, Julian Rozario, Ryan-Rhys Griffiths, Iain J Marshall

Clinical question answering systems have the potential to provide clinicians with relevant and timely answers to their questions. Nonetheless, despite the advances that have been made, adoption of these systems in clinical settings has been slow. One issue is a lack of question-answering datasets which reflect the real-world needs of health professionals. In this work, we present RealMedQA, a dataset of realistic clinical questions generated by humans and an LLM. We describe the process for generating and verifying the QA pairs and assess several QA models on BioASQ and RealMedQA to assess the relative difficulty of matching answers to questions. We show that the LLM is more cost-efficient for generating "ideal" QA pairs. Additionally, we achieve a lower lexical similarity between questions and answers than BioASQ which provides an additional challenge to the top two QA models, as per the results. We release our code and our dataset publicly to encourage further research.

临床问答系统有可能为临床医生提供相关和及时的问题答案。然而,尽管取得了进展,但在临床环境中采用这些系统的速度很慢。一个问题是缺乏反映现实世界卫生专业人员需求的问答数据集。在这项工作中,我们提出了RealMedQA,这是一个由人类和法学硕士生成的现实临床问题的数据集。我们描述了生成和验证QA对的过程,并在BioASQ和RealMedQA上评估了几个QA模型,以评估匹配问题答案的相对难度。我们证明了LLM在生成“理想”QA对方面更具成本效益。此外,根据结果,我们实现了比BioASQ更低的问题和答案之间的词汇相似性,这为前两个QA模型提供了额外的挑战。我们公开发布代码和数据集,以鼓励进一步的研究。
{"title":"RealMedQA: A pilot biomedical question answering dataset containing realistic clinical questions.","authors":"Gregory Kell, Angus Roberts, Serge Umansky, Yuti Khare, Najma Ahmed, Nikhil Patel, Chloe Simela, Jack Coumbe, Julian Rozario, Ryan-Rhys Griffiths, Iain J Marshall","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Clinical question answering systems have the potential to provide clinicians with relevant and timely answers to their questions. Nonetheless, despite the advances that have been made, adoption of these systems in clinical settings has been slow. One issue is a lack of question-answering datasets which reflect the real-world needs of health professionals. In this work, we present RealMedQA, a dataset of realistic clinical questions generated by humans and an LLM. We describe the process for generating and verifying the QA pairs and assess several QA models on BioASQ and RealMedQA to assess the relative difficulty of matching answers to questions. We show that the LLM is more cost-efficient for generating \"ideal\" QA pairs. Additionally, we achieve a lower lexical similarity between questions and answers than BioASQ which provides an additional challenge to the top two QA models, as per the results. We release our code and our dataset publicly to encourage further research.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2024 ","pages":"590-599"},"PeriodicalIF":0.0,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12099375/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144144715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards Interpretable End-Stage Renal Disease (ESRD) Prediction: Utilizing Administrative Claims Data with Explainable AI Techniques. 迈向可解释的终末期肾病(ESRD)预测:利用行政索赔数据和可解释的人工智能技术。
Pub Date : 2025-05-22 eCollection Date: 2024-01-01
Yubo Li, Saba Al-Sayouri, Rema Padman

This study explores the potential of utilizing administrative claims data, combined with advanced machine learning and deep learning techniques, to predict the progression of Chronic Kidney Disease (CKD) to End-Stage Renal Disease (ESRD). We analyze a comprehensive, 10-year dataset provided by a major health insurance organization to develop prediction models for multiple observation windows using traditional machine learning methods such as Random Forest and XGBoost as well as deep learning approaches such as Long Short-Term Memory (LSTM) networks. Our findings demonstrate that the LSTM model, particularly with a 24-month observation window, exhibits superior performance in predicting ESRD progression, outperforming existing models in the literature. We further apply SHap-ley Additive exPlanations (SHAP) analysis to enhance interpretability, providing insights into the impact of individual features on predictions at the individual patient level. This study underscores the value of leveraging administrative claims data for CKD management and predicting ESRD progression.

本研究探讨了利用行政索赔数据,结合先进的机器学习和深度学习技术,预测慢性肾脏疾病(CKD)到终末期肾脏疾病(ESRD)进展的潜力。我们分析了由一家大型健康保险组织提供的全面的10年数据集,使用传统的机器学习方法(如Random Forest和XGBoost)以及深度学习方法(如长短期记忆(LSTM)网络)开发多个观测窗口的预测模型。我们的研究结果表明,LSTM模型,特别是具有24个月观察窗口的LSTM模型,在预测ESRD进展方面表现优异,优于文献中的现有模型。我们进一步应用SHAP -ley加性解释(SHAP)分析来提高可解释性,从而深入了解个体特征对个体患者水平预测的影响。本研究强调了利用行政索赔数据对CKD管理和预测ESRD进展的价值。
{"title":"Towards Interpretable End-Stage Renal Disease (ESRD) Prediction: Utilizing Administrative Claims Data with Explainable AI Techniques.","authors":"Yubo Li, Saba Al-Sayouri, Rema Padman","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>This study explores the potential of utilizing administrative claims data, combined with advanced machine learning and deep learning techniques, to predict the progression of Chronic Kidney Disease (CKD) to End-Stage Renal Disease (ESRD). We analyze a comprehensive, 10-year dataset provided by a major health insurance organization to develop prediction models for multiple observation windows using traditional machine learning methods such as Random Forest and XGBoost as well as deep learning approaches such as Long Short-Term Memory (LSTM) networks. Our findings demonstrate that the LSTM model, particularly with a 24-month observation window, exhibits superior performance in predicting ESRD progression, outperforming existing models in the literature. We further apply SHap-ley Additive exPlanations (SHAP) analysis to enhance interpretability, providing insights into the impact of individual features on predictions at the individual patient level. This study underscores the value of leveraging administrative claims data for CKD management and predicting ESRD progression.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2024 ","pages":"664-673"},"PeriodicalIF":0.0,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12099416/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144144822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development of a Human Evaluation Framework and Correlation with Automated Metrics for Natural Language Generation of Medical Diagnoses. 医学诊断自然语言生成的人类评估框架及其与自动度量的关联。
Pub Date : 2025-05-22 eCollection Date: 2024-01-01
Emma Croxford, Yanjun Gao, Brian Patterson, Daniel To, Samuel Tesch, Dmitriy Dligach, Anoop Mayampurath, Matthew M Churpek, Majid Afshar

In the evolving landscape of clinical Natural Language Generation (NLG), assessing abstractive text quality remains challenging, as existing methods often overlook generative task complexities. This work aimed to examine the current state of automated evaluation metrics in NLG in healthcare. To have a robust and well-validated baseline with which to examine the alignment of these metrics, we created a comprehensive human evaluation framework. Employing ChatGPT-3.5-turbo generative output, we correlated human judgments with each metric. None of the metrics demonstrated high alignment; however, the SapBERT score-a Unified Medical Language System (UMLS)- showed the best results. This underscores the importance of incorporating domain-specific knowledge into evaluation efforts. Our work reveals the deficiency in quality evaluations for generated text and introduces our comprehensive human evaluation framework as a baseline. Future efforts should prioritize integrating medical knowledge databases to enhance the alignment of automated metrics, particularly focusing on refining the SapBERT score for improved assessments.

在临床自然语言生成(NLG)不断发展的环境中,评估抽象文本质量仍然具有挑战性,因为现有方法经常忽略生成任务的复杂性。这项工作旨在检查医疗保健中NLG自动评估指标的现状。为了有一个健壮的和经过良好验证的基线来检查这些度量的一致性,我们创建了一个全面的人类评估框架。使用chatgpt -3.5涡轮生成输出,我们将人类判断与每个指标关联起来。没有一个指标显示出高度的一致性;然而,统一医学语言系统(UMLS)的SapBERT评分显示出最好的结果。这强调了将特定领域的知识纳入评估工作的重要性。我们的工作揭示了生成文本质量评估的不足,并介绍了我们的综合人类评估框架作为基线。未来的工作应优先考虑整合医学知识数据库,以增强自动化度量标准的一致性,特别是侧重于改进SapBERT评分以改进评估。
{"title":"Development of a Human Evaluation Framework and Correlation with Automated Metrics for Natural Language Generation of Medical Diagnoses.","authors":"Emma Croxford, Yanjun Gao, Brian Patterson, Daniel To, Samuel Tesch, Dmitriy Dligach, Anoop Mayampurath, Matthew M Churpek, Majid Afshar","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>In the evolving landscape of clinical Natural Language Generation (NLG), assessing abstractive text quality remains challenging, as existing methods often overlook generative task complexities. This work aimed to examine the current state of automated evaluation metrics in NLG in healthcare. To have a robust and well-validated baseline with which to examine the alignment of these metrics, we created a comprehensive human evaluation framework. Employing ChatGPT-3.5-turbo generative output, we correlated human judgments with each metric. None of the metrics demonstrated high alignment; however, the SapBERT score-a Unified Medical Language System (UMLS)- showed the best results. This underscores the importance of incorporating domain-specific knowledge into evaluation efforts. Our work reveals the deficiency in quality evaluations for generated text and introduces our comprehensive human evaluation framework as a baseline. Future efforts should prioritize integrating medical knowledge databases to enhance the alignment of automated metrics, particularly focusing on refining the SapBERT score for improved assessments.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2024 ","pages":"309-318"},"PeriodicalIF":0.0,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12099413/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144144496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Large Language Models Struggle in Token-Level Clinical Named Entity Recognition. 大型语言模型在符号级临床命名实体识别中的挣扎。
Pub Date : 2025-05-22 eCollection Date: 2024-01-01
Qiuhao Lu, Rui Li, Andrew Wen, Jinlian Wang, Liwei Wang, Hongfang Liu

Large Language Models (LLMs) have revolutionized various sectors, including healthcare where they are employed in diverse applications. Their utility is particularly significant in the context of rare diseases, where data scarcity, complexity, and specificity pose considerable challenges. In the clinical domain, Named Entity Recognition (NER) stands out as an essential task and it plays a crucial role in extracting relevant information from clinical texts. Despite the promise of LLMs, current research mostly concentrates on document-level NER, identifying entities in a more general context across entire documents, without extracting their precise location. Additionally, efforts have been directed towards adapting ChatGPTfor token-level NER. However, there is a significant research gap when it comes to employing token-level NER for clinical texts, especially with the use of local open-source LLMs. This study aims to bridge this gap by investigating the effectiveness of both proprietary and local LLMs in token-level clinical NER. Essentially, we delve into the capabilities of these models through a series of experiments involving zero-shot prompting, few-shot prompting, retrieval-augmented generation (RAG), and instruction-fine-tuning. Our exploration reveals the inherent challenges LLMs face in token-level NER, particularly in the context of rare diseases, and suggests possible improvements for their application in healthcare. This research contributes to narrowing a significant gap in healthcare informatics and offers insights that could lead to a more refined application of LLMs in the healthcare sector.

大型语言模型(llm)已经彻底改变了各个领域,包括医疗保健领域,其中它们被用于各种应用程序。在罕见疾病的背景下,它们的效用尤其重要,因为数据的稀缺性、复杂性和特异性构成了相当大的挑战。在临床领域,命名实体识别(NER)是一项重要的任务,它在从临床文本中提取相关信息方面起着至关重要的作用。尽管llm很有前途,但目前的研究主要集中在文档级NER上,即在整个文档中更一般的上下文中识别实体,而不是提取它们的精确位置。此外,还在努力使chatgpt适应令牌级NER。然而,当涉及到为临床文本使用令牌级NER时,特别是使用本地开源法学硕士时,存在显着的研究差距。本研究旨在通过调查专有和本地法学硕士在令牌级临床NER中的有效性来弥合这一差距。从本质上讲,我们通过一系列涉及零提示、少提示、检索增强生成(RAG)和指令微调的实验来深入研究这些模型的功能。我们的探索揭示了llm在代币级NER中面临的固有挑战,特别是在罕见疾病的背景下,并建议了它们在医疗保健领域应用的可能改进。这项研究有助于缩小医疗保健信息学方面的重大差距,并提供了可能导致法学硕士在医疗保健领域更精细应用的见解。
{"title":"Large Language Models Struggle in Token-Level Clinical Named Entity Recognition.","authors":"Qiuhao Lu, Rui Li, Andrew Wen, Jinlian Wang, Liwei Wang, Hongfang Liu","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Large Language Models (LLMs) have revolutionized various sectors, including healthcare where they are employed in diverse applications. Their utility is particularly significant in the context of rare diseases, where data scarcity, complexity, and specificity pose considerable challenges. In the clinical domain, Named Entity Recognition (NER) stands out as an essential task and it plays a crucial role in extracting relevant information from clinical texts. Despite the promise of LLMs, current research mostly concentrates on document-level NER, identifying entities in a more general context across entire documents, without extracting their precise location. Additionally, efforts have been directed towards adapting ChatGPTfor token-level NER. However, there is a significant research gap when it comes to employing token-level NER for clinical texts, especially with the use of local open-source LLMs. This study aims to bridge this gap by investigating the effectiveness of both proprietary and local LLMs in token-level clinical NER. Essentially, we delve into the capabilities of these models through a series of experiments involving zero-shot prompting, few-shot prompting, retrieval-augmented generation (RAG), and instruction-fine-tuning. Our exploration reveals the inherent challenges LLMs face in token-level NER, particularly in the context of rare diseases, and suggests possible improvements for their application in healthcare. This research contributes to narrowing a significant gap in healthcare informatics and offers insights that could lead to a more refined application of LLMs in the healthcare sector.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2024 ","pages":"748-757"},"PeriodicalIF":0.0,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12099373/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144144361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancement of Fairness in AI for Chest X-ray Classification. 增强胸部x线分类人工智能公平性
Pub Date : 2025-05-22 eCollection Date: 2024-01-01
Nicholas J Jackson, Chao Yan, Bradley A Malin

The use of artificial intelligence (AI) in medicine has shown promise to improve the quality of healthcare decisions. However, AI can be biased in a manner that produces unfair predictions for certain demographic subgroups. In MIMIC-CXR, a publicly available dataset of over 300,000 chest X-ray images, diagnostic AI has been shown to have a higher false negative rate for racial minorities. We evaluated the capacity of synthetic data augmentation, oversampling, and demographic-based corrections to enhance the fairness of AI predictions. We show that adjusting unfair predictions for demographic attributes, such as race, is ineffective at improving fairness or predictive performance. However, using oversampling and synthetic data augmentation to modify disease prevalence reduced such disparities by 74.7% and 10.6%, respectively. Moreover, such fairness gains were accomplished without reduction in performance (95% CI AUC: [0.816, 0.820] versus [0.810, 0.819] versus [0.817, 0.821] for baseline, oversampling, and augmentation, respectively).

人工智能(AI)在医学领域的应用有望提高医疗保健决策的质量。然而,人工智能可能会以某种方式产生对某些人口统计子群体的不公平预测。MIMIC-CXR是一个公开的超过30万张胸部x射线图像数据集,在该数据集中,人工智能诊断对少数种族的假阴性率更高。我们评估了合成数据增强、过采样和基于人口统计的修正的能力,以提高人工智能预测的公平性。我们表明,调整人口统计属性(如种族)的不公平预测在提高公平性或预测性能方面是无效的。然而,使用过采样和合成数据增强来修改患病率,分别将这种差异缩小了74.7%和10.6%。此外,这种公平性的提高在不降低性能的情况下实现(95% CI AUC分别为基线、过采样和增强的[0.816,0.820]、[0.810,0.819]和[0.817,0.821])。
{"title":"Enhancement of Fairness in AI for Chest X-ray Classification.","authors":"Nicholas J Jackson, Chao Yan, Bradley A Malin","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The use of artificial intelligence (AI) in medicine has shown promise to improve the quality of healthcare decisions. However, AI can be biased in a manner that produces unfair predictions for certain demographic subgroups. In MIMIC-CXR, a publicly available dataset of over 300,000 chest X-ray images, diagnostic AI has been shown to have a higher false negative rate for racial minorities. We evaluated the capacity of synthetic data augmentation, oversampling, and demographic-based corrections to enhance the fairness of AI predictions. We show that adjusting unfair predictions for demographic attributes, such as race, is ineffective at improving fairness or predictive performance. However, using oversampling and synthetic data augmentation to modify disease prevalence reduced such disparities by 74.7% and 10.6%, respectively. Moreover, such fairness gains were accomplished without reduction in performance (95% CI AUC: [0.816, 0.820] versus [0.810, 0.819] versus [0.817, 0.821] for baseline, oversampling, and augmentation, respectively).</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2024 ","pages":"551-560"},"PeriodicalIF":0.0,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12099404/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144144579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
AMIA ... Annual Symposium proceedings. AMIA Symposium
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1