首页 > 最新文献

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science最新文献

英文 中文
Electronic Phenotyping of Urinary Tract Infections as a Silver Standard Label for Machine Learning. 将尿路感染电子表型作为机器学习的银标准标签
Stephen P Ma, Ebru Hosgur, Conor K Corbin, Ivan Lopez, Amy Chang, Jonathan H Chen

This study explored the efficacy of electronic phenotyping in data labeling for machine learning with a focus on urinary tract infections (UTIs). We contrasted labels from electronic phenotyping against previously published labels such as urine culture positivity. In comparison, electronic phenotyping showed the potential to enhance specificity in UTI labeling while maintaining similar sensitivity and was easily scaled for application to a large dataset suitable for machine learning, which we used to train and validate a machine learning model. Electronic phenotyping offers a valuable method for machine learning label generation in healthcare, with potential benefits for patient care and antimicrobial stewardship. Further research will expand its application and optimize techniques for increased performance.

本研究探讨了电子表型在机器学习数据标注中的功效,重点关注尿路感染(UTI)。我们将电子表型的标签与之前公布的标签(如尿培养阳性)进行了对比。相比之下,电子表型技术显示出了提高UTI标签特异性的潜力,同时保持了相似的灵敏度,而且很容易扩展应用到适合机器学习的大型数据集,我们用它来训练和验证机器学习模型。电子表型为医疗保健领域的机器学习标签生成提供了一种有价值的方法,可为患者护理和抗菌药物管理带来潜在益处。进一步的研究将扩大其应用范围并优化技术以提高性能。
{"title":"Electronic Phenotyping of Urinary Tract Infections as a Silver Standard Label for Machine Learning.","authors":"Stephen P Ma, Ebru Hosgur, Conor K Corbin, Ivan Lopez, Amy Chang, Jonathan H Chen","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>This study explored the efficacy of electronic phenotyping in data labeling for machine learning with a focus on urinary tract infections (UTIs). We contrasted labels from electronic phenotyping against previously published labels such as urine culture positivity. In comparison, electronic phenotyping showed the potential to enhance specificity in UTI labeling while maintaining similar sensitivity and was easily scaled for application to a large dataset suitable for machine learning, which we used to train and validate a machine learning model. Electronic phenotyping offers a valuable method for machine learning label generation in healthcare, with potential benefits for patient care and antimicrobial stewardship. Further research will expand its application and optimize techniques for increased performance.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141812/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141200681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Effects of Added Emphasis and Pause in Audio Delivery of Health Information. 在健康信息音频传播中增加强调和暂停的效果。
Arif Ahmed, Gondy Leroy, Stephen A Rains, Philip Harber, David Kauchak, Prosanta Barai

Health literacy is crucial to supporting good health and is a major national goal. Audio delivery of information is becoming more popular for informing oneself. In this study, we evaluate the effect of audio enhancements in the form of information emphasis and pauses with health texts of varying difficulty and we measure health information comprehension and retention. We produced audio snippets from difficult and easy text and conducted the study on Amazon Mechanical Turk (AMT). Our findings suggest that emphasis matters for both information comprehension and retention. When there is no added pause, emphasizing significant information can lower the perceived difficulty for difficult and easy texts. Comprehension is higher (54%) with correctly placed emphasis for the difficult texts compared to not adding emphasis (50%). Adding a pause lowers perceived difficulty and can improve retention but adversely affects information comprehension.

健康知识普及对支持良好的健康至关重要,也是一项重要的国家目标。通过音频传递信息越来越受到人们的欢迎。在本研究中,我们评估了以信息强调和停顿的形式对不同难度的健康文本进行音频增强的效果,并测量了健康信息的理解和保留情况。我们制作了难易文本的音频片段,并在亚马逊机械手(Amazon Mechanical Turk,AMT)上进行了研究。我们的研究结果表明,强调对于信息的理解和保留都很重要。在没有额外停顿的情况下,强调重要信息可以降低难懂和简单文本的感知难度。与不加停顿(50%)相比,正确强调难懂文章的理解率更高(54%)。添加停顿可降低感知难度,并能提高信息的保留率,但会对信息的理解产生不利影响。
{"title":"Effects of Added Emphasis and Pause in Audio Delivery of Health Information.","authors":"Arif Ahmed, Gondy Leroy, Stephen A Rains, Philip Harber, David Kauchak, Prosanta Barai","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Health literacy is crucial to supporting good health and is a major national goal. Audio delivery of information is becoming more popular for informing oneself. In this study, we evaluate the effect of audio enhancements in the form of information emphasis and pauses with health texts of varying difficulty and we measure health information comprehension and retention. We produced audio snippets from difficult and easy text and conducted the study on Amazon Mechanical Turk (AMT). Our findings suggest that emphasis matters for both information comprehension and retention. When there is no added pause, emphasizing significant information can lower the perceived difficulty for difficult and easy texts. Comprehension is higher (54%) with correctly placed emphasis for the difficult texts compared to not adding emphasis (50%). Adding a pause lowers perceived difficulty and can improve retention but adversely affects information comprehension.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141844/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141200730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Investigating Cross-Domain Binary Relation Classification in Biomedical Natural Language Processing. 研究生物医学自然语言处理中的跨域二元关系分类。
Alberto Purpura, Natasha Mulligan, Uri Kartoun, Eileen Koski, Vibha Anand, Joao Bettencourt-Silva

This paper addresses the challenge of binary relation classification in biomedical Natural Language Processing (NLP), focusing on diverse domains including gene-disease associations, compound protein interactions, and social determinants of health (SDOH). We evaluate different approaches, including fine-tuning Bidirectional Encoder Representations from Transformers (BERT) models and generative Large Language Models (LLMs), and examine their performance in zero and few-shot settings. We also introduce a novel dataset of biomedical text annotated with social and clinical entities to facilitate research into relation classification. Our results underscore the continued complexity of this task for both humans and models. BERT-based models trained on domain-specific data excelled in certain domains and achieved comparable performance and generalization power to generative LLMs in others. Despite these encouraging results, these models are still far from achieving human-level performance. We also highlight the significance of high-quality training data and domain-specific fine-tuning on the performance of all the considered models.

本文探讨了生物医学自然语言处理(NLP)中二元关系分类所面临的挑战,重点关注基因-疾病关联、复合蛋白质相互作用和健康的社会决定因素(SDOH)等不同领域。我们评估了不同的方法,包括微调变换器双向编码器表征(BERT)模型和生成式大型语言模型(LLM),并检验了它们在零点和少点设置下的性能。我们还引入了一个标注了社会和临床实体的生物医学文本新数据集,以促进关系分类研究。我们的研究结果凸显了这项任务对于人类和模型的持续复杂性。基于特定领域数据训练的 BERT 模型在某些领域表现出色,而在其他领域则取得了与生成式 LLM 相媲美的性能和泛化能力。尽管取得了这些令人鼓舞的结果,但这些模型仍远未达到人类水平。我们还强调了高质量的训练数据和特定领域的微调对所有模型性能的重要性。
{"title":"Investigating Cross-Domain Binary Relation Classification in Biomedical Natural Language Processing.","authors":"Alberto Purpura, Natasha Mulligan, Uri Kartoun, Eileen Koski, Vibha Anand, Joao Bettencourt-Silva","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>This paper addresses the challenge of binary relation classification in biomedical Natural Language Processing (NLP), focusing on diverse domains including gene-disease associations, compound protein interactions, and social determinants of health (SDOH). We evaluate different approaches, including fine-tuning Bidirectional Encoder Representations from Transformers (BERT) models and generative Large Language Models (LLMs), and examine their performance in zero and few-shot settings. We also introduce a novel dataset of biomedical text annotated with social and clinical entities to facilitate research into relation classification. Our results underscore the continued complexity of this task for both humans and models. BERT-based models trained on domain-specific data excelled in certain domains and achieved comparable performance and generalization power to generative LLMs in others. Despite these encouraging results, these models are still far from achieving human-level performance. We also highlight the significance of high-quality training data and domain-specific fine-tuning on the performance of all the considered models.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141837/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pathophysiological Features in Electronic Medical Records Sustain Model Performance under Temporal Dataset Shift. 电子病历中的病理生理学特征可在数据集时空转移的情况下维持模型性能。
Raphael Brosula, Conor K Corbin, Jonathan H Chen

Access to real-world data streams like electronic medical records (EMRs) has accelerated the development of supervised machine learning (ML) models for clinical applications. However, few studies investigate the differential impact of particular features in the EMR on model performance under temporal dataset shift. To explain how features in the EMR impact models over time, this study aggregates features into feature groups by their source (e.g. medication orders, diagnosis codes and lab results) and feature categories based on their reflection of patient pathophysiology or healthcare processes. We adapt Shapley values to explain feature groups' and feature categories' marginal contribution to initial and sustained model performance. We investigate three standard clinical prediction tasks and find that while feature contributions to initial performance differ across tasks, pathophysiological features help mitigate temporal discrimination deterioration. These results provide interpretable insights on how specific feature groups contribute to model performance and robustness to temporal dataset shift.

电子病历(EMR)等真实世界数据流的获取加速了临床应用中监督机器学习(ML)模型的开发。然而,很少有研究调查 EMR 中的特定特征对模型性能在时间数据集转移下的不同影响。为了解释 EMR 中的特征如何随着时间的推移对模型产生影响,本研究将特征按其来源(如医嘱、诊断代码和化验结果)聚合成特征组,并根据其对患者病理生理学或医疗流程的反映将特征分类。我们采用夏普利值来解释特征组和特征类别对初始和持续模型性能的边际贡献。我们对三项标准临床预测任务进行了研究,发现虽然不同任务的特征对初始性能的贡献不同,但病理生理特征有助于缓解时间辨别能力的退化。这些结果提供了可解释的见解,说明特定特征组如何对模型性能和对时间数据集转移的稳健性做出贡献。
{"title":"Pathophysiological Features in Electronic Medical Records Sustain Model Performance under Temporal Dataset Shift.","authors":"Raphael Brosula, Conor K Corbin, Jonathan H Chen","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Access to real-world data streams like electronic medical records (EMRs) has accelerated the development of supervised machine learning (ML) models for clinical applications. However, few studies investigate the differential impact of particular features in the EMR on model performance under temporal dataset shift. To explain how features in the EMR impact models over time, this study aggregates features into <i>feature groups</i> by their source (e.g. medication orders, diagnosis codes and lab results) and <i>feature categories</i> based on their reflection of patient pathophysiology or healthcare processes. We adapt Shapley values to explain feature groups' and feature categories' marginal contribution to initial and sustained model performance. We investigate three standard clinical prediction tasks and find that while feature contributions to initial performance differ across tasks, pathophysiological features help mitigate temporal discrimination deterioration. These results provide interpretable insights on how specific feature groups contribute to model performance and robustness to temporal dataset shift.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141811/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Selection and Implementation of Virtual Scribe Solutions to Reduce Documentation Burden: A Mixed Methods Pilot. 选择和实施虚拟抄写员解决方案以减轻文档记录负担:混合方法试验。
Carly Hudelson, Melissa A Gunderson, Debbie Pestka, Tori Christiaansen, Bret Stotka, Lynn Kissock, Rebecca Markowitz, Sameer Badlani, Genevieve B Melton

Electronic health record (EHR) documentation is a leading reason for clinician burnout. While technology-enabled solutions like virtual and digital scribes aim to improve this, there is limited evidence of their effectiveness and minimal guidance for healthcare systems around solution selection and implementation. A transdisciplinary approach, informed by clinician interviews and other considerations, was used to evaluate and select a virtual scribe solution to pilot in a rapid iterative sprint over 12 weeks. Surveys, interviews, and EHR metadata were analyzed over a staggered 30 day implementation with live and asynchronous virtual scribe solutions. Among 16 pilot clinicians, documentation burden metrics decreased for some but not all. Some clinicians had highly positive comments, and others had concerns regarding scribe training and quality. Our findings demonstrate that virtual scribes may reduce documentation burden for some clinicians and describe a method for a collaborative and iterative technology selection process for digital tools in practice.

电子健康记录(EHR)文档是造成临床医生职业倦怠的一个主要原因。虽然虚拟和数字抄写员等技术辅助解决方案旨在改善这一问题,但有关其有效性的证据有限,对医疗保健系统选择和实施解决方案的指导也少之又少。在临床医生访谈和其他考虑因素的基础上,我们采用了一种跨学科的方法来评估和选择虚拟抄写员解决方案,并在 12 周的快速迭代冲刺阶段进行试点。在 30 天的交错实施过程中,对实时和异步虚拟抄写员解决方案的调查、访谈和电子病历元数据进行了分析。在 16 名试点临床医生中,部分人的文档负担指标有所减轻,但并非全部。一些临床医生给予了高度评价,另一些则对抄写员的培训和质量表示担忧。我们的研究结果表明,虚拟抄写员可以减轻部分临床医生的文档记录负担,并描述了一种在实践中对数字工具进行协作和迭代技术选择的方法。
{"title":"Selection and Implementation of Virtual Scribe Solutions to Reduce Documentation Burden: A Mixed Methods Pilot.","authors":"Carly Hudelson, Melissa A Gunderson, Debbie Pestka, Tori Christiaansen, Bret Stotka, Lynn Kissock, Rebecca Markowitz, Sameer Badlani, Genevieve B Melton","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Electronic health record (EHR) documentation is a leading reason for clinician burnout. While technology-enabled solutions like virtual and digital scribes aim to improve this, there is limited evidence of their effectiveness and minimal guidance for healthcare systems around solution selection and implementation. A transdisciplinary approach, informed by clinician interviews and other considerations, was used to evaluate and select a virtual scribe solution to pilot in a rapid iterative sprint over 12 weeks. Surveys, interviews, and EHR metadata were analyzed over a staggered 30 day implementation with live and asynchronous virtual scribe solutions. Among 16 pilot clinicians, documentation burden metrics decreased for some but not all. Some clinicians had highly positive comments, and others had concerns regarding scribe training and quality. Our findings demonstrate that virtual scribes may reduce documentation burden for some clinicians and describe a method for a collaborative and iterative technology selection process for digital tools in practice.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141854/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards Predicting Smoking Events for Just-in-time Interventions. 预测吸烟事件,进行及时干预。
Hang Yu, Michael Kotlyar, Paul Thuras, Sheena Dufresne, Serguei Vs Pakhomov

Consumer-grade heart rate (HR) sensors are widely used for tracking physical and mental health status. We explore the feasibility of using Polar H10 electrocardiogram (ECG) sensor to detect and predict cigarette smoking events in naturalistic settings with several machine learning approaches. We have collected and analyzed data for 28 participants observed over a two-week period. We found that using bidirectional long short-term memory (BiLSTM) with ECG-derived and GPS location input features yielded the highest mean accuracy of 69% for smoking event detection. For predicting smoking events, the highest accuracy of 67% was achieved using the fine-tuned LSTM approach. We also found a significant correlation between accuracy and the number of smoking events available from each participant. Our findings indicate that both detection and prediction of smoking events are feasible but require an individualized approach to training the models, particularly for prediction.

消费级心率(HR)传感器被广泛用于跟踪身体和精神健康状况。我们利用几种机器学习方法探索了使用 Polar H10 心电图(ECG)传感器在自然环境中检测和预测吸烟事件的可行性。我们收集并分析了 28 名参与者两周内的观察数据。我们发现,使用双向长短期记忆(BiLSTM)以及心电图衍生和 GPS 位置输入特征检测吸烟事件的平均准确率最高,达到 69%。在预测吸烟事件方面,微调 LSTM 方法的准确率最高,达到 67%。我们还发现,准确率与每位参与者的吸烟事件数量之间存在明显的相关性。我们的研究结果表明,吸烟事件的检测和预测都是可行的,但需要采用个性化的方法来训练模型,尤其是预测模型。
{"title":"Towards Predicting Smoking Events for Just-in-time Interventions.","authors":"Hang Yu, Michael Kotlyar, Paul Thuras, Sheena Dufresne, Serguei Vs Pakhomov","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Consumer-grade heart rate (HR) sensors are widely used for tracking physical and mental health status. We explore the feasibility of using Polar H10 electrocardiogram (ECG) sensor to detect and predict cigarette smoking events in naturalistic settings with several machine learning approaches. We have collected and analyzed data for 28 participants observed over a two-week period. We found that using bidirectional long short-term memory (BiLSTM) with ECG-derived and GPS location input features yielded the highest mean accuracy of 69% for smoking event detection. For predicting smoking events, the highest accuracy of 67% was achieved using the fine-tuned LSTM approach. We also found a significant correlation between accuracy and the number of smoking events available from each participant. Our findings indicate that both detection and prediction of smoking events are feasible but require an individualized approach to training the models, particularly for prediction.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141818/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Utilizing Large Language Models to Generate Synthetic Data to Increase the Performance of BERT-Based Neural Networks. 利用大型语言模型生成合成数据,提高基于 BERT 的神经网络的性能。
Chancellor R Woolsey, Prakash Bisht, Joshua Rothman, Gondy Leroy

An important problem impacting healthcare is the lack of available experts. Machine learning (ML) models may help resolve this by aiding in screening and diagnosing patients. However, creating large, representative datasets to train models is expensive. We evaluated large language models (LLMs) for data creation. Using Autism Spectrum Disorders (ASD), we prompted GPT-3.5 and GPT-4 to generate 4,200 synthetic examples of behaviors to augment existing medical observations. Our goal is to label behaviors corresponding to autism criteria and improve model accuracy with synthetic training data. We used a BERT classifier pretrained on biomedical literature to assess differences in performance between models. A random sample (N=140) from the LLM-generated data was also evaluated by a clinician and found to contain 83% correct behavioral example-label pairs. Augmenting the dataset increased recall by 13% but decreased precision by 16%. Future work will investigate how different synthetic data characteristics affect ML outcomes.

影响医疗保健的一个重要问题是缺乏可用的专家。机器学习 (ML) 模型可以帮助筛查和诊断病人,从而解决这一问题。然而,创建大型、有代表性的数据集来训练模型的成本很高。我们评估了用于创建数据的大型语言模型(LLM)。利用自闭症谱系障碍(ASD),我们促使 GPT-3.5 和 GPT-4 生成了 4,200 个合成行为示例,以增强现有的医学观察结果。我们的目标是标注与自闭症标准相对应的行为,并通过合成训练数据提高模型的准确性。我们使用生物医学文献预训练的 BERT 分类器来评估不同模型之间的性能差异。临床医生也对 LLM 生成数据中的随机样本(N=140)进行了评估,发现其中包含 83% 正确的行为示例-标签对。扩充数据集后,召回率提高了 13%,但精确度降低了 16%。未来的工作将研究不同的合成数据特征如何影响 ML 结果。
{"title":"Utilizing Large Language Models to Generate Synthetic Data to Increase the Performance of BERT-Based Neural Networks.","authors":"Chancellor R Woolsey, Prakash Bisht, Joshua Rothman, Gondy Leroy","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>An important problem impacting healthcare is the lack of available experts. Machine learning (ML) models may help resolve this by aiding in screening and diagnosing patients. However, creating large, representative datasets to train models is expensive. We evaluated large language models (LLMs) for data creation. Using Autism Spectrum Disorders (ASD), we prompted GPT-3.5 and GPT-4 to generate 4,200 synthetic examples of behaviors to augment existing medical observations. Our goal is to label behaviors corresponding to autism criteria and improve model accuracy with synthetic training data. We used a BERT classifier pretrained on biomedical literature to assess differences in performance between models. A random sample (N=140) from the LLM-generated data was also evaluated by a clinician and found to contain 83% correct behavioral example-label pairs. Augmenting the dataset increased recall by 13% but decreased precision by 16%. Future work will investigate how different synthetic data characteristics affect ML outcomes.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141799/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic Population of the Case Report Forms for an International Multifactorial Adaptive Platform Trial Amid the COVID-19 Pandemic. 在 COVID-19 大流行中自动生成国际多因素自适应平台试验的病例报告表。
Andrew J King, Lisa Higgins, Carly Au, Salim Malakouti, Edvin Music, Kyle Kalchthaler, Gilles Clermont, William Garrard, David T Huang, Bryan J McVerry, Christopher W Seymour, Kelsey Linstrum, Amanda McNamara, Cameron Green, India Loar, Tracey Roberts, Oscar Marroquin, Derek C Angus, Christopher M Horvat

Objectives: To automatically populate the case report forms (CRFs) for an international, pragmatic, multifactorial, response-adaptive, Bayesian COVID-19 platform trial.

Methods: The locations of focus included 27 hospitals and 2 large electronic health record (EHR) instances (1 Cerner Millennium and 1 Epic) that are part of the same health system in the United States. This paper describes our efforts to use EHR data to automatically populate four of the trial's forms: baseline, daily, discharge, and response-adaptive randomization.

Results: Between April 2020 and May 2022, 417 patients from the UPMC health system were enrolled in the trial. A MySQL-based extract, transform, and load pipeline automatically populated 499 of 526 CRF variables. The populated forms were statistically and manually reviewed and then reported to the trial's international data coordinating center.

Conclusions: We accomplished automatic population of CRFs in a large platform trial and made recommendations for improving this process for future trials.

目的为一项国际性、务实、多因素、反应自适应、贝叶斯 COVID-19 平台试验自动填充病例报告表 (CRF):重点研究地点包括隶属于美国同一医疗系统的 27 家医院和 2 个大型电子病历 (EHR) 实例(1 个 Cerner Millennium 和 1 个 Epic)。本文介绍了我们在使用电子病历数据自动填充试验的四种表格方面所做的努力:基线、日常、出院和反应自适应随机化:2020年4月至2022年5月期间,UPMC医疗系统的417名患者加入了试验。基于 MySQL 的提取、转换和加载管道自动填充了 526 个 CRF 变量中的 499 个。填充后的表格经过统计和人工审核,然后报告给试验的国际数据协调中心:我们在一项大型平台试验中实现了 CRF 的自动填充,并为今后的试验提出了改进这一流程的建议。
{"title":"Automatic Population of the Case Report Forms for an International Multifactorial Adaptive Platform Trial Amid the COVID-19 Pandemic.","authors":"Andrew J King, Lisa Higgins, Carly Au, Salim Malakouti, Edvin Music, Kyle Kalchthaler, Gilles Clermont, William Garrard, David T Huang, Bryan J McVerry, Christopher W Seymour, Kelsey Linstrum, Amanda McNamara, Cameron Green, India Loar, Tracey Roberts, Oscar Marroquin, Derek C Angus, Christopher M Horvat","doi":"","DOIUrl":"","url":null,"abstract":"<p><strong>Objectives: </strong>To automatically populate the case report forms (CRFs) for an international, pragmatic, multifactorial, response-adaptive, Bayesian COVID-19 platform trial.</p><p><strong>Methods: </strong>The locations of focus included 27 hospitals and 2 large electronic health record (EHR) instances (1 Cerner Millennium and 1 Epic) that are part of the same health system in the United States. This paper describes our efforts to use EHR data to automatically populate four of the trial's forms: baseline, daily, discharge, and response-adaptive randomization.</p><p><strong>Results: </strong>Between April 2020 and May 2022, 417 patients from the UPMC health system were enrolled in the trial. A MySQL-based extract, transform, and load pipeline automatically populated 499 of 526 CRF variables. The populated forms were statistically and manually reviewed and then reported to the trial's international data coordinating center.</p><p><strong>Conclusions: </strong>We accomplished automatic population of CRFs in a large platform trial and made recommendations for improving this process for future trials.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141839/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Designing a Consumer-centric Care Management Program by Prioritizing Interventions Using Deep Learning Causal Inference. 利用深度学习因果推理确定干预措施的优先次序,设计以消费者为中心的护理管理计划。
Tianhao Li, Haoyun Feng, Vikram Bandugula, Ying Ding

Care management is a team-based and patient-centered approach to reduce health risks and improve outcomes for managed populations. Post Discharge Management (PDM) is an important care management program at Elevance Health, which is aimed at reducing 30-day readmission risk for recently discharged patients. The current PDM program suffers from low engagement. When assigning interventions to patients, case managers choose the interventions to be conducted in each call only based on their limited personal experiences. In this work, we use deep learning causal inference to analyze the impact of interventions conducted on the first call on consumer engagement in the PDM program, which provides a reliable reference for case managers to select interventions to promote consumer engagement. With three experiments cross-validating the results, our results show that consumers will engage more in the program if the case manager conducts interventions that require more nurse-patient interactions on the first call. On the other hand, conducting less interactive and more technical interventions on the first call leads to relatively poor consumer engagement. These findings correspond to the clinical sense of experienced nurses and are consistent with previous findings in patient engagement in hospital settings.

护理管理是一种以团队为基础、以患者为中心的方法,旨在降低健康风险并改善受管理人群的治疗效果。出院后管理(PDM)是 Elevance Health 的一项重要护理管理计划,旨在降低近期出院患者的 30 天再入院风险。目前的 PDM 计划参与度不高。在为患者分配干预措施时,病例管理人员仅根据其有限的个人经验选择每次呼叫中要进行的干预措施。在这项工作中,我们利用深度学习因果推理分析了第一次呼叫中进行的干预对 PDM 项目中消费者参与度的影响,这为个案经理选择干预措施以促进消费者参与度提供了可靠的参考。通过三个实验的交叉验证,我们的结果表明,如果个案管理者在首次呼叫时进行需要更多护士与患者互动的干预,消费者会更多地参与到项目中来。另一方面,在首次呼叫中进行互动较少、技术性较强的干预会导致消费者参与度相对较低。这些发现与经验丰富的护士的临床感觉相吻合,也与之前在医院环境中患者参与度的研究结果相一致。
{"title":"Designing a Consumer-centric Care Management Program by Prioritizing Interventions Using Deep Learning Causal Inference.","authors":"Tianhao Li, Haoyun Feng, Vikram Bandugula, Ying Ding","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Care management is a team-based and patient-centered approach to reduce health risks and improve outcomes for managed populations. Post Discharge Management (PDM) is an important care management program at Elevance Health, which is aimed at reducing 30-day readmission risk for recently discharged patients. The current PDM program suffers from low engagement. When assigning interventions to patients, case managers choose the interventions to be conducted in each call only based on their limited personal experiences. In this work, we use deep learning causal inference to analyze the impact of interventions conducted on the first call on consumer engagement in the PDM program, which provides a reliable reference for case managers to select interventions to promote consumer engagement. With three experiments cross-validating the results, our results show that consumers will engage more in the program if the case manager conducts interventions that require more nurse-patient interactions on the first call. On the other hand, conducting less interactive and more technical interventions on the first call leads to relatively poor consumer engagement. These findings correspond to the clinical sense of experienced nurses and are consistent with previous findings in patient engagement in hospital settings.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141861/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141200165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development of a Study Protocol for Evaluation of a Novel Measure to Incorporate Information Freshness into Network Analysis of Online Resources for COVID-19. 为 COVID-19 制定研究方案,评估将信息新鲜度纳入在线资源网络分析的新措施。
Meredith Abrams, Audrey Wong, Hanae El Kholti, Yunro Chung, Lisa Armitige, Dongwen Wang

We proposed a novel measure, Degree of Connectivity with Integration of Freshness (DCIF), to incorporate information freshness into analysis of online resource networks. We conducted a pilot study to apply this new measure to a dataset of online information resources related to COVID-19 risk assessment. Among the 52 nodes, we recorded statistically significant difference between the numerical values of DCIF and the traditional structural measure Degree of Connectivity (DC). Manual reviews of 18 selected nodes showed that DCIF outperformed DC in 11 of them, suggesting potential promise of the proposed new measure. We finalized the protocol for manual review based on the pilot and started a full-scale study. The proposed new measure has the potential to provide quantitative assessment on information freshness for timely and effective dissemination of clinical evidence. Further research is required to address the limitations of this pilot study and to examine the generalization of the findings.

我们提出了一种新的测量方法--新鲜度整合连接度(DCIF),用于将信息新鲜度纳入在线资源网络分析。我们在 COVID-19 风险评估相关的在线信息资源数据集上进行了试点研究。在 52 个节点中,我们发现 DCIF 的数值与传统的结构性测量指标 "连接度"(Degree of Connectivity,DC)之间存在显著的统计学差异。对所选的 18 个节点进行的人工审核显示,DCIF 在其中 11 个节点中的表现优于 DC,这表明所提议的新测量方法具有潜在的前景。我们在试点的基础上最终确定了人工审核协议,并开始了全面研究。所提出的新方法有可能对信息新鲜度进行量化评估,从而及时有效地传播临床证据。我们还需要进一步研究,以解决这项试点研究的局限性,并检验研究结果的普遍性。
{"title":"Development of a Study Protocol for Evaluation of a Novel Measure to Incorporate Information Freshness into Network Analysis of Online Resources for COVID-19.","authors":"Meredith Abrams, Audrey Wong, Hanae El Kholti, Yunro Chung, Lisa Armitige, Dongwen Wang","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>We proposed a novel measure, Degree of Connectivity with Integration of Freshness (DCIF), to incorporate information freshness into analysis of online resource networks. We conducted a pilot study to apply this new measure to a dataset of online information resources related to COVID-19 risk assessment. Among the 52 nodes, we recorded statistically significant difference between the numerical values of DCIF and the traditional structural measure Degree of Connectivity (DC). Manual reviews of 18 selected nodes showed that DCIF outperformed DC in 11 of them, suggesting potential promise of the proposed new measure. We finalized the protocol for manual review based on the pilot and started a full-scale study. The proposed new measure has the potential to provide quantitative assessment on information freshness for timely and effective dissemination of clinical evidence. Further research is required to address the limitations of this pilot study and to examine the generalization of the findings.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141794/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141200533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1