Yiye Zhang, Rochelle Joly, Ashley N Beecy, Samen Principe, Sujit Satpathy, Anatoly Gore, Tom Reilly, Mitchel Lang, Nagi Sathi, Carlos Uy, Matt Adams, Mark Israel
This study describes the deployment process of an AI-driven clinical decision support (CDS) system to support postpartum depression (PPD) prevention, diagnosis and management. Central to this CDS is an L2-regularized logistic regression model trained on electronic health record (EHR) data at an academic medical center, and subsequently refined through a broader dataset from a consortium to ensure its generalizability and fairness. The deployment architecture leveraged Microsoft Azure to facilitate a scalable, secure, and efficient operational framework. We used Fast Healthcare Interoperability Resources (FHIR) for data extraction and ingestion between the two systems. Continuous Integration/Continuous Deployment pipelines automated the deployment and ongoing maintenance, ensuring the system's adaptability to evolving clinical data. Along the technical preparation, we focused on a seamless integration of the CDS within the clinical workflow, presenting risk assessment directly within the clinician schedule and providing options for subsequent actions. The developed CDS is expected to drive a PPD clinical pathway to enable efficient PPD risk management.
{"title":"Implementation of a Machine Learning Risk Prediction Model for Postpartum Depression in the Electronic Health Records.","authors":"Yiye Zhang, Rochelle Joly, Ashley N Beecy, Samen Principe, Sujit Satpathy, Anatoly Gore, Tom Reilly, Mitchel Lang, Nagi Sathi, Carlos Uy, Matt Adams, Mark Israel","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>This study describes the deployment process of an AI-driven clinical decision support (CDS) system to support postpartum depression (PPD) prevention, diagnosis and management. Central to this CDS is an L2-regularized logistic regression model trained on electronic health record (EHR) data at an academic medical center, and subsequently refined through a broader dataset from a consortium to ensure its generalizability and fairness. The deployment architecture leveraged Microsoft Azure to facilitate a scalable, secure, and efficient operational framework. We used Fast Healthcare Interoperability Resources (FHIR) for data extraction and ingestion between the two systems. Continuous Integration/Continuous Deployment pipelines automated the deployment and ongoing maintenance, ensuring the system's adaptability to evolving clinical data. Along the technical preparation, we focused on a seamless integration of the CDS within the clinical workflow, presenting risk assessment directly within the clinician schedule and providing options for subsequent actions. The developed CDS is expected to drive a PPD clinical pathway to enable efficient PPD risk management.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"1057-1066"},"PeriodicalIF":0.0,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11497630/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142513963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Extracting valuable insights from unstructured clinical narrative reports is a challenging yet crucial task in the healthcare domain as it allows healthcare workers to treat patients more efficiently and improves the overall standard of care. We employ ChatGPT, a Large language model (LLM), and compare its performance to manual reviewers. The review focuses on four key conditions: family history of heart disease, depression, heavy smoking, and cancer. The evaluation of a diverse sample of History and Physical (H&P) Notes, demonstrates ChatGPT's remarkable capabilities. Notably, it exhibits exemplary results in sensitivity for depression and heavy smokers and specificity for cancer. We identify areas for improvement as well, particularly in capturing nuanced semantic information related to family history of heart disease and cancer. With further investigation, ChatGPT holds substantial potential for advancements in medical information extraction.
{"title":"Large Language Models for Efficient Medical Information Extraction.","authors":"Navya Bhagat, Olivia Mackey, Adam Wilcox","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Extracting valuable insights from unstructured clinical narrative reports is a challenging yet crucial task in the healthcare domain as it allows healthcare workers to treat patients more efficiently and improves the overall standard of care. We employ ChatGPT, a Large language model (LLM), and compare its performance to manual reviewers. The review focuses on four key conditions: family history of heart disease, depression, heavy smoking, and cancer. The evaluation of a diverse sample of History and Physical (H&P) Notes, demonstrates ChatGPT's remarkable capabilities. Notably, it exhibits exemplary results in sensitivity for depression and heavy smokers and specificity for cancer. We identify areas for improvement as well, particularly in capturing nuanced semantic information related to family history of heart disease and cancer. With further investigation, ChatGPT holds substantial potential for advancements in medical information extraction.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"509-514"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141860/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The purpose of this study was to characterize opioid and antimicrobial prescribing among uninsured patients seeking emergency medical care and to build predictive machine learning models. Uninsured patients were less likely to receive an opioid medication, more likely to receive non-opioid alternatives, and less likely to receive an antimicrobial prescription. The most impactful contributing factors were housing status, comorbidities, and recidivism.
{"title":"Opioid and Antimicrobial Prescription Patterns During Emergency Medicine Encounters Among Uninsured Patients.","authors":"Michael A Grasso, Anantaa Kotal, Anupam Joshi","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The purpose of this study was to characterize opioid and antimicrobial prescribing among uninsured patients seeking emergency medical care and to build predictive machine learning models. Uninsured patients were less likely to receive an opioid medication, more likely to receive non-opioid alternatives, and less likely to receive an antimicrobial prescription. The most impactful contributing factors were housing status, comorbidities, and recidivism.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"190"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141801/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Victor M Murcia, Vinod Aggarwal, Nikhil Pesaladinne, Ram Thammineni, Nhan Do, Gil Alterovitz, Rafael B Fricks
Clinical trials are critical to many medical advances; however, recruiting patients remains a persistent obstacle. Automated clinical trial matching could expedite recruitment across all trial phases. We detail our initial efforts towards automating the matching process by linking realistic synthetic electronic health records to clinical trial eligibility criteria using natural language processing methods. We also demonstrate how the Sørensen-Dice Index can be adapted to quantify match quality between a patient and a clinical trial.
{"title":"Automating Clinical Trial Matches Via Natural Language Processing of Synthetic Electronic Health Records and Clinical Trial Eligibility Criteria.","authors":"Victor M Murcia, Vinod Aggarwal, Nikhil Pesaladinne, Ram Thammineni, Nhan Do, Gil Alterovitz, Rafael B Fricks","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Clinical trials are critical to many medical advances; however, recruiting patients remains a persistent obstacle. Automated clinical trial matching could expedite recruitment across all trial phases. We detail our initial efforts towards automating the matching process by linking realistic synthetic electronic health records to clinical trial eligibility criteria using natural language processing methods. We also demonstrate how the Sørensen-Dice Index can be adapted to quantify match quality between a patient and a clinical trial.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"125-134"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141802/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Researchers estimate the number of dementia patients to triple by 20501. Dementia seldom occurs in isolation; it's frequently accompanied by other health conditions2. The coexistence of conditions further complicates the management of dementia. In this study, we embarked on an innovative approach, applying association rule mining to analyze National Alzheimer's Coordinating Center (NACC) data. First, we completed a literature review on the utilization of association rules, heatmaps, and network analysis to detect and visualize comorbidities. Then, we conducted a secondary data analysis on the NACC data using association rule mining. This algorithm uncovers associations of comorbidities that are diagnosed together in patients who have Alzheimer's disease and related dementias (ADRD). Also, for these patients, the algorithm provides the probability of a patient developing another comorbidity given the diagnosis of an associated comorbidity. These findings can enhance treatment planning, advance research on high-association diseases, and ultimately enhance healthcare for dementia patients.
{"title":"Detecting Multimorbidity Patterns with Association Rule Mining in Patients with Alzheimer's Disease and Related Dementias.","authors":"Razan A El Khalifa, Pui Ying Yew, Chih-Lin Chi","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Researchers estimate the number of dementia patients to triple by 2050<sup>1</sup>. Dementia seldom occurs in isolation; it's frequently accompanied by other health conditions<sup>2</sup>. The coexistence of conditions further complicates the management of dementia. In this study, we embarked on an innovative approach, applying association rule mining to analyze National Alzheimer's Coordinating Center (NACC) data. First, we completed a literature review on the utilization of association rules, heatmaps, and network analysis to detect and visualize comorbidities. Then, we conducted a secondary data analysis on the NACC data using association rule mining. This algorithm uncovers associations of comorbidities that are diagnosed together in patients who have Alzheimer's disease and related dementias (ADRD). Also, for these patients, the algorithm provides the probability of a patient developing another comorbidity given the diagnosis of an associated comorbidity. These findings can enhance treatment planning, advance research on high-association diseases, and ultimately enhance healthcare for dementia patients.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"525-534"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141815/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141200313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiancheng Ye, Jiarui Hai, Jiacheng Song, Zidan Wang
This study aims to propose a novel approach for enhancing clinical prediction models by combining structured and unstructured data with multimodal data fusion. We presented a comprehensive framework that integrated multimodal data sources, including textual clinical notes, structured electronic health records (EHRs), and relevant clinical data from National Electronic Injury Surveillance System (NEISS) datasets. We proposed a novel hybrid fusion method, which incorporated state-of-the-art pre-trained language model, to integrate unstructured clinical text with structured EHR data and other multimodal sources, thereby capturing a more comprehensive representation of patient information. The experimental results demonstrated that the hybrid fusion approach significantly improved the performance of clinical prediction models compared to traditional fusion frameworks and unimodal models that rely solely on structured data or text information alone. The proposed hybrid fusion system with RoBERTa language encoder achieved the best prediction of the Top 1 injury with an accuracy of 75.00% and Top 3 injuries with an accuracy of 93.54%. Our study highlights the potential of integrating natural language processing (NLP) techniques with multimodal data fusion for enhancing clinical prediction models' performances. By leveraging the rich information present in clinical text and combining it with structured EHR data, the proposed approach can improve the accuracy and robustness of predictive models. The approach has the potential to advance clinical decision support systems, enable personalized medicine, and facilitate evidence-based health care practices. Future research can further explore the application of this hybrid fusion approach in real-world clinical settings and investigate its impact on improving patient outcomes.
本研究旨在提出一种新方法,通过多模态数据融合将结构化和非结构化数据结合起来,从而增强临床预测模型。我们提出了一个综合框架,该框架整合了多模态数据源,包括文本临床笔记、结构化电子健康记录(EHR)以及来自国家电子伤害监测系统(NEISS)数据集的相关临床数据。我们提出了一种新颖的混合融合方法,该方法结合了最先进的预训练语言模型,将非结构化临床文本与结构化电子病历数据和其他多模态数据源整合在一起,从而更全面地呈现患者信息。实验结果表明,与传统的融合框架和仅依赖结构化数据或文本信息的单模态模型相比,混合融合方法显著提高了临床预测模型的性能。使用 RoBERTa 语言编码器的混合融合系统对前 1 名损伤的预测准确率达到 75.00%,对前 3 名损伤的预测准确率达到 93.54%。我们的研究强调了自然语言处理(NLP)技术与多模态数据融合在提高临床预测模型性能方面的潜力。通过利用临床文本中的丰富信息并将其与结构化电子病历数据相结合,所提出的方法可以提高预测模型的准确性和稳健性。该方法有望推动临床决策支持系统的发展,实现个性化医疗,促进循证医疗实践。未来的研究可以进一步探索这种混合融合方法在实际临床环境中的应用,并研究其对改善患者预后的影响。
{"title":"Multimodal Data Hybrid Fusion and Natural Language Processing for Clinical Prediction Models.","authors":"Jiancheng Ye, Jiarui Hai, Jiacheng Song, Zidan Wang","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>This study aims to propose a novel approach for enhancing clinical prediction models by combining structured and unstructured data with multimodal data fusion. We presented a comprehensive framework that integrated multimodal data sources, including textual clinical notes, structured electronic health records (EHRs), and relevant clinical data from National Electronic Injury Surveillance System (NEISS) datasets. We proposed a novel hybrid fusion method, which incorporated state-of-the-art pre-trained language model, to integrate unstructured clinical text with structured EHR data and other multimodal sources, thereby capturing a more comprehensive representation of patient information. The experimental results demonstrated that the hybrid fusion approach significantly improved the performance of clinical prediction models compared to traditional fusion frameworks and unimodal models that rely solely on structured data or text information alone. The proposed hybrid fusion system with RoBERTa language encoder achieved the best prediction of the Top 1 injury with an accuracy of 75.00% and Top 3 injuries with an accuracy of 93.54%. Our study highlights the potential of integrating natural language processing (NLP) techniques with multimodal data fusion for enhancing clinical prediction models' performances. By leveraging the rich information present in clinical text and combining it with structured EHR data, the proposed approach can improve the accuracy and robustness of predictive models. The approach has the potential to advance clinical decision support systems, enable personalized medicine, and facilitate evidence-based health care practices. Future research can further explore the application of this hybrid fusion approach in real-world clinical settings and investigate its impact on improving patient outcomes.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"191-200"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141806/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the realm of lung cancer treatment, where genetic heterogeneity presents formidable challenges, precision oncology demands an exacting approach to identify and hierarchically sort clinically significant somatic mutations. Current Next-Generation Sequencing (NGS) data filtering pipelines, while utilizing various external databases for mutation screening, often fall short in comprehensive integration and flexibility needed to keep pace with the evolving landscape of clinical data. Our study introduces a sophisticated NGS data filtering system, which not only aggregates but effectively synergizes diverse data sources, encompassing genetic variants, gene functions, clinical evidence, and an extensive body of literature. This system is distinguished by a unique algorithm that facilitates a rigorous, multi-tiered filtration process. This allows for the efficient prioritization of 420 genes and 1,193 variants from large datasets, with a particular focus on 80 variants demonstrating high clinical actionability. These variants have been aligned with FDA approvals, NCCN guidelines, and thoroughly reviewed literature, thereby equipping oncologists with a refined arsenal for targeted therapy decisions. The innovation of our system lies in its dynamic integration framework and its algorithm, tailored to emphasize clinical utility and actionability-a nuanced approach often lacking in existing methodologies. Our validation on real-world lung adenocarcinoma NGS datasets has shown not only an enhanced efficiency in identifying genetic targets but also the potential to streamline clinical workflows, thus propelling the advancement of precision oncology. Planned future enhancements include expanding the range of integrated data types and developing a user-friendly interface, aiming to facilitate easier access to data and promote collaborative efforts in tailoring cancer treatments.
{"title":"Prioritizing Clinically Significant Lung Cancer Somatic Mutations for Targeted Therapy Through Efficient NGS Data Filtering System.","authors":"Jinlian Wang, Hui Li, Hongfang Liu","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>In the realm of lung cancer treatment, where genetic heterogeneity presents formidable challenges, precision oncology demands an exacting approach to identify and hierarchically sort clinically significant somatic mutations. Current Next-Generation Sequencing (NGS) data filtering pipelines, while utilizing various external databases for mutation screening, often fall short in comprehensive integration and flexibility needed to keep pace with the evolving landscape of clinical data. Our study introduces a sophisticated NGS data filtering system, which not only aggregates but effectively synergizes diverse data sources, encompassing genetic variants, gene functions, clinical evidence, and an extensive body of literature. This system is distinguished by a unique algorithm that facilitates a rigorous, multi-tiered filtration process. This allows for the efficient prioritization of 420 genes and 1,193 variants from large datasets, with a particular focus on 80 variants demonstrating high clinical actionability. These variants have been aligned with FDA approvals, NCCN guidelines, and thoroughly reviewed literature, thereby equipping oncologists with a refined arsenal for targeted therapy decisions. The innovation of our system lies in its dynamic integration framework and its algorithm, tailored to emphasize clinical utility and actionability-a nuanced approach often lacking in existing methodologies. Our validation on real-world lung adenocarcinoma NGS datasets has shown not only an enhanced efficiency in identifying genetic targets but also the potential to streamline clinical workflows, thus propelling the advancement of precision oncology. Planned future enhancements include expanding the range of integrated data types and developing a user-friendly interface, aiming to facilitate easier access to data and promote collaborative efforts in tailoring cancer treatments.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"305-313"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141846/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Joseph Finkelstein, Wanting Cui, Jeffrey P Ferraro, Kensaku Kawamoto
The goal of this study was to analyze diagnostic discrepancies between emergency department (ED) and hospital discharge diagnoses in patients with congestive heart failure admitted to the ED. Using a synthetic dataset from the Department of Veterans Affairs, the patients' primary diagnoses were compared at two levels: diagnostic category and body system. With 12,621 patients and 24,235 admission cases, the study found a 58% mismatch rate at the category level, which was reduced to 30% at the body system level. Diagnostic categories associated with higher levels of mismatch included aplastic anemia, pneumonia, and bacterial infections. In contrast, diagnostic categories associated with lower levels of mismatch included alcohol-related disorders, COVID-19, cardiac dysrhythmias, and gastrointestinal hemorrhage. Further investigation revealed that diagnostic mismatches are associated with longer hospital stays and higher mortality rates. These findings highlight the importance of reducing diagnostic uncertainty, particularly in specific diagnostic categories and body systems, to improve patient care following ED admission.
{"title":"Association of Diagnostic Discrepancy with Length of Stay and Mortality in Congestive Heart Failure Patients Admitted to the Emergency Department.","authors":"Joseph Finkelstein, Wanting Cui, Jeffrey P Ferraro, Kensaku Kawamoto","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The goal of this study was to analyze diagnostic discrepancies between emergency department (ED) and hospital discharge diagnoses in patients with congestive heart failure admitted to the ED. Using a synthetic dataset from the Department of Veterans Affairs, the patients' primary diagnoses were compared at two levels: diagnostic category and body system. With 12,621 patients and 24,235 admission cases, the study found a 58% mismatch rate at the category level, which was reduced to 30% at the body system level. Diagnostic categories associated with higher levels of mismatch included aplastic anemia, pneumonia, and bacterial infections. In contrast, diagnostic categories associated with lower levels of mismatch included alcohol-related disorders, COVID-19, cardiac dysrhythmias, and gastrointestinal hemorrhage. Further investigation revealed that diagnostic mismatches are associated with longer hospital stays and higher mortality rates. These findings highlight the importance of reducing diagnostic uncertainty, particularly in specific diagnostic categories and body systems, to improve patient care following ED admission.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"155-161"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141848/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yasmmin C Martins, Praphulla Ms Bhawsar, Jeya B Balasubramanian, Daniel Russ, Wendy Sw Wong, Wolfgang Maass, Jonas S Almeida
Motivation: The proliferation of genetic testing and consumer genomics represents a logistic challenge to the personalized use of GWAS data in VCF format. Specifically, the challenge of retrieving target genetic variation from large compressed files filled with unrelated variation information. Compounding the data traversal challenge, privacy-sensitive VCF files are typically managed as large stand-alone single files (no companion index file) composed of variable-sized compressed chunks, hosted in consumer-facing environments with no native support for hosted execution. Results: A portable JavaScript module was developed to support in-browser fetching of partial content using byte-range requests. This includes on-the-fly decompressing irregularly positioned compressed chunks, coupled with a binary search algorithm iteratively identifying chromosome-position ranges. The in-browser zero-footprint solution (no downloads, no installations) enables the interoperability, reusability, and user-facing governance advanced by the FAIR principles for stewardship of scientific data. Availability - https://episphere.github.io/vcf, including supplementary material.
{"title":"FAIR privacy-preserving operation of large genomic variant calling format (VCF) data without download or installation.","authors":"Yasmmin C Martins, Praphulla Ms Bhawsar, Jeya B Balasubramanian, Daniel Russ, Wendy Sw Wong, Wolfgang Maass, Jonas S Almeida","doi":"","DOIUrl":"","url":null,"abstract":"<p><p><b>Motivation</b>: The proliferation of genetic testing and consumer genomics represents a logistic challenge to the personalized use of GWAS data in VCF format. Specifically, the challenge of retrieving target genetic variation from large compressed files filled with unrelated variation information. Compounding the data traversal challenge, privacy-sensitive VCF files are typically managed as large stand-alone single files (no companion index file) composed of variable-sized compressed chunks, hosted in consumer-facing environments with no native support for hosted execution. <b>Results</b>: A portable JavaScript module was developed to support in-browser fetching of partial content using byte-range requests. This includes on-the-fly decompressing irregularly positioned compressed chunks, coupled with a binary search algorithm iteratively identifying chromosome-position ranges. The in-browser zero-footprint solution (no downloads, no installations) enables the interoperability, reusability, and user-facing governance advanced by the FAIR principles for stewardship of scientific data. <b>Availability</b> - https://episphere.github.io/vcf, including supplementary material.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"65-74"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141832/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141200821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Niloufar Eghbali, Chad Klochko, Perra Razoky, Prateek Chintalapati, Efan Jawad, Zaid Mahdi, Joseph Craig, Mohammad M Ghassemi
Radiology Imaging plays a pivotal role in medical diagnostics, providing clinicians with insights into patient health and guiding the next steps in treatment. The true value of a radiological image lies in the accuracy of its accompanying report. To ensure the reliability of these reports, they are often cross-referenced with operative findings. The conventional method of manually comparing radiology and operative reports is labor-intensive and demands specialized knowledge. This study explores the potential of a Large Language Model (LLM) to simplify the radiology evaluation process by automatically extracting pertinent details from these reports, focusing especially on the shoulder's primary anatomical structures. A fine-tuned LLM identifies mentions of the supraspinatus tendon, infraspinatus tendon, subscapularis tendon, biceps tendon, and glenoid labrum in lengthy radiology and operative documents. Initial findings emphasize the model's capability to pinpoint relevant data, suggesting a transformative approach to the typical evaluation methods in radiology.
{"title":"Improving Automating Quality Control in Radiology: Leveraging Large Language Models to Extract Correlative Findings in Radiology and Operative Reports.","authors":"Niloufar Eghbali, Chad Klochko, Perra Razoky, Prateek Chintalapati, Efan Jawad, Zaid Mahdi, Joseph Craig, Mohammad M Ghassemi","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Radiology Imaging plays a pivotal role in medical diagnostics, providing clinicians with insights into patient health and guiding the next steps in treatment. The true value of a radiological image lies in the accuracy of its accompanying report. To ensure the reliability of these reports, they are often cross-referenced with operative findings. The conventional method of manually comparing radiology and operative reports is labor-intensive and demands specialized knowledge. This study explores the potential of a Large Language Model (LLM) to simplify the radiology evaluation process by automatically extracting pertinent details from these reports, focusing especially on the shoulder's primary anatomical structures. A fine-tuned LLM identifies mentions of the supraspinatus tendon, infraspinatus tendon, subscapularis tendon, biceps tendon, and glenoid labrum in lengthy radiology and operative documents. Initial findings emphasize the model's capability to pinpoint relevant data, suggesting a transformative approach to the typical evaluation methods in radiology.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"135-144"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141845/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}