Pub Date : 2025-08-01DOI: 10.1146/annurev-biodatasci-103123-095814
Zhiping Xiao, Bin Feng, Junwei Yang, Gongbo Sun, Yuxi Shen, Shengyuan Xu, Lina Yang, Hanwen Xu, Ming Zhang, Sheng Wang
The rapid development of artificial intelligence (AI) has had a significant impact on medical research, introducing new possibilities for pathology studies. There is a recent trend of applying large-scale AI models to many fields, and this trend has given rise to the pathology foundation models and pathology ensemble models. Large models in pathology are not standalone innovations; they build upon a legacy where AI has consistently played a vital role in pathology studies long before their advent. Numerous pathology datasets and AI models have been developed to support advancements in the field, with these combined efforts paving the way for the emergence of large models in pathology. AI greatly enhances pathology studies, yet its widespread use in sensitive applications also raises significant ethical concerns, including privacy risks. In this review, we summarize the datasets and models that are useful to pathology studies, with a particular focus on how they illuminate the path toward large-scale applications.
{"title":"Artificial Intelligence in Pathology: Advancing Large Models for Scalable Applications.","authors":"Zhiping Xiao, Bin Feng, Junwei Yang, Gongbo Sun, Yuxi Shen, Shengyuan Xu, Lina Yang, Hanwen Xu, Ming Zhang, Sheng Wang","doi":"10.1146/annurev-biodatasci-103123-095814","DOIUrl":"10.1146/annurev-biodatasci-103123-095814","url":null,"abstract":"<p><p>The rapid development of artificial intelligence (AI) has had a significant impact on medical research, introducing new possibilities for pathology studies. There is a recent trend of applying large-scale AI models to many fields, and this trend has given rise to the pathology foundation models and pathology ensemble models. Large models in pathology are not standalone innovations; they build upon a legacy where AI has consistently played a vital role in pathology studies long before their advent. Numerous pathology datasets and AI models have been developed to support advancements in the field, with these combined efforts paving the way for the emergence of large models in pathology. AI greatly enhances pathology studies, yet its widespread use in sensitive applications also raises significant ethical concerns, including privacy risks. In this review, we summarize the datasets and models that are useful to pathology studies, with a particular focus on how they illuminate the path toward large-scale applications.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":"8 1","pages":"149-171"},"PeriodicalIF":6.0,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144822750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-01DOI: 10.1146/annurev-biodatasci-111224-124530
Su Golder, Karen O'Connor, Guillermo Lopez-Garcia, Nicholas P Tatonetti, Graciela Gonzalez-Hernandez
Adverse drug events (ADEs) in pediatric populations pose significant public health challenges, yet research on their detection and monitoring remains limited. This scoping review evaluates the use of unstructured data from electronic health records (EHRs) to identify ADEs in children. We searched six databases, including MEDLINE, Embase, and IEEE Xplore, in September 2024. From 984 records, only nine studies met our inclusion criteria, indicating a significant gap in research toward identifying ADEs in children. We found that unstructured data in EHRs can indeed be of value and enhance pediatric pharmacovigilance, although their use has been so far very limited. Traditional natural language processing methods have been employed to extract ADEs, but the approaches utilized face challenges in generalizability and context interpretation. These challenges could be addressed with recent advances in transformer-based models and large language models, unlocking the use of EHR data at scale for pediatric pharmacovigilance.
{"title":"Leveraging Unstructured Data in Electronic Health Records to Detect Adverse Events from Pediatric Drug Use: A Scoping Review.","authors":"Su Golder, Karen O'Connor, Guillermo Lopez-Garcia, Nicholas P Tatonetti, Graciela Gonzalez-Hernandez","doi":"10.1146/annurev-biodatasci-111224-124530","DOIUrl":"10.1146/annurev-biodatasci-111224-124530","url":null,"abstract":"<p><p>Adverse drug events (ADEs) in pediatric populations pose significant public health challenges, yet research on their detection and monitoring remains limited. This scoping review evaluates the use of unstructured data from electronic health records (EHRs) to identify ADEs in children. We searched six databases, including MEDLINE, Embase, and IEEE Xplore, in September 2024. From 984 records, only nine studies met our inclusion criteria, indicating a significant gap in research toward identifying ADEs in children. We found that unstructured data in EHRs can indeed be of value and enhance pediatric pharmacovigilance, although their use has been so far very limited. Traditional natural language processing methods have been employed to extract ADEs, but the approaches utilized face challenges in generalizability and context interpretation. These challenges could be addressed with recent advances in transformer-based models and large language models, unlocking the use of EHR data at scale for pediatric pharmacovigilance.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":"8 1","pages":"227-250"},"PeriodicalIF":6.0,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144822751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The increasing accumulation of medical data brings the hope of data-driven medical decision-making, but data's increasing complexity-as text or images in electronic health records-calls for complex models, such as machine learning. Here, we review how machine learning can be used to inform decisions for individualized interventions, a causal question. Going from prediction to causal effects is challenging, as no individual is seen as both treated and not. We detail how some data can support some causal claims and how to build causal estimators with machine learning. Beyond variable selection to adjust for confounding bias, we cover the broader notions of study design that make or break causal inference. As the problems span across diverse scientific communities, we use didactic yet statistically precise formulations to bridge machine learning to epidemiology.
{"title":"From Prediction to Prescription: Machine Learning and Causal Inference for the Heterogeneous Treatment Effect.","authors":"Judith Abécassis, Élise Dumas, Julie Alberge, Gaël Varoquaux","doi":"10.1146/annurev-biodatasci-103123-095750","DOIUrl":"10.1146/annurev-biodatasci-103123-095750","url":null,"abstract":"<p><p>The increasing accumulation of medical data brings the hope of data-driven medical decision-making, but data's increasing complexity-as text or images in electronic health records-calls for complex models, such as machine learning. Here, we review how machine learning can be used to inform decisions for individualized interventions, a causal question. Going from prediction to causal effects is challenging, as no individual is seen as both treated and not. We detail how some data can support some causal claims and how to build causal estimators with machine learning. Beyond variable selection to adjust for confounding bias, we cover the broader notions of study design that make or break causal inference. As the problems span across diverse scientific communities, we use didactic yet statistically precise formulations to bridge machine learning to epidemiology.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":"381-404"},"PeriodicalIF":6.0,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144003572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-01Epub Date: 2025-04-09DOI: 10.1146/annurev-biodatasci-103123-094756
William Hersh
Generative artificial intelligence (AI) has had a profound impact on biomedicine and health, both in professional work and in education. Based on large language models (LLMs), generative AI has been found to perform as well as humans in simulated situations taking medical board exams, answering clinical questions, solving clinical cases, applying clinical reasoning, and summarizing information. Generative AI is also being used widely in education, performing well in academic courses and their assessments. This review summarizes the successes of LLMs and highlights some of their challenges in the context of education, most notably aspects that may undermine the acquisition of knowledge and skills for professional work. It then provides recommendations for best practices to overcome the shortcomings of LLM use in education. Although there are challenges for the use of generative AI in education, all students and faculty, in biomedicine and health and beyond, must have understanding of it and be competent in its use.
{"title":"Generative Artificial Intelligence: Implications for Biomedical and Health Professions Education.","authors":"William Hersh","doi":"10.1146/annurev-biodatasci-103123-094756","DOIUrl":"10.1146/annurev-biodatasci-103123-094756","url":null,"abstract":"<p><p>Generative artificial intelligence (AI) has had a profound impact on biomedicine and health, both in professional work and in education. Based on large language models (LLMs), generative AI has been found to perform as well as humans in simulated situations taking medical board exams, answering clinical questions, solving clinical cases, applying clinical reasoning, and summarizing information. Generative AI is also being used widely in education, performing well in academic courses and their assessments. This review summarizes the successes of LLMs and highlights some of their challenges in the context of education, most notably aspects that may undermine the acquisition of knowledge and skills for professional work. It then provides recommendations for best practices to overcome the shortcomings of LLM use in education. Although there are challenges for the use of generative AI in education, all students and faculty, in biomedicine and health and beyond, must have understanding of it and be competent in its use.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":"355-380"},"PeriodicalIF":6.0,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144052507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-01DOI: 10.1146/annurev-biodatasci-120924-091033
Yuehua Zhu, Weiguang Mao, Rezwan Hosseini, Maria Chikina
DNA methylation, a covalent modification, fundamentally shapes mammalian gene regulation and cellular identity. This review examines methylation's biochemical underpinnings, genomic distribution patterns, and analytical approaches. We highlight three distinctive aspects that separate methylation from other epigenetic marks: its remarkable stability as a silencing mechanism, its capacity to maintain distinct states independently of DNA sequence, and its effectiveness as a quantitative trait linking genotype to disease risk. We also explore the phenomenon of methylation clocks and their biological significance. The review addresses technical considerations across major assay types-both array-based technologies and sequencing approaches-with emphasis on data normalization, quality control, cell proportion inference, and the specialized statistical models required for next-generation sequencing analysis.
{"title":"Methylation Data Analysis and Interpretation.","authors":"Yuehua Zhu, Weiguang Mao, Rezwan Hosseini, Maria Chikina","doi":"10.1146/annurev-biodatasci-120924-091033","DOIUrl":"10.1146/annurev-biodatasci-120924-091033","url":null,"abstract":"<p><p>DNA methylation, a covalent modification, fundamentally shapes mammalian gene regulation and cellular identity. This review examines methylation's biochemical underpinnings, genomic distribution patterns, and analytical approaches. We highlight three distinctive aspects that separate methylation from other epigenetic marks: its remarkable stability as a silencing mechanism, its capacity to maintain distinct states independently of DNA sequence, and its effectiveness as a quantitative trait linking genotype to disease risk. We also explore the phenomenon of methylation clocks and their biological significance. The review addresses technical considerations across major assay types-both array-based technologies and sequencing approaches-with emphasis on data normalization, quality control, cell proportion inference, and the specialized statistical models required for next-generation sequencing analysis.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":"8 1","pages":"605-632"},"PeriodicalIF":6.0,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144822753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Large language models (LLMs) have become powerful tools for biomedical applications, offering potential to transform healthcare and medical research. Since the release of ChatGPT in 2022, there has been a surge in LLMs for diverse biomedical applications. This review examines the landscape of text-based biomedical LLM development, analyzing model characteristics (e.g., architecture), development processes (e.g., training strategy), and applications (e.g., chatbots). Following PRISMA guidelines, 82 articles were selected out of 5,512 articles since 2022 that met our rigorous criteria, including the requirement of using biomedical data when training LLMs. Findings highlight the predominant use of decoder-only architectures such as Llama 7B, prevalence of task-specific fine-tuning, and reliance on biomedical literature for training. Challenges persist in balancing data openness with privacy concerns and detailing model development, including computational resources used. Future efforts would benefit from multimodal integration, LLMs for specialized medical applications, and improved data sharing and model accessibility.
{"title":"The Development Landscape of Large Language Models for Biomedical Applications.","authors":"Zhiyuan Cao, Vipina K Keloth, Qianqian Xie, Lingfei Qian, Yuntian Liu, Yan Wang, Rui Shi, Weipeng Zhou, Gui Yang, Jeffrey Zhang, Xueqing Peng, Ethan Zhen, Ruey-Ling Weng, Qingyu Chen, Hua Xu","doi":"10.1146/annurev-biodatasci-102224-074736","DOIUrl":"10.1146/annurev-biodatasci-102224-074736","url":null,"abstract":"<p><p>Large language models (LLMs) have become powerful tools for biomedical applications, offering potential to transform healthcare and medical research. Since the release of ChatGPT in 2022, there has been a surge in LLMs for diverse biomedical applications. This review examines the landscape of text-based biomedical LLM development, analyzing model characteristics (e.g., architecture), development processes (e.g., training strategy), and applications (e.g., chatbots). Following PRISMA guidelines, 82 articles were selected out of 5,512 articles since 2022 that met our rigorous criteria, including the requirement of using biomedical data when training LLMs. Findings highlight the predominant use of decoder-only architectures such as Llama 7B, prevalence of task-specific fine-tuning, and reliance on biomedical literature for training. Challenges persist in balancing data openness with privacy concerns and detailing model development, including computational resources used. Future efforts would benefit from multimodal integration, LLMs for specialized medical applications, and improved data sharing and model accessibility.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":"251-274"},"PeriodicalIF":6.0,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12372014/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143765290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-01Epub Date: 2025-04-09DOI: 10.1146/annurev-biodatasci-090624-022951
Christine Y Yeh, Dennis P Wall, Karen Matthys, Chiara Sabatti, Julia A Palacios
In recent decades, there has been an explosion of data streams spanning the entire spectrum of biomedicine, opening novel opportunities to tackle biological and medical research questions, increasing our ability to provide effective and efficient health care. In parallel, augmented computational power has allowed the development and deployment of quantitative approaches at unprecedented scales. To effectively take advantage of this progress, it is important to invest in the training of a new generation of biomedical data scientists. Designing a graduate curriculum in the backdrop of a rapidly changing landscape of data, methods, and computing power demands flexibility and openness to adaptation. At the same time, we strive to ensure that the students acquire foundational competencies that might fuel productive and evolving careers, without being constrained to and defined by a niche trendy topic. We offer here a view of graduate training in biomedical data science from the standpoint of our experience at Stanford University. We conclude with a series of open challenges, the answers to which we believe will shape training in biomedical data science.
{"title":"Curriculum Design in an Evolving Field: Perspectives on Biomedical Data Science from Stanford.","authors":"Christine Y Yeh, Dennis P Wall, Karen Matthys, Chiara Sabatti, Julia A Palacios","doi":"10.1146/annurev-biodatasci-090624-022951","DOIUrl":"10.1146/annurev-biodatasci-090624-022951","url":null,"abstract":"<p><p>In recent decades, there has been an explosion of data streams spanning the entire spectrum of biomedicine, opening novel opportunities to tackle biological and medical research questions, increasing our ability to provide effective and efficient health care. In parallel, augmented computational power has allowed the development and deployment of quantitative approaches at unprecedented scales. To effectively take advantage of this progress, it is important to invest in the training of a new generation of biomedical data scientists. Designing a graduate curriculum in the backdrop of a rapidly changing landscape of data, methods, and computing power demands flexibility and openness to adaptation. At the same time, we strive to ensure that the students acquire foundational competencies that might fuel productive and evolving careers, without being constrained to and defined by a niche trendy topic. We offer here a view of graduate training in biomedical data science from the standpoint of our experience at Stanford University. We conclude with a series of open challenges, the answers to which we believe will shape training in biomedical data science.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":"341-354"},"PeriodicalIF":6.0,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144027038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-01Epub Date: 2025-02-19DOI: 10.1146/annurev-biodatasci-103123-095824
Gary E Weissman
Artificial intelligence (AI) methods were first developed nearly seven decades ago. Only in recent years have they demonstrated their potential to improve clinical care at the bedside. AI systems are now capable of interpreting, predicting, and even generating important medical information. AI medical devices share many similarities with traditional medical devices but also diverge from them in important ways. Despite widespread optimism and enthusiasm surrounding the use of such devices to improve care processes, patient outcomes, and the healthcare experience for patients, caregivers, and clinicians alike, little evidence exists so far for their effectiveness in practice. Even less is known about the safety or equity of AI medical devices. As with any new technology, this exciting time is accompanied by appropriate questions regarding if, how much, when, and who such AI systems really help. Different stakeholders, ranging from patients to clinicians to industry device developers, may have divergent preferences or assessments of risk and benefits, warranting an informed public discussion to guide emerging regulatory efforts. This review summarizes the rapidly evolving recent efforts and evidence related to the regulation and evaluation of AI medical devices and highlights opportunities for future work to ensure their effectiveness, safety, and equity.
{"title":"Evaluation and Regulation of Artificial Intelligence Medical Devices for Clinical Decision Support.","authors":"Gary E Weissman","doi":"10.1146/annurev-biodatasci-103123-095824","DOIUrl":"10.1146/annurev-biodatasci-103123-095824","url":null,"abstract":"<p><p>Artificial intelligence (AI) methods were first developed nearly seven decades ago. Only in recent years have they demonstrated their potential to improve clinical care at the bedside. AI systems are now capable of interpreting, predicting, and even generating important medical information. AI medical devices share many similarities with traditional medical devices but also diverge from them in important ways. Despite widespread optimism and enthusiasm surrounding the use of such devices to improve care processes, patient outcomes, and the healthcare experience for patients, caregivers, and clinicians alike, little evidence exists so far for their effectiveness in practice. Even less is known about the safety or equity of AI medical devices. As with any new technology, this exciting time is accompanied by appropriate questions regarding if, how much, when, and who such AI systems really help. Different stakeholders, ranging from patients to clinicians to industry device developers, may have divergent preferences or assessments of risk and benefits, warranting an informed public discussion to guide emerging regulatory efforts. This review summarizes the rapidly evolving recent efforts and evidence related to the regulation and evaluation of AI medical devices and highlights opportunities for future work to ensure their effectiveness, safety, and equity.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":"81-99"},"PeriodicalIF":6.0,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12339208/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143459781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Spatial transcriptomics (ST) brings new dimensions to the analysis of single-cell data. While some methods for data analysis can be ported over without major modifications, they are the exception rather than the rule. Trajectory inference (TI) methods in particular can suffer from significant challenges due to spatial batch effects in ST data. These can add independent sources of noise to each time point. Pioneering methods for TI on ST data have focused primarily on addressing the batch effects in physical arrangement, i.e., where tissues are deformed in different ways at different time points. However, other challenges arise due to the measurement granularity of ST technologies, as well as a bias from slicing. In this review, we examine the sources of these challenges, and we explore how they are addressed with current state-of-the-art STTI methods. We conclude by highlighting some opportunities for future method development.
空间转录组学(ST)为单细胞数据分析带来了新的维度。虽然有些数据分析方法无需进行重大修改即可移植,但它们只是例外,而不是常规。特别是轨迹推断(TI)方法,由于 ST 数据的空间批次效应,可能会面临巨大的挑战。这可能会给每个时间点增加独立的噪声源。ST 数据轨迹推断的开创性方法主要侧重于解决物理排列中的批次效应,即组织在不同时间点以不同方式变形。然而,由于 ST 技术的测量粒度以及切片产生的偏差,也带来了其他挑战。在本综述中,我们研究了这些挑战的来源,并探讨了当前最先进的 STTI 方法如何应对这些挑战。最后,我们强调了未来方法发展的一些机遇。
{"title":"Spatial Transcriptomics Brings New Challenges and Opportunities for Trajectory Inference.","authors":"Matthieu Heitz, Yujia Ma, Sharvaj Kubal, Geoffrey Schiebinger","doi":"10.1146/annurev-biodatasci-040324-030052","DOIUrl":"10.1146/annurev-biodatasci-040324-030052","url":null,"abstract":"<p><p>Spatial transcriptomics (ST) brings new dimensions to the analysis of single-cell data. While some methods for data analysis can be ported over without major modifications, they are the exception rather than the rule. Trajectory inference (TI) methods in particular can suffer from significant challenges due to spatial batch effects in ST data. These can add independent sources of noise to each time point. Pioneering methods for TI on ST data have focused primarily on addressing the batch effects in physical arrangement, i.e., where tissues are deformed in different ways at different time points. However, other challenges arise due to the measurement granularity of ST technologies, as well as a bias from slicing. In this review, we examine the sources of these challenges, and we explore how they are addressed with current state-of-the-art STTI methods. We conclude by highlighting some opportunities for future method development.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":"1-19"},"PeriodicalIF":6.0,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142628467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-01Epub Date: 2025-03-18DOI: 10.1146/annurev-biodatasci-103123-095332
Divya Shanmugam, Monica Agrawal, Rajiv Movva, Irene Y Chen, Marzyeh Ghassemi, Maia Jacobs, Emma Pierson
The increased capabilities of generative artificial intelligence (AI) have dramatically expanded its possible use cases in medicine. We provide a comprehensive overview of generative AI use cases for clinicians, patients, clinical trial organizers, researchers, and trainees. We then discuss the many challenges-including maintaining privacy and security, improving transparency and interpretability, upholding equity, and rigorously evaluating models-that must be overcome to realize this potential, as well as the open research directions they give rise to.
{"title":"Generative Artificial Intelligence in Medicine.","authors":"Divya Shanmugam, Monica Agrawal, Rajiv Movva, Irene Y Chen, Marzyeh Ghassemi, Maia Jacobs, Emma Pierson","doi":"10.1146/annurev-biodatasci-103123-095332","DOIUrl":"10.1146/annurev-biodatasci-103123-095332","url":null,"abstract":"<p><p>The increased capabilities of generative artificial intelligence (AI) have dramatically expanded its possible use cases in medicine. We provide a comprehensive overview of generative AI use cases for clinicians, patients, clinical trial organizers, researchers, and trainees. We then discuss the many challenges-including maintaining privacy and security, improving transparency and interpretability, upholding equity, and rigorously evaluating models-that must be overcome to realize this potential, as well as the open research directions they give rise to.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":"199-226"},"PeriodicalIF":6.0,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143658878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}