首页 > 最新文献

Annual Review of Biomedical Data Science最新文献

英文 中文
The Expanding Landscape of Neural Architectures and Their Impact in Biomedicine. 神经结构的扩展景观及其在生物医学中的影响。
IF 6 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-08-01 DOI: 10.1146/annurev-biodatasci-103023-050856
Zijun Frank Zhang, Huixin Zhan, Tinghui Wu, Robert Burns, Jasreet Hundal, Helio A Costa

Deep learning and artificial intelligence (AI) have seen explosive growth and success in biomedical applications in the last decade, largely due to the rapid development of deep neural networks and their underlying neural network (NN) architectures. Here, we explore biomedical deep learning and AI from the specific perspective of NN architectures. We discuss widely varying design principles of NN architectures, their use in particular biomedical applications, and the assumptions (often hidden) built into them. We explore neural architecture search techniques that automate the design of NN topology to optimize task performance. Advanced neural architectures are being developed for both molecular and healthcare applications, employing elements of graph networks, transformers, and interpretable NNs, and we discuss and summarize the design considerations and unique advantages of each architecture. Future advances will include the employment of multimodal language models and smaller highly focused mechanistic models that build on the success of today's large models.

在过去十年中,深度学习和人工智能(AI)在生物医学应用中取得了爆炸式的增长和成功,这主要归功于深度神经网络及其底层神经网络(NN)架构的快速发展。在这里,我们从神经网络架构的特定角度探索生物医学深度学习和人工智能。我们讨论了各种各样的神经网络架构的设计原则,它们在特定生物医学应用中的应用,以及它们内置的假设(通常是隐藏的)。我们探索神经结构搜索技术,自动设计神经网络拓扑以优化任务性能。先进的神经体系结构正在为分子和医疗保健应用开发,采用图网络、变压器和可解释神经网络的元素,我们讨论和总结了每种体系结构的设计考虑因素和独特优势。未来的进展将包括使用多模态语言模型和建立在当今大型模型成功基础上的更小的高度集中的机制模型。
{"title":"The Expanding Landscape of Neural Architectures and Their Impact in Biomedicine.","authors":"Zijun Frank Zhang, Huixin Zhan, Tinghui Wu, Robert Burns, Jasreet Hundal, Helio A Costa","doi":"10.1146/annurev-biodatasci-103023-050856","DOIUrl":"10.1146/annurev-biodatasci-103023-050856","url":null,"abstract":"<p><p>Deep learning and artificial intelligence (AI) have seen explosive growth and success in biomedical applications in the last decade, largely due to the rapid development of deep neural networks and their underlying neural network (NN) architectures. Here, we explore biomedical deep learning and AI from the specific perspective of NN architectures. We discuss widely varying design principles of NN architectures, their use in particular biomedical applications, and the assumptions (often hidden) built into them. We explore neural architecture search techniques that automate the design of NN topology to optimize task performance. Advanced neural architectures are being developed for both molecular and healthcare applications, employing elements of graph networks, transformers, and interpretable NNs, and we discuss and summarize the design considerations and unique advantages of each architecture. Future advances will include the employment of multimodal language models and smaller highly focused mechanistic models that build on the success of today's large models.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":"8 1","pages":"101-124"},"PeriodicalIF":6.0,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144822754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Biomedical Natural Language Processing in the Era of Large Language Models. 大语言模型时代的生物医学自然语言处理。
IF 6 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-08-01 Epub Date: 2025-04-17 DOI: 10.1146/annurev-biodatasci-103123-095406
Naoto Usuyama, Cliff Wong, Sheng Zhang, Tristan Naumann, Hoifung Poon

Biomedicine has rapidly digitized over recent decades, from genomic sequencing to electronic medical records. Now, the rise of large language models (LLMs) is driving a generative artificial intelligence (AI) revolution in natural language processing (NLP). Together, these trends create unprecedented possibilities to optimize patient care and accelerate biomedical discovery. Biomedical NLP already boosts productivity by automating labor-intensive tasks such as knowledge extraction and medical abstraction. Emerging approaches promise creativity gain, surpassing standard healthcare practices and uncovering emergent capabilities through Web-scale biomedical knowledge and population-level patient data. However, LLMs remain prone to hallucinations and omissions, and ensuring compliance and safety is vital in order to do no harm. Incorporating diverse modalities such as imaging and genomics is also essential for comprehensive solutions. We review these challenges and opportunities in biomedical NLP, offering historical context, surveying the current state of the art, and exploring frontiers for AI researchers and biomedical practitioners.

近几十年来,从基因组测序到电子医疗记录,生物医学迅速实现了数字化。现在,大型语言模型(llm)的兴起正在推动自然语言处理(NLP)领域的生成式人工智能(AI)革命。总之,这些趋势为优化患者护理和加速生物医学发现创造了前所未有的可能性。生物医学NLP已经通过自动化劳动密集型任务(如知识提取和医学抽象)提高了生产力。新兴的方法有望获得创造力,超越标准的医疗保健实践,并通过网络规模的生物医学知识和人口水平的患者数据揭示紧急功能。然而,法学硕士仍然容易出现幻觉和遗漏,为了不造成伤害,确保合规和安全至关重要。整合成像和基因组学等多种模式对于全面解决方案也至关重要。我们回顾了生物医学NLP中的这些挑战和机遇,提供了历史背景,调查了当前的艺术状态,并为人工智能研究人员和生物医学从业者探索了前沿。
{"title":"Biomedical Natural Language Processing in the Era of Large Language Models.","authors":"Naoto Usuyama, Cliff Wong, Sheng Zhang, Tristan Naumann, Hoifung Poon","doi":"10.1146/annurev-biodatasci-103123-095406","DOIUrl":"10.1146/annurev-biodatasci-103123-095406","url":null,"abstract":"<p><p>Biomedicine has rapidly digitized over recent decades, from genomic sequencing to electronic medical records. Now, the rise of large language models (LLMs) is driving a generative artificial intelligence (AI) revolution in natural language processing (NLP). Together, these trends create unprecedented possibilities to optimize patient care and accelerate biomedical discovery. Biomedical NLP already boosts productivity by automating labor-intensive tasks such as knowledge extraction and medical abstraction. Emerging approaches promise creativity gain, surpassing standard healthcare practices and uncovering emergent capabilities through Web-scale biomedical knowledge and population-level patient data. However, LLMs remain prone to hallucinations and omissions, and ensuring compliance and safety is vital in order to do no harm. Incorporating diverse modalities such as imaging and genomics is also essential for comprehensive solutions. We review these challenges and opportunities in biomedical NLP, offering historical context, surveying the current state of the art, and exploring frontiers for AI researchers and biomedical practitioners.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":"471-490"},"PeriodicalIF":6.0,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144052846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Beyond Multiple-Choice Accuracy: Real-World Challenges of Implementing Large Language Models in Healthcare. 超越多项选择的准确性:在医疗保健中实现大型语言模型的现实挑战。
IF 6 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-08-01 Epub Date: 2025-04-08 DOI: 10.1146/annurev-biodatasci-103123-094851
Yifan Yang, Qiao Jin, Qingqing Zhu, Zhizheng Wang, Francisco Erramuspe Álvarez, Nicholas Wan, Benjamin Hou, Zhiyong Lu

Large language models (LLMs) have gained significant attention in the medical domain for their human-level capabilities, leading to increased efforts to explore their potential in various healthcare applications. However, despite such a promising future, there are multiple challenges and obstacles that remain for their real-world uses in practical settings. This work discusses key challenges for LLMs in medical applications from four unique aspects: operational vulnerabilities, ethical and social considerations, performance and assessment difficulties, and legal and regulatory compliance. Addressing these challenges is crucial for leveraging LLMs to their full potential and ensuring their responsible integration into healthcare.

大型语言模型(llm)因其具有人类级别的功能而在医学领域获得了极大的关注,因此人们加大了探索其在各种医疗保健应用中的潜力的努力。然而,尽管有这样一个充满希望的未来,在实际环境中使用它们仍然存在许多挑战和障碍。这项工作从四个独特的方面讨论了法学硕士在医疗应用中的主要挑战:操作漏洞、道德和社会考虑、绩效和评估困难以及法律和法规遵从性。解决这些挑战对于充分发挥法学硕士的潜力并确保其负责任地融入医疗保健行业至关重要。
{"title":"Beyond Multiple-Choice Accuracy: Real-World Challenges of Implementing Large Language Models in Healthcare.","authors":"Yifan Yang, Qiao Jin, Qingqing Zhu, Zhizheng Wang, Francisco Erramuspe Álvarez, Nicholas Wan, Benjamin Hou, Zhiyong Lu","doi":"10.1146/annurev-biodatasci-103123-094851","DOIUrl":"10.1146/annurev-biodatasci-103123-094851","url":null,"abstract":"<p><p>Large language models (LLMs) have gained significant attention in the medical domain for their human-level capabilities, leading to increased efforts to explore their potential in various healthcare applications. However, despite such a promising future, there are multiple challenges and obstacles that remain for their real-world uses in practical settings. This work discusses key challenges for LLMs in medical applications from four unique aspects: operational vulnerabilities, ethical and social considerations, performance and assessment difficulties, and legal and regulatory compliance. Addressing these challenges is crucial for leveraging LLMs to their full potential and ensuring their responsible integration into healthcare.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":"305-316"},"PeriodicalIF":6.0,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143812609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Algorithm-Based Clinical Decision Support: Evolving Regulatory Landscape and Best Practices for Local Oversight. 基于算法的临床决策支持:不断发展的监管环境和地方监督的最佳实践。
IF 6 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-08-01 Epub Date: 2025-04-21 DOI: 10.1146/annurev-biodatasci-103123-094601
Anthony L Lin, Amanda B Parrish, Michael Cary, Christina Silcox, Suresh Balu, J Eric Jelovsek, Cara O'Brien, Michael Pencina, Eric Poon, Nicoleta J Economou-Zavlanos

The potential of algorithm-based clinical decision support (CDS) in healthcare continues to increase with the growing field of artificial intelligence (AI)-enabled CDS. The use of these technologies to support clinicians, patients, and health systems is still quite new, and to date, implementors and regulators are still identifying the best processes and practices to ensure the effective, safe, and equitable use of these technology solutions. To assist individuals and organizations interested in implementation of algorithm-based CDS and AI-enabled CDS in healthcare, this article reviews the important regulatory decisions that form the landscape within which algorithm-based CDS has emerged, modern governance frameworks used to oversee these CDS systems, nuances in evaluation and monitoring throughout the CDS life cycle, best practices for real-world implementation, safety and equity considerations, and avenues for future collaboration and innovation.

随着人工智能(AI)支持的临床决策支持(CDS)领域的不断发展,基于算法的临床决策支持(CDS)在医疗保健领域的潜力不断增加。使用这些技术来支持临床医生、患者和卫生系统仍然是相当新的,迄今为止,实施者和监管机构仍在确定最佳流程和做法,以确保有效、安全和公平地使用这些技术解决方案。为了帮助对在医疗保健领域实施基于算法的CDS和支持ai的CDS感兴趣的个人和组织,本文回顾了形成基于算法的CDS出现的环境的重要监管决策、用于监督这些CDS系统的现代治理框架、整个CDS生命周期中评估和监控的细微差别、现实世界实施的最佳实践、安全和公平考虑。以及未来合作和创新的途径。
{"title":"Algorithm-Based Clinical Decision Support: Evolving Regulatory Landscape and Best Practices for Local Oversight.","authors":"Anthony L Lin, Amanda B Parrish, Michael Cary, Christina Silcox, Suresh Balu, J Eric Jelovsek, Cara O'Brien, Michael Pencina, Eric Poon, Nicoleta J Economou-Zavlanos","doi":"10.1146/annurev-biodatasci-103123-094601","DOIUrl":"10.1146/annurev-biodatasci-103123-094601","url":null,"abstract":"<p><p>The potential of algorithm-based clinical decision support (CDS) in healthcare continues to increase with the growing field of artificial intelligence (AI)-enabled CDS. The use of these technologies to support clinicians, patients, and health systems is still quite new, and to date, implementors and regulators are still identifying the best processes and practices to ensure the effective, safe, and equitable use of these technology solutions. To assist individuals and organizations interested in implementation of algorithm-based CDS and AI-enabled CDS in healthcare, this article reviews the important regulatory decisions that form the landscape within which algorithm-based CDS has emerged, modern governance frameworks used to oversee these CDS systems, nuances in evaluation and monitoring throughout the CDS life cycle, best practices for real-world implementation, safety and equity considerations, and avenues for future collaboration and innovation.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":"491-507"},"PeriodicalIF":6.0,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144022132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Revisiting Technical Bias Mitigation Strategies. 重新审视技术偏见缓解战略。
IF 6 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-08-01 Epub Date: 2025-04-08 DOI: 10.1146/annurev-biodatasci-103123-095737
Abdoul Jalil Djiberou Mahamadou, Artem A Trotsyuk

Efforts to mitigate bias and enhance fairness in the artificial intelligence (AI) community have predominantly focused on technical solutions. While numerous reviews have addressed bias in AI, this review uniquely focuses on the practical limitations of technical solutions in healthcare settings, providing a structured analysis across five key dimensions affecting their real-world implementation: who defines bias and fairness, which mitigation strategy to use and prioritize among dozens that are inconsistent and incompatible, when in the AI development stages the solutions are most effective, for which populations, and the context for which the solutions are designed. We illustrate each limitation with empirical studies focusing on healthcare and biomedical applications. Moreover, we discuss how value-sensitive AI, a framework derived from technology design, can engage stakeholders and ensure that their values are embodied in bias and fairness mitigation solutions. Finally, we discuss areas that require further investigation and provide practical recommendations to address the limitations covered in the study.

在人工智能(AI)社区,减轻偏见和提高公平性的努力主要集中在技术解决方案上。虽然许多综述已经解决了人工智能中的偏见,但本综述独特地关注了医疗保健环境中技术解决方案的实际局限性,并在影响其现实世界实施的五个关键维度上提供了结构化分析:世卫组织定义了偏见和公平,在数十种不一致和不相容的缓解策略中使用哪种策略并优先考虑哪种策略,在人工智能开发阶段,解决方案何时最有效,针对哪些人群,以及设计解决方案的背景。我们通过关注医疗保健和生物医学应用的实证研究来说明每个限制。此外,我们还讨论了价值敏感的人工智能(源自技术设计的框架)如何吸引利益相关者,并确保他们的价值观体现在偏见和公平缓解解决方案中。最后,我们讨论了需要进一步调查的领域,并提供了实际的建议,以解决研究中所涵盖的局限性。
{"title":"Revisiting Technical Bias Mitigation Strategies.","authors":"Abdoul Jalil Djiberou Mahamadou, Artem A Trotsyuk","doi":"10.1146/annurev-biodatasci-103123-095737","DOIUrl":"10.1146/annurev-biodatasci-103123-095737","url":null,"abstract":"<p><p>Efforts to mitigate bias and enhance fairness in the artificial intelligence (AI) community have predominantly focused on technical solutions. While numerous reviews have addressed bias in AI, this review uniquely focuses on the practical limitations of technical solutions in healthcare settings, providing a structured analysis across five key dimensions affecting their real-world implementation: who defines bias and fairness, which mitigation strategy to use and prioritize among dozens that are inconsistent and incompatible, when in the AI development stages the solutions are most effective, for which populations, and the context for which the solutions are designed. We illustrate each limitation with empirical studies focusing on healthcare and biomedical applications. Moreover, we discuss how value-sensitive AI, a framework derived from technology design, can engage stakeholders and ensure that their values are embodied in bias and fairness mitigation solutions. Finally, we discuss areas that require further investigation and provide practical recommendations to address the limitations covered in the study.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":"287-303"},"PeriodicalIF":6.0,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143812611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Genetic Studies Through the Lens of Gene Networks. 基因网络视角下的基因研究。
IF 6 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-08-01 Epub Date: 2025-02-20 DOI: 10.1146/annurev-biodatasci-103123-095355
Marc Subirana-Granés, Jill Hoffman, Haoyu Zhang, Christina Akirtava, Sutanu Nandi, Kevin Fotso, Milton Pividori

Understanding the genetic basis of complex traits is a longstanding challenge in the field of genomics. Genome-wide association studies have identified thousands of variant-trait associations, but most of these variants are located in noncoding regions, making the link to biological function elusive. While traditional approaches, such as transcriptome-wide association studies (TWAS), have advanced our understanding by linking genetic variants to gene expression, they often overlook gene-gene interactions. Here, we review current approaches to integrate different molecular data, leveraging machine learning methods to identify gene modules based on coexpression and functional relationships. These integrative approaches, such as PhenoPLIER, combine TWAS and drug-induced transcriptional profiles to effectively capture biologically meaningful gene networks. This integration provides a context-specific understanding of disease processes while highlighting both core and peripheral genes. These insights pave the way for novel therapeutic targets and enhance the interpretability of genetic studies in personalized medicine.

了解复杂性状的遗传基础是基因组学领域的一个长期挑战。全基因组关联研究已经确定了数千种变异-性状关联,但大多数这些变异位于非编码区域,使其与生物学功能的联系难以捉摸。虽然传统的方法,如转录组全关联研究(TWAS),通过将遗传变异与基因表达联系起来,提高了我们的理解,但它们往往忽略了基因与基因的相互作用。在这里,我们回顾了目前整合不同分子数据的方法,利用机器学习方法来识别基于共表达和功能关系的基因模块。这些综合方法,如PhenoPLIER,将TWAS和药物诱导的转录谱结合起来,有效地捕获生物学上有意义的基因网络。这种整合提供了对疾病过程的特定背景的理解,同时突出了核心和外周基因。这些见解为新的治疗靶点铺平了道路,并增强了个性化医学中基因研究的可解释性。
{"title":"Genetic Studies Through the Lens of Gene Networks.","authors":"Marc Subirana-Granés, Jill Hoffman, Haoyu Zhang, Christina Akirtava, Sutanu Nandi, Kevin Fotso, Milton Pividori","doi":"10.1146/annurev-biodatasci-103123-095355","DOIUrl":"10.1146/annurev-biodatasci-103123-095355","url":null,"abstract":"<p><p>Understanding the genetic basis of complex traits is a longstanding challenge in the field of genomics. Genome-wide association studies have identified thousands of variant-trait associations, but most of these variants are located in noncoding regions, making the link to biological function elusive. While traditional approaches, such as transcriptome-wide association studies (TWAS), have advanced our understanding by linking genetic variants to gene expression, they often overlook gene-gene interactions. Here, we review current approaches to integrate different molecular data, leveraging machine learning methods to identify gene modules based on coexpression and functional relationships. These integrative approaches, such as PhenoPLIER, combine TWAS and drug-induced transcriptional profiles to effectively capture biologically meaningful gene networks. This integration provides a context-specific understanding of disease processes while highlighting both core and peripheral genes. These insights pave the way for novel therapeutic targets and enhance the interpretability of genetic studies in personalized medicine.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":"125-147"},"PeriodicalIF":6.0,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12310179/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143469408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Embedding Methods for Electronic Health Record Research. 电子健康档案研究的嵌入方法。
IF 6 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-08-01 Epub Date: 2025-05-01 DOI: 10.1146/annurev-biodatasci-103123-094729
Justin Kauffman, Riccardo Miotto, Eyal Klang, Anthony Costa, Beau Norgeot, Marinka Zitnik, Shameer Khader, Fei Wang, Girish N Nadkarni, Benjamin S Glicksberg

This review aims to elucidate the role and impact of embedding techniques in the analysis and utilization of electronic health record data for research. By integrating multidimensional, incongruent, and often unstructured medical data for machine learning models, embeddings provide a powerful tool for enhancing data utility, especially under certain conditions and for asking certain questions. We explore a variety of embedding methods, including but not limited to word embeddings, graph embeddings, and other deep learning models. We highlight key applications of embeddings that are representative of a variety of areas of research, including predictive modeling, patient stratification, clinical decision support, and beyond. Finally, we show how to evaluate the impact and quality of embeddings in real-world clinical settings, assessing their performance against traditional models and noting areas where they deliver substantial improvements or fall short.

本文旨在阐明嵌入技术在分析和利用电子病历数据进行研究中的作用和影响。通过为机器学习模型集成多维的、不一致的、通常是非结构化的医疗数据,嵌入为增强数据效用提供了一个强大的工具,特别是在某些条件下和提出某些问题时。我们探索了各种嵌入方法,包括但不限于词嵌入、图嵌入和其他深度学习模型。我们强调了嵌入的关键应用,这些应用代表了各种研究领域,包括预测建模、患者分层、临床决策支持等。最后,我们展示了如何在现实世界的临床环境中评估嵌入的影响和质量,根据传统模型评估它们的性能,并指出它们提供实质性改进或不足的领域。
{"title":"Embedding Methods for Electronic Health Record Research.","authors":"Justin Kauffman, Riccardo Miotto, Eyal Klang, Anthony Costa, Beau Norgeot, Marinka Zitnik, Shameer Khader, Fei Wang, Girish N Nadkarni, Benjamin S Glicksberg","doi":"10.1146/annurev-biodatasci-103123-094729","DOIUrl":"10.1146/annurev-biodatasci-103123-094729","url":null,"abstract":"<p><p>This review aims to elucidate the role and impact of embedding techniques in the analysis and utilization of electronic health record data for research. By integrating multidimensional, incongruent, and often unstructured medical data for machine learning models, embeddings provide a powerful tool for enhancing data utility, especially under certain conditions and for asking certain questions. We explore a variety of embedding methods, including but not limited to word embeddings, graph embeddings, and other deep learning models. We highlight key applications of embeddings that are representative of a variety of areas of research, including predictive modeling, patient stratification, clinical decision support, and beyond. Finally, we show how to evaluate the impact and quality of embeddings in real-world clinical settings, assessing their performance against traditional models and noting areas where they deliver substantial improvements or fall short.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":"563-590"},"PeriodicalIF":6.0,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144048052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quantum Computing for Photosensitizer Design in Photodynamic Therapy. 光动力疗法中光敏剂设计的量子计算。
IF 6 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-08-01 Epub Date: 2025-05-01 DOI: 10.1146/annurev-biodatasci-103123-095644
Hope Zehr, Alberto Baiardi, Francesco Tacchino, Anthony Gandon, Laurin E Fischer, Yue Xu, Frank P DiFilippo, Leonardo Guidoni, Pi A B Haase, Walter N Talarico, Martina Stella, Fabio Tarocco, Anton Nykänen, Aaron Fitzpatrick, Aaron Miller, Leander Thiessen, Stefan Knecht, Elsi-Mari Borrelli, Sabrina Maniscalco, Fabijan Pavošević, Ivano Tavernelli, Edward Maytin, Vijay Krishna

Use of light in healthcare is evolving with increasing applications of photodynamic therapy (PDT) for treating various cancers. PDT utilizes light-activated molecules called photosensitizers (PSs) that generate reactive oxygen species (ROSs) to induce tumor cell apoptosis and necrosis. However, the use of PDT is limited by the availability of PSs that can be activated by deep tissue-penetrating near-infrared light, exhibit low dark toxicity, and produce ROSs efficiently. Here we review the different categories of PS currently used in clinical or preclinical trials and highlight the significance of advanced computational methods, including density functional and wave function-based quantum chemistry, for understanding the molecular mechanisms involved in PS activation. Despite advancements in classical computational techniques, the complexities of excited state dynamics in highly correlated molecular systems demand innovative simulation approaches such as quantum computing. We propose that quantum computing holds promise for accurately modeling the excited-state properties of PSs to optimize their design and broaden clinical applications.

随着光动力疗法(PDT)用于治疗各种癌症的应用越来越多,光在医疗保健中的应用也在不断发展。PDT利用称为光敏剂(ps)的光激活分子产生活性氧(ROSs)来诱导肿瘤细胞凋亡和坏死。然而,PDT的使用受到PSs可用性的限制,这些PSs可以被深层组织穿透近红外光激活,具有低暗毒性,并且可以有效地产生ROSs。在此,我们回顾了目前在临床或临床前试验中使用的不同类别的PS,并强调了先进的计算方法的重要性,包括密度泛函和基于波函数的量子化学,以了解涉及PS激活的分子机制。尽管经典计算技术取得了进步,但高度相关分子系统中激发态动力学的复杂性需要创新的模拟方法,如量子计算。我们提出量子计算有望准确地模拟ps的激发态特性,以优化其设计并扩大临床应用。
{"title":"Quantum Computing for Photosensitizer Design in Photodynamic Therapy.","authors":"Hope Zehr, Alberto Baiardi, Francesco Tacchino, Anthony Gandon, Laurin E Fischer, Yue Xu, Frank P DiFilippo, Leonardo Guidoni, Pi A B Haase, Walter N Talarico, Martina Stella, Fabio Tarocco, Anton Nykänen, Aaron Fitzpatrick, Aaron Miller, Leander Thiessen, Stefan Knecht, Elsi-Mari Borrelli, Sabrina Maniscalco, Fabijan Pavošević, Ivano Tavernelli, Edward Maytin, Vijay Krishna","doi":"10.1146/annurev-biodatasci-103123-095644","DOIUrl":"10.1146/annurev-biodatasci-103123-095644","url":null,"abstract":"<p><p>Use of light in healthcare is evolving with increasing applications of photodynamic therapy (PDT) for treating various cancers. PDT utilizes light-activated molecules called photosensitizers (PSs) that generate reactive oxygen species (ROSs) to induce tumor cell apoptosis and necrosis. However, the use of PDT is limited by the availability of PSs that can be activated by deep tissue-penetrating near-infrared light, exhibit low dark toxicity, and produce ROSs efficiently. Here we review the different categories of PS currently used in clinical or preclinical trials and highlight the significance of advanced computational methods, including density functional and wave function-based quantum chemistry, for understanding the molecular mechanisms involved in PS activation. Despite advancements in classical computational techniques, the complexities of excited state dynamics in highly correlated molecular systems demand innovative simulation approaches such as quantum computing. We propose that quantum computing holds promise for accurately modeling the excited-state properties of PSs to optimize their design and broaden clinical applications.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":"509-536"},"PeriodicalIF":6.0,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144017049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Centralized and Federated Models for the Analysis of Clinical Data. 临床数据分析的集中模式和联合模式。
IF 6 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-08-01 Epub Date: 2024-07-24 DOI: 10.1146/annurev-biodatasci-122220-115746
Ruowang Li, Joseph D Romano, Yong Chen, Jason H Moore

The progress of precision medicine research hinges on the gathering and analysis of extensive and diverse clinical datasets. With the continued expansion of modalities, scales, and sources of clinical datasets, it becomes imperative to devise methods for aggregating information from these varied sources to achieve a comprehensive understanding of diseases. In this review, we describe two important approaches for the analysis of diverse clinical datasets, namely the centralized model and federated model. We compare and contrast the strengths and weaknesses inherent in each model and present recent progress in methodologies and their associated challenges. Finally, we present an outlook on the opportunities that both models hold for the future analysis of clinical data.

精准医学研究的进展取决于对广泛而多样的临床数据集的收集和分析。随着临床数据集的模式、规模和来源的不断扩大,当务之急是设计出从这些不同来源汇总信息的方法,以实现对疾病的全面了解。在这篇综述中,我们介绍了分析多样化临床数据集的两种重要方法,即集中模式和联合模式。我们比较和对比了每种模式固有的优缺点,并介绍了方法论的最新进展及其相关挑战。最后,我们展望了这两种模式为未来临床数据分析带来的机遇。
{"title":"Centralized and Federated Models for the Analysis of Clinical Data.","authors":"Ruowang Li, Joseph D Romano, Yong Chen, Jason H Moore","doi":"10.1146/annurev-biodatasci-122220-115746","DOIUrl":"10.1146/annurev-biodatasci-122220-115746","url":null,"abstract":"<p><p>The progress of precision medicine research hinges on the gathering and analysis of extensive and diverse clinical datasets. With the continued expansion of modalities, scales, and sources of clinical datasets, it becomes imperative to devise methods for aggregating information from these varied sources to achieve a comprehensive understanding of diseases. In this review, we describe two important approaches for the analysis of diverse clinical datasets, namely the centralized model and federated model. We compare and contrast the strengths and weaknesses inherent in each model and present recent progress in methodologies and their associated challenges. Finally, we present an outlook on the opportunities that both models hold for the future analysis of clinical data.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":"179-199"},"PeriodicalIF":6.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11571052/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140899793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Evolutionary Interplay of Somatic and Germline Mutation Rates. 体细胞和种系突变率在进化过程中的相互作用
IF 6 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-08-01 Epub Date: 2024-07-24 DOI: 10.1146/annurev-biodatasci-102523-104225
Annabel C Beichman, Luke Zhu, Kelley Harris

Novel sequencing technologies are making it increasingly possible to measure the mutation rates of somatic cell lineages. Accurate germline mutation rate measurement technologies have also been available for a decade, making it possible to assess how this fundamental evolutionary parameter varies across the tree of life. Here, we review some classical theories about germline and somatic mutation rate evolution that were formulated using principles of population genetics and the biology of aging and cancer. We find that somatic mutation rate measurements, while still limited in phylogenetic diversity, seem consistent with the theory that selection to preserve the soma is proportional to life span. However, germline and somatic theories make conflicting predictions regarding which species should have the most accurate DNA repair. Resolving this conflict will require carefully measuring how mutation rates scale with time and cell division and achieving a better understanding of mutation rate pleiotropy among cell types.

新的测序技术使测量体细胞系突变率变得越来越可能。精确的种系突变率测量技术也已问世十年,这使得评估这一基本进化参数在整个生命树中的变化情况成为可能。在此,我们回顾了一些关于种系和体细胞突变率进化的经典理论,这些理论是利用群体遗传学和衰老与癌症生物学原理提出的。我们发现,体细胞突变率的测量结果虽然在系统发育多样性方面仍然有限,但似乎与保护体细胞的选择与寿命成正比的理论相一致。然而,生殖细胞理论和体细胞理论在预测哪个物种的 DNA 修复最准确方面存在冲突。要解决这一矛盾,需要仔细测量突变率如何随时间和细胞分裂而变化,并更好地了解细胞类型之间的突变率褶积性。
{"title":"The Evolutionary Interplay of Somatic and Germline Mutation Rates.","authors":"Annabel C Beichman, Luke Zhu, Kelley Harris","doi":"10.1146/annurev-biodatasci-102523-104225","DOIUrl":"10.1146/annurev-biodatasci-102523-104225","url":null,"abstract":"<p><p>Novel sequencing technologies are making it increasingly possible to measure the mutation rates of somatic cell lineages. Accurate germline mutation rate measurement technologies have also been available for a decade, making it possible to assess how this fundamental evolutionary parameter varies across the tree of life. Here, we review some classical theories about germline and somatic mutation rate evolution that were formulated using principles of population genetics and the biology of aging and cancer. We find that somatic mutation rate measurements, while still limited in phylogenetic diversity, seem consistent with the theory that selection to preserve the soma is proportional to life span. However, germline and somatic theories make conflicting predictions regarding which species should have the most accurate DNA repair. Resolving this conflict will require carefully measuring how mutation rates scale with time and cell division and achieving a better understanding of mutation rate pleiotropy among cell types.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":"83-105"},"PeriodicalIF":6.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12254932/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140872288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Annual Review of Biomedical Data Science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1