首页 > 最新文献

Annual Review of Biomedical Data Science最新文献

英文 中文
Genetic Studies Through the Lens of Gene Networks.
IF 7 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-02-20 DOI: 10.1146/annurev-biodatasci-103123-095355
Marc Subirana-Granés, Jill Hoffman, Haoyu Zhang, Christina Akirtava, Sutanu Nandi, Kevin Fotso, Milton Pividori

Understanding the genetic basis of complex traits is a longstanding challenge in the field of genomics. Genome-wide association studies have identified thousands of variant-trait associations, but most of these variants are located in noncoding regions, making the link to biological function elusive. While traditional approaches, such as transcriptome-wide association studies (TWAS), have advanced our understanding by linking genetic variants to gene expression, they often overlook gene-gene interactions. Here, we review current approaches to integrate different molecular data, leveraging machine learning methods to identify gene modules based on coexpression and functional relationships. These integrative approaches, such as PhenoPLIER, combine TWAS and drug-induced transcriptional profiles to effectively capture biologically meaningful gene networks. This integration provides a context-specific understanding of disease processes while highlighting both core and peripheral genes. These insights pave the way for novel therapeutic targets and enhance the interpretability of genetic studies in personalized medicine.

{"title":"Genetic Studies Through the Lens of Gene Networks.","authors":"Marc Subirana-Granés, Jill Hoffman, Haoyu Zhang, Christina Akirtava, Sutanu Nandi, Kevin Fotso, Milton Pividori","doi":"10.1146/annurev-biodatasci-103123-095355","DOIUrl":"https://doi.org/10.1146/annurev-biodatasci-103123-095355","url":null,"abstract":"<p><p>Understanding the genetic basis of complex traits is a longstanding challenge in the field of genomics. Genome-wide association studies have identified thousands of variant-trait associations, but most of these variants are located in noncoding regions, making the link to biological function elusive. While traditional approaches, such as transcriptome-wide association studies (TWAS), have advanced our understanding by linking genetic variants to gene expression, they often overlook gene-gene interactions. Here, we review current approaches to integrate different molecular data, leveraging machine learning methods to identify gene modules based on coexpression and functional relationships. These integrative approaches, such as PhenoPLIER, combine TWAS and drug-induced transcriptional profiles to effectively capture biologically meaningful gene networks. This integration provides a context-specific understanding of disease processes while highlighting both core and peripheral genes. These insights pave the way for novel therapeutic targets and enhance the interpretability of genetic studies in personalized medicine.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":""},"PeriodicalIF":7.0,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143469408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluation and Regulation of Artificial Intelligence Medical Devices for Clinical Decision Support.
IF 7 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-02-19 DOI: 10.1146/annurev-biodatasci-103123-095824
Gary E Weissman

Artificial intelligence (AI) methods were first developed nearly seven decades ago. Only in recent years have they demonstrated their potential to improve clinical care at the bedside. AI systems are now capable of interpreting, predicting, and even generating important medical information. AI medical devices share many similarities with traditional medical devices but also diverge from them in important ways. Despite widespread optimism and enthusiasm surrounding the use of such devices to improve care processes, patient outcomes, and the healthcare experience for patients, caregivers, and clinicians alike, little evidence exists so far for their effectiveness in practice. Even less is known about the safety or equity of AI medical devices. As with any new technology, this exciting time is accompanied by appropriate questions regarding if, how much, when, and who such AI systems really help. Different stakeholders, ranging from patients to clinicians to industry device developers, may have divergent preferences or assessments of risk and benefits, warranting an informed public discussion to guide emerging regulatory efforts. This review summarizes the rapidly evolving recent efforts and evidence related to the regulation and evaluation of AI medical devices and highlights opportunities for future work to ensure their effectiveness, safety, and equity.

{"title":"Evaluation and Regulation of Artificial Intelligence Medical Devices for Clinical Decision Support.","authors":"Gary E Weissman","doi":"10.1146/annurev-biodatasci-103123-095824","DOIUrl":"https://doi.org/10.1146/annurev-biodatasci-103123-095824","url":null,"abstract":"<p><p>Artificial intelligence (AI) methods were first developed nearly seven decades ago. Only in recent years have they demonstrated their potential to improve clinical care at the bedside. AI systems are now capable of interpreting, predicting, and even generating important medical information. AI medical devices share many similarities with traditional medical devices but also diverge from them in important ways. Despite widespread optimism and enthusiasm surrounding the use of such devices to improve care processes, patient outcomes, and the healthcare experience for patients, caregivers, and clinicians alike, little evidence exists so far for their effectiveness in practice. Even less is known about the safety or equity of AI medical devices. As with any new technology, this exciting time is accompanied by appropriate questions regarding if, how much, when, and who such AI systems really help. Different stakeholders, ranging from patients to clinicians to industry device developers, may have divergent preferences or assessments of risk and benefits, warranting an informed public discussion to guide emerging regulatory efforts. This review summarizes the rapidly evolving recent efforts and evidence related to the regulation and evaluation of AI medical devices and highlights opportunities for future work to ensure their effectiveness, safety, and equity.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":""},"PeriodicalIF":7.0,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143459781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Foundation Models for Translational Cancer Biology.
IF 7 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-01-29 DOI: 10.1146/annurev-biodatasci-103123-095633
Kevin K Tsang, Sophia Kivelson, Jose M Acitores Cortina, Aditi Kuchi, Jacob S Berkowitz, Hongyu Liu, Apoorva Srinivasan, Nadine A Friedrich, Yasaman Fatapour, Nicholas P Tatonetti

Cancer remains a leading cause of death globally. The complexity and diversity of cancer-related datasets across different specialties pose challenges in refining precision medicine for oncology. Foundation models offer a promising solution. Trained on vast amounts of data, these models develop a broad understanding across a wide range of tasks. We examine the role of foundation models in domains relevant to cancer research, including natural language processing, computer vision, molecular biology, and cheminformatics. Through a review of state-of-the-art methods, we explore how these models have already advanced translational cancer research goals such as precision tumor classification and artificial intelligence-assisted surgery. We also discuss prospective advances in areas like early tumor detection, personalized cancer treatment, and drug discovery. This review provides researchers with a curated set of resources and methodologies, offers practitioners a deeper understanding of how these models enhance cancer care, and points to opportunities for future applications of foundation models in cancer research.

癌症仍然是全球死亡的主要原因。不同专科癌症相关数据集的复杂性和多样性为完善肿瘤精准医疗带来了挑战。基础模型提供了一个很有前景的解决方案。通过对大量数据的训练,这些模型能对各种任务形成广泛的理解。我们研究了基础模型在癌症研究相关领域的作用,包括自然语言处理、计算机视觉、分子生物学和化学信息学。通过回顾最先进的方法,我们探讨了这些模型是如何推进肿瘤精准分类和人工智能辅助手术等转化癌症研究目标的。我们还讨论了早期肿瘤检测、个性化癌症治疗和药物发现等领域的前瞻性进展。这篇综述为研究人员提供了一套精心策划的资源和方法,让从业人员更深入地了解这些模型如何加强癌症护理,并指出了未来在癌症研究中应用基础模型的机会。
{"title":"Foundation Models for Translational Cancer Biology.","authors":"Kevin K Tsang, Sophia Kivelson, Jose M Acitores Cortina, Aditi Kuchi, Jacob S Berkowitz, Hongyu Liu, Apoorva Srinivasan, Nadine A Friedrich, Yasaman Fatapour, Nicholas P Tatonetti","doi":"10.1146/annurev-biodatasci-103123-095633","DOIUrl":"https://doi.org/10.1146/annurev-biodatasci-103123-095633","url":null,"abstract":"<p><p>Cancer remains a leading cause of death globally. The complexity and diversity of cancer-related datasets across different specialties pose challenges in refining precision medicine for oncology. Foundation models offer a promising solution. Trained on vast amounts of data, these models develop a broad understanding across a wide range of tasks. We examine the role of foundation models in domains relevant to cancer research, including natural language processing, computer vision, molecular biology, and cheminformatics. Through a review of state-of-the-art methods, we explore how these models have already advanced translational cancer research goals such as precision tumor classification and artificial intelligence-assisted surgery. We also discuss prospective advances in areas like early tumor detection, personalized cancer treatment, and drug discovery. This review provides researchers with a curated set of resources and methodologies, offers practitioners a deeper understanding of how these models enhance cancer care, and points to opportunities for future applications of foundation models in cancer research.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":""},"PeriodicalIF":7.0,"publicationDate":"2025-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143068152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Conditional Generative Models for Synthetic Tabular Data: Applications for Precision Medicine and Diverse Representations. 合成表格数据的条件生成模型:精准医疗和多样化表示的应用。
IF 7 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-01-14 DOI: 10.1146/annurev-biodatasci-103123-094844
Kara Liu, Russ B Altman

Tabular medical datasets, like electronic health records (EHRs), biobanks, and structured clinical trial data, are rich sources of information with the potential to advance precision medicine and optimize patient care. However, real-world medical datasets have limited patient diversity and cannot simulate hypothetical outcomes, both of which are necessary for equitable and effective medical research. Fueled by recent advancements in machine learning, generative models offer a promising solution to these data limitations by generating enhanced synthetic data. This review highlights the potential of conditional generative models (CGMs) to create patient-specific synthetic data for a variety of precision medicine applications. We survey CGM approaches that tackle two medical applications: correcting for data representation biases and simulating digital health twins. We additionally explore how the surveyed methods handle modeling tabular medical data and briefly discuss evaluation criteria. Finally, we summarize the technical, medical, and ethical challenges that must be addressed before CGMs can be effectively and safely deployed in the medical field.

表格式医疗数据集,如电子健康记录(EHRs)、生物银行和结构化临床试验数据,是丰富的信息源,具有推进精准医疗和优化患者护理的潜力。然而,现实世界的医疗数据集具有有限的患者多样性,无法模拟假设的结果,这两者对于公平和有效的医学研究都是必要的。在机器学习最新进展的推动下,生成模型通过生成增强的合成数据,为这些数据限制提供了一个有希望的解决方案。这篇综述强调了条件生成模型(cgm)在为各种精准医学应用创建患者特定合成数据方面的潜力。我们调查了CGM解决两种医疗应用的方法:纠正数据表示偏差和模拟数字健康双胞胎。此外,我们还探讨了调查方法如何处理表格医学数据的建模,并简要讨论了评估标准。最后,我们总结了在cgm能够有效和安全地应用于医疗领域之前必须解决的技术、医学和伦理挑战。
{"title":"Conditional Generative Models for Synthetic Tabular Data: Applications for Precision Medicine and Diverse Representations.","authors":"Kara Liu, Russ B Altman","doi":"10.1146/annurev-biodatasci-103123-094844","DOIUrl":"https://doi.org/10.1146/annurev-biodatasci-103123-094844","url":null,"abstract":"<p><p>Tabular medical datasets, like electronic health records (EHRs), biobanks, and structured clinical trial data, are rich sources of information with the potential to advance precision medicine and optimize patient care. However, real-world medical datasets have limited patient diversity and cannot simulate hypothetical outcomes, both of which are necessary for equitable and effective medical research. Fueled by recent advancements in machine learning, generative models offer a promising solution to these data limitations by generating enhanced synthetic data. This review highlights the potential of conditional generative models (CGMs) to create patient-specific synthetic data for a variety of precision medicine applications. We survey CGM approaches that tackle two medical applications: correcting for data representation biases and simulating digital health twins. We additionally explore how the surveyed methods handle modeling tabular medical data and briefly discuss evaluation criteria. Finally, we summarize the technical, medical, and ethical challenges that must be addressed before CGMs can be effectively and safely deployed in the medical field.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":""},"PeriodicalIF":7.0,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142984817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatial Transcriptomics Brings New Challenges and Opportunities for Trajectory Inference. 空间转录组学为轨迹推断带来新的挑战和机遇
IF 7 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-11-14 DOI: 10.1146/annurev-biodatasci-040324-030052
Matthieu Heitz, Yujia Ma, Sharvaj Kubal, Geoffrey Schiebinger

Spatial transcriptomics (ST) brings new dimensions to the analysis of single-cell data. While some methods for data analysis can be ported over without major modifications, they are the exception rather than the rule. Trajectory inference (TI) methods in particular can suffer from significant challenges due to spatial batch effects in ST data. These can add independent sources of noise to each time point. Pioneering methods for TI on ST data have focused primarily on addressing the batch effects in physical arrangement, i.e., where tissues are deformed in different ways at different time points. However, other challenges arise due to the measurement granularity of ST technologies, as well as a bias from slicing. In this review, we examine the sources of these challenges, and we explore how they are addressed with current state-of-the-art STTI methods. We conclude by highlighting some opportunities for future method development.

空间转录组学(ST)为单细胞数据分析带来了新的维度。虽然有些数据分析方法无需进行重大修改即可移植,但它们只是例外,而不是常规。特别是轨迹推断(TI)方法,由于 ST 数据的空间批次效应,可能会面临巨大的挑战。这可能会给每个时间点增加独立的噪声源。ST 数据轨迹推断的开创性方法主要侧重于解决物理排列中的批次效应,即组织在不同时间点以不同方式变形。然而,由于 ST 技术的测量粒度以及切片产生的偏差,也带来了其他挑战。在本综述中,我们研究了这些挑战的来源,并探讨了当前最先进的 STTI 方法如何应对这些挑战。最后,我们强调了未来方法发展的一些机遇。
{"title":"Spatial Transcriptomics Brings New Challenges and Opportunities for Trajectory Inference.","authors":"Matthieu Heitz, Yujia Ma, Sharvaj Kubal, Geoffrey Schiebinger","doi":"10.1146/annurev-biodatasci-040324-030052","DOIUrl":"https://doi.org/10.1146/annurev-biodatasci-040324-030052","url":null,"abstract":"<p><p>Spatial transcriptomics (ST) brings new dimensions to the analysis of single-cell data. While some methods for data analysis can be ported over without major modifications, they are the exception rather than the rule. Trajectory inference (TI) methods in particular can suffer from significant challenges due to spatial batch effects in ST data. These can add independent sources of noise to each time point. Pioneering methods for TI on ST data have focused primarily on addressing the batch effects in physical arrangement, i.e., where tissues are deformed in different ways at different time points. However, other challenges arise due to the measurement granularity of ST technologies, as well as a bias from slicing. In this review, we examine the sources of these challenges, and we explore how they are addressed with current state-of-the-art STTI methods. We conclude by highlighting some opportunities for future method development.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142628467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Centralized and Federated Models for the Analysis of Clinical Data. 临床数据分析的集中模式和联合模式。
IF 7 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-08-01 Epub Date: 2024-07-24 DOI: 10.1146/annurev-biodatasci-122220-115746
Ruowang Li, Joseph D Romano, Yong Chen, Jason H Moore

The progress of precision medicine research hinges on the gathering and analysis of extensive and diverse clinical datasets. With the continued expansion of modalities, scales, and sources of clinical datasets, it becomes imperative to devise methods for aggregating information from these varied sources to achieve a comprehensive understanding of diseases. In this review, we describe two important approaches for the analysis of diverse clinical datasets, namely the centralized model and federated model. We compare and contrast the strengths and weaknesses inherent in each model and present recent progress in methodologies and their associated challenges. Finally, we present an outlook on the opportunities that both models hold for the future analysis of clinical data.

精准医学研究的进展取决于对广泛而多样的临床数据集的收集和分析。随着临床数据集的模式、规模和来源的不断扩大,当务之急是设计出从这些不同来源汇总信息的方法,以实现对疾病的全面了解。在这篇综述中,我们介绍了分析多样化临床数据集的两种重要方法,即集中模式和联合模式。我们比较和对比了每种模式固有的优缺点,并介绍了方法论的最新进展及其相关挑战。最后,我们展望了这两种模式为未来临床数据分析带来的机遇。
{"title":"Centralized and Federated Models for the Analysis of Clinical Data.","authors":"Ruowang Li, Joseph D Romano, Yong Chen, Jason H Moore","doi":"10.1146/annurev-biodatasci-122220-115746","DOIUrl":"10.1146/annurev-biodatasci-122220-115746","url":null,"abstract":"<p><p>The progress of precision medicine research hinges on the gathering and analysis of extensive and diverse clinical datasets. With the continued expansion of modalities, scales, and sources of clinical datasets, it becomes imperative to devise methods for aggregating information from these varied sources to achieve a comprehensive understanding of diseases. In this review, we describe two important approaches for the analysis of diverse clinical datasets, namely the centralized model and federated model. We compare and contrast the strengths and weaknesses inherent in each model and present recent progress in methodologies and their associated challenges. Finally, we present an outlook on the opportunities that both models hold for the future analysis of clinical data.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":"179-199"},"PeriodicalIF":7.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11571052/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140899793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Evolutionary Interplay of Somatic and Germline Mutation Rates. 体细胞和种系突变率在进化过程中的相互作用
IF 7 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-08-01 Epub Date: 2024-07-24 DOI: 10.1146/annurev-biodatasci-102523-104225
Annabel C Beichman, Luke Zhu, Kelley Harris

Novel sequencing technologies are making it increasingly possible to measure the mutation rates of somatic cell lineages. Accurate germline mutation rate measurement technologies have also been available for a decade, making it possible to assess how this fundamental evolutionary parameter varies across the tree of life. Here, we review some classical theories about germline and somatic mutation rate evolution that were formulated using principles of population genetics and the biology of aging and cancer. We find that somatic mutation rate measurements, while still limited in phylogenetic diversity, seem consistent with the theory that selection to preserve the soma is proportional to life span. However, germline and somatic theories make conflicting predictions regarding which species should have the most accurate DNA repair. Resolving this conflict will require carefully measuring how mutation rates scale with time and cell division and achieving a better understanding of mutation rate pleiotropy among cell types.

新的测序技术使测量体细胞系突变率变得越来越可能。精确的种系突变率测量技术也已问世十年,这使得评估这一基本进化参数在整个生命树中的变化情况成为可能。在此,我们回顾了一些关于种系和体细胞突变率进化的经典理论,这些理论是利用群体遗传学和衰老与癌症生物学原理提出的。我们发现,体细胞突变率的测量结果虽然在系统发育多样性方面仍然有限,但似乎与保护体细胞的选择与寿命成正比的理论相一致。然而,生殖细胞理论和体细胞理论在预测哪个物种的 DNA 修复最准确方面存在冲突。要解决这一矛盾,需要仔细测量突变率如何随时间和细胞分裂而变化,并更好地了解细胞类型之间的突变率褶积性。
{"title":"The Evolutionary Interplay of Somatic and Germline Mutation Rates.","authors":"Annabel C Beichman, Luke Zhu, Kelley Harris","doi":"10.1146/annurev-biodatasci-102523-104225","DOIUrl":"10.1146/annurev-biodatasci-102523-104225","url":null,"abstract":"<p><p>Novel sequencing technologies are making it increasingly possible to measure the mutation rates of somatic cell lineages. Accurate germline mutation rate measurement technologies have also been available for a decade, making it possible to assess how this fundamental evolutionary parameter varies across the tree of life. Here, we review some classical theories about germline and somatic mutation rate evolution that were formulated using principles of population genetics and the biology of aging and cancer. We find that somatic mutation rate measurements, while still limited in phylogenetic diversity, seem consistent with the theory that selection to preserve the soma is proportional to life span. However, germline and somatic theories make conflicting predictions regarding which species should have the most accurate DNA repair. Resolving this conflict will require carefully measuring how mutation rates scale with time and cell division and achieving a better understanding of mutation rate pleiotropy among cell types.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":"83-105"},"PeriodicalIF":7.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140872288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mapping the Multiscale Proteomic Organization of Cellular and Disease Phenotypes. 绘制细胞和疾病表型的多尺度蛋白质组组织图。
IF 7 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-08-01 Epub Date: 2024-07-24 DOI: 10.1146/annurev-biodatasci-102423-113534
Anthony Cesnik, Leah V Schaffer, Ishan Gaur, Mayank Jain, Trey Ideker, Emma Lundberg

While the primary sequences of human proteins have been cataloged for over a decade, determining how these are organized into a dynamic collection of multiprotein assemblies, with structures and functions spanning biological scales, is an ongoing venture. Systematic and data-driven analyses of these higher-order structures are emerging, facilitating the discovery and understanding of cellular phenotypes. At present, knowledge of protein localization and function has been primarily derived from manual annotation and curation in resources such as the Gene Ontology, which are biased toward richly annotated genes in the literature. Here, we envision a future powered by data-driven mapping of protein assemblies. These maps can capture and decode cellular functions through the integration of protein expression, localization, and interaction data across length scales and timescales. In this review, we focus on progress toward constructing integrated cell maps that accelerate the life sciences and translational research.

虽然人类蛋白质的主要序列已经编目十多年,但确定这些蛋白质是如何组织成一个动态的多蛋白集合体,其结构和功能跨越生物尺度,仍是一项持续的工作。对这些高阶结构的系统化和数据驱动分析正在兴起,有助于发现和理解细胞表型。目前,有关蛋白质定位和功能的知识主要来自人工注释和基因本体等资源的整理,这些资源偏重于文献中注释丰富的基因。在这里,我们设想了一个由数据驱动的蛋白质组装图谱驱动的未来。通过整合跨长度尺度和时间尺度的蛋白质表达、定位和相互作用数据,这些图谱可以捕捉和解码细胞功能。在这篇综述中,我们将重点介绍构建集成细胞图谱的进展,以加速生命科学和转化研究的发展。
{"title":"Mapping the Multiscale Proteomic Organization of Cellular and Disease Phenotypes.","authors":"Anthony Cesnik, Leah V Schaffer, Ishan Gaur, Mayank Jain, Trey Ideker, Emma Lundberg","doi":"10.1146/annurev-biodatasci-102423-113534","DOIUrl":"10.1146/annurev-biodatasci-102423-113534","url":null,"abstract":"<p><p>While the primary sequences of human proteins have been cataloged for over a decade, determining how these are organized into a dynamic collection of multiprotein assemblies, with structures and functions spanning biological scales, is an ongoing venture. Systematic and data-driven analyses of these higher-order structures are emerging, facilitating the discovery and understanding of cellular phenotypes. At present, knowledge of protein localization and function has been primarily derived from manual annotation and curation in resources such as the Gene Ontology, which are biased toward richly annotated genes in the literature. Here, we envision a future powered by data-driven mapping of protein assemblies. These maps can capture and decode cellular functions through the integration of protein expression, localization, and interaction data across length scales and timescales. In this review, we focus on progress toward constructing integrated cell maps that accelerate the life sciences and translational research.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":"369-389"},"PeriodicalIF":7.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11343683/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140946150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Employing Informatics Strategies in Alzheimer's Disease Research: A Review from Genetics, Multiomics, and Biomarkers to Clinical Outcomes. 在阿尔茨海默病研究中采用信息学策略:从遗传学、多组学、生物标记物到临床结果的回顾。
IF 7 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-08-01 Epub Date: 2024-07-24 DOI: 10.1146/annurev-biodatasci-102423-121021
Jingxuan Bao, Brian N Lee, Junhao Wen, Mansu Kim, Shizhuo Mu, Shu Yang, Christos Davatzikos, Qi Long, Marylyn D Ritchie, Li Shen

Alzheimer's disease (AD) is a critical national concern, affecting 5.8 million people and costing more than $250 billion annually. However, there is no available cure. Thus, effective strategies are in urgent need to discover AD biomarkers for disease early detection and drug development. In this review, we study AD from a biomedical data scientist perspective to discuss the four fundamental components in AD research: genetics (G), molecular multiomics (M), multimodal imaging biomarkers (B), and clinical outcomes (O) (collectively referred to as the GMBO framework). We provide a comprehensive review of common statistical and informatics methodologies for each component within the GMBO framework, accompanied by the major findings from landmark AD studies. Our review highlights the potential of multimodal biobank data in addressing key challenges in AD, such as early diagnosis, disease heterogeneity, and therapeutic development. We identify major hurdles in AD research, including data scarcity and complexity, and advocate for enhanced collaboration, data harmonization, and advanced modeling techniques. This review aims to be an essential guide for understanding current biomedical data science strategies in AD research, emphasizing the need for integrated, multidisciplinary approaches to advance our understanding and management of AD.

阿尔茨海默氏症(AD)是一个全国性的重大问题,影响到 580 万人,每年造成的损失超过 2,500 亿美元。然而,目前尚无治疗方法。因此,迫切需要有效的策略来发现阿兹海默症生物标志物,以用于疾病的早期检测和药物开发。在这篇综述中,我们从生物医学数据科学家的角度研究了AD,讨论了AD研究的四个基本组成部分:遗传学(G)、分子多组学(M)、多模态成像生物标志物(B)和临床结果(O)(统称为GMBO框架)。我们全面回顾了 GMBO 框架中每个组成部分的常用统计和信息学方法,并附有具有里程碑意义的 AD 研究的主要发现。我们的综述强调了多模态生物库数据在应对 AD 关键挑战(如早期诊断、疾病异质性和治疗开发)方面的潜力。我们指出了 AD 研究中的主要障碍,包括数据稀缺性和复杂性,并倡导加强合作、统一数据和采用先进的建模技术。这篇综述旨在成为了解当前 AD 研究中生物医学数据科学策略的重要指南,强调我们需要综合、多学科的方法来促进我们对 AD 的理解和管理。
{"title":"Employing Informatics Strategies in Alzheimer's Disease Research: A Review from Genetics, Multiomics, and Biomarkers to Clinical Outcomes.","authors":"Jingxuan Bao, Brian N Lee, Junhao Wen, Mansu Kim, Shizhuo Mu, Shu Yang, Christos Davatzikos, Qi Long, Marylyn D Ritchie, Li Shen","doi":"10.1146/annurev-biodatasci-102423-121021","DOIUrl":"10.1146/annurev-biodatasci-102423-121021","url":null,"abstract":"<p><p>Alzheimer's disease (AD) is a critical national concern, affecting 5.8 million people and costing more than $250 billion annually. However, there is no available cure. Thus, effective strategies are in urgent need to discover AD biomarkers for disease early detection and drug development. In this review, we study AD from a biomedical data scientist perspective to discuss the four fundamental components in AD research: genetics (G), molecular multiomics (M), multimodal imaging biomarkers (B), and clinical outcomes (O) (collectively referred to as the GMBO framework). We provide a comprehensive review of common statistical and informatics methodologies for each component within the GMBO framework, accompanied by the major findings from landmark AD studies. Our review highlights the potential of multimodal biobank data in addressing key challenges in AD, such as early diagnosis, disease heterogeneity, and therapeutic development. We identify major hurdles in AD research, including data scarcity and complexity, and advocate for enhanced collaboration, data harmonization, and advanced modeling techniques. This review aims to be an essential guide for understanding current biomedical data science strategies in AD research, emphasizing the need for integrated, multidisciplinary approaches to advance our understanding and management of AD.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":"391-418"},"PeriodicalIF":7.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11525791/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141288709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatially Resolved Single-Cell Omics: Methods, Challenges, and Future Perspectives. 空间分辨单细胞图像学:方法、挑战和未来展望》。
IF 7 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-08-01 Epub Date: 2024-07-24 DOI: 10.1146/annurev-biodatasci-102523-103640
Felipe Segato Dezem, Wani Arjumand, Hannah DuBose, Natalia Silva Morosini, Jasmine Plummer

Overlaying omics data onto spatial biological dimensions has been a promising technology to provide high-resolution insights into the interactome and cellular heterogeneity relative to the organization of the molecular microenvironment of tissue samples in normal and disease states. Spatial omics can be categorized into three major modalities: (a) next-generation sequencing-based assays, (b) imaging-based spatially resolved transcriptomics approaches including in situ hybridization/in situ sequencing, and (c) imaging-based spatial proteomics. These modalities allow assessment of transcripts and proteins at a cellular level, generating large and computationally challenging datasets. The lack of standardized computational pipelines to analyze and integrate these nonuniform structured data has made it necessary to apply artificial intelligence and machine learning strategies to best visualize and translate their complexity. In this review, we summarize the currently available techniques and computational strategies, highlight their advantages and limitations, and discuss their future prospects in the scientific field.

将全局组学数据叠加到空间生物维度上是一项前景广阔的技术,可提供对正常和疾病状态下组织样本分子微环境组织的相互作用组和细胞异质性的高分辨率洞察。空间全息技术可分为三种主要模式:(a) 基于新一代测序的检测,(b) 基于成像的空间分辨转录组学 RNA 方法,包括原位杂交/原位测序,以及 (c) 基于成像的蛋白质组学。这些方法可在细胞水平评估转录本和蛋白质,产生大量计算难度高的数据集。由于缺乏标准化的计算管道来分析和整合这些非统一结构的数据,因此有必要应用人工智能和机器学习策略来最好地可视化和转化其复杂性。在这篇综述中,我们总结了目前可用的技术和计算策略,强调了它们的优势和局限性,并讨论了它们在科学领域的未来前景。
{"title":"Spatially Resolved Single-Cell Omics: Methods, Challenges, and Future Perspectives.","authors":"Felipe Segato Dezem, Wani Arjumand, Hannah DuBose, Natalia Silva Morosini, Jasmine Plummer","doi":"10.1146/annurev-biodatasci-102523-103640","DOIUrl":"10.1146/annurev-biodatasci-102523-103640","url":null,"abstract":"<p><p>Overlaying omics data onto spatial biological dimensions has been a promising technology to provide high-resolution insights into the interactome and cellular heterogeneity relative to the organization of the molecular microenvironment of tissue samples in normal and disease states. Spatial omics can be categorized into three major modalities: (<i>a</i>) next-generation sequencing-based assays, (<i>b</i>) imaging-based spatially resolved transcriptomics approaches including in situ hybridization/in situ sequencing, and (<i>c</i>) imaging-based spatial proteomics. These modalities allow assessment of transcripts and proteins at a cellular level, generating large and computationally challenging datasets. The lack of standardized computational pipelines to analyze and integrate these nonuniform structured data has made it necessary to apply artificial intelligence and machine learning strategies to best visualize and translate their complexity. In this review, we summarize the currently available techniques and computational strategies, highlight their advantages and limitations, and discuss their future prospects in the scientific field.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":"131-153"},"PeriodicalIF":7.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141071246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Annual Review of Biomedical Data Science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1