首页 > 最新文献

Annual Review of Biomedical Data Science最新文献

英文 中文
Data Science Methods for Real-World Evidence Generation in Real-World Data. 在真实世界数据中生成证据的数据科学方法。
IF 7 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-08-01 Epub Date: 2024-07-24 DOI: 10.1146/annurev-biodatasci-102423-113220
Fang Liu

In the healthcare landscape, data science (DS) methods have emerged as indispensable tools to harness real-world data (RWD) from various data sources such as electronic health records, claim and registry data, and data gathered from digital health technologies. Real-world evidence (RWE) generated from RWD empowers researchers, clinicians, and policymakers with a more comprehensive understanding of real-world patient outcomes. Nevertheless, persistent challenges in RWD (e.g., messiness, voluminousness, heterogeneity, multimodality) and a growing awareness of the need for trustworthy and reliable RWE demand innovative, robust, and valid DS methods for analyzing RWD. In this article, I review some common current DS methods for extracting RWE and valuable insights from complex and diverse RWD. This article encompasses the entire RWE-generation pipeline, from study design with RWD to data preprocessing, exploratory analysis, methods for analyzing RWD, and trustworthiness and reliability guarantees, along with data ethics considerations and open-source tools. This review, tailored for an audience that may not be experts in DS, aspires to offer a systematic review of DS methods and assists readers in selecting suitable DS methods and enhancing the process of RWE generation for addressing their specific challenges.

在医疗保健领域,数据科学(DS)方法已成为利用来自各种数据源(如电子健康记录、索赔和登记数据以及从数字医疗技术中收集的数据)的真实世界数据(RWD)的不可或缺的工具。由真实世界数据生成的真实世界证据(RWE)使研究人员、临床医生和政策制定者能够更全面地了解真实世界中患者的治疗效果。然而,RWD 中持续存在的挑战(如杂乱性、大量性、异质性、多模态性)以及人们对可信和可靠 RWE 需求的日益增长的认识,都要求采用创新、稳健和有效的 DS 方法来分析 RWD。在本文中,我回顾了当前一些常见的从复杂多样的 RWD 中提取 RWE 和有价值见解的 DS 方法。本文涵盖了整个 RWE 生成流程,从使用 RWD 的研究设计到数据预处理、探索性分析、RWD 分析方法、可信度和可靠性保证,以及数据伦理考虑和开源工具。这篇综述是为可能不是数据挖掘专家的读者量身定制的,旨在对数据挖掘方法进行系统综述,帮助读者选择合适的数据挖掘方法,并改进 RWE 生成过程,以解决他们面临的具体挑战。
{"title":"Data Science Methods for Real-World Evidence Generation in Real-World Data.","authors":"Fang Liu","doi":"10.1146/annurev-biodatasci-102423-113220","DOIUrl":"10.1146/annurev-biodatasci-102423-113220","url":null,"abstract":"<p><p>In the healthcare landscape, data science (DS) methods have emerged as indispensable tools to harness real-world data (RWD) from various data sources such as electronic health records, claim and registry data, and data gathered from digital health technologies. Real-world evidence (RWE) generated from RWD empowers researchers, clinicians, and policymakers with a more comprehensive understanding of real-world patient outcomes. Nevertheless, persistent challenges in RWD (e.g., messiness, voluminousness, heterogeneity, multimodality) and a growing awareness of the need for trustworthy and reliable RWE demand innovative, robust, and valid DS methods for analyzing RWD. In this article, I review some common current DS methods for extracting RWE and valuable insights from complex and diverse RWD. This article encompasses the entire RWE-generation pipeline, from study design with RWD to data preprocessing, exploratory analysis, methods for analyzing RWD, and trustworthiness and reliability guarantees, along with data ethics considerations and open-source tools. This review, tailored for an audience that may not be experts in DS, aspires to offer a systematic review of DS methods and assists readers in selecting suitable DS methods and enhancing the process of RWE generation for addressing their specific challenges.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":"201-224"},"PeriodicalIF":7.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140946147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Value Proposition of Coordinated Population Cohorts Across Africa. 全非洲协调人口群组的价值主张。
IF 7 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-08-01 DOI: 10.1146/annurev-biodatasci-020722-015026
Michèle Ramsay, Amelia C Crampin, Ayaga A Bawah, Evelyn Gitau, Kobus Herbst

Building longitudinal population cohorts in Africa for coordinated research and surveillance can influence the setting of national health priorities, lead to the introduction of appropriate interventions, and provide evidence for targeted treatment, leading to better health across the continent. However, compared to cohorts from the global north, longitudinal continental African population cohorts remain scarce, are relatively small in size, and lack data complexity. As infections and noncommunicable diseases disproportionately affect Africa's approximately 1.4 billion inhabitants, African cohorts present a unique opportunity for research and surveillance. High genetic diversity in African populations and multiomic research studies, together with detailed phenotyping and clinical profiling, will be a treasure trove for discovery. The outcomes, including novel drug targets, biological pathways for disease, and gene-environment interactions, will boost precision medicine approaches, not only in Africa but across the globe.

在非洲建立用于协调研究和监测的纵向人口队列,可以影响国家卫生优先事项的制定,促使采取适当的干预措施,并为有针对性的治疗提供证据,从而改善整个非洲大陆的健康状况。然而,与全球北方的队列相比,非洲大陆的纵向人口队列仍然很少,规模相对较小,而且缺乏数据的复杂性。由于感染和非传染性疾病对非洲约 14 亿居民的影响尤为严重,非洲队列为研究和监测提供了一个独特的机会。非洲人口的遗传多样性很高,多基因组研究以及详细的表型和临床分析将成为发现疾病的宝库。这些成果,包括新的药物靶点、疾病的生物学途径以及基因与环境的相互作用,将不仅在非洲,而且在全球范围内促进精准医疗方法的发展。
{"title":"The Value Proposition of Coordinated Population Cohorts Across Africa.","authors":"Michèle Ramsay, Amelia C Crampin, Ayaga A Bawah, Evelyn Gitau, Kobus Herbst","doi":"10.1146/annurev-biodatasci-020722-015026","DOIUrl":"10.1146/annurev-biodatasci-020722-015026","url":null,"abstract":"<p><p>Building longitudinal population cohorts in Africa for coordinated research and surveillance can influence the setting of national health priorities, lead to the introduction of appropriate interventions, and provide evidence for targeted treatment, leading to better health across the continent. However, compared to cohorts from the global north, longitudinal continental African population cohorts remain scarce, are relatively small in size, and lack data complexity. As infections and noncommunicable diseases disproportionately affect Africa's approximately 1.4 billion inhabitants, African cohorts present a unique opportunity for research and surveillance. High genetic diversity in African populations and multiomic research studies, together with detailed phenotyping and clinical profiling, will be a treasure trove for discovery. The outcomes, including novel drug targets, biological pathways for disease, and gene-environment interactions, will boost precision medicine approaches, not only in Africa but across the globe.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":"7 1","pages":"277-294"},"PeriodicalIF":7.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142044149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Graph Artificial Intelligence in Medicine. 图谱人工智能在医学中的应用。
IF 7 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-08-01 Epub Date: 2024-07-24 DOI: 10.1146/annurev-biodatasci-110723-024625
Ruth Johnson, Michelle M Li, Ayush Noori, Owen Queen, Marinka Zitnik

In clinical artificial intelligence (AI), graph representation learning, mainly through graph neural networks and graph transformer architectures, stands out for its capability to capture intricate relationships and structures within clinical datasets. With diverse data-from patient records to imaging-graph AI models process data holistically by viewing modalities and entities within them as nodes interconnected by their relationships. Graph AI facilitates model transfer across clinical tasks, enabling models to generalize across patient populations without additional parameters and with minimal to no retraining. However, the importance of human-centered design and model interpretability in clinical decision-making cannot be overstated. Since graph AI models capture information through localized neural transformations defined on relational datasets, they offer both an opportunity and a challenge in elucidating model rationale. Knowledge graphs can enhance interpretability by aligning model-driven insights with medical knowledge. Emerging graph AI models integrate diverse data modalities through pretraining, facilitate interactive feedback loops, and foster human-AI collaboration, paving the way toward clinically meaningful predictions.

在临床人工智能(AI)领域,主要通过图神经网络和图转换器架构进行的图表示学习,因其能够捕捉临床数据集中错综复杂的关系和结构而脱颖而出。对于从病人记录到成像的各种数据,图人工智能模型通过将模式和其中的实体视为由其关系相互连接的节点,从而全面地处理数据。图谱人工智能促进了模型在临床任务中的转移,使模型能够在患者群体中推广,而无需额外参数,并且只需极少甚至无需重新训练。然而,在临床决策中,以人为本的设计和模型可解释性的重要性怎么强调都不为过。由于图人工智能模型是通过定义在关系数据集上的局部神经变换来捕捉信息的,因此在阐明模型原理方面既是机遇也是挑战。知识图谱可以将模型驱动的见解与医学知识相结合,从而提高可解释性。新兴的图人工智能模型通过预训练整合了多种数据模式,促进了交互式反馈循环,并促进了人类与人工智能的合作,为实现有临床意义的预测铺平了道路。
{"title":"Graph Artificial Intelligence in Medicine.","authors":"Ruth Johnson, Michelle M Li, Ayush Noori, Owen Queen, Marinka Zitnik","doi":"10.1146/annurev-biodatasci-110723-024625","DOIUrl":"10.1146/annurev-biodatasci-110723-024625","url":null,"abstract":"<p><p>In clinical artificial intelligence (AI), graph representation learning, mainly through graph neural networks and graph transformer architectures, stands out for its capability to capture intricate relationships and structures within clinical datasets. With diverse data-from patient records to imaging-graph AI models process data holistically by viewing modalities and entities within them as nodes interconnected by their relationships. Graph AI facilitates model transfer across clinical tasks, enabling models to generalize across patient populations without additional parameters and with minimal to no retraining. However, the importance of human-centered design and model interpretability in clinical decision-making cannot be overstated. Since graph AI models capture information through localized neural transformations defined on relational datasets, they offer both an opportunity and a challenge in elucidating model rationale. Knowledge graphs can enhance interpretability by aligning model-driven insights with medical knowledge. Emerging graph AI models integrate diverse data modalities through pretraining, facilitate interactive feedback loops, and foster human-AI collaboration, paving the way toward clinically meaningful predictions.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":"345-368"},"PeriodicalIF":7.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11344018/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140946148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Computational Methods for Predicting Key Interactions in T Cell-Mediated Adaptive Immunity. 预测 T 细胞介导的适应性免疫中关键相互作用的计算方法。
IF 7 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-08-01 Epub Date: 2024-07-24 DOI: 10.1146/annurev-biodatasci-102423-122741
Ryan Ehrlich, Eric Glynn, Mona Singh, Dario Ghersi

The adaptive immune system recognizes pathogen- and cancer-specific features and is endowed with memory, enabling it to respond quickly and efficiently to repeated encounters with the same antigens. T cells play a central role in the adaptive immune system by directly targeting intracellular pathogens and helping to activate B cells to secrete antibodies. Several fundamental protein interactions-including those between major histocompatibility complex (MHC) proteins and antigen-derived peptides as well as between T cell receptors and peptide-MHC complexes-underlie the ability of T cells to recognize antigens with great precision. Computational approaches to predict these interactions are increasingly being used for medically relevant applications, including vaccine design and prediction of patient response to cancer immunotherapies. We provide computational researchers with an accessible introduction to the adaptive immune system, review computational approaches to predict the key protein interactions underlying T cell-mediated adaptive immunity, and highlight remaining challenges.

适应性免疫系统能够识别病原体和癌症的特异性特征,并具有记忆能力,使其能够快速有效地应对与相同抗原的反复接触。T 细胞在适应性免疫系统中发挥着核心作用,它直接针对细胞内病原体,并帮助激活 B 细胞分泌抗体。有几种基本的蛋白质相互作用--包括主要组织相容性复合体(MHC)蛋白与抗原衍生肽之间的相互作用,以及T细胞受体与肽-MHC复合体之间的相互作用--是T细胞能够精确识别抗原的基础。预测这些相互作用的计算方法正越来越多地应用于医学相关领域,包括疫苗设计和预测患者对癌症免疫疗法的反应。我们为计算研究人员提供了关于适应性免疫系统的通俗易懂的介绍,回顾了预测 T 细胞介导的适应性免疫的关键蛋白质相互作用的计算方法,并重点介绍了仍然存在的挑战。
{"title":"Computational Methods for Predicting Key Interactions in T Cell-Mediated Adaptive Immunity.","authors":"Ryan Ehrlich, Eric Glynn, Mona Singh, Dario Ghersi","doi":"10.1146/annurev-biodatasci-102423-122741","DOIUrl":"10.1146/annurev-biodatasci-102423-122741","url":null,"abstract":"<p><p>The adaptive immune system recognizes pathogen- and cancer-specific features and is endowed with memory, enabling it to respond quickly and efficiently to repeated encounters with the same antigens. T cells play a central role in the adaptive immune system by directly targeting intracellular pathogens and helping to activate B cells to secrete antibodies. Several fundamental protein interactions-including those between major histocompatibility complex (MHC) proteins and antigen-derived peptides as well as between T cell receptors and peptide-MHC complexes-underlie the ability of T cells to recognize antigens with great precision. Computational approaches to predict these interactions are increasingly being used for medically relevant applications, including vaccine design and prediction of patient response to cancer immunotherapies. We provide computational researchers with an accessible introduction to the adaptive immune system, review computational approaches to predict the key protein interactions underlying T cell-mediated adaptive immunity, and highlight remaining challenges.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":"295-316"},"PeriodicalIF":7.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140946171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Addressing the Challenge of Biomedical Data Inequality: An Artificial Intelligence Perspective. 应对生物医学数据不平等的挑战:人工智能视角。
IF 6 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-08-10 Epub Date: 2023-04-27 DOI: 10.1146/annurev-biodatasci-020722-020704
Yan Gao, Teena Sharma, Yan Cui

Artificial intelligence (AI) and other data-driven technologies hold great promise to transform healthcare and confer the predictive power essential to precision medicine. However, the existing biomedical data, which are a vital resource and foundation for developing medical AI models, do not reflect the diversity of the human population. The low representation in biomedical data has become a significant health risk for non-European populations, and the growing application of AI opens a new pathway for this health risk to manifest and amplify. Here we review the current status of biomedical data inequality and present a conceptual framework for understanding its impacts on machine learning. We also discuss the recent advances in algorithmic interventions for mitigating health disparities arising from biomedical data inequality. Finally, we briefly discuss the newly identified disparity in data quality among ethnic groups and its potential impacts on machine learning.

人工智能(AI)和其他数据驱动技术有望改变医疗保健,并赋予精准医疗所必需的预测能力。然而,现有的生物医学数据是开发医学人工智能模型的重要资源和基础,并不能反映人类的多样性。生物医学数据中的低代表性已成为非欧洲人群的一个重大健康风险,人工智能的日益应用为这种健康风险的显现和放大开辟了一条新的途径。在这里,我们回顾了生物医学数据不平等的现状,并提出了一个概念框架来理解其对机器学习的影响。我们还讨论了算法干预的最新进展,以缓解生物医学数据不平等引起的健康差异。最后,我们简要讨论了新发现的种族群体之间数据质量的差异及其对机器学习的潜在影响。
{"title":"Addressing the Challenge of Biomedical Data Inequality: An Artificial Intelligence Perspective.","authors":"Yan Gao,&nbsp;Teena Sharma,&nbsp;Yan Cui","doi":"10.1146/annurev-biodatasci-020722-020704","DOIUrl":"10.1146/annurev-biodatasci-020722-020704","url":null,"abstract":"<p><p>Artificial intelligence (AI) and other data-driven technologies hold great promise to transform healthcare and confer the predictive power essential to precision medicine. However, the existing biomedical data, which are a vital resource and foundation for developing medical AI models, do not reflect the diversity of the human population. The low representation in biomedical data has become a significant health risk for non-European populations, and the growing application of AI opens a new pathway for this health risk to manifest and amplify. Here we review the current status of biomedical data inequality and present a conceptual framework for understanding its impacts on machine learning. We also discuss the recent advances in algorithmic interventions for mitigating health disparities arising from biomedical data inequality. Finally, we briefly discuss the newly identified disparity in data quality among ethnic groups and its potential impacts on machine learning.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":"6 ","pages":"153-171"},"PeriodicalIF":6.0,"publicationDate":"2023-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10529864/pdf/nihms-1913459.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9960491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Virus-Derived Small RNAs and microRNAs in Health and Disease. 病毒衍生小rna和微rna在健康和疾病中的作用。
IF 6 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-08-10 DOI: 10.1146/annurev-biodatasci-122220-111429
Vasileios Gouzouasis, Spyros Tastsoglou, Antonis Giannakakis, Artemis G Hatzigeorgiou

MicroRNAs (miRNAs) are short noncoding RNAs that can regulate all steps of gene expression (induction, transcription, and translation). Several virus families, primarily double-stranded DNA viruses, encode small RNAs (sRNAs), including miRNAs. These virus-derived miRNAs (v-miRNAs) help the virus evade the host's innate and adaptive immune system and maintain an environment of chronic latent infection. In this review, the functions of the sRNA-mediated virus-host interactions are highlighted, delineating their implication in chronic stress, inflammation, immunopathology, and disease. We provide insights into the latest viral RNA-based research-in silico approaches for functional characterization of v-miRNAs and other RNA types. The latest research can assist toward the identification of therapeutic targets to combat viral infections.

MicroRNAs (miRNAs)是一种短的非编码rna,可以调节基因表达的所有步骤(诱导、转录和翻译)。一些病毒科,主要是双链DNA病毒,编码小rna (sRNAs),包括miRNAs。这些病毒衍生的mirna (v- mirna)帮助病毒逃避宿主的先天和适应性免疫系统,并维持慢性潜伏感染的环境。在这篇综述中,强调了srna介导的病毒-宿主相互作用的功能,描述了它们在慢性应激、炎症、免疫病理和疾病中的作用。我们提供了最新的基于病毒RNA的研究方法,用于v- mirna和其他RNA类型的功能表征。最新的研究有助于确定对抗病毒感染的治疗靶点。
{"title":"Virus-Derived Small RNAs and microRNAs in Health and Disease.","authors":"Vasileios Gouzouasis,&nbsp;Spyros Tastsoglou,&nbsp;Antonis Giannakakis,&nbsp;Artemis G Hatzigeorgiou","doi":"10.1146/annurev-biodatasci-122220-111429","DOIUrl":"https://doi.org/10.1146/annurev-biodatasci-122220-111429","url":null,"abstract":"<p><p>MicroRNAs (miRNAs) are short noncoding RNAs that can regulate all steps of gene expression (induction, transcription, and translation). Several virus families, primarily double-stranded DNA viruses, encode small RNAs (sRNAs), including miRNAs. These virus-derived miRNAs (v-miRNAs) help the virus evade the host's innate and adaptive immune system and maintain an environment of chronic latent infection. In this review, the functions of the sRNA-mediated virus-host interactions are highlighted, delineating their implication in chronic stress, inflammation, immunopathology, and disease. We provide insights into the latest viral RNA-based research-in silico approaches for functional characterization of v-miRNAs and other RNA types. The latest research can assist toward the identification of therapeutic targets to combat viral infections.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":"6 ","pages":"275-298"},"PeriodicalIF":6.0,"publicationDate":"2023-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9960509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Computational Methods for Single-Cell Proteomics. 单细胞蛋白质组学的计算方法。
IF 6 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-08-10 Epub Date: 2023-04-11 DOI: 10.1146/annurev-biodatasci-020422-050255
Sophia M Guldberg, Trine Line Hauge Okholm, Elizabeth E McCarthy, Matthew H Spitzer

Advances in single-cell proteomics technologies have resulted in high-dimensional datasets comprising millions of cells that are capable of answering key questions about biology and disease. The advent of these technologies has prompted the development of computational tools to process and visualize the complex data. In this review, we outline the steps of single-cell and spatial proteomics analysis pipelines. In addition to describing available methods, we highlight benchmarking studies that have identified advantages and pitfalls of the currently available computational toolkits. As these technologies continue to advance, robust analysis tools should be developed in tandem to take full advantage of the potential biological insights provided by these data.

单细胞蛋白质组学技术的进步已经产生了由数百万细胞组成的高维数据集,这些数据集能够回答有关生物学和疾病的关键问题。这些技术的出现促使计算工具的发展,以处理和可视化复杂的数据。在这篇综述中,我们概述了单细胞和空间蛋白质组学分析管道的步骤。除了描述可用的方法外,我们还强调了基准测试研究,这些研究已经确定了当前可用的计算工具包的优点和缺点。随着这些技术的不断进步,应该同时开发强大的分析工具,以充分利用这些数据提供的潜在生物学见解。
{"title":"Computational Methods for Single-Cell Proteomics.","authors":"Sophia M Guldberg, Trine Line Hauge Okholm, Elizabeth E McCarthy, Matthew H Spitzer","doi":"10.1146/annurev-biodatasci-020422-050255","DOIUrl":"10.1146/annurev-biodatasci-020422-050255","url":null,"abstract":"<p><p>Advances in single-cell proteomics technologies have resulted in high-dimensional datasets comprising millions of cells that are capable of answering key questions about biology and disease. The advent of these technologies has prompted the development of computational tools to process and visualize the complex data. In this review, we outline the steps of single-cell and spatial proteomics analysis pipelines. In addition to describing available methods, we highlight benchmarking studies that have identified advantages and pitfalls of the currently available computational toolkits. As these technologies continue to advance, robust analysis tools should be developed in tandem to take full advantage of the potential biological insights provided by these data.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":"6 ","pages":"47-71"},"PeriodicalIF":6.0,"publicationDate":"2023-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10621466/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10023948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Gene Interactions in Human Disease Studies-Evidence Is Mounting. 人类疾病研究中的基因相互作用——证据越来越多。
IF 6 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-08-10 DOI: 10.1146/annurev-biodatasci-102022-120818
Pankhuri Singhal, Shefali Setia Verma, Marylyn D Ritchie

Despite monumental advances in molecular technology to generate genome sequence data at scale, there is still a considerable proportion of heritability in most complex diseases that remains unexplained. Because many of the discoveries have been single-nucleotide variants with small to moderate effects on disease, the functional implication of many of the variants is still unknown and, thus, we have limited new drug targets and therapeutics. We, and many others, posit that one primary factor that has limited our ability to identify novel drug targets from genome-wide association studies may be due to gene interactions (epistasis), gene-environment interactions, network/pathway effects, or multiomic relationships. We propose that many of these complex models explain much of the underlying genetic architecture of complex disease. In this review, we discuss the evidence from multiple research avenues, ranging from pairs of alleles to multiomic integration studies and pharmacogenomics, that supports the need for further investigation of gene interactions (or epistasis) in genetic and genomic studies of human disease. Our goal is to catalog the mounting evidence for epistasis in genetic studies and the connections between genetic interactions and human health and disease that could enable precision medicine of the future.

尽管分子技术取得了巨大的进步,可以大规模地生成基因组序列数据,但在大多数复杂疾病中,仍有相当大比例的遗传性仍未得到解释。由于许多发现都是单核苷酸变异,对疾病的影响小到中等,许多变异的功能含义仍然未知,因此,我们的新药物靶点和治疗方法有限。我们和其他许多人认为,限制我们从全基因组关联研究中识别新药物靶点的能力的一个主要因素可能是基因相互作用(上位性)、基因-环境相互作用、网络/途径效应或多组关系。我们认为,这些复杂的模型解释了复杂疾病的许多潜在遗传结构。在这篇综述中,我们讨论了来自多个研究途径的证据,从等位基因对到多组整合研究和药物基因组学,这些证据支持在人类疾病的遗传和基因组研究中进一步研究基因相互作用(或上位性)的必要性。我们的目标是对遗传研究中的上位性以及基因相互作用与人类健康和疾病之间的联系的越来越多的证据进行编目,这些证据可能使未来的精准医学成为可能。
{"title":"Gene Interactions in Human Disease Studies-Evidence Is Mounting.","authors":"Pankhuri Singhal,&nbsp;Shefali Setia Verma,&nbsp;Marylyn D Ritchie","doi":"10.1146/annurev-biodatasci-102022-120818","DOIUrl":"https://doi.org/10.1146/annurev-biodatasci-102022-120818","url":null,"abstract":"<p><p>Despite monumental advances in molecular technology to generate genome sequence data at scale, there is still a considerable proportion of heritability in most complex diseases that remains unexplained. Because many of the discoveries have been single-nucleotide variants with small to moderate effects on disease, the functional implication of many of the variants is still unknown and, thus, we have limited new drug targets and therapeutics. We, and many others, posit that one primary factor that has limited our ability to identify novel drug targets from genome-wide association studies may be due to gene interactions (epistasis), gene-environment interactions, network/pathway effects, or multiomic relationships. We propose that many of these complex models explain much of the underlying genetic architecture of complex disease. In this review, we discuss the evidence from multiple research avenues, ranging from pairs of alleles to multiomic integration studies and pharmacogenomics, that supports the need for further investigation of gene interactions (or epistasis) in genetic and genomic studies of human disease. Our goal is to catalog the mounting evidence for epistasis in genetic studies and the connections between genetic interactions and human health and disease that could enable precision medicine of the future.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":"6 ","pages":"377-395"},"PeriodicalIF":6.0,"publicationDate":"2023-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9960535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Importance of Diversity in Precision Medicine: Generalizability of Genetic Associations Across Ancestry Groups Toward Better Identification of Disease Susceptibility Variants. 多样性在精准医学中的重要性:跨祖先群体遗传关联的普遍性,有助于更好地识别疾病易感性变异。
IF 6 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-08-10 Epub Date: 2023-05-17 DOI: 10.1146/annurev-biodatasci-122220-113250
Lauren A Cruz, Jessica N Cooke Bailey, Dana C Crawford

Genome-wide association studies (GWAS) revolutionized our understanding of common genetic variation and its impact on common human disease and traits. Developed and adopted in the mid-2000s, GWAS led to searchable genotype-phenotype catalogs and genome-wide datasets available for further data mining and analysis for the eventual development of translational applications. The GWAS revolution was swift and specific, including almost exclusively populations of European descent, to the neglect of the majority of the world's genetic diversity. In this narrative review, we recount the GWAS landscape of the early years that established a genotype-phenotype catalog that is now universally understood to be inadequate for a complete understanding of complex human genetics. We then describe approaches taken to augment the genotype-phenotype catalog, including the study populations, collaborative consortia, and study design approaches aimed to generalize and then ultimately discover genome-wide associations in non-European descent populations. The collaborations and data resources established in the efforts to diversify genomic findings undoubtedly provide the foundations of the next chapters of genetic association studies with the advent of budget-friendly whole-genome sequencing.

全基因组关联研究(GWAS)彻底改变了我们对常见遗传变异及其对常见人类疾病和性状的影响的理解。GWAS于2000年代中期开发并采用,导致可搜索的基因型-表型目录和全基因组数据集,可用于进一步的数据挖掘和分析,最终开发转化应用。GWAS革命迅速而具体,几乎只包括欧洲人后裔,而忽视了世界上大多数的遗传多样性。在这篇叙述性的综述中,我们叙述了早期建立基因型-表型目录的GWAS景观,现在普遍认为该目录不足以完全理解复杂的人类遗传学。然后,我们描述了扩大基因型-表型目录所采取的方法,包括研究群体、合作联盟和旨在推广并最终发现非欧洲血统人群全基因组关联的研究设计方法。随着预算友好型全基因组测序的出现,在多样化基因组发现的努力中建立的合作和数据资源无疑为遗传关联研究的下一章提供了基础。
{"title":"Importance of Diversity in Precision Medicine: Generalizability of Genetic Associations Across Ancestry Groups Toward Better Identification of Disease Susceptibility Variants.","authors":"Lauren A Cruz, Jessica N Cooke Bailey, Dana C Crawford","doi":"10.1146/annurev-biodatasci-122220-113250","DOIUrl":"10.1146/annurev-biodatasci-122220-113250","url":null,"abstract":"<p><p>Genome-wide association studies (GWAS) revolutionized our understanding of common genetic variation and its impact on common human disease and traits. Developed and adopted in the mid-2000s, GWAS led to searchable genotype-phenotype catalogs and genome-wide datasets available for further data mining and analysis for the eventual development of translational applications. The GWAS revolution was swift and specific, including almost exclusively populations of European descent, to the neglect of the majority of the world's genetic diversity. In this narrative review, we recount the GWAS landscape of the early years that established a genotype-phenotype catalog that is now universally understood to be inadequate for a complete understanding of complex human genetics. We then describe approaches taken to augment the genotype-phenotype catalog, including the study populations, collaborative consortia, and study design approaches aimed to generalize and then ultimately discover genome-wide associations in non-European descent populations. The collaborations and data resources established in the efforts to diversify genomic findings undoubtedly provide the foundations of the next chapters of genetic association studies with the advent of budget-friendly whole-genome sequencing.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":"6 ","pages":"339-356"},"PeriodicalIF":6.0,"publicationDate":"2023-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10720270/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9960536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Human Microbiomes and Disease for the Biomedical Data Scientist. 生物医学数据科学家的人类微生物组和疾病。
IF 6 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-08-10 DOI: 10.1146/annurev-biodatasci-020722-043017
Jonathan L Golob

The human microbiome is complex, variable from person to person, essential for health, and related to both the risk for disease and the efficacy of our treatments. There are robust techniques to describe microbiota with high-throughput sequencing, and there are hundreds of thousands of already-sequenced specimens in public archives. The promise remains to use the microbiome both as a prognostic factor and as a target for precision medicine. However, when used as an input in biomedical data science modeling, the microbiome presents unique challenges. Here, we review the most common techniques used to describe microbial communities, explore these unique challenges, and discuss the more successful approaches for biomedical data scientists seeking to use the microbiome as an input in their studies.

人体微生物群是复杂的,因人而异,对健康至关重要,与疾病风险和治疗效果有关。有强大的技术可以用高通量测序来描述微生物群,并且在公共档案中有数十万个已经测序的标本。利用微生物组作为预测因素和精准医疗的目标仍然是有希望的。然而,当用作生物医学数据科学建模的输入时,微生物组呈现出独特的挑战。在这里,我们回顾了用于描述微生物群落的最常用技术,探索了这些独特的挑战,并讨论了生物医学数据科学家寻求将微生物组作为研究输入的更成功的方法。
{"title":"Human Microbiomes and Disease for the Biomedical Data Scientist.","authors":"Jonathan L Golob","doi":"10.1146/annurev-biodatasci-020722-043017","DOIUrl":"https://doi.org/10.1146/annurev-biodatasci-020722-043017","url":null,"abstract":"<p><p>The human microbiome is complex, variable from person to person, essential for health, and related to both the risk for disease and the efficacy of our treatments. There are robust techniques to describe microbiota with high-throughput sequencing, and there are hundreds of thousands of already-sequenced specimens in public archives. The promise remains to use the microbiome both as a prognostic factor and as a target for precision medicine. However, when used as an input in biomedical data science modeling, the microbiome presents unique challenges. Here, we review the most common techniques used to describe microbial communities, explore these unique challenges, and discuss the more successful approaches for biomedical data scientists seeking to use the microbiome as an input in their studies.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":"6 ","pages":"259-273"},"PeriodicalIF":6.0,"publicationDate":"2023-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9960518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Annual Review of Biomedical Data Science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1