首页 > 最新文献

Annual Review of Biomedical Data Science最新文献

英文 中文
Integration of Protein Structure and Population-Scale DNA Sequence Data for Disease Gene Discovery and Variant Interpretation. 整合蛋白质结构和群体规模的DNA序列数据,用于疾病基因发现和变异解释。
IF 6 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2022-05-04 DOI: 10.1146/annurev-biodatasci-122220-112147
Bian Li, Bowen Jin, J. Capra, W. Bush
The experimental and computational techniques for capturing information about protein structures and genetic variation within the human genome have advanced dramatically in the past 20 years, generating extensive new data resources. In this review, we discuss these advances, along with new approaches for determining the impact a genetic variant has on protein function. We focus on the potential of new methods that integrate human genetic variation into protein structures to discover relationships to disease, including the discovery of mutational hotspots in cancer-related proteins, the localization of protein-altering variants within protein regions for common complex diseases, and the assessment of variants of unknown significance for Mendelian traits. We expect that approaches that integrate these data sources will play increasingly important roles in disease gene discovery and variant interpretation. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 5 is August 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
在过去的20年里,用于获取人类基因组中蛋白质结构和遗传变异信息的实验和计算技术取得了巨大进步,产生了大量新的数据资源。在这篇综述中,我们讨论了这些进展,以及确定遗传变异对蛋白质功能影响的新方法。我们专注于将人类遗传变异整合到蛋白质结构中以发现疾病关系的新方法的潜力,包括发现癌症相关蛋白质的突变热点,在常见复杂疾病的蛋白质区域内定位蛋白质改变变异,以及评估孟德尔性状的未知意义变异。我们期望整合这些数据源的方法将在疾病基因发现和变异解释中发挥越来越重要的作用。预计《生物医学数据科学年度评论》第5卷的最终在线出版日期为2022年8月。修订后的估计数请参阅http://www.annualreviews.org/page/journal/pubdates。
{"title":"Integration of Protein Structure and Population-Scale DNA Sequence Data for Disease Gene Discovery and Variant Interpretation.","authors":"Bian Li, Bowen Jin, J. Capra, W. Bush","doi":"10.1146/annurev-biodatasci-122220-112147","DOIUrl":"https://doi.org/10.1146/annurev-biodatasci-122220-112147","url":null,"abstract":"The experimental and computational techniques for capturing information about protein structures and genetic variation within the human genome have advanced dramatically in the past 20 years, generating extensive new data resources. In this review, we discuss these advances, along with new approaches for determining the impact a genetic variant has on protein function. We focus on the potential of new methods that integrate human genetic variation into protein structures to discover relationships to disease, including the discovery of mutational hotspots in cancer-related proteins, the localization of protein-altering variants within protein regions for common complex diseases, and the assessment of variants of unknown significance for Mendelian traits. We expect that approaches that integrate these data sources will play increasingly important roles in disease gene discovery and variant interpretation. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 5 is August 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":""},"PeriodicalIF":6.0,"publicationDate":"2022-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46663334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Functional Characterization of Genetic Variant Effects on Expression. 基因变异对表达影响的功能表征。
IF 6 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2022-04-28 DOI: 10.1146/annurev-biodatasci-122120-010010
Elise D. Flynn, T. Lappalainen
Thousands of common genetic variants in the human population have been associated with disease risk and phenotypic variation by genome-wide association studies (GWAS). However, the majority of GWAS variants fall into noncoding regions of the genome, complicating our understanding of their regulatory functions, and few molecular mechanisms of GWAS variant effects have been clearly elucidated. Here, we set out to review genetic variant effects, focusing on expression quantitative trait loci (eQTLs), including their utility in interpreting GWAS variant mechanisms. We discuss the interrelated challenges and opportunities for eQTL analysis, covering determining causal variants, elucidating molecular mechanisms of action, and understanding context variability. Addressing these questions can enable better functional characterization of disease-associated loci and provide insights into fundamental biological questions of the noncoding genetic regulatory code and its control of gene expression. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 5 is August 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
通过全基因组关联研究(GWAS),人类中数千种常见的遗传变异与疾病风险和表型变异有关。然而,大多数GWAS变体属于基因组的非编码区,这使我们对其调控功能的理解变得复杂,而且很少有GWAS变体效应的分子机制得到明确阐明。在这里,我们开始综述遗传变异效应,重点关注表达数量性状基因座(eQTL),包括它们在解释GWAS变异机制中的作用。我们讨论了eQTL分析的相关挑战和机遇,包括确定因果变异、阐明分子作用机制和理解上下文变异。解决这些问题可以更好地表征疾病相关基因座的功能,并深入了解非编码遗传调控密码及其对基因表达的控制的基本生物学问题。《生物医学数据科学年度评论》第5卷预计最终在线出版日期为2022年8月。请参阅http://www.annualreviews.org/page/journal/pubdates用于修订估算。
{"title":"Functional Characterization of Genetic Variant Effects on Expression.","authors":"Elise D. Flynn, T. Lappalainen","doi":"10.1146/annurev-biodatasci-122120-010010","DOIUrl":"https://doi.org/10.1146/annurev-biodatasci-122120-010010","url":null,"abstract":"Thousands of common genetic variants in the human population have been associated with disease risk and phenotypic variation by genome-wide association studies (GWAS). However, the majority of GWAS variants fall into noncoding regions of the genome, complicating our understanding of their regulatory functions, and few molecular mechanisms of GWAS variant effects have been clearly elucidated. Here, we set out to review genetic variant effects, focusing on expression quantitative trait loci (eQTLs), including their utility in interpreting GWAS variant mechanisms. We discuss the interrelated challenges and opportunities for eQTL analysis, covering determining causal variants, elucidating molecular mechanisms of action, and understanding context variability. Addressing these questions can enable better functional characterization of disease-associated loci and provide insights into fundamental biological questions of the noncoding genetic regulatory code and its control of gene expression. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 5 is August 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":""},"PeriodicalIF":6.0,"publicationDate":"2022-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47338850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Open Structural Data in Precision Medicine. 精准医学中的开放结构数据。
IF 6 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2022-04-28 DOI: 10.1146/annurev-biodatasci-122220-012951
R. Nussinov, Hyunbum Jang, G. Nir, Chung-Jung Tsai, F. Cheng
Three-dimensional protein structural data at the molecular level are pivotal for successful precision medicine. Such data are crucial not only for discovering drugs that act to block the active site of the target mutant protein but also for clarifying to the patient and the clinician how the mutations harbored by the patient work. The relative paucity of structural data reflects their cost, challenges in their interpretation, and lack of clinical guidelines for their utilization. Rapid technological advancements in experimental high-resolution structural determination increasingly generate structures. Computationally, modeling algorithms, including molecular dynamics simulations, are becoming more powerful, as are compute-intensive hardware, particularly graphics processing units, overlapping with the inception of the exascale era. Accessible, freely available, and detailed structural and dynamical data can be merged with big data to powerfully transform personalized pharmacology. Here we review protein and emerging genome high-resolution data, along with means, applications, and examples underscoring their usefulness in precision medicine. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 5 is August 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
在分子水平上的三维蛋白质结构数据是成功的精准医疗的关键。这些数据不仅对于发现阻断目标突变蛋白活性位点的药物至关重要,而且对于向患者和临床医生阐明患者所携带的突变如何起作用至关重要。结构数据的相对缺乏反映了它们的成本、解释上的挑战以及缺乏临床应用指南。实验高分辨率结构测定技术的快速发展越来越多地产生结构。在计算方面,包括分子动力学模拟在内的建模算法正变得越来越强大,计算密集型硬件,特别是图形处理单元,也随着百亿亿次时代的开始而变得越来越强大。可访问的、免费的、详细的结构和动态数据可以与大数据相结合,有力地改变个性化药理学。在这里,我们回顾了蛋白质和新兴的基因组高分辨率数据,以及方法、应用和例子,强调了它们在精准医学中的有用性。预计《生物医学数据科学年度评论》第5卷的最终在线出版日期为2022年8月。修订后的估计数请参阅http://www.annualreviews.org/page/journal/pubdates。
{"title":"Open Structural Data in Precision Medicine.","authors":"R. Nussinov, Hyunbum Jang, G. Nir, Chung-Jung Tsai, F. Cheng","doi":"10.1146/annurev-biodatasci-122220-012951","DOIUrl":"https://doi.org/10.1146/annurev-biodatasci-122220-012951","url":null,"abstract":"Three-dimensional protein structural data at the molecular level are pivotal for successful precision medicine. Such data are crucial not only for discovering drugs that act to block the active site of the target mutant protein but also for clarifying to the patient and the clinician how the mutations harbored by the patient work. The relative paucity of structural data reflects their cost, challenges in their interpretation, and lack of clinical guidelines for their utilization. Rapid technological advancements in experimental high-resolution structural determination increasingly generate structures. Computationally, modeling algorithms, including molecular dynamics simulations, are becoming more powerful, as are compute-intensive hardware, particularly graphics processing units, overlapping with the inception of the exascale era. Accessible, freely available, and detailed structural and dynamical data can be merged with big data to powerfully transform personalized pharmacology. Here we review protein and emerging genome high-resolution data, along with means, applications, and examples underscoring their usefulness in precision medicine. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 5 is August 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":""},"PeriodicalIF":6.0,"publicationDate":"2022-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45448847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Machine Learning in Chemoinformatics and Medicinal Chemistry. 化学信息学和药物化学中的机器学习。
IF 6 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2022-04-19 DOI: 10.1146/annurev-biodatasci-122120-124216
Raquel Rodríguez-Pérez, Filip Miljković, J. Bajorath
In chemoinformatics and medicinal chemistry, machine learning has evolved into an important approach. In recent years, increasing computational resources and new deep learning algorithms have put machine learning onto a new level, addressing previously unmet challenges in pharmaceutical research. In silico approaches for compound activity predictions, de novo design, and reaction modeling have been further advanced by new algorithmic developments and the emergence of big data in the field. Herein, novel applications of machine learning and deep learning in chemoinformatics and medicinal chemistry are reviewed. Opportunities and challenges for new methods and applications are discussed, placing emphasis on proper baseline comparisons, robust validation methodologies, and new applicability domains. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 5 is August 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
在化学信息学和药物化学中,机器学习已经发展成为一种重要的方法。近年来,越来越多的计算资源和新的深度学习算法将机器学习提升到一个新的水平,解决了制药研究中以前未遇到的挑战。随着新算法的发展和大数据的出现,化合物活性预测、从头设计和反应建模的计算机方法得到了进一步的发展。本文综述了机器学习和深度学习在化学信息学和药物化学中的新应用。讨论了新方法和应用程序的机遇和挑战,重点放在适当的基线比较,稳健的验证方法和新的适用领域。预计《生物医学数据科学年度评论》第5卷的最终在线出版日期为2022年8月。修订后的估计数请参阅http://www.annualreviews.org/page/journal/pubdates。
{"title":"Machine Learning in Chemoinformatics and Medicinal Chemistry.","authors":"Raquel Rodríguez-Pérez, Filip Miljković, J. Bajorath","doi":"10.1146/annurev-biodatasci-122120-124216","DOIUrl":"https://doi.org/10.1146/annurev-biodatasci-122120-124216","url":null,"abstract":"In chemoinformatics and medicinal chemistry, machine learning has evolved into an important approach. In recent years, increasing computational resources and new deep learning algorithms have put machine learning onto a new level, addressing previously unmet challenges in pharmaceutical research. In silico approaches for compound activity predictions, de novo design, and reaction modeling have been further advanced by new algorithmic developments and the emergence of big data in the field. Herein, novel applications of machine learning and deep learning in chemoinformatics and medicinal chemistry are reviewed. Opportunities and challenges for new methods and applications are discussed, placing emphasis on proper baseline comparisons, robust validation methodologies, and new applicability domains. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 5 is August 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":""},"PeriodicalIF":6.0,"publicationDate":"2022-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48421704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Static and Motion Facial Analysis for Craniofacial Assessment and Diagnosing Diseases. 静态和运动面部分析用于颅面评估和疾病诊断。
IF 6 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2022-04-19 DOI: 10.1146/annurev-biodatasci-122120-111413
H. Matthews, G. de Jong, T. Maal, P. Claes
Deviation from a normal facial shape and symmetry can arise from numerous sources, including physical injury and congenital birth defects. Such abnormalities can have important aesthetic and functional consequences. Furthermore, in clinical genetics distinctive facial appearances are often associated with clinical or genetic diagnoses; the recognition of a characteristic facial appearance can substantially narrow the search space of potential diagnoses for the clinician. Unusual patterns of facial movement and expression can indicate disturbances to normal mechanical functioning or emotional affect. Computational analyses of static and moving 2D and 3D images can serve clinicians and researchers by detecting and describing facial structural, mechanical, and affective abnormalities objectively. In this review we survey traditional and emerging methods of facial analysis, including statistical shape modeling, syndrome classification, modeling clinical face phenotype spaces, and analysis of facial motion and affect. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 5 is August 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
与正常面部形状和对称性的偏差可能来自多种原因,包括身体损伤和先天性出生缺陷。这种异常可能会产生重要的美学和功能后果。此外,在临床遗传学中,独特的面部外观通常与临床或遗传诊断有关;特征面部外观的识别可以显著地缩小临床医生潜在诊断的搜索空间。面部运动和表情的异常模式可能表明正常的机械功能或情绪受到干扰。静态和运动2D和3D图像的计算分析可以通过客观地检测和描述面部结构、机械和情感异常来为临床医生和研究人员提供服务。在这篇综述中,我们综述了传统和新兴的面部分析方法,包括统计形状建模、综合征分类、临床面部表型空间建模以及面部运动和情感分析。《生物医学数据科学年度评论》第5卷预计最终在线出版日期为2022年8月。请参阅http://www.annualreviews.org/page/journal/pubdates用于修订估算。
{"title":"Static and Motion Facial Analysis for Craniofacial Assessment and Diagnosing Diseases.","authors":"H. Matthews, G. de Jong, T. Maal, P. Claes","doi":"10.1146/annurev-biodatasci-122120-111413","DOIUrl":"https://doi.org/10.1146/annurev-biodatasci-122120-111413","url":null,"abstract":"Deviation from a normal facial shape and symmetry can arise from numerous sources, including physical injury and congenital birth defects. Such abnormalities can have important aesthetic and functional consequences. Furthermore, in clinical genetics distinctive facial appearances are often associated with clinical or genetic diagnoses; the recognition of a characteristic facial appearance can substantially narrow the search space of potential diagnoses for the clinician. Unusual patterns of facial movement and expression can indicate disturbances to normal mechanical functioning or emotional affect. Computational analyses of static and moving 2D and 3D images can serve clinicians and researchers by detecting and describing facial structural, mechanical, and affective abnormalities objectively. In this review we survey traditional and emerging methods of facial analysis, including statistical shape modeling, syndrome classification, modeling clinical face phenotype spaces, and analysis of facial motion and affect. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 5 is August 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":"1 1","pages":""},"PeriodicalIF":6.0,"publicationDate":"2022-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41479900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
The Cell Physiome: What Do We Need in a Computational Physiology Framework for Predicting Single-Cell Biology? 细胞重组:我们在预测单细胞生物学的计算生理学框架中需要什么?
IF 6 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2022-02-27 DOI: 10.1146/annurev-biodatasci-072018-021246
V. Rajagopal, S. Arumugam, Peter J. Hunter, A. Khadangi, Joshua Chung, Michael Pan
Modern biology and biomedicine are undergoing a big data explosion, needing advanced computational algorithms to extract mechanistic insights on the physiological state of living cells. We present the motivation for the Cell Physiome project: a framework and approach for creating, sharing, and using biophysics-based computational models of single-cell physiology. Using examples in calcium signaling, bioenergetics, and endosomal trafficking, we highlight the need for spatially detailed, biophysics-based computational models to uncover new mechanisms underlying cell biology. We review progress and challenges to date toward creating cell physiome models. We then introduce bond graphs as an efficient way to create cell physiome models that integrate chemical, mechanical, electromagnetic, and thermal processes while maintaining mass and energy balance. Bond graphs enhance modularization and reusability of computational models of cells at scale. We conclude with a look forward at steps that will help fully realize this exciting new field of mechanistic biomedical data science. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 5 is August 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
现代生物学和生物医学正在经历大数据爆炸,需要先进的计算算法来提取活细胞生理状态的机制见解。我们提出了细胞生理组项目的动机:创建、共享和使用基于生物物理学的单细胞生理学计算模型的框架和方法。以钙信号传导、生物能量学和内体运输为例,我们强调需要空间详细的、基于生物物理学的计算模型来揭示细胞生物学的新机制。我们回顾了迄今为止在创建细胞生理组模型方面的进展和挑战。然后,我们引入键合图作为创建细胞生理组模型的有效方法,该模型集成了化学、机械、电磁和热过程,同时保持质量和能量平衡。键合图增强了细胞计算模型的模块化和可重用性。最后,我们展望了将有助于充分实现这一令人兴奋的机械生物医学数据科学新领域的步骤。预计《生物医学数据科学年度评论》第5卷的最终在线出版日期为2022年8月。修订后的估计数请参阅http://www.annualreviews.org/page/journal/pubdates。
{"title":"The Cell Physiome: What Do We Need in a Computational Physiology Framework for Predicting Single-Cell Biology?","authors":"V. Rajagopal, S. Arumugam, Peter J. Hunter, A. Khadangi, Joshua Chung, Michael Pan","doi":"10.1146/annurev-biodatasci-072018-021246","DOIUrl":"https://doi.org/10.1146/annurev-biodatasci-072018-021246","url":null,"abstract":"Modern biology and biomedicine are undergoing a big data explosion, needing advanced computational algorithms to extract mechanistic insights on the physiological state of living cells. We present the motivation for the Cell Physiome project: a framework and approach for creating, sharing, and using biophysics-based computational models of single-cell physiology. Using examples in calcium signaling, bioenergetics, and endosomal trafficking, we highlight the need for spatially detailed, biophysics-based computational models to uncover new mechanisms underlying cell biology. We review progress and challenges to date toward creating cell physiome models. We then introduce bond graphs as an efficient way to create cell physiome models that integrate chemical, mechanical, electromagnetic, and thermal processes while maintaining mass and energy balance. Bond graphs enhance modularization and reusability of computational models of cells at scale. We conclude with a look forward at steps that will help fully realize this exciting new field of mechanistic biomedical data science. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 5 is August 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":""},"PeriodicalIF":6.0,"publicationDate":"2022-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44647308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Best Practices on Big Data Analytics to Address Sex-Specific Biases in our Understanding of the Etiology, Diagnosis and Prognosis of Diseases 大数据分析的最佳实践,以解决我们对疾病病因、诊断和预后的理解中的性别特异性偏差
IF 6 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2022-02-06 DOI: 10.1101/2022.01.31.22270183
S. Golder, K. O’Connor, Yunwen Wang, R. Stevens, G. Gonzalez-Hernandez
A bias in health research to favor understanding of diseases as they present in men can have a grave impact on the health of women. This paper reports on a conceptual review of the literature that used machine learning or NLP techniques to interrogate big data for identifying sex-specific health disparities. We searched Ovid MEDLINE, Embase, and PsycINFO in October 2021 using synonyms and indexing terms for (1) "women" or "men" or "sex," (2) "big data" or "artificial intelligence" or "NLP", and (3) "disparities" or "differences." From 902 records, 22 studies met the inclusion criteria and were analyzed. Results demonstrate that the inclusion by sex is inconsistent and often unreported, although the inclusion of men in the included studies is disproportionately less than women. Even though AI and NLP techniques are widely applied in health research, few studies use them to take advatage of unstructured text to investigate sex-related differences or disparities. Researchers are increasingly aware of sex-based data bias, but the process to- wards correction is slow. We reflected on what would be the best practices on using big data analytics to address sex-specific biases in understanding the etiology, diagnosis, and prognosis of diseases.
健康研究中倾向于理解男性疾病的偏见可能会对女性健康产生严重影响。本文报告了对使用机器学习或NLP技术询问大数据以识别性别特定健康差异的文献的概念性综述。2021年10月,我们使用同义词和索引词搜索了Ovid MEDLINE、Embase和PsycINFO,分别为(1)“女性”或“男性”或“性别”,(2)“大数据”或“人工智能”或“NLP”,以及(3)“差异”或“差异”。从902份记录中,有22项研究符合纳入标准并进行了分析。结果表明,按性别划分的纳入情况是不一致的,而且往往没有报告,尽管纳入研究的男性比例远远低于女性。尽管人工智能和NLP技术在健康研究中得到了广泛应用,但很少有研究使用它们来支持非结构化文本来调查与性别相关的差异或差异。研究人员越来越意识到基于性别的数据偏见,但纠正过程很慢。我们思考了使用大数据分析来解决在理解疾病病因、诊断和预后方面存在的性别偏见的最佳做法。
{"title":"Best Practices on Big Data Analytics to Address Sex-Specific Biases in our Understanding of the Etiology, Diagnosis and Prognosis of Diseases","authors":"S. Golder, K. O’Connor, Yunwen Wang, R. Stevens, G. Gonzalez-Hernandez","doi":"10.1101/2022.01.31.22270183","DOIUrl":"https://doi.org/10.1101/2022.01.31.22270183","url":null,"abstract":"A bias in health research to favor understanding of diseases as they present in men can have a grave impact on the health of women. This paper reports on a conceptual review of the literature that used machine learning or NLP techniques to interrogate big data for identifying sex-specific health disparities. We searched Ovid MEDLINE, Embase, and PsycINFO in October 2021 using synonyms and indexing terms for (1) \"women\" or \"men\" or \"sex,\" (2) \"big data\" or \"artificial intelligence\" or \"NLP\", and (3) \"disparities\" or \"differences.\" From 902 records, 22 studies met the inclusion criteria and were analyzed. Results demonstrate that the inclusion by sex is inconsistent and often unreported, although the inclusion of men in the included studies is disproportionately less than women. Even though AI and NLP techniques are widely applied in health research, few studies use them to take advatage of unstructured text to investigate sex-related differences or disparities. Researchers are increasingly aware of sex-based data bias, but the process to- wards correction is slow. We reflected on what would be the best practices on using big data analytics to address sex-specific biases in understanding the etiology, diagnosis, and prognosis of diseases.","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":""},"PeriodicalIF":6.0,"publicationDate":"2022-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44284431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Single-Cell Analysis for Whole-Organism Datasets. 全生物数据集的单细胞分析。
IF 6 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2021-07-20 Epub Date: 2021-05-11 DOI: 10.1146/annurev-biodatasci-092820-031008
Angela Oliveira Pisco, Bruno Tojo, Aaron McGeever

Cell atlases are essential companions to the genome as they elucidate how genes are used in a cell type-specific manner or how the usage of genes changes over the lifetime of an organism. This review explores recent advances in whole-organism single-cell atlases, which enable understanding of cell heterogeneity and tissue and cell fate, both in health and disease. Here we provide an overview of recent efforts to build cell atlases across species and discuss the challenges that the field is currently facing. Moreover, we propose the concept of having a knowledgebase that can scale with the number of experiments and computational approaches and a new feedback loop for development and benchmarking of computational methods that includes contributions from the users. These two aspects are key for community efforts in single-cell biology that will help produce a comprehensive annotated map of cell types and states with unparalleled resolution.

细胞图谱是基因组的重要伙伴,因为它们阐明了基因如何以特定细胞类型的方式使用,或者基因的使用如何在生物体的一生中发生变化。这篇综述探讨了生物体单细胞图谱的最新进展,使我们能够理解健康和疾病中的细胞异质性、组织和细胞命运。在这里,我们概述了最近建立跨物种细胞图谱的努力,并讨论了该领域目前面临的挑战。此外,我们提出了一个概念,即拥有一个可以随着实验和计算方法的数量而扩展的知识库,以及一个新的反馈回路,用于包括用户贡献的计算方法的开发和基准测试。这两个方面是单细胞生物学社区努力的关键,这将有助于以无与伦比的分辨率产生细胞类型和状态的综合注释图。
{"title":"Single-Cell Analysis for Whole-Organism Datasets.","authors":"Angela Oliveira Pisco,&nbsp;Bruno Tojo,&nbsp;Aaron McGeever","doi":"10.1146/annurev-biodatasci-092820-031008","DOIUrl":"https://doi.org/10.1146/annurev-biodatasci-092820-031008","url":null,"abstract":"<p><p>Cell atlases are essential companions to the genome as they elucidate how genes are used in a cell type-specific manner or how the usage of genes changes over the lifetime of an organism. This review explores recent advances in whole-organism single-cell atlases, which enable understanding of cell heterogeneity and tissue and cell fate, both in health and disease. Here we provide an overview of recent efforts to build cell atlases across species and discuss the challenges that the field is currently facing. Moreover, we propose the concept of having a knowledgebase that can scale with the number of experiments and computational approaches and a new feedback loop for development and benchmarking of computational methods that includes contributions from the users. These two aspects are key for community efforts in single-cell biology that will help produce a comprehensive annotated map of cell types and states with unparalleled resolution.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":"207-226"},"PeriodicalIF":6.0,"publicationDate":"2021-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39370511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
The 3D Genome Structure of Single Cells. 单细胞的三维基因组结构。
IF 6 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2021-07-20 Epub Date: 2021-04-23 DOI: 10.1146/annurev-biodatasci-020121-084709
Tianming Zhou, Ruochi Zhang, Jian Ma

The spatial organization of the genome in the cell nucleus is pivotal to cell function. However, how the 3D genome organization and its dynamics influence cellular phenotypes remains poorly understood. The very recent development of single-cell technologies for probing the 3D genome, especially single-cell Hi-C (scHi-C), has ushered in a new era of unveiling cell-to-cell variability of 3D genome features at an unprecedented resolution. Here, we review recent developments in computational approaches to the analysis of scHi-C, including data processing, dimensionality reduction, imputation for enhancing data quality, and the revealing of 3D genome features at single-cell resolution. While much progress has been made in computational method development to analyze single-cell 3D genomes, substantial future work is needed to improve data interpretation and multimodal data integration, which are critical to reveal fundamental connections between genome structure and function among heterogeneous cell populations in various biological contexts.

基因组在细胞核中的空间组织对细胞功能至关重要。然而,三维基因组组织及其动力学如何影响细胞表型仍然知之甚少。用于探测3D基因组的单细胞技术的最新发展,特别是单细胞Hi-C (scHi-C),以前所未有的分辨率开启了揭示3D基因组特征的细胞间变异性的新时代。在这里,我们回顾了scHi-C分析的计算方法的最新进展,包括数据处理、降维、提高数据质量的imputation以及单细胞分辨率下3D基因组特征的揭示。虽然在分析单细胞三维基因组的计算方法开发方面取得了很大进展,但需要大量的未来工作来改进数据解释和多模态数据集成,这对于揭示不同生物学背景下异质细胞群体中基因组结构和功能之间的基本联系至关重要。
{"title":"The 3D Genome Structure of Single Cells.","authors":"Tianming Zhou,&nbsp;Ruochi Zhang,&nbsp;Jian Ma","doi":"10.1146/annurev-biodatasci-020121-084709","DOIUrl":"https://doi.org/10.1146/annurev-biodatasci-020121-084709","url":null,"abstract":"<p><p>The spatial organization of the genome in the cell nucleus is pivotal to cell function. However, how the 3D genome organization and its dynamics influence cellular phenotypes remains poorly understood. The very recent development of single-cell technologies for probing the 3D genome, especially single-cell Hi-C (scHi-C), has ushered in a new era of unveiling cell-to-cell variability of 3D genome features at an unprecedented resolution. Here, we review recent developments in computational approaches to the analysis of scHi-C, including data processing, dimensionality reduction, imputation for enhancing data quality, and the revealing of 3D genome features at single-cell resolution. While much progress has been made in computational method development to analyze single-cell 3D genomes, substantial future work is needed to improve data interpretation and multimodal data integration, which are critical to reveal fundamental connections between genome structure and function among heterogeneous cell populations in various biological contexts.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":"21-41"},"PeriodicalIF":6.0,"publicationDate":"2021-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39371086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Integration of Multimodal Data for Deciphering Brain Disorders. 多模态数据集成用于脑部疾病的破译。
IF 6 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2021-07-20 Epub Date: 2021-04-23 DOI: 10.1146/annurev-biodatasci-092820-020354
Jingqi Chen, Guiying Dong, Liting Song, Xingzhong Zhao, Jixin Cao, Xiaohui Luo, Jianfeng Feng, Xing-Ming Zhao

The accumulation of vast amounts of multimodal data for the human brain, in both normal and disease conditions, has provided unprecedented opportunities for understanding why and how brain disorders arise. Compared with traditional analyses of single datasets, the integration of multimodal datasets covering different types of data (i.e., genomics, transcriptomics, imaging, etc.) has shed light on the mechanisms underlying brain disorders in greater detail across both the microscopic and macroscopic levels. In this review, we first briefly introduce the popular large datasets for the brain. Then, we discuss in detail how integration of multimodal human brain datasets can reveal the genetic predispositions and the abnormal molecular pathways of brain disorders. Finally, we present an outlook on how future data integration efforts may advance the diagnosis and treatment of brain disorders.

人类大脑在正常和疾病条件下的大量多模态数据的积累,为理解大脑疾病产生的原因和方式提供了前所未有的机会。与传统的单一数据集分析相比,涵盖不同类型数据(即基因组学、转录组学、成像等)的多模态数据集的整合,在微观和宏观层面上更详细地揭示了大脑疾病的机制。在这篇综述中,我们首先简要介绍了流行的大脑大数据集。然后,我们详细讨论了多模态人脑数据集的整合如何揭示大脑疾病的遗传易感性和异常分子途径。最后,我们展望了未来数据整合工作将如何促进脑部疾病的诊断和治疗。
{"title":"Integration of Multimodal Data for Deciphering Brain Disorders.","authors":"Jingqi Chen,&nbsp;Guiying Dong,&nbsp;Liting Song,&nbsp;Xingzhong Zhao,&nbsp;Jixin Cao,&nbsp;Xiaohui Luo,&nbsp;Jianfeng Feng,&nbsp;Xing-Ming Zhao","doi":"10.1146/annurev-biodatasci-092820-020354","DOIUrl":"https://doi.org/10.1146/annurev-biodatasci-092820-020354","url":null,"abstract":"<p><p>The accumulation of vast amounts of multimodal data for the human brain, in both normal and disease conditions, has provided unprecedented opportunities for understanding why and how brain disorders arise. Compared with traditional analyses of single datasets, the integration of multimodal datasets covering different types of data (i.e., genomics, transcriptomics, imaging, etc.) has shed light on the mechanisms underlying brain disorders in greater detail across both the microscopic and macroscopic levels. In this review, we first briefly introduce the popular large datasets for the brain. Then, we discuss in detail how integration of multimodal human brain datasets can reveal the genetic predispositions and the abnormal molecular pathways of brain disorders. Finally, we present an outlook on how future data integration efforts may advance the diagnosis and treatment of brain disorders.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":"43-56"},"PeriodicalIF":6.0,"publicationDate":"2021-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39370514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
Annual Review of Biomedical Data Science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1