首页 > 最新文献

Frontiers in bioinformatics最新文献

英文 中文
AUTO-TUNE: selecting the distance threshold for inferring HIV transmission clusters. 自动调整:选择距离阈值以推断艾滋病毒传播集群。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-07-10 eCollection Date: 2024-01-01 DOI: 10.3389/fbinf.2024.1400003
Steven Weaver, Vanessa M Dávila Conn, Daniel Ji, Hannah Verdonk, Santiago Ávila-Ríos, Andrew J Leigh Brown, Joel O Wertheim, Sergei L Kosakovsky Pond

Molecular surveillance of viral pathogens and inference of transmission networks from genomic data play an increasingly important role in public health efforts, especially for HIV-1. For many methods, the genetic distance threshold used to connect sequences in the transmission network is a key parameter informing the properties of inferred networks. Using a distance threshold that is too high can result in a network with many spurious links, making it difficult to interpret. Conversely, a distance threshold that is too low can result in a network with too few links, which may not capture key insights into clusters of public health concern. Published research using the HIV-TRACE software package frequently uses the default threshold of 0.015 substitutions/site for HIV pol gene sequences, but in many cases, investigators heuristically select other threshold parameters to better capture the underlying dynamics of the epidemic they are studying. Here, we present a general heuristic scoring approach for tuning a distance threshold adaptively, which seeks to prevent the formation of giant clusters. We prioritize the ratio of the sizes of the largest and the second largest cluster, maximizing the number of clusters present in the network. We apply our scoring heuristic to outbreaks with different characteristics, such as regional or temporal variability, and demonstrate the utility of using the scoring mechanism's suggested distance threshold to identify clusters exhibiting risk factors that would have otherwise been more difficult to identify. For example, while we found that a 0.015 substitutions/site distance threshold is typical for US-like epidemics, recent outbreaks like the CRF07_BC subtype among men who have sex with men (MSM) in China have been found to have a lower optimal threshold of 0.005 to better capture the transition from injected drug use (IDU) to MSM as the primary risk factor. Alternatively, in communities surrounding Lake Victoria in Uganda, where there has been sustained heterosexual transmission for many years, we found that a larger distance threshold is necessary to capture a more risk factor-diverse population with sparse sampling over a longer period of time. Such identification may allow for more informed intervention action by respective public health officials.

对病毒病原体的分子监测和从基因组数据推断传播网络在公共卫生工作中发挥着越来越重要的作用,尤其是在 HIV-1 方面。对于许多方法来说,用于连接传播网络中序列的遗传距离阈值是影响推断网络特性的一个关键参数。使用过高的距离阈值会导致网络中出现许多虚假链接,从而难以解释。相反,如果距离阈值过低,则可能导致网络中的链接过少,从而无法捕捉到有关公共卫生问题的关键信息。已发表的使用 HIV-TRACE 软件包进行的研究通常使用 0.015 个取代/位点的默认阈值来处理 HIV pol 基因序列,但在许多情况下,研究人员会启发式地选择其他阈值参数,以更好地捕捉他们正在研究的流行病的潜在动态。在此,我们提出了一种通用的启发式评分方法,用于自适应地调整距离阈值,以防止形成巨大的簇。我们优先考虑最大集群和第二大集群的大小之比,最大限度地增加网络中存在的集群数量。我们将我们的评分启发式应用于具有不同特征的疫情爆发,如区域或时间变异性,并展示了使用评分机制建议的距离阈值来识别表现出风险因素的集群的实用性,否则这些集群将更难识别。例如,我们发现 0.015 个替代/地点的距离阈值是类似美国流行病的典型阈值,而最近在中国男男性行为者(MSM)中爆发的 CRF07_BC 亚型等流行病的最佳阈值较低,为 0.005,以便更好地捕捉从注射吸毒(IDU)到 MSM 作为主要风险因素的转变。另外,在乌干达维多利亚湖周边的社区,异性传播已持续多年,我们发现需要更大的距离阈值,才能在更长的时间内通过稀疏取样捕捉到风险因素更多样化的人群。这样的识别可以让相关公共卫生官员采取更明智的干预行动。
{"title":"AUTO-TUNE: selecting the distance threshold for inferring HIV transmission clusters.","authors":"Steven Weaver, Vanessa M Dávila Conn, Daniel Ji, Hannah Verdonk, Santiago Ávila-Ríos, Andrew J Leigh Brown, Joel O Wertheim, Sergei L Kosakovsky Pond","doi":"10.3389/fbinf.2024.1400003","DOIUrl":"10.3389/fbinf.2024.1400003","url":null,"abstract":"<p><p>Molecular surveillance of viral pathogens and inference of transmission networks from genomic data play an increasingly important role in public health efforts, especially for HIV-1. For many methods, the genetic distance threshold used to connect sequences in the transmission network is a key parameter informing the properties of inferred networks. Using a distance threshold that is too high can result in a network with many spurious links, making it difficult to interpret. Conversely, a distance threshold that is too low can result in a network with too few links, which may not capture key insights into clusters of public health concern. Published research using the HIV-TRACE software package frequently uses the default threshold of 0.015 substitutions/site for HIV pol gene sequences, but in many cases, investigators heuristically select other threshold parameters to better capture the underlying dynamics of the epidemic they are studying. Here, we present a general heuristic scoring approach for tuning a distance threshold adaptively, which seeks to prevent the formation of giant clusters. We prioritize the ratio of the sizes of the largest and the second largest cluster, maximizing the number of clusters present in the network. We apply our scoring heuristic to outbreaks with different characteristics, such as regional or temporal variability, and demonstrate the utility of using the scoring mechanism's suggested distance threshold to identify clusters exhibiting risk factors that would have otherwise been more difficult to identify. For example, while we found that a 0.015 substitutions/site distance threshold is typical for US-like epidemics, recent outbreaks like the CRF07_BC subtype among men who have sex with men (MSM) in China have been found to have a lower optimal threshold of 0.005 to better capture the transition from injected drug use (IDU) to MSM as the primary risk factor. Alternatively, in communities surrounding Lake Victoria in Uganda, where there has been sustained heterosexual transmission for many years, we found that a larger distance threshold is necessary to capture a more risk factor-diverse population with sparse sampling over a longer period of time. Such identification may allow for more informed intervention action by respective public health officials.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11289888/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141861844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A systematic overview of single-cell transcriptomics databases, their use cases, and limitations. 系统概述单细胞转录组学数据库、其用例和局限性。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-07-08 eCollection Date: 2024-01-01 DOI: 10.3389/fbinf.2024.1417428
Mahnoor N Gondal, Saad Ur Rehman Shah, Arul M Chinnaiyan, Marcin Cieslik

Rapid advancements in high-throughput single-cell RNA-seq (scRNA-seq) technologies and experimental protocols have led to the generation of vast amounts of transcriptomic data that populates several online databases and repositories. Here, we systematically examined large-scale scRNA-seq databases, categorizing them based on their scope and purpose such as general, tissue-specific databases, disease-specific databases, cancer-focused databases, and cell type-focused databases. Next, we discuss the technical and methodological challenges associated with curating large-scale scRNA-seq databases, along with current computational solutions. We argue that understanding scRNA-seq databases, including their limitations and assumptions, is crucial for effectively utilizing this data to make robust discoveries and identify novel biological insights. Such platforms can help bridge the gap between computational and wet lab scientists through user-friendly web-based interfaces needed for democratizing access to single-cell data. These platforms would facilitate interdisciplinary research, enabling researchers from various disciplines to collaborate effectively. This review underscores the importance of leveraging computational approaches to unravel the complexities of single-cell data and offers a promising direction for future research in the field.

高通量单细胞RNA-seq(scRNA-seq)技术和实验方案的快速发展产生了大量转录组数据,这些数据充斥着多个在线数据库和资料库。在此,我们系统地研究了大规模 scRNA-seq 数据库,并根据其范围和目的进行了分类,如通用数据库、组织特异性数据库、疾病特异性数据库、癌症数据库和细胞类型数据库。接下来,我们讨论了与整理大规模 scRNA-seq 数据库相关的技术和方法挑战,以及当前的计算解决方案。我们认为,了解 scRNA-seq 数据库,包括其局限性和假设,对于有效利用这些数据进行有力的发现和确定新的生物学见解至关重要。这些平台可以通过用户友好的网络界面,帮助弥合计算科学家和湿实验室科学家之间的差距,从而实现单细胞数据访问的民主化。这些平台将促进跨学科研究,使来自不同学科的研究人员能够有效合作。这篇综述强调了利用计算方法揭示单细胞数据复杂性的重要性,并为该领域未来的研究指明了方向。
{"title":"A systematic overview of single-cell transcriptomics databases, their use cases, and limitations.","authors":"Mahnoor N Gondal, Saad Ur Rehman Shah, Arul M Chinnaiyan, Marcin Cieslik","doi":"10.3389/fbinf.2024.1417428","DOIUrl":"10.3389/fbinf.2024.1417428","url":null,"abstract":"<p><p>Rapid advancements in high-throughput single-cell RNA-seq (scRNA-seq) technologies and experimental protocols have led to the generation of vast amounts of transcriptomic data that populates several online databases and repositories. Here, we systematically examined large-scale scRNA-seq databases, categorizing them based on their scope and purpose such as general, tissue-specific databases, disease-specific databases, cancer-focused databases, and cell type-focused databases. Next, we discuss the technical and methodological challenges associated with curating large-scale scRNA-seq databases, along with current computational solutions. We argue that understanding scRNA-seq databases, including their limitations and assumptions, is crucial for effectively utilizing this data to make robust discoveries and identify novel biological insights. Such platforms can help bridge the gap between computational and wet lab scientists through user-friendly web-based interfaces needed for democratizing access to single-cell data. These platforms would facilitate interdisciplinary research, enabling researchers from various disciplines to collaborate effectively. This review underscores the importance of leveraging computational approaches to unravel the complexities of single-cell data and offers a promising direction for future research in the field.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11260681/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141749912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DNA structural features and variability of complete MHC locus sequences. 完整 MHC 基因座序列的 DNA 结构特征和变异性。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-07-03 eCollection Date: 2024-01-01 DOI: 10.3389/fbinf.2024.1392613
Trudy M Wassenaar, Terry Harville, Jonathan Chastain, Visanu Wanchai, David W Ussery

The major histocompatibility (MHC) locus, also known as the Human Leukocyte Antigen (HLA) genes, is located on the short arm of chromosome 6, and contains three regions (Class I, Class II and Class III). This 5 Mbp locus is one of the most variable regions of the human genome, yet it also encodes a set of highly conserved and important proteins related to immunological response. Genetic variations in this region are responsible for more diseases than in the entire rest of the human genome. However, information on local structural features of the DNA is largely ignored. With recent advances in long-read sequencing technology, it is now becoming possible to sequence the entire 5 Mbp MHC locus, producing complete diploid haplotypes of the whole region. Here, we describe structural maps based on the complete sequences from six different homozygous HLA cell lines. We find long-range structural variability in the different sequences for DNA stacking energy, position preference and curvature, variation in repeats, as well as more local changes in regions forming open chromatin structures, likely to influence gene expression levels. These structural maps can be useful in visualizing large scale structural variation across HLA types, in particular when this can be complemented with epigenetic signals.

主要组织相容性(MHC)基因座又称人类白细胞抗原(HLA)基因,位于第 6 号染色体的短臂上,包含三个区域(I 类、II 类和 III 类)。这个 5 Mbp 的基因座是人类基因组中变异最大的区域之一,但它也编码了一组与免疫反应有关的高度保守的重要蛋白质。该区域的基因变异导致的疾病比整个人类基因组的其他区域还要多。然而,有关 DNA 局部结构特征的信息在很大程度上被忽视了。随着长线程测序技术的不断进步,现在可以对整个 5 Mbp MHC 基因座进行测序,从而得到整个区域的完整二倍体单倍型。在此,我们描述了基于六个不同同源 HLA 细胞系完整序列的结构图。我们发现不同序列在 DNA 堆叠能、位置偏好和曲率、重复序列的变化等方面存在长程结构变异,而在形成开放染色质结构的区域则存在更多局部变化,这些变化可能会影响基因表达水平。这些结构图有助于直观显示不同 HLA 类型的大规模结构变异,特别是在有表观遗传学信号补充的情况下。
{"title":"DNA structural features and variability of complete MHC locus sequences.","authors":"Trudy M Wassenaar, Terry Harville, Jonathan Chastain, Visanu Wanchai, David W Ussery","doi":"10.3389/fbinf.2024.1392613","DOIUrl":"10.3389/fbinf.2024.1392613","url":null,"abstract":"<p><p>The major histocompatibility (MHC) locus, also known as the Human Leukocyte Antigen (HLA) genes, is located on the short arm of chromosome 6, and contains three regions (Class I, Class II and Class III). This 5 Mbp locus is one of the most variable regions of the human genome, yet it also encodes a set of highly conserved and important proteins related to immunological response. Genetic variations in this region are responsible for more diseases than in the entire rest of the human genome. However, information on local structural features of the DNA is largely ignored. With recent advances in long-read sequencing technology, it is now becoming possible to sequence the entire 5 Mbp MHC locus, producing complete diploid haplotypes of the whole region. Here, we describe structural maps based on the complete sequences from six different homozygous HLA cell lines. We find long-range structural variability in the different sequences for DNA stacking energy, position preference and curvature, variation in repeats, as well as more local changes in regions forming open chromatin structures, likely to influence gene expression levels. These structural maps can be useful in visualizing large scale structural variation across HLA types, in particular when this can be complemented with epigenetic signals.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11251971/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141636053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Maximum-scoring path sets on pangenome graphs of constant treewidth. 恒定树宽的盘根图上的最大得分路径集。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-07-01 eCollection Date: 2024-01-01 DOI: 10.3389/fbinf.2024.1391086
Broňa Brejová, Travis Gagie, Eva Herencsárová, Tomáš Vinař

We generalize a problem of finding maximum-scoring segment sets, previously studied by Csűrös (IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2004, 1, 139-150), from sequences to graphs. Namely, given a vertex-weighted graph G and a non-negative startup penalty c, we can find a set of vertex-disjoint paths in G with maximum total score when each path's score is its vertices' total weight minus c. We call this new problem maximum-scoring path sets (MSPS). We present an algorithm that has a linear-time complexity for graphs with a constant treewidth. Generalization from sequences to graphs allows the algorithm to be used on pangenome graphs representing several related genomes and can be seen as a common abstraction for several biological problems on pangenomes, including searching for CpG islands, ChIP-seq data analysis, analysis of region enrichment for functional elements, or simple chaining problems.

我们将 Csűrös(IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2004, 1, 139-150)之前研究的寻找最大得分段集问题从序列推广到图。也就是说,给定一个顶点加权图 G 和一个非负的启动惩罚 c,我们可以在 G 中找到一组顶点相交的路径,当每条路径的得分是其顶点的总权重减去 c 时,总得分最大。我们提出的算法对于树宽恒定的图具有线性时间复杂度。从序列到图的泛化使该算法可用于代表多个相关基因组的庞基因组图,并可被视为庞基因组上多个生物学问题的通用抽象,包括 CpG 岛搜索、ChIP-seq 数据分析、功能元素区域富集分析或简单的链问题。
{"title":"Maximum-scoring path sets on pangenome graphs of constant treewidth.","authors":"Broňa Brejová, Travis Gagie, Eva Herencsárová, Tomáš Vinař","doi":"10.3389/fbinf.2024.1391086","DOIUrl":"10.3389/fbinf.2024.1391086","url":null,"abstract":"<p><p>We generalize a problem of finding maximum-scoring segment sets, previously studied by Csűrös (IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2004, 1, 139-150), from sequences to graphs. Namely, given a vertex-weighted graph <i>G</i> and a non-negative startup penalty <i>c</i>, we can find a set of vertex-disjoint paths in <i>G</i> with maximum total score when each path's score is its vertices' total weight minus <i>c</i>. We call this new problem <i>maximum-scoring path sets</i> (MSPS). We present an algorithm that has a linear-time complexity for graphs with a constant treewidth. Generalization from sequences to graphs allows the algorithm to be used on pangenome graphs representing several related genomes and can be seen as a common abstraction for several biological problems on pangenomes, including searching for CpG islands, ChIP-seq data analysis, analysis of region enrichment for functional elements, or simple chaining problems.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11246863/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141621903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The evolution of mammalian Rem2: unraveling the impact of purifying selection and coevolution on protein function, and implications for human disorders. 哺乳动物 Rem2 的进化:揭示纯化选择和共同进化对蛋白质功能的影响,以及对人类疾病的影响。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-06-24 eCollection Date: 2024-01-01 DOI: 10.3389/fbinf.2024.1381540
Alexander G Lucaci, William E Brew, Jason Lamanna, Avery Selberg, Vincenzo Carnevale, Anna R Moore, Sergei L Kosakovsky Pond

Rad And Gem-Like GTP-Binding Protein 2 (Rem2), a member of the RGK family of Ras-like GTPases, is implicated in Huntington's disease and Long QT Syndrome and is highly expressed in the brain and endocrine cells. We examine the evolutionary history of Rem2 identified in various mammalian species, focusing on the role of purifying selection and coevolution in shaping its sequence and protein structural constraints. Our analysis of Rem2 sequences across 175 mammalian species found evidence for strong purifying selection in 70% of non-invariant codon sites which is characteristic of essential proteins that play critical roles in biological processes and is consistent with Rem2's role in the regulation of neuronal development and function. We inferred epistatic effects in 50 pairs of codon sites in Rem2, some of which are predicted to have deleterious effects on human health. Additionally, we reconstructed the ancestral evolutionary history of mammalian Rem2 using protein structure prediction of extinct and extant sequences which revealed the dynamics of how substitutions that change the gene sequence of Rem2 can impact protein structure in variable regions while maintaining core functional mechanisms. By understanding the selective pressures, protein- and gene - interactions that have shaped the sequence and structure of the Rem2 protein, we gain a stronger understanding of its biological and functional constraints.

Rad 和宝石样 GTP 结合蛋白 2(Rem2)是 Ras 样 GTP 酶 RGK 家族的成员,与亨廷顿氏病和长 QT 综合征有关,在大脑和内分泌细胞中高度表达。我们研究了在不同哺乳动物物种中发现的 Rem2 的进化史,重点是纯化选择和共同进化在形成其序列和蛋白质结构限制方面的作用。我们对 175 个哺乳动物物种的 Rem2 序列进行了分析,发现 70% 的非不变密码子位点存在强烈的纯化选择,这是在生物过程中发挥关键作用的必需蛋白的特征,也与 Rem2 在神经元发育和功能调控中的作用相一致。我们推断出了 Rem2 中 50 对密码子位点的表观效应,其中一些预计会对人类健康产生有害影响。此外,我们利用已灭绝和现存序列的蛋白质结构预测重建了哺乳动物Rem2的祖先进化史,揭示了改变Rem2基因序列的置换如何在保持核心功能机制的同时影响可变区域的蛋白质结构的动态变化。通过了解形成 Rem2 蛋白序列和结构的选择压力、蛋白质和基因之间的相互作用,我们对其生物学和功能限制有了更深入的了解。
{"title":"The evolution of mammalian Rem2: unraveling the impact of purifying selection and coevolution on protein function, and implications for human disorders.","authors":"Alexander G Lucaci, William E Brew, Jason Lamanna, Avery Selberg, Vincenzo Carnevale, Anna R Moore, Sergei L Kosakovsky Pond","doi":"10.3389/fbinf.2024.1381540","DOIUrl":"10.3389/fbinf.2024.1381540","url":null,"abstract":"<p><p>Rad And Gem-Like GTP-Binding Protein 2 (Rem2), a member of the RGK family of Ras-like GTPases, is implicated in Huntington's disease and Long QT Syndrome and is highly expressed in the brain and endocrine cells. We examine the evolutionary history of Rem2 identified in various mammalian species, focusing on the role of purifying selection and coevolution in shaping its sequence and protein structural constraints. Our analysis of Rem2 sequences across 175 mammalian species found evidence for strong purifying selection in 70% of non-invariant codon sites which is characteristic of essential proteins that play critical roles in biological processes and is consistent with Rem2's role in the regulation of neuronal development and function. We inferred epistatic effects in 50 pairs of codon sites in Rem2, some of which are predicted to have deleterious effects on human health. Additionally, we reconstructed the ancestral evolutionary history of mammalian Rem2 using protein structure prediction of extinct and extant sequences which revealed the dynamics of how substitutions that change the gene sequence of Rem2 can impact protein structure in variable regions while maintaining core functional mechanisms. By understanding the selective pressures, protein- and gene - interactions that have shaped the sequence and structure of the Rem2 protein, we gain a stronger understanding of its biological and functional constraints.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11228553/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141560465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bioinformatics proficiency among African students. 非洲学生的生物信息学能力。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-06-20 eCollection Date: 2024-01-01 DOI: 10.3389/fbinf.2024.1328714
Ashraf Akintayo Akintola, Abdullahi Tunde Aborode, Muhammed Taofiq Hamza, Augustine Amakiri, Benjamin Moore, Suliat Abdulai, Oluyinka Ajibola Iyiola, Lateef Adegboyega Sulaimon, Effiong Effiong, Adedeji Ogunyemi, Boluwatife Dosunmu, Abdulkadir Yusif Maigoro, Opeyemi Lawal, Kayode Raheem, Ui Wook Hwang

Bioinformatics, the interdisciplinary field that combines biology, computer science, and data analysis, plays a pivotal role in advancing our understanding of life sciences. In the African context, where the diversity of biological resources and healthcare challenges is substantial, fostering bioinformatics literacy and proficiency among students is important. This perspective provides an overview of the state of bioinformatics literacy among African students, highlighting the significance, challenges, and potential solutions in addressing this critical educational gap. It proposes various strategies to enhance bioinformatics literacy among African students. These include expanding educational resources, fostering collaboration between institutions, and engaging students in research projects. By addressing the current challenges and implementing comprehensive strategies, African students can harness the power of bioinformatics to contribute to innovative solutions in healthcare, agriculture, and biodiversity conservation, ultimately advancing the continent's scientific capabilities and improving the quality of life for her people. In conclusion, promoting bioinformatics literacy among African students is imperative for the continent's scientific development and advancing frontiers of biological research.

生物信息学是一个结合了生物学、计算机科学和数据分析的跨学科领域,在促进我们对生命科学的理解方面发挥着举足轻重的作用。在非洲,生物资源和医疗保健面临着巨大的挑战,因此培养学生的生物信息学素养和能力非常重要。本视角概述了非洲学生的生物信息学素养状况,强调了解决这一关键教育差距的意义、挑战和潜在解决方案。它提出了提高非洲学生生物信息学素养的各种策略。这些策略包括扩大教育资源、促进机构间合作以及让学生参与研究项目。通过应对当前的挑战和实施综合战略,非洲学生可以利用生物信息学的力量,为医疗保健、农业和生物多样性保护领域的创新解决方案做出贡献,最终提高非洲大陆的科学能力,改善非洲人民的生活质量。总之,提高非洲学生的生物信息学素养对于非洲大陆的科学发展和推进生物研究前沿势在必行。
{"title":"Bioinformatics proficiency among African students.","authors":"Ashraf Akintayo Akintola, Abdullahi Tunde Aborode, Muhammed Taofiq Hamza, Augustine Amakiri, Benjamin Moore, Suliat Abdulai, Oluyinka Ajibola Iyiola, Lateef Adegboyega Sulaimon, Effiong Effiong, Adedeji Ogunyemi, Boluwatife Dosunmu, Abdulkadir Yusif Maigoro, Opeyemi Lawal, Kayode Raheem, Ui Wook Hwang","doi":"10.3389/fbinf.2024.1328714","DOIUrl":"10.3389/fbinf.2024.1328714","url":null,"abstract":"<p><p>Bioinformatics, the interdisciplinary field that combines biology, computer science, and data analysis, plays a pivotal role in advancing our understanding of life sciences. In the African context, where the diversity of biological resources and healthcare challenges is substantial, fostering bioinformatics literacy and proficiency among students is important. This perspective provides an overview of the state of bioinformatics literacy among African students, highlighting the significance, challenges, and potential solutions in addressing this critical educational gap. It proposes various strategies to enhance bioinformatics literacy among African students. These include expanding educational resources, fostering collaboration between institutions, and engaging students in research projects. By addressing the current challenges and implementing comprehensive strategies, African students can harness the power of bioinformatics to contribute to innovative solutions in healthcare, agriculture, and biodiversity conservation, ultimately advancing the continent's scientific capabilities and improving the quality of life for her people. In conclusion, promoting bioinformatics literacy among African students is imperative for the continent's scientific development and advancing frontiers of biological research.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11222312/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141536364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A comprehensive multi-omics analysis reveals unique signatures to predict Alzheimer's disease. 全面的多组学分析揭示了预测阿尔茨海默病的独特特征。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-06-19 eCollection Date: 2024-01-01 DOI: 10.3389/fbinf.2024.1390607
Michael Vacher, Rodrigo Canovas, Simon M Laws, James D Doecke

Background: Complex disorders, such as Alzheimer's disease (AD), result from the combined influence of multiple biological and environmental factors. The integration of high-throughput data from multiple omics platforms can provide system overviews, improving our understanding of complex biological processes underlying human disease. In this study, integrated data from four omics platforms were used to characterise biological signatures of AD.

Method: The study cohort consists of 455 participants (Control:148, Cases:307) from the Religious Orders Study and Memory and Aging Project (ROSMAP). Genotype (SNP), methylation (CpG), RNA and proteomics data were collected, quality-controlled and pre-processed (SNP = 130; CpG = 83; RNA = 91; Proteomics = 119). Using a diagnosis of Mild Cognitive Impairment (MCI)/AD combined as the target phenotype, we first used Partial Least Squares Regression as an unsupervised classification framework to assess the prediction capabilities for each omics dataset individually. We then used a variation of the sparse generalized canonical correlation analysis (sGCCA) to assess predictions of the combined datasets and identify multi-omics signatures characterising each group of participants.

Results: Analysing datasets individually we found methylation data provided the best predictions with an accuracy of 0.63 (95%CI = [0.54-0.71]), followed by RNA, 0.61 (95%CI = [0.52-0.69]), SNP, 0.59 (95%CI = [0.51-0.68]) and proteomics, 0.58 (95%CI = [0.51-0.67]). After integration of the four datasets, predictions were dramatically improved with a resulting accuracy of 0.95 (95% CI = [0.89-0.98]).

Conclusion: The integration of data from multiple platforms is a powerful approach to explore biological systems and better characterise the biological signatures of AD. The results suggest that integrative methods can identify biomarker panels with improved predictive performance compared to individual platforms alone. Further validation in independent cohorts is required to validate and refine the results presented in this study.

背景:阿尔茨海默病(AD)等复杂疾病是多种生物和环境因素共同影响的结果。整合来自多个 omics 平台的高通量数据可以提供系统概述,提高我们对人类疾病背后复杂生物过程的理解。在这项研究中,来自四个全息平台的整合数据被用来描述 AD 的生物学特征:研究队列由来自宗教团体研究和记忆与衰老项目(ROSMAP)的 455 名参与者组成(对照组:148 人,病例:307 人)。收集了基因型(SNP)、甲基化(CpG)、RNA和蛋白质组学数据,并进行了质量控制和预处理(SNP = 130;CpG = 83;RNA = 91;蛋白质组学 = 119)。以轻度认知功能障碍(MCI)/AD 合并诊断为目标表型,我们首先使用部分最小二乘法回归作为无监督分类框架,评估每个 omics 数据集的预测能力。然后,我们使用稀疏广义典型相关分析(sGCCA)的一种变体来评估合并数据集的预测结果,并确定每组参与者的多组学特征:对数据集进行单独分析后,我们发现甲基化数据提供了最佳预测,准确率为 0.63(95%CI = [0.54-0.71]),其次是 RNA,准确率为 0.61(95%CI = [0.52-0.69]),SNP,准确率为 0.59(95%CI = [0.51-0.68]),蛋白质组学,准确率为 0.58(95%CI = [0.51-0.67])。整合四个数据集后,预测结果大幅提高,准确率达到 0.95 (95%CI = [0.89-0.98]):结论:整合来自多个平台的数据是探索生物系统和更好地描述 AD 生物特征的有力方法。研究结果表明,与单个平台相比,整合方法能识别出预测性能更高的生物标记物面板。要验证和完善本研究提出的结果,还需要在独立队列中进行进一步验证。
{"title":"A comprehensive multi-omics analysis reveals unique signatures to predict Alzheimer's disease.","authors":"Michael Vacher, Rodrigo Canovas, Simon M Laws, James D Doecke","doi":"10.3389/fbinf.2024.1390607","DOIUrl":"10.3389/fbinf.2024.1390607","url":null,"abstract":"<p><strong>Background: </strong>Complex disorders, such as Alzheimer's disease (AD), result from the combined influence of multiple biological and environmental factors. The integration of high-throughput data from multiple omics platforms can provide system overviews, improving our understanding of complex biological processes underlying human disease. In this study, integrated data from four omics platforms were used to characterise biological signatures of AD.</p><p><strong>Method: </strong>The study cohort consists of 455 participants (Control:148, Cases:307) from the Religious Orders Study and Memory and Aging Project (ROSMAP). Genotype (SNP), methylation (CpG), RNA and proteomics data were collected, quality-controlled and pre-processed (SNP = 130; CpG = 83; RNA = 91; Proteomics = 119). Using a diagnosis of Mild Cognitive Impairment (MCI)/AD combined as the target phenotype, we first used Partial Least Squares Regression as an unsupervised classification framework to assess the prediction capabilities for each omics dataset individually. We then used a variation of the sparse generalized canonical correlation analysis (sGCCA) to assess predictions of the combined datasets and identify multi-omics signatures characterising each group of participants.</p><p><strong>Results: </strong>Analysing datasets individually we found methylation data provided the best predictions with an accuracy of 0.63 (95%CI = [0.54-0.71]), followed by RNA, 0.61 (95%CI = [0.52-0.69]), SNP, 0.59 (95%CI = [0.51-0.68]) and proteomics, 0.58 (95%CI = [0.51-0.67]). After integration of the four datasets, predictions were dramatically improved with a resulting accuracy of 0.95 (95% CI = [0.89-0.98]).</p><p><strong>Conclusion: </strong>The integration of data from multiple platforms is a powerful approach to explore biological systems and better characterise the biological signatures of AD. The results suggest that integrative methods can identify biomarker panels with improved predictive performance compared to individual platforms alone. Further validation in independent cohorts is required to validate and refine the results presented in this study.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2024-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11219798/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141499827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Effects of highly active antiretroviral therapy initiation on epigenomic DNA methylation in persons living with HIV 启动高效抗逆转录病毒疗法对艾滋病毒感染者表观基因组 DNA 甲基化的影响
Pub Date : 2024-05-24 DOI: 10.3389/fbinf.2024.1357889
Joshua Zhang, Mary E. Sehl, Roger Shih, Elizabeth Crabb Breen, Fengxue Li, Ake T. Lu, Jay H. Bream, Priya Duggal, Jeremy Martinson, Steven M. Wolinsky, Otoniel Martinez-Maza, Christina M. Ramirez, Steve Horvath, Beth D. Jamieson
Introduction: Highly active antiretroviral therapy (HAART) helps improve some measures of accelerated epigenetic aging in persons living with HIV (PLWH), but its overall impact on the epigenome is not fully understood.Methods: In this study, we analyzed the DNA methylation profiles of PLWH (n = 187) shortly before and approximately 2–3 years after they started HAART, as well as matched seronegative (SN) controls (n = 187), taken at two time intervals. Our aim was to identify specific CpGs and biologic pathways associated with HIV infection and initiation of HAART. Additionally, we attempted to identify epigenetic changes associated with HAART initiation that were independent of HIV-associated changes, using matched HIV seronegative (SN) controls (matched on age, hepatitis C status, and interval between visits) to identify CpGs that did not differ between PLWH and SN pre-HAART but were significantly associated with HAART initiation while being unrelated to HIV viral load. Epigenome-wide association studies (EWAS) on >850,000 CpG sites were performed using pre- and post-HAART samples from PLWH. The results were then annotated using the Genomic Regions Enrichment of Annotations Tool (GREAT).Results: When only pre- and post-HAART visits in PLWH were compared, gene ontologies related to immune function and diseases related to immune function were significant, though with less significance for PLWH with detectable HIV viral loads (>50 copies/mL) at the post-HAART visit. To specifically elucidate the effects of HAART separately from HIV-induced methylation changes, we performed EWAS of HAART while also controlling for HIV viral load, and found gene ontologies associated with transplant rejection, transplant-related diseases, and other immunologic signatures. Additionally, we performed a more focused analysis that examined CpGs reaching genome-wide significance (p < 1 × 10−7) from the viral load-controlled EWAS that did not differ between all PLWH and matched SN controls pre-HAART. These CpGs were found to be near genes that play a role in retroviral drug metabolism, diffuse large B cell lymphoma proliferation, and gastric cancer metastasis.Discussion: Overall, this study provides insight into potential biological functions associated with DNA methylation changes induced by HAART initiation in persons living with HIV.
导言:高活性抗逆转录病毒疗法(HAART)有助于改善艾滋病病毒感染者(PLWH)表观遗传加速衰老的某些指标,但其对表观基因组的总体影响尚不完全清楚:在这项研究中,我们分析了艾滋病病毒感染者(187 人)开始接受 HAART 治疗前不久和开始接受 HAART 治疗约 2-3 年后的 DNA 甲基化图谱,以及血清阴性(SN)对照组(187 人)在两个时间间隔内的 DNA 甲基化图谱。我们的目的是找出与 HIV 感染和开始 HAART 相关的特定 CpGs 和生物通路。此外,我们还试图利用匹配的 HIV 血清阴性 (SN) 对照组(年龄、丙型肝炎状态和就诊时间间隔匹配)来确定与 HAART 启动相关但独立于 HIV 相关变化的表观遗传学变化,从而确定在接受 HAART 治疗前 PLWH 和 SN 之间没有差异但与 HAART 启动显著相关的 CpGs,而这些变化与 HIV 病毒载量无关。利用 PLWH 在 HAART 之前和之后的样本,对超过 850,000 个 CpG 位点进行了表观基因组范围关联研究 (EWAS)。然后使用基因组区域富集注释工具(GREAT)对研究结果进行注释:结果:如果只比较接受抗逆转录病毒治疗前和接受抗逆转录病毒治疗后的 PLWH,则与免疫功能和与免疫功能相关的疾病有关的基因本体具有显著性,但对于在接受抗逆转录病毒治疗后就诊时检测到 HIV 病毒载量(>50 拷贝/毫升)的 PLWH 而言,其显著性较低。为了具体阐明 HAART 的影响与 HIV 引起的甲基化变化之间的区别,我们在控制 HIV 病毒载量的同时对 HAART 进行了 EWAS 分析,发现了与移植排斥反应、移植相关疾病和其他免疫特征相关的基因本体。此外,我们还进行了一项更有针对性的分析,研究了病毒载量控制 EWAS 中达到全基因组意义(p < 1 × 10-7)的 CpGs,这些 CpGs 在所有 PLWH 和 HART 前的匹配 SN 对照之间没有差异。这些 CpGs 位于在逆转录病毒药物代谢、弥漫大 B 细胞淋巴瘤增殖和胃癌转移中发挥作用的基因附近:总之,这项研究让我们深入了解了艾滋病病毒感染者在开始接受 HAART 治疗后 DNA 甲基化发生变化可能带来的生物学功能。
{"title":"Effects of highly active antiretroviral therapy initiation on epigenomic DNA methylation in persons living with HIV","authors":"Joshua Zhang, Mary E. Sehl, Roger Shih, Elizabeth Crabb Breen, Fengxue Li, Ake T. Lu, Jay H. Bream, Priya Duggal, Jeremy Martinson, Steven M. Wolinsky, Otoniel Martinez-Maza, Christina M. Ramirez, Steve Horvath, Beth D. Jamieson","doi":"10.3389/fbinf.2024.1357889","DOIUrl":"https://doi.org/10.3389/fbinf.2024.1357889","url":null,"abstract":"Introduction: Highly active antiretroviral therapy (HAART) helps improve some measures of accelerated epigenetic aging in persons living with HIV (PLWH), but its overall impact on the epigenome is not fully understood.Methods: In this study, we analyzed the DNA methylation profiles of PLWH (n = 187) shortly before and approximately 2–3 years after they started HAART, as well as matched seronegative (SN) controls (n = 187), taken at two time intervals. Our aim was to identify specific CpGs and biologic pathways associated with HIV infection and initiation of HAART. Additionally, we attempted to identify epigenetic changes associated with HAART initiation that were independent of HIV-associated changes, using matched HIV seronegative (SN) controls (matched on age, hepatitis C status, and interval between visits) to identify CpGs that did not differ between PLWH and SN pre-HAART but were significantly associated with HAART initiation while being unrelated to HIV viral load. Epigenome-wide association studies (EWAS) on >850,000 CpG sites were performed using pre- and post-HAART samples from PLWH. The results were then annotated using the Genomic Regions Enrichment of Annotations Tool (GREAT).Results: When only pre- and post-HAART visits in PLWH were compared, gene ontologies related to immune function and diseases related to immune function were significant, though with less significance for PLWH with detectable HIV viral loads (>50 copies/mL) at the post-HAART visit. To specifically elucidate the effects of HAART separately from HIV-induced methylation changes, we performed EWAS of HAART while also controlling for HIV viral load, and found gene ontologies associated with transplant rejection, transplant-related diseases, and other immunologic signatures. Additionally, we performed a more focused analysis that examined CpGs reaching genome-wide significance (p < 1 × 10−7) from the viral load-controlled EWAS that did not differ between all PLWH and matched SN controls pre-HAART. These CpGs were found to be near genes that play a role in retroviral drug metabolism, diffuse large B cell lymphoma proliferation, and gastric cancer metastasis.Discussion: Overall, this study provides insight into potential biological functions associated with DNA methylation changes induced by HAART initiation in persons living with HIV.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141102397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Human cytokine and coronavirus nucleocapsid protein interactivity using large-scale virtual screens. 利用大规模虚拟筛选研究人类细胞因子与冠状病毒核壳蛋白的相互作用。
Pub Date : 2024-05-24 eCollection Date: 2024-01-01 DOI: 10.3389/fbinf.2024.1397968
Phillip J Tomezsko, Colby T Ford, Avery E Meyer, Adam M Michaleas, Rafael Jaimes

Understanding the interactions between SARS-CoV-2 and the human immune system is paramount to the characterization of novel variants as the virus co-evolves with the human host. In this study, we employed state-of-the-art molecular docking tools to conduct large-scale virtual screens, predicting the binding affinities between 64 human cytokines against 17 nucleocapsid proteins from six betacoronaviruses. Our comprehensive in silico analyses reveal specific changes in cytokine-nucleocapsid protein interactions, shedding light on potential modulators of the host immune response during infection. These findings offer valuable insights into the molecular mechanisms underlying viral pathogenesis and may guide the future development of targeted interventions. This manuscript serves as insight into the comparison of deep learning based AlphaFold2-Multimer and the semi-physicochemical based HADDOCK for protein-protein docking. We show the two methods are complementary in their predictive capabilities. We also introduce a novel algorithm for rapidly assessing the binding interface of protein-protein docks using graph edit distance: graph-based interface residue assessment function (GIRAF). The high-performance computational framework presented here will not only aid in accelerating the discovery of effective interventions against emerging viral threats, but extend to other applications of high throughput protein-protein screens.

随着病毒与人类宿主的共同进化,了解 SARS-CoV-2 与人类免疫系统之间的相互作用对于鉴定新型变体至关重要。在这项研究中,我们采用最先进的分子对接工具进行了大规模的虚拟筛选,预测了 64 种人类细胞因子与来自 6 种 betacoronaviruses 的 17 种核壳蛋白的结合亲和力。我们的全面硅学分析揭示了细胞因子-核苷酸蛋白相互作用的特定变化,揭示了感染期间宿主免疫反应的潜在调节因子。这些发现为了解病毒致病的分子机制提供了宝贵的视角,并可能为未来开发有针对性的干预措施提供指导。本手稿深入探讨了基于深度学习的 AlphaFold2-Multimer 与基于半物理化学的 HADDOCK 在蛋白质-蛋白质对接方面的比较。我们发现这两种方法在预测能力上具有互补性。我们还介绍了一种利用图编辑距离快速评估蛋白质-蛋白质对接结合界面的新型算法:基于图的界面残基评估函数(GIRAF)。本文介绍的高性能计算框架不仅有助于加快发现有效的干预措施来应对新出现的病毒威胁,还可扩展到高通量蛋白质-蛋白质筛选的其他应用领域。
{"title":"Human cytokine and coronavirus nucleocapsid protein interactivity using large-scale virtual screens.","authors":"Phillip J Tomezsko, Colby T Ford, Avery E Meyer, Adam M Michaleas, Rafael Jaimes","doi":"10.3389/fbinf.2024.1397968","DOIUrl":"10.3389/fbinf.2024.1397968","url":null,"abstract":"<p><p>Understanding the interactions between SARS-CoV-2 and the human immune system is paramount to the characterization of novel variants as the virus co-evolves with the human host. In this study, we employed state-of-the-art molecular docking tools to conduct large-scale virtual screens, predicting the binding affinities between 64 human cytokines against 17 nucleocapsid proteins from six betacoronaviruses. Our comprehensive <i>in silico</i> analyses reveal specific changes in cytokine-nucleocapsid protein interactions, shedding light on potential modulators of the host immune response during infection. These findings offer valuable insights into the molecular mechanisms underlying viral pathogenesis and may guide the future development of targeted interventions. This manuscript serves as insight into the comparison of deep learning based AlphaFold2-Multimer and the semi-physicochemical based HADDOCK for protein-protein docking. We show the two methods are complementary in their predictive capabilities. We also introduce a novel algorithm for rapidly assessing the binding interface of protein-protein docks using graph edit distance: graph-based interface residue assessment function (GIRAF). The high-performance computational framework presented here will not only aid in accelerating the discovery of effective interventions against emerging viral threats, but extend to other applications of high throughput protein-protein screens.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11157076/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141297494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Decreased but persistent epigenetic age acceleration is associated with changes in T-cell subsets after initiation of highly active antiretroviral therapy in persons living with HIV 艾滋病毒感染者开始接受高效抗逆转录病毒疗法后,表观遗传学年龄加速度降低但持续存在,这与 T 细胞亚群的变化有关
Pub Date : 2024-05-24 DOI: 10.3389/fbinf.2024.1356509
Mary E. Sehl, Elizabeth Crabb Breen, Roger Shih, Fengxue Li, Joshua Zhang, Peter Langfelder, Steve Horvath, Jay H. Bream, Priya Duggal, Jeremy Martinson, Steven M. Wolinsky, Otoniel Martinez-Maza, Christina M. Ramirez, Beth D. Jamieson
Persons living with HIV (PLWH) experience the early onset of age-related illnesses, even in the setting of successful human immunodeficiency virus (HIV) suppression with highly active antiretroviral therapy (HAART). HIV infection is associated with accelerated epigenetic aging as measured using DNA methylation (DNAm)-based estimates of biological age and of telomere length (TL).DNAm levels (Infinium MethylationEPIC BeadChip) from peripheral blood mononuclear cells from 200 PLWH and 199 HIV-seronegative (SN) participants matched on chronologic age, hepatitis C virus, and time intervals were used to calculate epigenetic age acceleration, expressed as age-adjusted acceleration residuals from 4 epigenetic clocks [Horvath’s pan-tissue age acceleration residual (AAR), extrinsic epigenetic age acceleration (EEAA), phenotypic epigenetic age acceleration (PEAA), and grim epigenetic age acceleration (GEAA)] plus age-adjusted DNAm-based TL (aaDNAmTL). Epigenetic age acceleration was compared for PLWH and SN participants at two visits: up to 1.5 years prior and 2–3 years after HAART (or equivalent visits). Flow cytometry was performed in PLWH and SN participants at both visits to evaluate T-cell subsets.Epigenetic age acceleration in PLWH decreased after the initiation of HAART but remained greater post-HAART than that in age-matched SN participants, with differences in medians of 6.6, 9.1, and 7.7 years for AAR, EEAA, and PEAA, respectively, and 0.39 units of aaDNAmTL shortening (all p < 0.001). Cumulative HIV viral load after HAART initiation was associated with some epigenetic acceleration (EEAA, PEAA, and aaDNAmTL), but even PLWH with undetectable HIV post-HAART showed persistent epigenetic age acceleration compared to SN participants (p < 0.001). AAR, EEAA, and aaDNAmTL showed significant associations with total, naïve, and senescent CD8 T-cell counts; the total CD4 T-cell counts were associated with AAR, EEAA, and PEAA (p = 0.04 to <0.001). In an epigenome-wide analysis using weighted gene co-methylation network analyses, 11 modules demonstrated significant DNAm differences pre- to post-HAART initiation. Of these, nine were previously identified as significantly different from pre- to post-HIV infection but in the opposite direction.In this large longitudinal study, we demonstrated that, although the magnitude of the difference decreases with HAART is associated with the cumulative viral load, PLWH are persistently epigenetically older than age-matched SN participants even after the successful initiation of HAART, and these changes are associated with changes in T-cell subsets.
即使在使用高效抗逆转录病毒疗法(HAART)成功抑制人类免疫缺陷病毒(HIV)的情况下,艾滋病病毒感染者(PLWH)也会提前出现与年龄相关的疾病。艾滋病病毒感染与表观遗传学老化加速有关,这是用基于 DNA 甲基化(DNAm)的生物年龄和端粒长度(TL)估算值来衡量的。200 名 PLWH 和 199 名 HIV 阴性(SN)参与者的外周血单核细胞中的 DNAm 水平(Infinium MethylationEPIC BeadChip)与年龄、丙型肝炎病毒和时间间隔相匹配,用于计算表观遗传学年龄加速度、表观遗传学年龄加速度用 4 个表观遗传学时钟[Horvath 的泛组织年龄加速度残差 (AAR)、外在表观遗传学年龄加速度 (EEAA)、表型表观遗传学年龄加速度 (PEAA) 和严峻表观遗传学年龄加速度 (GEAA)] 的年龄调整加速度残差加上基于 DNAm 的年龄调整 TL (aaDNAmTL)表示。对 PLWH 和 SN 参与者的表观遗传学年龄加速进行了两次访问比较:HAART 前 1.5 年和 HAART 后 2-3 年(或同等访问)。HAART启动后,PLWH参与者的表观遗传学年龄加速度下降,但HAART后仍大于年龄匹配的SN参与者,AAR、EEAA和PEAA的中位数分别相差6.6年、9.1年和7.7年,aaDNAmTL缩短了0.39个单位(所有P<0.001)。开始 HAART 后累积的 HIV 病毒载量与一些表观遗传学加速(EEAA、PEAA 和 aaDNAmTL)有关,但与 SN 参与者相比,即使在 HAART 后检测不到 HIV 的 PLWH 也表现出持续的表观遗传学年龄加速(p < 0.001)。AAR、EEAA和aaDNAmTL与CD8 T细胞总数、幼稚细胞总数和衰老细胞总数有显著关联;CD4 T细胞总数与AAR、EEAA和PEAA有关联(p = 0.04至<0.001)。在利用加权基因共甲基化网络分析进行的全表观基因组分析中,有 11 个模块显示出启动 HAART 前后 DNAm 的显著差异。在这项大型纵向研究中,我们证明,虽然HAART导致的差异幅度减小与累积病毒载量有关,但即使在成功启动HAART后,PLWH患者的表观遗传学年龄仍持续高于年龄匹配的SN参与者,而且这些变化与T细胞亚群的变化有关。
{"title":"Decreased but persistent epigenetic age acceleration is associated with changes in T-cell subsets after initiation of highly active antiretroviral therapy in persons living with HIV","authors":"Mary E. Sehl, Elizabeth Crabb Breen, Roger Shih, Fengxue Li, Joshua Zhang, Peter Langfelder, Steve Horvath, Jay H. Bream, Priya Duggal, Jeremy Martinson, Steven M. Wolinsky, Otoniel Martinez-Maza, Christina M. Ramirez, Beth D. Jamieson","doi":"10.3389/fbinf.2024.1356509","DOIUrl":"https://doi.org/10.3389/fbinf.2024.1356509","url":null,"abstract":"Persons living with HIV (PLWH) experience the early onset of age-related illnesses, even in the setting of successful human immunodeficiency virus (HIV) suppression with highly active antiretroviral therapy (HAART). HIV infection is associated with accelerated epigenetic aging as measured using DNA methylation (DNAm)-based estimates of biological age and of telomere length (TL).DNAm levels (Infinium MethylationEPIC BeadChip) from peripheral blood mononuclear cells from 200 PLWH and 199 HIV-seronegative (SN) participants matched on chronologic age, hepatitis C virus, and time intervals were used to calculate epigenetic age acceleration, expressed as age-adjusted acceleration residuals from 4 epigenetic clocks [Horvath’s pan-tissue age acceleration residual (AAR), extrinsic epigenetic age acceleration (EEAA), phenotypic epigenetic age acceleration (PEAA), and grim epigenetic age acceleration (GEAA)] plus age-adjusted DNAm-based TL (aaDNAmTL). Epigenetic age acceleration was compared for PLWH and SN participants at two visits: up to 1.5 years prior and 2–3 years after HAART (or equivalent visits). Flow cytometry was performed in PLWH and SN participants at both visits to evaluate T-cell subsets.Epigenetic age acceleration in PLWH decreased after the initiation of HAART but remained greater post-HAART than that in age-matched SN participants, with differences in medians of 6.6, 9.1, and 7.7 years for AAR, EEAA, and PEAA, respectively, and 0.39 units of aaDNAmTL shortening (all p < 0.001). Cumulative HIV viral load after HAART initiation was associated with some epigenetic acceleration (EEAA, PEAA, and aaDNAmTL), but even PLWH with undetectable HIV post-HAART showed persistent epigenetic age acceleration compared to SN participants (p < 0.001). AAR, EEAA, and aaDNAmTL showed significant associations with total, naïve, and senescent CD8 T-cell counts; the total CD4 T-cell counts were associated with AAR, EEAA, and PEAA (p = 0.04 to <0.001). In an epigenome-wide analysis using weighted gene co-methylation network analyses, 11 modules demonstrated significant DNAm differences pre- to post-HAART initiation. Of these, nine were previously identified as significantly different from pre- to post-HIV infection but in the opposite direction.In this large longitudinal study, we demonstrated that, although the magnitude of the difference decreases with HAART is associated with the cumulative viral load, PLWH are persistently epigenetically older than age-matched SN participants even after the successful initiation of HAART, and these changes are associated with changes in T-cell subsets.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141102651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Frontiers in bioinformatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1