首页 > 最新文献

Frontiers in bioinformatics最新文献

英文 中文
New alignment method for remote protein sequences by the direct use of pairwise sequence correlations and substitutions. 通过直接使用成对序列相关性和取代的远程蛋白质序列的新比对方法。
Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-10-12 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1227193
Kejue Jia, Mesih Kilinc, Robert L Jernigan

Understanding protein sequences and how they relate to the functions of proteins is extremely important. One of the most basic operations in bioinformatics is sequence alignment and usually the first things learned from these are which positions are the most conserved and often these are critical parts of the structure, such as enzyme active site residues. In addition, the contact pairs in a protein usually correspond closely to the correlations between residue positions in the multiple sequence alignment, and these usually change in a systematic and coordinated way, if one position changes then the other member of the pair also changes to compensate. In the present work, these correlated pairs are taken as anchor points for a new type of sequence alignment. The main advantage of the method here is its combining the remote homolog detection from our method PROST with pairwise sequence substitutions in the rigorous method from Kleinjung et al. We show a few examples of some resulting sequence alignments, and how they can lead to improvements in alignments for function, even for a disordered protein.

了解蛋白质序列及其与蛋白质功能的关系是极其重要的。生物信息学中最基本的操作之一是序列比对,通常从这些操作中学到的第一件事是哪些位置最保守,而且这些位置通常是结构的关键部分,例如酶活性位点残基。此外,蛋白质中的接触对通常与多序列比对中残基位置之间的相关性密切对应,并且这些相关性通常以系统和协调的方式发生变化,如果一个位置发生变化,则该对的另一个成员也发生变化以进行补偿。在本工作中,将这些相关对作为一种新型序列比对的锚点。该方法的主要优点是将我们的方法PROST的远程同源物检测与Kleinung等人的严格方法中的成对序列替换相结合。我们展示了一些由此产生的序列比对的例子,以及它们如何改善功能比对,甚至是紊乱的蛋白质。
{"title":"New alignment method for remote protein sequences by the direct use of pairwise sequence correlations and substitutions.","authors":"Kejue Jia, Mesih Kilinc, Robert L Jernigan","doi":"10.3389/fbinf.2023.1227193","DOIUrl":"10.3389/fbinf.2023.1227193","url":null,"abstract":"<p><p>Understanding protein sequences and how they relate to the functions of proteins is extremely important. One of the most basic operations in bioinformatics is sequence alignment and usually the first things learned from these are which positions are the most conserved and often these are critical parts of the structure, such as enzyme active site residues. In addition, the contact pairs in a protein usually correspond closely to the correlations between residue positions in the multiple sequence alignment, and these usually change in a systematic and coordinated way, if one position changes then the other member of the pair also changes to compensate. In the present work, these correlated pairs are taken as anchor points for a new type of sequence alignment. The main advantage of the method here is its combining the remote homolog detection from our method PROST with pairwise sequence substitutions in the rigorous method from Kleinjung et al. We show a few examples of some resulting sequence alignments, and how they can lead to improvements in alignments for function, even for a disordered protein.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1227193"},"PeriodicalIF":0.0,"publicationDate":"2023-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10602800/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71415730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
VariantSurvival: a tool to identify genotype-treatment response. VariantSurvival:一种识别基因型治疗反应的工具。
Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-10-11 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1277923
Thomas Krannich, Marina Herrera Sarrias, Hiba Ben Aribi, Moustafa Shokrof, Alfredo Iacoangeli, Ammar Al-Chalabi, Fritz J Sedlazeck, Ben Busby, Ahmad Al Khleifat

Motivation: For a number of neurological diseases, such as Alzheimer's disease, amyotrophic lateral sclerosis, and many others, certain genes are known to be involved in the disease mechanism. A common question is whether a structural variant in any such gene may be related to drug response in clinical trials and how this relationship can contribute to the lifecycle of drug development. Results: To this end, we introduce VariantSurvival, a tool that identifies changes in survival relative to structural variants within target genes. VariantSurvival matches annotated structural variants with genes that are clinically relevant to neurological diseases. A Cox regression model determines the change in survival between the placebo and clinical trial groups with respect to the number of structural variants in the drug target genes. We demonstrate the functionality of our approach with the exemplary case of the SETX gene. VariantSurvival has a user-friendly and lightweight graphical user interface built on the shiny web application package.

动机:对于许多神经系统疾病,如阿尔茨海默病、肌萎缩侧索硬化症和许多其他疾病,已知某些基因与疾病机制有关。一个常见的问题是,任何此类基因的结构变异是否与临床试验中的药物反应有关,以及这种关系如何有助于药物开发的生命周期。结果:为此,我们引入了VariantSurvival,这是一种识别存活率相对于靶基因结构变异变化的工具。VariantSurvival将注释的结构变体与临床上与神经疾病相关的基因进行匹配。Cox回归模型确定了安慰剂组和临床试验组之间相对于药物靶基因结构变异数量的生存率变化。我们用SETX基因的例子来证明我们的方法的功能。VariantSurvival在闪亮的web应用程序包上构建了一个用户友好、轻量级的图形用户界面。
{"title":"VariantSurvival: a tool to identify genotype-treatment response.","authors":"Thomas Krannich, Marina Herrera Sarrias, Hiba Ben Aribi, Moustafa Shokrof, Alfredo Iacoangeli, Ammar Al-Chalabi, Fritz J Sedlazeck, Ben Busby, Ahmad Al Khleifat","doi":"10.3389/fbinf.2023.1277923","DOIUrl":"10.3389/fbinf.2023.1277923","url":null,"abstract":"<p><p><b>Motivation:</b> For a number of neurological diseases, such as Alzheimer's disease, amyotrophic lateral sclerosis, and many others, certain genes are known to be involved in the disease mechanism. A common question is whether a structural variant in any such gene may be related to drug response in clinical trials and how this relationship can contribute to the lifecycle of drug development. <b>Results:</b> To this end, we introduce VariantSurvival, a tool that identifies changes in survival relative to structural variants within target genes. VariantSurvival matches annotated structural variants with genes that are clinically relevant to neurological diseases. A Cox regression model determines the change in survival between the placebo and clinical trial groups with respect to the number of structural variants in the drug target genes. We demonstrate the functionality of our approach with the exemplary case of the <i>SETX</i> gene. VariantSurvival has a user-friendly and lightweight graphical user interface built on the shiny web application package.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1277923"},"PeriodicalIF":0.0,"publicationDate":"2023-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10598652/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"54232718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DeepRaccess: high-speed RNA accessibility prediction using deep learning. DeepRaccess:使用深度学习进行高速RNA可及性预测。
Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-10-10 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1275787
Kaisei Hara, Natsuki Iwano, Tsukasa Fukunaga, Michiaki Hamada

RNA accessibility is a useful RNA secondary structural feature for predicting RNA-RNA interactions and translation efficiency in prokaryotes. However, conventional accessibility calculation tools, such as Raccess, are computationally expensive and require considerable computational time to perform transcriptome-scale analysis. In this study, we developed DeepRaccess, which predicts RNA accessibility based on deep learning methods. DeepRaccess was trained to take artificial RNA sequences as input and to predict the accessibility of these sequences as calculated by Raccess. Simulation and empirical dataset analyses showed that the accessibility predicted by DeepRaccess was highly correlated with the accessibility calculated by Raccess. In addition, we confirmed that DeepRaccess could predict protein abundance in E.coli with moderate accuracy from the sequences around the start codon. We also demonstrated that DeepRaccess achieved tens to hundreds of times software speed-up in a GPU environment. The source codes and the trained models of DeepRaccess are freely available at https://github.com/hmdlab/DeepRaccess.

RNA可及性是预测原核生物中RNA-RNA相互作用和翻译效率的有用的RNA二级结构特征。然而,传统的可访问性计算工具,如Raccess,在计算上是昂贵的,并且需要相当长的计算时间来执行转录组规模的分析。在这项研究中,我们开发了DeepRaccess,它基于深度学习方法预测RNA的可及性。训练DeepRaccess以人工RNA序列作为输入,并预测Raccess计算的这些序列的可访问性。仿真和经验数据集分析表明,DeepRaccess预测的可达性与Raccess计算的可达性高度相关。此外,我们证实DeepRaccess可以从起始密码子周围的序列中以中等精度预测大肠杆菌中的蛋白质丰度。我们还证明了DeepRaccess在GPU环境中实现了数十到数百倍的软件加速。DeepRaccess的源代码和经过训练的模型可在https://github.com/hmdlab/DeepRaccess.
{"title":"DeepRaccess: high-speed RNA accessibility prediction using deep learning.","authors":"Kaisei Hara,&nbsp;Natsuki Iwano,&nbsp;Tsukasa Fukunaga,&nbsp;Michiaki Hamada","doi":"10.3389/fbinf.2023.1275787","DOIUrl":"10.3389/fbinf.2023.1275787","url":null,"abstract":"<p><p>RNA accessibility is a useful RNA secondary structural feature for predicting RNA-RNA interactions and translation efficiency in prokaryotes. However, conventional accessibility calculation tools, such as Raccess, are computationally expensive and require considerable computational time to perform transcriptome-scale analysis. In this study, we developed DeepRaccess, which predicts RNA accessibility based on deep learning methods. DeepRaccess was trained to take artificial RNA sequences as input and to predict the accessibility of these sequences as calculated by Raccess. Simulation and empirical dataset analyses showed that the accessibility predicted by DeepRaccess was highly correlated with the accessibility calculated by Raccess. In addition, we confirmed that DeepRaccess could predict protein abundance in <i>E.coli</i> with moderate accuracy from the sequences around the start codon. We also demonstrated that DeepRaccess achieved tens to hundreds of times software speed-up in a GPU environment. The source codes and the trained models of DeepRaccess are freely available at https://github.com/hmdlab/DeepRaccess.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1275787"},"PeriodicalIF":0.0,"publicationDate":"2023-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10597636/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50163995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Artificial intelligence in imaging flow cytometry. 成像流式细胞术中的人工智能。
Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-10-09 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1229052
Paolo Pozzi, Alessia Candeo, Petra Paiè, Francesca Bragheri, Andrea Bassi
{"title":"Artificial intelligence in imaging flow cytometry.","authors":"Paolo Pozzi,&nbsp;Alessia Candeo,&nbsp;Petra Paiè,&nbsp;Francesca Bragheri,&nbsp;Andrea Bassi","doi":"10.3389/fbinf.2023.1229052","DOIUrl":"10.3389/fbinf.2023.1229052","url":null,"abstract":"","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1229052"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10593470/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50159463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
NeighborNet: improved algorithms and implementation. NeighborNet:改进的算法和实现。
Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-09-20 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1178600
David Bryant, Daniel H Huson

NeighborNet constructs phylogenetic networks to visualize distance data. It is a popular method used in a wide range of applications. While several studies have investigated its mathematical features, here we focus on computational aspects. The algorithm operates in three steps. We present a new simplified formulation of the first step, which aims at computing a circular ordering. We provide the first technical description of the second step, the estimation of split weights. We review the third step by constructing and drawing the network. Finally, we discuss how the networks might best be interpreted, review related approaches, and present some open questions.

NeighborNet构建系统发育网络以可视化距离数据。这是一种广泛应用的流行方法。虽然一些研究已经调查了它的数学特征,但这里我们关注的是计算方面。该算法分为三个步骤。我们提出了第一步的一个新的简化公式,旨在计算循环排序。我们提供了第二步的第一个技术描述,即分割权重的估计。我们通过构建和绘制网络来回顾第三步。最后,我们讨论了如何最好地解释网络,回顾了相关的方法,并提出了一些悬而未决的问题。
{"title":"NeighborNet: improved algorithms and implementation.","authors":"David Bryant,&nbsp;Daniel H Huson","doi":"10.3389/fbinf.2023.1178600","DOIUrl":"10.3389/fbinf.2023.1178600","url":null,"abstract":"<p><p>NeighborNet constructs phylogenetic networks to visualize distance data. It is a popular method used in a wide range of applications. While several studies have investigated its mathematical features, here we focus on computational aspects. The algorithm operates in three steps. We present a new simplified formulation of the first step, which aims at computing a circular ordering. We provide the first technical description of the second step, the estimation of split weights. We review the third step by constructing and drawing the network. Finally, we discuss how the networks might best be interpreted, review related approaches, and present some open questions.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1178600"},"PeriodicalIF":0.0,"publicationDate":"2023-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10548196/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41161536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
VariBench, new variation benchmark categories and data sets. VariBeach,新的变体基准类别和数据集。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-09-19 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1248732
Niloofar Shirvanizadeh, Mauno Vihinen
{"title":"VariBench, new variation benchmark categories and data sets.","authors":"Niloofar Shirvanizadeh, Mauno Vihinen","doi":"10.3389/fbinf.2023.1248732","DOIUrl":"10.3389/fbinf.2023.1248732","url":null,"abstract":"","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1248732"},"PeriodicalIF":2.8,"publicationDate":"2023-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10546188/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41167306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Corrigendum: A review on deep learning applications in highly multiplexed tissue imaging data analysis. 更正:深度学习在高度复用组织成像数据分析中的应用综述。
Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-09-13 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1287407
Mohammed Zidane, Ahmad Makky, Matthias Bruhns, Alexander Rochwarger, Sepideh Babaei, Manfred Claassen, Christian M Schürch

[This corrects the article DOI: 10.3389/fbinf.2023.1159381.].

[这更正了文章DOI:10.3389/fbinf.2023.1159381.]。
{"title":"Corrigendum: A review on deep learning applications in highly multiplexed tissue imaging data analysis.","authors":"Mohammed Zidane,&nbsp;Ahmad Makky,&nbsp;Matthias Bruhns,&nbsp;Alexander Rochwarger,&nbsp;Sepideh Babaei,&nbsp;Manfred Claassen,&nbsp;Christian M Schürch","doi":"10.3389/fbinf.2023.1287407","DOIUrl":"10.3389/fbinf.2023.1287407","url":null,"abstract":"<p><p>[This corrects the article DOI: 10.3389/fbinf.2023.1159381.].</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1287407"},"PeriodicalIF":0.0,"publicationDate":"2023-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10534973/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41170250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Editorial: Recent advances in peptide informatics: challenges and opportunities. 社论:肽信息学的最新进展:挑战与机遇。
Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-09-12 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1271932
Rahul Kumar, Kumardeep Chaudhary, Sandeep Kumar Dhanda
Peptide informatics is a rapidly growing field that is at the intersection of bioinformatics, chemistry, and biology. Peptides are short chains of amino acids that play important roles in a wide variety of biological processes, such as protein folding, signal transduction, and immune function. Peptide informatics is the use of computational methods to study peptides and their sequence, structure, function, and interactions. Recent advances in peptide informatics have led to a number of new discoveries and applications. For example, new methods have been developed to predict the structure of peptides, which can be used to design new drugs and therapies. New methods for identifying peptide-protein interactions have also been introduced, which can be used to understand the molecular basis of disease.
{"title":"Editorial: Recent advances in peptide informatics: challenges and opportunities.","authors":"Rahul Kumar,&nbsp;Kumardeep Chaudhary,&nbsp;Sandeep Kumar Dhanda","doi":"10.3389/fbinf.2023.1271932","DOIUrl":"https://doi.org/10.3389/fbinf.2023.1271932","url":null,"abstract":"Peptide informatics is a rapidly growing field that is at the intersection of bioinformatics, chemistry, and biology. Peptides are short chains of amino acids that play important roles in a wide variety of biological processes, such as protein folding, signal transduction, and immune function. Peptide informatics is the use of computational methods to study peptides and their sequence, structure, function, and interactions. Recent advances in peptide informatics have led to a number of new discoveries and applications. For example, new methods have been developed to predict the structure of peptides, which can be used to design new drugs and therapies. New methods for identifying peptide-protein interactions have also been introduced, which can be used to understand the molecular basis of disease.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1271932"},"PeriodicalIF":0.0,"publicationDate":"2023-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10523389/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41155909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The origin of eukaryotes and rise in complexity were synchronous with the rise in oxygen. 真核生物的起源和复杂性的增加与氧气的增加是同步的。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-09-01 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1233281
Jack M Craig, Sudhir Kumar, S Blair Hedges

The origin of eukaryotes was among the most important events in the history of life, spawning a new evolutionary lineage that led to all complex multicellular organisms. However, the timing of this event, crucial for understanding its environmental context, has been difficult to establish. The fossil and biomarker records are sparse and molecular clocks have thus far not reached a consensus, with dates spanning 2.1-0.91 billion years ago (Ga) for critical nodes. Notably, molecular time estimates for the last common ancestor of eukaryotes are typically hundreds of millions of years younger than the Great Oxidation Event (GOE, 2.43-2.22 Ga), leading researchers to question the presumptive link between eukaryotes and oxygen. We obtained a new time estimate for the origin of eukaryotes using genetic data of both archaeal and bacterial origin, the latter rarely used in past studies. We also avoided potential calibration biases that may have affected earlier studies. We obtained a conservative interval of 2.2-1.5 Ga, with an even narrower core interval of 2.0-1.8 Ga, for the origin of eukaryotes, a period closely aligned with the rise in oxygen. We further reconstructed the history of biological complexity across the tree of life using three universal measures: cell types, genes, and genome size. We found that the rise in complexity was temporally consistent with and followed a pattern similar to the rise in oxygen. This suggests a causal relationship stemming from the increased energy needs of complex life fulfilled by oxygen.

真核生物的起源是生命史上最重要的事件之一,产生了一个新的进化谱系,导致了所有复杂的多细胞生物。然而,这一事件的时间安排对于理解其环境背景至关重要,一直很难确定。化石和生物标志物记录稀少,分子钟迄今尚未达成共识,关键节点的日期跨度为21-0.91亿年前(Ga)。值得注意的是,对真核生物最后一个共同祖先的分子时间估计通常比大氧化事件(GOE,2.43-2.22 Ga)年轻数亿年,这导致研究人员质疑真核生物与氧气之间的假定联系。我们使用古菌和细菌起源的遗传数据获得了对真核生物起源的新的时间估计,后者在过去的研究中很少使用。我们还避免了可能影响早期研究的潜在校准偏差。对于真核生物的起源,我们获得了2.2至1.5 Ga的保守区间,2.0至1.8 Ga的核心区间甚至更窄,这一时期与氧气的增加密切相关。我们使用三种通用的衡量标准:细胞类型、基因和基因组大小,进一步重建了整个生命树的生物复杂性历史。我们发现复杂性的增加在时间上与氧气的增加一致,并遵循类似的模式。这表明了一种因果关系,源于氧气满足的复杂生命的能量需求增加。
{"title":"The origin of eukaryotes and rise in complexity were synchronous with the rise in oxygen.","authors":"Jack M Craig, Sudhir Kumar, S Blair Hedges","doi":"10.3389/fbinf.2023.1233281","DOIUrl":"10.3389/fbinf.2023.1233281","url":null,"abstract":"<p><p>The origin of eukaryotes was among the most important events in the history of life, spawning a new evolutionary lineage that led to all complex multicellular organisms. However, the timing of this event, crucial for understanding its environmental context, has been difficult to establish. The fossil and biomarker records are sparse and molecular clocks have thus far not reached a consensus, with dates spanning 2.1-0.91 billion years ago (Ga) for critical nodes. Notably, molecular time estimates for the last common ancestor of eukaryotes are typically hundreds of millions of years younger than the Great Oxidation Event (GOE, 2.43-2.22 Ga), leading researchers to question the presumptive link between eukaryotes and oxygen. We obtained a new time estimate for the origin of eukaryotes using genetic data of both archaeal and bacterial origin, the latter rarely used in past studies. We also avoided potential calibration biases that may have affected earlier studies. We obtained a conservative interval of 2.2-1.5 Ga, with an even narrower core interval of 2.0-1.8 Ga, for the origin of eukaryotes, a period closely aligned with the rise in oxygen. We further reconstructed the history of biological complexity across the tree of life using three universal measures: cell types, genes, and genome size. We found that the rise in complexity was temporally consistent with and followed a pattern similar to the rise in oxygen. This suggests a causal relationship stemming from the increased energy needs of complex life fulfilled by oxygen.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1233281"},"PeriodicalIF":2.8,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10505794/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41142624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Orthogonal outlier detection and dimension estimation for improved MDS embedding of biological datasets. 用于改进生物数据集 MDS 嵌入的正交离群点检测和维度估计。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-08-10 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1211819
Wanxin Li, Jules Mirone, Ashok Prasad, Nina Miolane, Carine Legrand, Khanh Dao Duc

Conventional dimensionality reduction methods like Multidimensional Scaling (MDS) are sensitive to the presence of orthogonal outliers, leading to significant defects in the embedding. We introduce a robust MDS method, called DeCOr-MDS (Detection and Correction of Orthogonal outliers using MDS), based on the geometry and statistics of simplices formed by data points, that allows to detect orthogonal outliers and subsequently reduce dimensionality. We validate our methods using synthetic datasets, and further show how it can be applied to a variety of large real biological datasets, including cancer image cell data, human microbiome project data and single cell RNA sequencing data, to address the task of data cleaning and visualization.

传统的降维方法(如多维缩放(MDS))对正交离群值的存在很敏感,从而导致嵌入中的重大缺陷。我们介绍了一种稳健的 MDS 方法,称为 DeCOr-MDS(使用 MDS 检测和校正正交离群值),它基于数据点形成的简约的几何形状和统计数据,可以检测正交离群值,进而降低维度。我们利用合成数据集验证了我们的方法,并进一步展示了如何将其应用于各种大型真实生物数据集,包括癌症图像细胞数据、人类微生物组项目数据和单细胞 RNA 测序数据,以解决数据清理和可视化任务。
{"title":"Orthogonal outlier detection and dimension estimation for improved MDS embedding of biological datasets.","authors":"Wanxin Li, Jules Mirone, Ashok Prasad, Nina Miolane, Carine Legrand, Khanh Dao Duc","doi":"10.3389/fbinf.2023.1211819","DOIUrl":"10.3389/fbinf.2023.1211819","url":null,"abstract":"<p><p>Conventional dimensionality reduction methods like Multidimensional Scaling (MDS) are sensitive to the presence of orthogonal outliers, leading to significant defects in the embedding. We introduce a robust MDS method, called <i>DeCOr-MDS</i> (Detection and Correction of Orthogonal outliers using MDS), based on the geometry and statistics of simplices formed by data points, that allows to detect orthogonal outliers and subsequently reduce dimensionality. We validate our methods using synthetic datasets, and further show how it can be applied to a variety of large real biological datasets, including cancer image cell data, human microbiome project data and single cell RNA sequencing data, to address the task of data cleaning and visualization.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"3 ","pages":"1211819"},"PeriodicalIF":2.8,"publicationDate":"2023-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10448701/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10100807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Frontiers in bioinformatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1