首页 > 最新文献

Frontiers in bioinformatics最新文献

英文 中文
Protein-lipid interactions and protein anchoring modulate the modes of association of the globular domain of the Prion protein and Doppel protein to model membrane patches 蛋白-脂质相互作用和蛋白锚定调节朊病毒蛋白和多肽蛋白的球状结构域与模型膜片的结合模式
Pub Date : 2024-01-05 DOI: 10.3389/fbinf.2023.1321287
Patricia Soto, Davis T. Thalhuber, Frank Luceri, Jamie Janos, Mason R. Borgman, Noah M. Greenwood, Sofia Acosta, Hunter Stoffel
The Prion protein is the molecular hallmark of the incurable prion diseases affecting mammals, including humans. The protein-only hypothesis states that the misfolding, accumulation, and deposition of the Prion protein play a critical role in toxicity. The cellular Prion protein (PrPC) anchors to the extracellular leaflet of the plasma membrane and prefers cholesterol- and sphingomyelin-rich membrane domains. Conformational Prion protein conversion into the pathological isoform happens on the cell surface. In vitro and in vivo experiments indicate that Prion protein misfolding, aggregation, and toxicity are sensitive to the lipid composition of plasma membranes and vesicles. A picture of the underlying biophysical driving forces that explain the effect of Prion protein - lipid interactions in physiological conditions is needed to develop a structural model of Prion protein conformational conversion. To this end, we use molecular dynamics simulations that mimic the interactions between the globular domain of PrPC anchored to model membrane patches. In addition, we also simulate the Doppel protein anchored to such membrane patches. The Doppel protein is the closest in the phylogenetic tree to PrPC, localizes in an extracellular milieu similar to that of PrPC, and exhibits a similar topology to PrPC even if the amino acid sequence is only 25% identical. Our simulations show that specific protein-lipid interactions and conformational constraints imposed by GPI anchoring together favor specific binding sites in globular PrPC but not in Doppel. Interestingly, the binding sites we found in PrPC correspond to prion protein loops, which are critical in aggregation and prion disease transmission barrier (β2-α2 loop) and in initial spontaneous misfolding (α2-α3 loop). We also found that the membrane re-arranges locally to accommodate protein residues inserted in the membrane surface as a response to protein binding.
朊病毒蛋白是影响哺乳动物(包括人类)的无法治愈的朊病毒疾病的分子标志。唯蛋白假说认为,朊病毒蛋白的错误折叠、积累和沉积在毒性中起着关键作用。细胞朊病毒蛋白(PrPC)锚定在质膜的细胞外小叶上,喜欢富含胆固醇和鞘磷脂的膜域。朊病毒蛋白在细胞表面转化为病理异构体。体外和体内实验表明,朊病毒蛋白的错误折叠、聚集和毒性对质膜和囊泡的脂质成分很敏感。为了建立朊病毒蛋白构象转换的结构模型,我们需要了解在生理条件下解释朊病毒蛋白-脂质相互作用效应的潜在生物物理驱动力。为此,我们使用分子动力学模拟来模拟锚定在模型膜片上的 PrPC 球状结构域之间的相互作用。此外,我们还模拟了锚定在此类膜斑块上的 Doppel 蛋白。Doppel 蛋白在系统发育树中与 PrPC 最为接近,定位于与 PrPC 相似的细胞外环境中,即使氨基酸序列只有 25% 相同,也表现出与 PrPC 相似的拓扑结构。我们的模拟结果表明,特定的蛋白质-脂质相互作用和 GPI 锚定所施加的构象限制共同作用于球状 PrPC 的特定结合位点,而不是 Doppel 的结合位点。有趣的是,我们在 PrPC 中发现的结合位点与朊病毒蛋白环路相对应,而朊病毒蛋白环路在聚集和朊病毒疾病传播屏障(β2-α2 环路)以及初始自发错误折叠(α2-α3 环路)中至关重要。我们还发现,作为对蛋白质结合的反应,膜会局部重新排列,以容纳插入膜表面的蛋白质残基。
{"title":"Protein-lipid interactions and protein anchoring modulate the modes of association of the globular domain of the Prion protein and Doppel protein to model membrane patches","authors":"Patricia Soto, Davis T. Thalhuber, Frank Luceri, Jamie Janos, Mason R. Borgman, Noah M. Greenwood, Sofia Acosta, Hunter Stoffel","doi":"10.3389/fbinf.2023.1321287","DOIUrl":"https://doi.org/10.3389/fbinf.2023.1321287","url":null,"abstract":"The Prion protein is the molecular hallmark of the incurable prion diseases affecting mammals, including humans. The protein-only hypothesis states that the misfolding, accumulation, and deposition of the Prion protein play a critical role in toxicity. The cellular Prion protein (PrPC) anchors to the extracellular leaflet of the plasma membrane and prefers cholesterol- and sphingomyelin-rich membrane domains. Conformational Prion protein conversion into the pathological isoform happens on the cell surface. In vitro and in vivo experiments indicate that Prion protein misfolding, aggregation, and toxicity are sensitive to the lipid composition of plasma membranes and vesicles. A picture of the underlying biophysical driving forces that explain the effect of Prion protein - lipid interactions in physiological conditions is needed to develop a structural model of Prion protein conformational conversion. To this end, we use molecular dynamics simulations that mimic the interactions between the globular domain of PrPC anchored to model membrane patches. In addition, we also simulate the Doppel protein anchored to such membrane patches. The Doppel protein is the closest in the phylogenetic tree to PrPC, localizes in an extracellular milieu similar to that of PrPC, and exhibits a similar topology to PrPC even if the amino acid sequence is only 25% identical. Our simulations show that specific protein-lipid interactions and conformational constraints imposed by GPI anchoring together favor specific binding sites in globular PrPC but not in Doppel. Interestingly, the binding sites we found in PrPC correspond to prion protein loops, which are critical in aggregation and prion disease transmission barrier (β2-α2 loop) and in initial spontaneous misfolding (α2-α3 loop). We also found that the membrane re-arranges locally to accommodate protein residues inserted in the membrane surface as a response to protein binding.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139381635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RxNorm for drug name normalization: a case study of prescription opioids in the FDA adverse events reporting system 用于药品名称规范化的 RxNorm:FDA 不良事件报告系统中处方类阿片的案例研究
Pub Date : 2024-01-05 DOI: 10.3389/fbinf.2023.1328613
Huyen Le, Ru Chen, Stephen Harris, Hong Fang, Beverly Lyn-Cook, H. Hong, W. Ge, Paul Rogers, Weida Tong, Wen Zou
Numerous studies have been conducted on the US Food and Drug Administration (FDA) Adverse Events Reporting System (FAERS) database to assess post-marketing reporting rates for drug safety review and risk assessment. However, the drug names in the adverse event (AE) reports from FAERS were heterogeneous due to a lack of uniformity of information submitted mandatorily by pharmaceutical companies and voluntarily by patients, healthcare professionals, and the public. Studies using FAERS and other spontaneous reporting AEs database without drug name normalization may encounter incomplete collection of AE reports from non-standard drug names and the accuracies of the results might be impacted. In this study, we demonstrated applicability of RxNorm, developed by the National Library of Medicine, for drug name normalization in FAERS. Using prescription opioids as a case study, we used RxNorm application program interface (API) to map all FDA-approved prescription opioids described in FAERS AE reports to their equivalent RxNorm Concept Unique Identifiers (RxCUIs) and RxNorm names. The different names of the opioids were then extracted, and their usage frequencies were calculated in collection of more than 14.9 million AE reports for 13 FDA-approved prescription opioid classes, reported over 17 years. The results showed that a significant number of different names were consistently used for opioids in FAERS reports, with 2,086 different names (out of 7,892) used at least three times and 842 different names used at least ten times for each of the 92 RxNorm names of FDA-approved opioids. Our method of using RxNorm API mapping was confirmed to be efficient and accurate and capable of reducing the heterogeneity of prescription opioid names significantly in the AE reports in FAERS; meanwhile, it is expected to have a broad application to different sets of drug names from any database where drug names are diverse and unnormalized. It is expected to be able to automatically standardize and link different representations of the same drugs to build an intact and high-quality database for diverse research, particularly postmarketing data analysis in pharmacovigilance initiatives.
美国食品和药物管理局(FDA)的不良事件报告系统(FAERS)数据库已进行了大量研究,以评估上市后的报告率,用于药物安全性审查和风险评估。然而,由于制药公司强制提交的信息与患者、医疗保健专业人员和公众自愿提交的信息不统一,FAERS 中不良事件(AE)报告中的药物名称也不尽相同。使用 FAERS 和其他自发报告的 AEs 数据库进行研究时,如果没有对药物名称进行规范化处理,可能会遇到从非标准药物名称中收集到的 AE 报告不完整的问题,结果的准确性可能会受到影响。在本研究中,我们展示了美国国家医学图书馆开发的 RxNorm 在 FAERS 中进行药名规范化的适用性。以处方阿片类药物为例,我们使用 RxNorm 应用程序接口(API)将 FAERS AE 报告中描述的所有经 FDA 批准的处方阿片类药物映射为等效的 RxNorm 概念唯一标识符(RxCUI)和 RxNorm 名称。然后提取了阿片类药物的不同名称,并在收集的超过 1490 万份 AE 报告中计算了它们的使用频率,这些报告涉及 13 个 FDA 批准的处方阿片类药物类别,报告时间长达 17 年。结果显示,在 FAERS 报告中,阿片类药物持续使用了大量不同的名称,在 92 个 FDA 批准的阿片类药物 RxNorm 名称中,有 2,086 个不同名称(共 7,892 个)至少使用了三次,842 个不同名称至少使用了十次。我们使用 RxNorm API 映射的方法被证实是高效、准确的,能够显著减少 FAERS AE 报告中处方阿片类药物名称的异质性;同时,它有望广泛应用于任何药物名称多样且未规范化的数据库中的不同药物名称集。它有望能够自动标准化和连接相同药物的不同表述,从而建立一个完整和高质量的数据库,用于各种研究,特别是药物警戒计划中的上市后数据分析。
{"title":"RxNorm for drug name normalization: a case study of prescription opioids in the FDA adverse events reporting system","authors":"Huyen Le, Ru Chen, Stephen Harris, Hong Fang, Beverly Lyn-Cook, H. Hong, W. Ge, Paul Rogers, Weida Tong, Wen Zou","doi":"10.3389/fbinf.2023.1328613","DOIUrl":"https://doi.org/10.3389/fbinf.2023.1328613","url":null,"abstract":"Numerous studies have been conducted on the US Food and Drug Administration (FDA) Adverse Events Reporting System (FAERS) database to assess post-marketing reporting rates for drug safety review and risk assessment. However, the drug names in the adverse event (AE) reports from FAERS were heterogeneous due to a lack of uniformity of information submitted mandatorily by pharmaceutical companies and voluntarily by patients, healthcare professionals, and the public. Studies using FAERS and other spontaneous reporting AEs database without drug name normalization may encounter incomplete collection of AE reports from non-standard drug names and the accuracies of the results might be impacted. In this study, we demonstrated applicability of RxNorm, developed by the National Library of Medicine, for drug name normalization in FAERS. Using prescription opioids as a case study, we used RxNorm application program interface (API) to map all FDA-approved prescription opioids described in FAERS AE reports to their equivalent RxNorm Concept Unique Identifiers (RxCUIs) and RxNorm names. The different names of the opioids were then extracted, and their usage frequencies were calculated in collection of more than 14.9 million AE reports for 13 FDA-approved prescription opioid classes, reported over 17 years. The results showed that a significant number of different names were consistently used for opioids in FAERS reports, with 2,086 different names (out of 7,892) used at least three times and 842 different names used at least ten times for each of the 92 RxNorm names of FDA-approved opioids. Our method of using RxNorm API mapping was confirmed to be efficient and accurate and capable of reducing the heterogeneity of prescription opioid names significantly in the AE reports in FAERS; meanwhile, it is expected to have a broad application to different sets of drug names from any database where drug names are diverse and unnormalized. It is expected to be able to automatically standardize and link different representations of the same drugs to build an intact and high-quality database for diverse research, particularly postmarketing data analysis in pharmacovigilance initiatives.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139383606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Editorial: Expert opinions in protein bioinformatics: 2022 社论:蛋白质生物信息学专家意见:2022 年
Pub Date : 2024-01-05 DOI: 10.3389/fbinf.2023.1338560
Daisuke Kihara
{"title":"Editorial: Expert opinions in protein bioinformatics: 2022","authors":"Daisuke Kihara","doi":"10.3389/fbinf.2023.1338560","DOIUrl":"https://doi.org/10.3389/fbinf.2023.1338560","url":null,"abstract":"","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139383582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Genomic risk prediction of cardiovascular diseases among type 2 diabetes patients in the UK Biobank 英国生物库中 2 型糖尿病患者心血管疾病的基因组风险预测
Pub Date : 2024-01-04 DOI: 10.3389/fbinf.2023.1320748
Yixuan Ye, Jiaqi Hu, Fuyuan Pang, Can Cui, Hongyu Zhao
Background: Polygenic risk score (PRS) has proved useful in predicting the risk of cardiovascular diseases (CVD) based on the genotypes of an individual, but most analyses have focused on disease onset in the general population. The usefulness of PRS to predict CVD risk among type 2 diabetes (T2D) patients remains unclear.Methods: We built a meta-PRSCVD upon the candidate PRSs developed from state-of-the-art PRS methods for three CVD subtypes of significant importance: coronary artery disease (CAD), ischemic stroke (IS), and heart failure (HF). To evaluate the prediction performance of the meta-PRSCVD, we restricted our analysis to 21,092 white British T2D patients in the UK Biobank, among which 4,015 had CVD events.Results: Results showed that the meta-PRSCVD was significantly associated with CVD risk with a hazard ratio per standard deviation increase of 1.28 (95% CI: 1.23–1.33). The meta-PRSCVD alone predicted the CVD incidence with an area under the receiver operating characteristic curve (AUC) of 0.57 (95% CI: 0.54–0.59). When restricted to the early-onset patients (onset age ≤ 55), the AUC was further increased to 0.61 (95% CI 0.56–0.67).Conclusion: Our results highlight the potential role of genomic screening for secondary preventions of CVD among T2D patients, especially among early-onset patients.
背景:多基因风险评分(PRS)已被证明有助于根据个体的基因型预测心血管疾病(CVD)的风险,但大多数分析都侧重于普通人群的发病情况。PRS对预测2型糖尿病(T2D)患者心血管疾病风险的有用性仍不清楚:我们根据最先进的 PRS 方法为三种重要的心血管疾病亚型(冠状动脉疾病 (CAD)、缺血性中风 (IS) 和心力衰竭 (HF))开发的候选 PRS 建立了元 PRSCVD。为了评估元 PRSCVD 的预测性能,我们将分析对象限定为英国生物库中的 21,092 名英国白人 T2D 患者,其中 4,015 人发生了心血管疾病事件:结果显示,meta-PRSCVD 与心血管疾病风险显著相关,每标准差增加的危险比为 1.28(95% CI:1.23-1.33)。元-PRSCVD单独预测心血管疾病发病率的接收者操作特征曲线下面积(AUC)为0.57(95% CI:0.54-0.59)。如果仅限于早发患者(发病年龄≤55岁),则AUC进一步增加到0.61(95% CI 0.56-0.67):我们的研究结果凸显了基因组筛查在T2D患者,尤其是早发症患者心血管疾病二级预防中的潜在作用。
{"title":"Genomic risk prediction of cardiovascular diseases among type 2 diabetes patients in the UK Biobank","authors":"Yixuan Ye, Jiaqi Hu, Fuyuan Pang, Can Cui, Hongyu Zhao","doi":"10.3389/fbinf.2023.1320748","DOIUrl":"https://doi.org/10.3389/fbinf.2023.1320748","url":null,"abstract":"Background: Polygenic risk score (PRS) has proved useful in predicting the risk of cardiovascular diseases (CVD) based on the genotypes of an individual, but most analyses have focused on disease onset in the general population. The usefulness of PRS to predict CVD risk among type 2 diabetes (T2D) patients remains unclear.Methods: We built a meta-PRSCVD upon the candidate PRSs developed from state-of-the-art PRS methods for three CVD subtypes of significant importance: coronary artery disease (CAD), ischemic stroke (IS), and heart failure (HF). To evaluate the prediction performance of the meta-PRSCVD, we restricted our analysis to 21,092 white British T2D patients in the UK Biobank, among which 4,015 had CVD events.Results: Results showed that the meta-PRSCVD was significantly associated with CVD risk with a hazard ratio per standard deviation increase of 1.28 (95% CI: 1.23–1.33). The meta-PRSCVD alone predicted the CVD incidence with an area under the receiver operating characteristic curve (AUC) of 0.57 (95% CI: 0.54–0.59). When restricted to the early-onset patients (onset age ≤ 55), the AUC was further increased to 0.61 (95% CI 0.56–0.67).Conclusion: Our results highlight the potential role of genomic screening for secondary preventions of CVD among T2D patients, especially among early-onset patients.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139384606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identifying epigenetic aging moderators using the epigenetic pacemaker 利用表观遗传起搏器识别表观遗传衰老调节器
Pub Date : 2024-01-03 DOI: 10.3389/fbinf.2023.1308680
Colin Farrell, Chanyue Hu, Kalsuda Lapborisuth, Kyle Pu, S. Snir, Matteo Pellegrini
Epigenetic clocks are DNA methylation-based chronological age prediction models that are commonly employed to study age-related biology. The difference between the predicted and observed age is often interpreted as a form of biological age acceleration, and many studies have measured the impact of environmental and disease-associated factors on epigenetic age. Most epigenetic clocks are fit using approaches that minimize the error between the predicted and observed chronological age, and as a result, they may not accurately model the impact of factors that moderate the relationship between the actual and epigenetic age. Here, we compare epigenetic clocks that are constructed using penalized regression methods to an evolutionary framework of epigenetic aging with the epigenetic pacemaker (EPM), which directly models DNA methylation as a function of a time-dependent epigenetic state. In simulations, we show that the value of the epigenetic state is impacted by factors such as age, sex, and cell-type composition. Next, in a dataset aggregated from previous studies, we show that the epigenetic state is also moderated by sex and the cell type. Finally, we demonstrate that the epigenetic state is also moderated by toxins in a study on polybrominated biphenyl exposure. Thus, we find that the pacemaker provides a robust framework for the study of factors that impact epigenetic age acceleration and that the effect of these factors may be obscured in traditional clocks based on linear regression models.
表观遗传时钟是基于 DNA 甲基化的年代年龄预测模型,通常用于研究与年龄相关的生物学。预测年龄与观察年龄之间的差异通常被解释为一种生物年龄加速,许多研究已经测量了环境和疾病相关因素对表观遗传年龄的影响。大多数表观遗传时钟的拟合方法是尽量减小预测年龄与观察年龄之间的误差,因此,它们可能无法准确模拟缓和实际年龄与表观遗传年龄之间关系的因素的影响。在这里,我们将使用惩罚回归方法构建的表观遗传时钟与表观遗传起搏器(EPM)的表观遗传衰老进化框架进行了比较,EPM直接将DNA甲基化模拟为随时间变化的表观遗传状态的函数。在模拟中,我们发现表观遗传状态的值受年龄、性别和细胞类型组成等因素的影响。接下来,在一个由以往研究汇总而成的数据集中,我们表明表观遗传状态也受性别和细胞类型的影响。最后,我们在一项关于多溴联苯暴露的研究中证明,表观遗传状态也受毒素的影响。因此,我们发现起搏器为研究影响表观遗传年龄加速的因素提供了一个稳健的框架,而这些因素的影响可能会被基于线性回归模型的传统时钟所掩盖。
{"title":"Identifying epigenetic aging moderators using the epigenetic pacemaker","authors":"Colin Farrell, Chanyue Hu, Kalsuda Lapborisuth, Kyle Pu, S. Snir, Matteo Pellegrini","doi":"10.3389/fbinf.2023.1308680","DOIUrl":"https://doi.org/10.3389/fbinf.2023.1308680","url":null,"abstract":"Epigenetic clocks are DNA methylation-based chronological age prediction models that are commonly employed to study age-related biology. The difference between the predicted and observed age is often interpreted as a form of biological age acceleration, and many studies have measured the impact of environmental and disease-associated factors on epigenetic age. Most epigenetic clocks are fit using approaches that minimize the error between the predicted and observed chronological age, and as a result, they may not accurately model the impact of factors that moderate the relationship between the actual and epigenetic age. Here, we compare epigenetic clocks that are constructed using penalized regression methods to an evolutionary framework of epigenetic aging with the epigenetic pacemaker (EPM), which directly models DNA methylation as a function of a time-dependent epigenetic state. In simulations, we show that the value of the epigenetic state is impacted by factors such as age, sex, and cell-type composition. Next, in a dataset aggregated from previous studies, we show that the epigenetic state is also moderated by sex and the cell type. Finally, we demonstrate that the epigenetic state is also moderated by toxins in a study on polybrominated biphenyl exposure. Thus, we find that the pacemaker provides a robust framework for the study of factors that impact epigenetic age acceleration and that the effect of these factors may be obscured in traditional clocks based on linear regression models.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139451050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving fluorescence lifetime imaging microscopy phasor accuracy using convolutional neural networks 利用卷积神经网络提高荧光寿命成像显微镜相位精度
Pub Date : 2023-12-22 DOI: 10.3389/fbinf.2023.1335413
Varun Mannam, Jacob P. Brandt, Cody J. Smith, Xiaotong Yuan, S. Howard
Introduction: Although a powerful biological imaging technique, fluorescence lifetime imaging microscopy (FLIM) faces challenges such as a slow acquisition rate, a low signal-to-noise ratio (SNR), and high cost and complexity. To address the fundamental problem of low SNR in FLIM images, we demonstrate how to use pre-trained convolutional neural networks (CNNs) to reduce noise in FLIM measurements.Methods: Our approach uses pre-learned models that have been previously validated on large datasets with different distributions than the training datasets, such as sample structures, noise distributions, and microscopy modalities in fluorescence microscopy, to eliminate the need to train a neural network from scratch or to acquire a large training dataset to denoise FLIM data. In addition, we are using the pre-trained networks in the inference stage, where the computation time is in milliseconds and accuracy is better than traditional denoising methods. To separate different fluorophores in lifetime images, the denoised images are then run through an unsupervised machine learning technique named “K-means clustering”.Results and Discussion: The results of the experiments carried out on in vivo mouse kidney tissue, Bovine pulmonary artery endothelial (BPAE) fixed cells that have been fluorescently labeled, and mouse kidney fixed samples that have been fluorescently labeled show that our demonstrated method can effectively remove noise from FLIM images and improve segmentation accuracy. Additionally, the performance of our method on out-of-distribution highly scattering in vivo plant samples shows that it can also improve SNR in challenging imaging conditions. Our proposed method provides a fast and accurate way to segment fluorescence lifetime images captured using any FLIM system. It is especially effective for separating fluorophores in noisy FLIM images, which is common in in vivo imaging where averaging is not applicable. Our approach significantly improves the identification of vital biologically relevant structures in biomedical imaging applications.
引言:荧光寿命成像显微镜(FLIM)虽然是一种强大的生物成像技术,但却面临着采集速度慢、信噪比(SNR)低、成本高且复杂等挑战。为了解决荧光寿命成像图像信噪比低这一根本问题,我们展示了如何使用预训练的卷积神经网络(CNN)来降低荧光寿命成像测量中的噪声:我们的方法使用预先学习的模型,这些模型之前已在大型数据集上进行过验证,这些数据集的分布与训练数据集不同,例如荧光显微镜中的样本结构、噪声分布和显微镜模式,因此无需从头开始训练神经网络,也无需获取大型训练数据集来对 FLIM 数据进行去噪处理。此外,我们还在推理阶段使用预先训练好的网络,其计算时间仅为毫秒级,准确性却优于传统的去噪方法。为了分离生命周期图像中的不同荧光团,去噪后的图像将通过一种名为 "K-means 聚类 "的无监督机器学习技术进行处理:在活体小鼠肾脏组织、已荧光标记的牛肺动脉内皮(BPAE)固定细胞和已荧光标记的小鼠肾脏固定样本上进行的实验结果表明,我们展示的方法能有效去除 FLIM 图像中的噪声,并提高分割精度。此外,我们的方法在分布外高散射活体植物样本上的表现表明,它还能在具有挑战性的成像条件下提高信噪比。我们提出的方法提供了一种快速、准确的方法来分割使用任何 FLIM 系统捕获的荧光寿命图像。这种方法对分离嘈杂 FLIM 图像中的荧光团特别有效,而这种情况在不适用平均法的活体成像中很常见。我们的方法大大提高了生物医学成像应用中重要生物相关结构的识别能力。
{"title":"Improving fluorescence lifetime imaging microscopy phasor accuracy using convolutional neural networks","authors":"Varun Mannam, Jacob P. Brandt, Cody J. Smith, Xiaotong Yuan, S. Howard","doi":"10.3389/fbinf.2023.1335413","DOIUrl":"https://doi.org/10.3389/fbinf.2023.1335413","url":null,"abstract":"Introduction: Although a powerful biological imaging technique, fluorescence lifetime imaging microscopy (FLIM) faces challenges such as a slow acquisition rate, a low signal-to-noise ratio (SNR), and high cost and complexity. To address the fundamental problem of low SNR in FLIM images, we demonstrate how to use pre-trained convolutional neural networks (CNNs) to reduce noise in FLIM measurements.Methods: Our approach uses pre-learned models that have been previously validated on large datasets with different distributions than the training datasets, such as sample structures, noise distributions, and microscopy modalities in fluorescence microscopy, to eliminate the need to train a neural network from scratch or to acquire a large training dataset to denoise FLIM data. In addition, we are using the pre-trained networks in the inference stage, where the computation time is in milliseconds and accuracy is better than traditional denoising methods. To separate different fluorophores in lifetime images, the denoised images are then run through an unsupervised machine learning technique named “K-means clustering”.Results and Discussion: The results of the experiments carried out on in vivo mouse kidney tissue, Bovine pulmonary artery endothelial (BPAE) fixed cells that have been fluorescently labeled, and mouse kidney fixed samples that have been fluorescently labeled show that our demonstrated method can effectively remove noise from FLIM images and improve segmentation accuracy. Additionally, the performance of our method on out-of-distribution highly scattering in vivo plant samples shows that it can also improve SNR in challenging imaging conditions. Our proposed method provides a fast and accurate way to segment fluorescence lifetime images captured using any FLIM system. It is especially effective for separating fluorophores in noisy FLIM images, which is common in in vivo imaging where averaging is not applicable. Our approach significantly improves the identification of vital biologically relevant structures in biomedical imaging applications.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138944777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Attention network for predicting T-cell receptor-peptide binding can associate attention with interpretable protein structural properties. 用于预测 T 细胞受体与多肽结合的注意力网络可将注意力与可解释的蛋白质结构特性联系起来。
Pub Date : 2023-12-18 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1274599
Kyohei Koyama, Kosuke Hashimoto, Chioko Nagao, Kenji Mizuguchi

Understanding how a T-cell receptor (TCR) recognizes its specific ligand peptide is crucial for gaining an insight into biological functions and disease mechanisms. Despite its importance, experimentally determining TCR-peptide-major histocompatibility complex (TCR-pMHC) interactions is expensive and time-consuming. To address this challenge, computational methods have been proposed, but they are typically evaluated by internal retrospective validation only, and few researchers have incorporated and tested an attention layer from language models into structural information. Therefore, in this study, we developed a machine learning model based on a modified version of Transformer, a source-target attention neural network, to predict the TCR-pMHC interaction solely from the amino acid sequences of the TCR complementarity-determining region (CDR) 3 and the peptide. This model achieved competitive performance on a benchmark dataset of the TCR-pMHC interaction, as well as on a truly new external dataset. Additionally, by analyzing the results of binding predictions, we associated the neural network weights with protein structural properties. By classifying the residues into large- and small-attention groups, we identified statistically significant properties associated with the largely attended residues such as hydrogen bonds within CDR3. The dataset that we created and the ability of our model to provide an interpretable prediction of TCR-peptide binding should increase our knowledge about molecular recognition and pave the way for designing new therapeutics.

了解 T 细胞受体(TCR)如何识别其特定的配体肽对于深入了解生物功能和疾病机制至关重要。尽管很重要,但通过实验确定 TCR-肽-主要组织相容性复合体(TCR-pMHC)之间的相互作用既昂贵又耗时。为了应对这一挑战,人们提出了一些计算方法,但这些方法通常只通过内部回顾验证进行评估,很少有研究人员将语言模型的注意力层纳入结构信息并进行测试。因此,在本研究中,我们开发了一种基于源-目标注意神经网络 Transformer 改进版的机器学习模型,仅从 TCR 互补性决定区(CDR)3 和多肽的氨基酸序列预测 TCR-pMHC 相互作用。该模型在TCR-pMHC相互作用的基准数据集以及全新的外部数据集上都取得了具有竞争力的性能。此外,通过分析结合预测的结果,我们将神经网络权重与蛋白质结构特性联系起来。通过将残基分为大关注度组和小关注度组,我们发现了与大关注度残基(如 CDR3 中的氢键)相关的具有统计学意义的特性。我们创建的数据集和我们的模型能够提供可解释的 TCR 肽结合预测,这将增加我们对分子识别的了解,并为设计新疗法铺平道路。
{"title":"Attention network for predicting T-cell receptor-peptide binding can associate attention with interpretable protein structural properties.","authors":"Kyohei Koyama, Kosuke Hashimoto, Chioko Nagao, Kenji Mizuguchi","doi":"10.3389/fbinf.2023.1274599","DOIUrl":"10.3389/fbinf.2023.1274599","url":null,"abstract":"<p><p>Understanding how a T-cell receptor (TCR) recognizes its specific ligand peptide is crucial for gaining an insight into biological functions and disease mechanisms. Despite its importance, experimentally determining TCR-peptide-major histocompatibility complex (TCR-pMHC) interactions is expensive and time-consuming. To address this challenge, computational methods have been proposed, but they are typically evaluated by internal retrospective validation only, and few researchers have incorporated and tested an attention layer from language models into structural information. Therefore, in this study, we developed a machine learning model based on a modified version of Transformer, a source-target attention neural network, to predict the TCR-pMHC interaction solely from the amino acid sequences of the TCR complementarity-determining region (CDR) 3 and the peptide. This model achieved competitive performance on a benchmark dataset of the TCR-pMHC interaction, as well as on a truly new external dataset. Additionally, by analyzing the results of binding predictions, we associated the neural network weights with protein structural properties. By classifying the residues into large- and small-attention groups, we identified statistically significant properties associated with the largely attended residues such as hydrogen bonds within CDR3. The dataset that we created and the ability of our model to provide an interpretable prediction of TCR-peptide binding should increase our knowledge about molecular recognition and pave the way for designing new therapeutics.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10759225/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139089614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient semi-supervised semantic segmentation of electron microscopy cancer images with sparse annotations. 利用稀疏注释对电子显微镜癌症图像进行高效的半监督语义分割。
Pub Date : 2023-12-15 eCollection Date: 2023-01-01 DOI: 10.3389/fbinf.2023.1308707
Lucas Pagano, Guillaume Thibault, Walid Bousselham, Jessica L Riesterer, Xubo Song, Joe W Gray

Electron microscopy (EM) enables imaging at a resolution of nanometers and can shed light on how cancer evolves to develop resistance to therapy. Acquiring these images has become a routine task.However, analyzing them is now a bottleneck, as manual structure identification is very time-consuming and can take up to several months for a single sample. Deep learning approaches offer a suitable solution to speed up the analysis. In this work, we present a study of several state-of-the-art deep learning models for the task of segmenting nuclei and nucleoli in volumes from tumor biopsies. We compared previous results obtained with the ResUNet architecture to the more recent UNet++, FracTALResNet, SenFormer, and CEECNet models. In addition, we explored the utilization of unlabeled images through semi-supervised learning with Cross Pseudo Supervision. We have trained and evaluated all of the models on sparse manual labels from three fully annotated in-house datasets that we have made available on demand, demonstrating improvements in terms of 3D Dice score. From the analysis of these results, we drew conclusions on the relative gains of using more complex models, and semi-supervised learning as well as the next steps for the mitigation of the manual segmentation bottleneck.

电子显微镜(EM)能以纳米级的分辨率成像,并能揭示癌症是如何演变成抗药性的。然而,分析这些图像现在却遇到了瓶颈,因为人工结构识别非常耗时,一个样本可能需要几个月的时间。深度学习方法为加快分析速度提供了合适的解决方案。在这项工作中,我们针对肿瘤活检样本中的细胞核和核小体分割任务,对几种最先进的深度学习模型进行了研究。我们将以前使用 ResUNet 架构获得的结果与最新的 UNet++、FracTALResNet、SenFormer 和 CEECNet 模型进行了比较。此外,我们还通过交叉伪监督(Cross Pseudo Supervision)进行半监督学习,探索了如何利用无标记图像。我们在三个完全标注的内部数据集上对所有模型进行了稀疏人工标注的训练和评估,结果表明这些模型在 3D Dice 分数方面都有所改进。通过对这些结果的分析,我们得出了使用更复杂模型和半监督学习的相对收益结论,以及缓解人工分割瓶颈的下一步措施。
{"title":"Efficient semi-supervised semantic segmentation of electron microscopy cancer images with sparse annotations.","authors":"Lucas Pagano, Guillaume Thibault, Walid Bousselham, Jessica L Riesterer, Xubo Song, Joe W Gray","doi":"10.3389/fbinf.2023.1308707","DOIUrl":"10.3389/fbinf.2023.1308707","url":null,"abstract":"<p><p>Electron microscopy (EM) enables imaging at a resolution of nanometers and can shed light on how cancer evolves to develop resistance to therapy. Acquiring these images has become a routine task.However, analyzing them is now a bottleneck, as manual structure identification is very time-consuming and can take up to several months for a single sample. Deep learning approaches offer a suitable solution to speed up the analysis. In this work, we present a study of several state-of-the-art deep learning models for the task of segmenting nuclei and nucleoli in volumes from tumor biopsies. We compared previous results obtained with the ResUNet architecture to the more recent UNet++, FracTALResNet, SenFormer, and CEECNet models. In addition, we explored the utilization of unlabeled images through semi-supervised learning with Cross Pseudo Supervision. We have trained and evaluated all of the models on sparse manual labels from three fully annotated in-house datasets that we have made available on demand, demonstrating improvements in terms of 3D Dice score. From the analysis of these results, we drew conclusions on the relative gains of using more complex models, and semi-supervised learning as well as the next steps for the mitigation of the manual segmentation bottleneck.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10757843/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139076063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Segmentation of cellular ultrastructures on sparsely labeled 3D electron microscopy images using deep learning 利用深度学习在稀疏标记的三维电子显微镜图像上分割细胞超微结构
Pub Date : 2023-12-15 DOI: 10.3389/fbinf.2023.1308708
Archana Machireddy, Guillaume Thibault, Kevin G. Loftis, Kevin Stoltz, Cecilia Bueno, Hannah R. Smith, J. Riesterer, Joe W. Gray, Xubo Song
Focused ion beam-scanning electron microscopy (FIB-SEM) images can provide a detailed view of the cellular ultrastructure of tumor cells. A deeper understanding of their organization and interactions can shed light on cancer mechanisms and progression. However, the bottleneck in the analysis is the delineation of the cellular structures to enable quantitative measurements and analysis. We mitigated this limitation using deep learning to segment cells and subcellular ultrastructure in 3D FIB-SEM images of tumor biopsies obtained from patients with metastatic breast and pancreatic cancers. The ultrastructures, such as nuclei, nucleoli, mitochondria, endosomes, and lysosomes, are relatively better defined than their surroundings and can be segmented with high accuracy using a neural network trained with sparse manual labels. Cell segmentation, on the other hand, is much more challenging due to the lack of clear boundaries separating cells in the tissue. We adopted a multi-pronged approach combining detection, boundary propagation, and tracking for cell segmentation. Specifically, a neural network was employed to detect the intracellular space; optical flow was used to propagate cell boundaries across the z-stack from the nearest ground truth image in order to facilitate the separation of individual cells; finally, the filopodium-like protrusions were tracked to the main cells by calculating the intersection over union measure for all regions detected in consecutive images along z-stack and connecting regions with maximum overlap. The proposed cell segmentation methodology resulted in an average Dice score of 0.93. For nuclei, nucleoli, and mitochondria, the segmentation achieved Dice scores of 0.99, 0.98, and 0.86, respectively. The segmentation of FIB-SEM images will enable interpretative rendering and provide quantitative image features to be associated with relevant clinical variables.
聚焦离子束扫描电子显微镜(FIB-SEM)图像可提供肿瘤细胞超微结构的详细视图。深入了解肿瘤细胞的组织结构和相互作用可以揭示癌症的发生机制和发展过程。然而,分析的瓶颈在于细胞结构的划分,以便进行定量测量和分析。我们利用深度学习,在转移性乳腺癌和胰腺癌患者的肿瘤活检组织的三维 FIB-SEM 图像中分割细胞和亚细胞超微结构,从而缓解了这一限制。细胞核、核小叶、线粒体、内体和溶酶体等超微结构的定义相对于其周围环境要好得多,因此可以使用使用稀疏人工标签训练的神经网络进行高精度分割。另一方面,由于组织中的细胞缺乏清晰的分界,细胞分割的难度要大得多。我们采用了一种多管齐下的方法,将检测、边界传播和跟踪结合起来进行细胞分割。具体来说,我们采用了神经网络来检测细胞内空间;利用光流从最近的地面实况图像出发,在z-stack上传播细胞边界,以促进单个细胞的分离;最后,通过计算z-stack上连续图像中检测到的所有区域的交集大于联合度量,并将重叠度最大的区域连接起来,将丝状突起追踪到主细胞。所提出的细胞分割方法的平均 Dice 得分为 0.93。对于细胞核、核小球和线粒体,分割的 Dice 分数分别为 0.99、0.98 和 0.86。对 FIB-SEM 图像进行分割后,就能进行解释性渲染,并提供与相关临床变量相关联的定量图像特征。
{"title":"Segmentation of cellular ultrastructures on sparsely labeled 3D electron microscopy images using deep learning","authors":"Archana Machireddy, Guillaume Thibault, Kevin G. Loftis, Kevin Stoltz, Cecilia Bueno, Hannah R. Smith, J. Riesterer, Joe W. Gray, Xubo Song","doi":"10.3389/fbinf.2023.1308708","DOIUrl":"https://doi.org/10.3389/fbinf.2023.1308708","url":null,"abstract":"Focused ion beam-scanning electron microscopy (FIB-SEM) images can provide a detailed view of the cellular ultrastructure of tumor cells. A deeper understanding of their organization and interactions can shed light on cancer mechanisms and progression. However, the bottleneck in the analysis is the delineation of the cellular structures to enable quantitative measurements and analysis. We mitigated this limitation using deep learning to segment cells and subcellular ultrastructure in 3D FIB-SEM images of tumor biopsies obtained from patients with metastatic breast and pancreatic cancers. The ultrastructures, such as nuclei, nucleoli, mitochondria, endosomes, and lysosomes, are relatively better defined than their surroundings and can be segmented with high accuracy using a neural network trained with sparse manual labels. Cell segmentation, on the other hand, is much more challenging due to the lack of clear boundaries separating cells in the tissue. We adopted a multi-pronged approach combining detection, boundary propagation, and tracking for cell segmentation. Specifically, a neural network was employed to detect the intracellular space; optical flow was used to propagate cell boundaries across the z-stack from the nearest ground truth image in order to facilitate the separation of individual cells; finally, the filopodium-like protrusions were tracked to the main cells by calculating the intersection over union measure for all regions detected in consecutive images along z-stack and connecting regions with maximum overlap. The proposed cell segmentation methodology resulted in an average Dice score of 0.93. For nuclei, nucleoli, and mitochondria, the segmentation achieved Dice scores of 0.99, 0.98, and 0.86, respectively. The segmentation of FIB-SEM images will enable interpretative rendering and provide quantitative image features to be associated with relevant clinical variables.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138997115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Completing a molecular timetree of apes and monkeys 完成猿猴的分子时间树
Pub Date : 2023-12-15 DOI: 10.3389/fbinf.2023.1284744
Jack M Craig, Grace L. Bamba, Jose Barba-Montoya, S. Hedges, Sudhir Kumar, Sankar Subramanian, Yuanning Li, Gagandeep Singh
The primate infraorder Simiiformes, comprising Old and New World monkeys and apes, includes the most well-studied species on earth. Their most comprehensive molecular timetree, assembled from thousands of published studies, is found in the TimeTree database and contains 268 simiiform species. It is, however, missing 38 out of 306 named species in the NCBI taxonomy for which at least one molecular sequence exists in the NCBI GenBank. We developed a three-pronged approach to expanding the timetree of Simiiformes to contain 306 species. First, molecular divergence times were searched and found for 21 missing species in timetrees published across 15 studies. Second, untimed molecular phylogenies were searched and scaled to time using relaxed clocks to add four more species. Third, we reconstructed ten new timetrees from genetic data in GenBank, allowing us to incorporate 13 more species. Finally, we assembled the most comprehensive molecular timetree of Simiiformes containing all 306 species for which any molecular data exists. We compared the species divergence times with those previously imputed using statistical approaches in the absence of molecular data. The latter data-less imputed times were not significantly correlated with those derived from the molecular data. Also, using phylogenies containing imputed times produced different trends of evolutionary distinctiveness and speciation rates over time than those produced using the molecular timetree. These results demonstrate that more complete clade-specific timetrees can be produced by analyzing existing information, which we hope will encourage future efforts to fill in the missing taxa in the global timetree of life.
灵长目猿亚目包括新旧世界的猴类和猿类,是地球上研究最深入的物种。时间树数据库(TimeTree)中包含了 268 个猿形目物种,这是最全面的分子时间树,由数千项已发表的研究成果组合而成。然而,在美国国家生物信息局(NCBI)分类学中的 306 个命名物种中,有 38 个物种的分子序列至少存在于 NCBI GenBank 中,而这 38 个物种的分子序列却缺失了。我们开发了一种三管齐下的方法来扩展蚋形目时间树,使其包含 306 个物种。首先,我们搜索了 15 项研究发表的时间树中 21 个缺失物种的分子分歧时间。其次,利用松弛时钟搜索未定时的分子系统发生并按时间缩放,从而增加了 4 个物种。第三,我们根据 GenBank 中的基因数据重建了 10 个新的时间树,从而又增加了 13 个物种。最后,我们建立了最全面的蚋形目分子时间树,其中包含了有分子数据的所有 306 个物种。我们将物种分歧时间与之前在没有分子数据的情况下使用统计方法推算出的物种分歧时间进行了比较。后者的无数据推算时间与分子数据推算时间的相关性不大。此外,使用含有推算时间的系统进化论与使用分子时间树得出的进化独特性和物种分化率随时间变化的趋势不同。这些结果表明,通过分析现有信息可以生成更完整的特定支系时间树,我们希望这将鼓励未来填补全球生命时间树中缺失类群的努力。
{"title":"Completing a molecular timetree of apes and monkeys","authors":"Jack M Craig, Grace L. Bamba, Jose Barba-Montoya, S. Hedges, Sudhir Kumar, Sankar Subramanian, Yuanning Li, Gagandeep Singh","doi":"10.3389/fbinf.2023.1284744","DOIUrl":"https://doi.org/10.3389/fbinf.2023.1284744","url":null,"abstract":"The primate infraorder Simiiformes, comprising Old and New World monkeys and apes, includes the most well-studied species on earth. Their most comprehensive molecular timetree, assembled from thousands of published studies, is found in the TimeTree database and contains 268 simiiform species. It is, however, missing 38 out of 306 named species in the NCBI taxonomy for which at least one molecular sequence exists in the NCBI GenBank. We developed a three-pronged approach to expanding the timetree of Simiiformes to contain 306 species. First, molecular divergence times were searched and found for 21 missing species in timetrees published across 15 studies. Second, untimed molecular phylogenies were searched and scaled to time using relaxed clocks to add four more species. Third, we reconstructed ten new timetrees from genetic data in GenBank, allowing us to incorporate 13 more species. Finally, we assembled the most comprehensive molecular timetree of Simiiformes containing all 306 species for which any molecular data exists. We compared the species divergence times with those previously imputed using statistical approaches in the absence of molecular data. The latter data-less imputed times were not significantly correlated with those derived from the molecular data. Also, using phylogenies containing imputed times produced different trends of evolutionary distinctiveness and speciation rates over time than those produced using the molecular timetree. These results demonstrate that more complete clade-specific timetrees can be produced by analyzing existing information, which we hope will encourage future efforts to fill in the missing taxa in the global timetree of life.","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138999675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Frontiers in bioinformatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1