首页 > 最新文献

Bioinformatics最新文献

英文 中文
Correction to: "Retraction of: DeepCRISTL: deep transfer learning to predict CRISPR/Cas9 functional and endogenous on-target editing efficiency". 更正:“撤回:deepcrisstl:深度迁移学习预测CRISPR/Cas9功能和内源性靶向编辑效率”。
IF 5.8 3区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2023-09-02 DOI: 10.1093/bioinformatics/btad562
This is a correction to “Retraction of: DeepCRISTL: deep transfer learning to predict CRISPR/Cas9 functional and endogenous on-target editing efficiency”, Bioinformatics, Volume 39, Issue 7, July 2023, https://doi.org/10.1093/bioin formatics/btad412. The retraction notice text has been updated, because we have subsequently discovered that the authors did not receive the journal’s communications to them asking them to address the flaws. This correction does not change the outcome or decision to retract.
{"title":"Correction to: \"Retraction of: DeepCRISTL: deep transfer learning to predict CRISPR/Cas9 functional and endogenous on-target editing efficiency\".","authors":"","doi":"10.1093/bioinformatics/btad562","DOIUrl":"https://doi.org/10.1093/bioinformatics/btad562","url":null,"abstract":"This is a correction to “Retraction of: DeepCRISTL: deep transfer learning to predict CRISPR/Cas9 functional and endogenous on-target editing efficiency”, Bioinformatics, Volume 39, Issue 7, July 2023, https://doi.org/10.1093/bioin formatics/btad412. The retraction notice text has been updated, because we have subsequently discovered that the authors did not receive the journal’s communications to them asking them to address the flaws. This correction does not change the outcome or decision to retract.","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":"39 9","pages":""},"PeriodicalIF":5.8,"publicationDate":"2023-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10500088/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10264061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RNA 3D structure modeling by fragment assembly with small-angle X-ray scattering restraints. 基于小角度x射线散射约束的RNA片段组装三维结构建模。
IF 5.8 3区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2023-09-02 DOI: 10.1093/bioinformatics/btad527
Grzegorz Chojnowski, Rafał Zaborowski, Marcin Magnus, Sunandan Mukherjee, Janusz M Bujnicki

Summary: Structure determination is a key step in the functional characterization of many non-coding RNA molecules. High-resolution RNA 3D structure determination efforts, however, are not keeping up with the pace of discovery of new non-coding RNA sequences. This increases the importance of computational approaches and low-resolution experimental data, such as from the small-angle X-ray scattering experiments. We present RNA Masonry, a computer program and a web service for a fully automated modeling of RNA 3D structures. It assemblies RNA fragments into geometrically plausible models that meet user-provided secondary structure constraints, restraints on tertiary contacts, and small-angle X-ray scattering data. We illustrate the method description with detailed benchmarks and its application to structural studies of viral RNAs with SAXS restraints.

Availability and implementation: The program web server is available at http://iimcb.genesilico.pl/rnamasonry. The source code is available at https://gitlab.com/gchojnowski/rnamasonry.

摘要:结构测定是许多非编码RNA分子功能表征的关键步骤。然而,高分辨率RNA 3D结构测定的努力并没有跟上发现新的非编码RNA序列的步伐。这增加了计算方法和低分辨率实验数据的重要性,例如来自小角度x射线散射实验的数据。我们提出了RNA砌体,一个计算机程序和一个网络服务,用于RNA 3D结构的全自动建模。它将RNA片段组装成几何上合理的模型,以满足用户提供的二级结构约束、三级接触约束和小角度x射线散射数据。我们用详细的基准说明了方法描述,并将其应用于具有SAXS约束的病毒rna的结构研究。可用性和实现:程序web服务器可在http://iimcb.genesilico.pl/rnamasonry上获得。源代码可从https://gitlab.com/gchojnowski/rnamasonry获得。
{"title":"RNA 3D structure modeling by fragment assembly with small-angle X-ray scattering restraints.","authors":"Grzegorz Chojnowski,&nbsp;Rafał Zaborowski,&nbsp;Marcin Magnus,&nbsp;Sunandan Mukherjee,&nbsp;Janusz M Bujnicki","doi":"10.1093/bioinformatics/btad527","DOIUrl":"https://doi.org/10.1093/bioinformatics/btad527","url":null,"abstract":"<p><strong>Summary: </strong>Structure determination is a key step in the functional characterization of many non-coding RNA molecules. High-resolution RNA 3D structure determination efforts, however, are not keeping up with the pace of discovery of new non-coding RNA sequences. This increases the importance of computational approaches and low-resolution experimental data, such as from the small-angle X-ray scattering experiments. We present RNA Masonry, a computer program and a web service for a fully automated modeling of RNA 3D structures. It assemblies RNA fragments into geometrically plausible models that meet user-provided secondary structure constraints, restraints on tertiary contacts, and small-angle X-ray scattering data. We illustrate the method description with detailed benchmarks and its application to structural studies of viral RNAs with SAXS restraints.</p><p><strong>Availability and implementation: </strong>The program web server is available at http://iimcb.genesilico.pl/rnamasonry. The source code is available at https://gitlab.com/gchojnowski/rnamasonry.</p>","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":"39 9","pages":""},"PeriodicalIF":5.8,"publicationDate":"2023-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10474949/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10285624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Genome-wide multimediator analyses using the generalized Berk-Jones statistics with the composite test. 全基因组多介质分析使用广义伯克-琼斯统计与复合检验。
IF 5.8 3区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2023-09-02 DOI: 10.1093/bioinformatics/btad544
En-Yu Lai, Yen-Tsung Huang

Motivation: Mediation analysis is performed to evaluate the effects of a hypothetical causal mechanism that marks the progression from an exposure, through mediators, to an outcome. In the age of high-throughput technologies, it has become routine to assess numerous potential mechanisms at the genome or proteome scales. Alongside this, the necessity to address issues related to multiple testing has also arisen. In a sparse scenario where only a few genes or proteins are causally involved, conventional methods for assessing mediation effects lose statistical power because the composite null distribution behind this experiment cannot be attained. The power loss hence decreases the true mechanisms identified after multiple testing corrections. To fairly delineate a uniform distribution under the composite null, Huang (Genome-wide analyses of sparse mediation effects under composite null hypotheses. Ann Appl Stat 2019a;13:60-84; AoAS) proposed the composite test to provide adjusted P-values for single-mediator analyses.

Results: Our contribution is to extend the method to multimediator analyses, which are commonly encountered in genomic studies and also flexible to various biological interests. Using the generalized Berk-Jones statistics with the composite test, we proposed a multivariate approach that favors dense and diverse mediation effects, a decorrelation approach that favors sparse and consistent effects, and a hybrid approach that captures the edges of both approaches. Our analysis suite has been implemented as an R package MACtest. The utility is demonstrated by analyzing the lung adenocarcinoma datasets from The Cancer Genome Atlas and Clinical Proteomic Tumor Analysis Consortium. We further investigate the genes and networks whose expression may be regulated by smoking-induced epigenetic aberrations.

Availability and implementation: An R package MACtest is available on https://github.com/roqe/MACtest.

动机:进行中介分析是为了评估一个假设的因果机制的影响,该机制标志着从暴露,通过中介,到结果的进展。在高通量技术的时代,在基因组或蛋白质组尺度上评估许多潜在的机制已经成为常规。除此之外,解决与多重测试相关的问题的必要性也出现了。在只有少数基因或蛋白质参与的稀疏情况下,评估中介效应的传统方法失去了统计能力,因为无法获得该实验背后的复合零分布。因此,功率损耗降低了经过多次测试修正后确定的真实机制。为了公平地描述在复合零假设下的均匀分布,Huang (Genome-wide)分析了在复合零假设下的稀疏中介效应。Ann apple Stat 2019;13:60-84;AoAS)提出了复合检验,为单介质分析提供调整后的p值。结果:我们的贡献是将方法扩展到多介质分析,这在基因组研究中经常遇到,并且对各种生物学兴趣也很灵活。利用广义Berk-Jones统计和复合检验,我们提出了一种有利于密集和多样化中介效应的多元方法,一种有利于稀疏和一致效应的去相关方法,以及一种捕捉两种方法边缘的混合方法。我们的分析套件已经被实现为一个R包MACtest。通过分析来自癌症基因组图谱和临床蛋白质组学肿瘤分析联盟的肺腺癌数据集,证明了其实用性。我们进一步研究了可能受吸烟诱导的表观遗传畸变调控的基因和网络。可用性和实现:在https://github.com/roqe/MACtest上可以获得R包MACtest。
{"title":"Genome-wide multimediator analyses using the generalized Berk-Jones statistics with the composite test.","authors":"En-Yu Lai,&nbsp;Yen-Tsung Huang","doi":"10.1093/bioinformatics/btad544","DOIUrl":"https://doi.org/10.1093/bioinformatics/btad544","url":null,"abstract":"<p><strong>Motivation: </strong>Mediation analysis is performed to evaluate the effects of a hypothetical causal mechanism that marks the progression from an exposure, through mediators, to an outcome. In the age of high-throughput technologies, it has become routine to assess numerous potential mechanisms at the genome or proteome scales. Alongside this, the necessity to address issues related to multiple testing has also arisen. In a sparse scenario where only a few genes or proteins are causally involved, conventional methods for assessing mediation effects lose statistical power because the composite null distribution behind this experiment cannot be attained. The power loss hence decreases the true mechanisms identified after multiple testing corrections. To fairly delineate a uniform distribution under the composite null, Huang (Genome-wide analyses of sparse mediation effects under composite null hypotheses. Ann Appl Stat 2019a;13:60-84; AoAS) proposed the composite test to provide adjusted P-values for single-mediator analyses.</p><p><strong>Results: </strong>Our contribution is to extend the method to multimediator analyses, which are commonly encountered in genomic studies and also flexible to various biological interests. Using the generalized Berk-Jones statistics with the composite test, we proposed a multivariate approach that favors dense and diverse mediation effects, a decorrelation approach that favors sparse and consistent effects, and a hybrid approach that captures the edges of both approaches. Our analysis suite has been implemented as an R package MACtest. The utility is demonstrated by analyzing the lung adenocarcinoma datasets from The Cancer Genome Atlas and Clinical Proteomic Tumor Analysis Consortium. We further investigate the genes and networks whose expression may be regulated by smoking-induced epigenetic aberrations.</p><p><strong>Availability and implementation: </strong>An R package MACtest is available on https://github.com/roqe/MACtest.</p>","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":"39 9","pages":""},"PeriodicalIF":5.8,"publicationDate":"2023-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10500087/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10286120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction to: Online bias-aware disease module mining with ROBUST-Web. 更正:使用ROBUST-Web进行在线偏见感知疾病模块挖掘。
IF 5.8 3区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2023-09-02 DOI: 10.1093/bioinformatics/btad566
{"title":"Correction to: Online bias-aware disease module mining with ROBUST-Web.","authors":"","doi":"10.1093/bioinformatics/btad566","DOIUrl":"https://doi.org/10.1093/bioinformatics/btad566","url":null,"abstract":"","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":"39 9","pages":""},"PeriodicalIF":5.8,"publicationDate":"2023-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10502235/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10337420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GraphCpG: imputation of single-cell methylomes based on locus-aware neighboring subgraphs. GraphCpG:基于位点感知相邻子图的单细胞甲基组插补。
IF 5.8 3区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2023-09-02 DOI: 10.1093/bioinformatics/btad533
Yuzhong Deng, Jianxiong Tang, Jiyang Zhang, Jianxiao Zou, Que Zhu, Shicai Fan

Motivation: Single-cell DNA methylation sequencing can assay DNA methylation at single-cell resolution. However, incomplete coverage compromises related downstream analyses, outlining the importance of imputation techniques. With a rising number of cell samples in recent large datasets, scalable and efficient imputation models are critical to addressing the sparsity for genome-wide analyses.

Results: We proposed a novel graph-based deep learning approach to impute methylation matrices based on locus-aware neighboring subgraphs with locus-aware encoding orienting on one cell type. Merely using the CpGs methylation matrix, the obtained GraphCpG outperforms previous methods on datasets containing more than hundreds of cells and achieves competitive performance on smaller datasets, with subgraphs of predicted sites visualized by retrievable bipartite graphs. Besides better imputation performance with increasing cell number, it significantly reduces computation time and demonstrates improvement in downstream analysis.

Availability and implementation: The source code is freely available at https://github.com/yuzhong-deng/graphcpg.git.

动机:单细胞DNA甲基化测序可以以单细胞分辨率测定DNA甲基化。然而,不完全覆盖损害了相关的下游分析,概述了插补技术的重要性。随着最近大型数据集中细胞样本数量的增加,可扩展和高效的插补模型对于解决全基因组分析的稀疏性至关重要。结果:我们提出了一种新的基于图的深度学习方法,以基于位点感知相邻子图的甲基化矩阵,其中位点感知编码面向一种细胞类型。仅使用CpGs甲基化矩阵,所获得的GraphCpG在包含数百个以上细胞的数据集上优于以前的方法,并在较小的数据集中实现了竞争性能,预测位点的子图通过可检索的二分图可视化。除了随着细胞数量的增加而获得更好的插补性能外,它还显著减少了计算时间,并证明了下游分析的改进。可用性和实现:源代码可在https://github.com/yuzhong-deng/graphcpg.git.
{"title":"GraphCpG: imputation of single-cell methylomes based on locus-aware neighboring subgraphs.","authors":"Yuzhong Deng,&nbsp;Jianxiong Tang,&nbsp;Jiyang Zhang,&nbsp;Jianxiao Zou,&nbsp;Que Zhu,&nbsp;Shicai Fan","doi":"10.1093/bioinformatics/btad533","DOIUrl":"10.1093/bioinformatics/btad533","url":null,"abstract":"<p><strong>Motivation: </strong>Single-cell DNA methylation sequencing can assay DNA methylation at single-cell resolution. However, incomplete coverage compromises related downstream analyses, outlining the importance of imputation techniques. With a rising number of cell samples in recent large datasets, scalable and efficient imputation models are critical to addressing the sparsity for genome-wide analyses.</p><p><strong>Results: </strong>We proposed a novel graph-based deep learning approach to impute methylation matrices based on locus-aware neighboring subgraphs with locus-aware encoding orienting on one cell type. Merely using the CpGs methylation matrix, the obtained GraphCpG outperforms previous methods on datasets containing more than hundreds of cells and achieves competitive performance on smaller datasets, with subgraphs of predicted sites visualized by retrievable bipartite graphs. Besides better imputation performance with increasing cell number, it significantly reduces computation time and demonstrates improvement in downstream analysis.</p><p><strong>Availability and implementation: </strong>The source code is freely available at https://github.com/yuzhong-deng/graphcpg.git.</p>","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":" ","pages":""},"PeriodicalIF":5.8,"publicationDate":"2023-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10516632/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10121589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automated machine learning for genome wide association studies. 用于全基因组关联研究的自动化机器学习。
IF 5.8 3区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2023-09-02 DOI: 10.1093/bioinformatics/btad545
Kleanthi Lakiotaki, Zaharias Papadovasilakis, Vincenzo Lagani, Stefanos Fafalios, Paulos Charonyktakis, Michail Tsagris, Ioannis Tsamardinos

Motivation: Genome-wide association studies (GWAS) present several computational and statistical challenges for their data analysis, including knowledge discovery, interpretability, and translation to clinical practice.

Results: We develop, apply, and comparatively evaluate an automated machine learning (AutoML) approach, customized for genomic data that delivers reliable predictive and diagnostic models, the set of genetic variants that are important for predictions (called a biosignature), and an estimate of the out-of-sample predictive power. This AutoML approach discovers variants with higher predictive performance compared to standard GWAS methods, computes an individual risk prediction score, generalizes to new, unseen data, is shown to better differentiate causal variants from other highly correlated variants, and enhances knowledge discovery and interpretability by reporting multiple equivalent biosignatures.

Availability and implementation: Code for this study is available at: https://github.com/mensxmachina/autoML-GWAS. JADBio offers a free version at: https://jadbio.com/sign-up/. SNP data can be downloaded from the EGA repository (https://ega-archive.org/). PRS data are found at: https://www.aicrowd.com/challenges/opensnp-height-prediction. Simulation data to study population structure can be found at: https://easygwas.ethz.ch/data/public/dataset/view/1/.

动机:全基因组关联研究(GWAS)对其数据分析提出了一些计算和统计挑战,包括知识发现、可解释性和转化为临床实践。结果:我们开发、应用并比较评估了一种自动机器学习(AutoML)方法,该方法是为提供可靠预测和诊断模型的基因组数据定制的,对预测很重要的一组遗传变异(称为生物信号),以及对样本外预测能力的估计。与标准GWAS方法相比,这种AutoML方法发现具有更高预测性能的变体,计算个人风险预测得分,推广到新的、看不见的数据,被证明可以更好地区分因果变体和其他高度相关的变体,并通过报告多个等效的生物特征来增强知识发现和可解释性。可用性和实施:本研究的代码可在:https://github.com/mensxmachina/autoML-GWAS.JADBio提供免费版本,网址为:https://jadbio.com/sign-up/.SNP数据可从EGA存储库下载(https://ega-archive.org/)。PRS数据位于:https://www.aicrowd.com/challenges/opensnp-height-prediction.研究人口结构的模拟数据可在以下网址找到:https://easygwas.ethz.ch/data/public/dataset/view/1/.
{"title":"Automated machine learning for genome wide association studies.","authors":"Kleanthi Lakiotaki, Zaharias Papadovasilakis, Vincenzo Lagani, Stefanos Fafalios, Paulos Charonyktakis, Michail Tsagris, Ioannis Tsamardinos","doi":"10.1093/bioinformatics/btad545","DOIUrl":"10.1093/bioinformatics/btad545","url":null,"abstract":"<p><strong>Motivation: </strong>Genome-wide association studies (GWAS) present several computational and statistical challenges for their data analysis, including knowledge discovery, interpretability, and translation to clinical practice.</p><p><strong>Results: </strong>We develop, apply, and comparatively evaluate an automated machine learning (AutoML) approach, customized for genomic data that delivers reliable predictive and diagnostic models, the set of genetic variants that are important for predictions (called a biosignature), and an estimate of the out-of-sample predictive power. This AutoML approach discovers variants with higher predictive performance compared to standard GWAS methods, computes an individual risk prediction score, generalizes to new, unseen data, is shown to better differentiate causal variants from other highly correlated variants, and enhances knowledge discovery and interpretability by reporting multiple equivalent biosignatures.</p><p><strong>Availability and implementation: </strong>Code for this study is available at: https://github.com/mensxmachina/autoML-GWAS. JADBio offers a free version at: https://jadbio.com/sign-up/. SNP data can be downloaded from the EGA repository (https://ega-archive.org/). PRS data are found at: https://www.aicrowd.com/challenges/opensnp-height-prediction. Simulation data to study population structure can be found at: https://easygwas.ethz.ch/data/public/dataset/view/1/.</p>","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":" ","pages":""},"PeriodicalIF":5.8,"publicationDate":"2023-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10562960/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10161763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accessibility of covariance information creates vulnerability in Federated Learning frameworks. 协方差信息的可访问性在联合学习框架中造成漏洞。
IF 5.8 3区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2023-09-02 DOI: 10.1093/bioinformatics/btad531
Manuel Huth, Jonas Arruda, Roy Gusinow, Lorenzo Contento, Evelina Tacconelli, Jan Hasenauer

Motivation: Federated Learning (FL) is gaining traction in various fields as it enables integrative data analysis without sharing sensitive data, such as in healthcare. However, the risk of data leakage caused by malicious attacks must be considered. In this study, we introduce a novel attack algorithm that relies on being able to compute sample means, sample covariances, and construct known linearly independent vectors on the data owner side.

Results: We show that these basic functionalities, which are available in several established FL frameworks, are sufficient to reconstruct privacy-protected data. Additionally, the attack algorithm is robust to defense strategies that involve adding random noise. We demonstrate the limitations of existing frameworks and propose potential defense strategies analyzing the implications of using differential privacy. The novel insights presented in this study will aid in the improvement of FL frameworks.

Availability and implementation: The code examples are provided at GitHub (https://github.com/manuhuth/Data-Leakage-From-Covariances.git). The CNSIM1 dataset, which we used in the manuscript, is available within the DSData R package (https://github.com/datashield/DSData/tree/main/data).

动机:联合学习(FL)在各个领域都越来越受欢迎,因为它能够在不共享敏感数据的情况下进行综合数据分析,例如在医疗保健领域。但是,必须考虑恶意攻击导致数据泄露的风险。在这项研究中,我们介绍了一种新的攻击算法,该算法依赖于能够计算样本均值、样本协方差,并在数据所有者侧构造已知的线性无关向量。结果:我们表明,这些基本功能在几个已建立的FL框架中可用,足以重建受隐私保护的数据。此外,该攻击算法对涉及添加随机噪声的防御策略是鲁棒的。我们展示了现有框架的局限性,并提出了潜在的防御策略,分析了使用差异隐私的含义。本研究中提出的新见解将有助于FL框架的改进。可用性和实现:代码示例在GitHub上提供(https://github.com/manuhuth/Data-Leakage-From-Covariances.git)。我们在手稿中使用的CNSIM1数据集在DSData R包中可用(https://github.com/datashield/DSData/tree/main/data)。
{"title":"Accessibility of covariance information creates vulnerability in Federated Learning frameworks.","authors":"Manuel Huth,&nbsp;Jonas Arruda,&nbsp;Roy Gusinow,&nbsp;Lorenzo Contento,&nbsp;Evelina Tacconelli,&nbsp;Jan Hasenauer","doi":"10.1093/bioinformatics/btad531","DOIUrl":"10.1093/bioinformatics/btad531","url":null,"abstract":"<p><strong>Motivation: </strong>Federated Learning (FL) is gaining traction in various fields as it enables integrative data analysis without sharing sensitive data, such as in healthcare. However, the risk of data leakage caused by malicious attacks must be considered. In this study, we introduce a novel attack algorithm that relies on being able to compute sample means, sample covariances, and construct known linearly independent vectors on the data owner side.</p><p><strong>Results: </strong>We show that these basic functionalities, which are available in several established FL frameworks, are sufficient to reconstruct privacy-protected data. Additionally, the attack algorithm is robust to defense strategies that involve adding random noise. We demonstrate the limitations of existing frameworks and propose potential defense strategies analyzing the implications of using differential privacy. The novel insights presented in this study will aid in the improvement of FL frameworks.</p><p><strong>Availability and implementation: </strong>The code examples are provided at GitHub (https://github.com/manuhuth/Data-Leakage-From-Covariances.git). The CNSIM1 dataset, which we used in the manuscript, is available within the DSData R package (https://github.com/datashield/DSData/tree/main/data).</p>","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":" ","pages":""},"PeriodicalIF":5.8,"publicationDate":"2023-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10516515/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10176920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction to: Optimal adjustment sets for causal query estimation in partially observed biomolecular networks. 修正:部分观察到的生物分子网络中因果查询估计的最优调整集。
IF 5.8 3区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2023-09-02 DOI: 10.1093/bioinformatics/btad559
This is a correction to: Sara Mohammad-Taheri and others, Optimal adjustment sets for causal query estimation in partially observed biomolecular networks, Bioinformatics, Volume 39, Issue Supplement_1, June 2023, Pages i494–i503, https://doi. org/10.1093/bioinformatics/btad270 In the originally published version of this manuscript, the sixth author’s name was incorrectly spelled as Charles Taply Hoyt. It should be Charles Tapley Hoyt.
{"title":"Correction to: Optimal adjustment sets for causal query estimation in partially observed biomolecular networks.","authors":"","doi":"10.1093/bioinformatics/btad559","DOIUrl":"https://doi.org/10.1093/bioinformatics/btad559","url":null,"abstract":"This is a correction to: Sara Mohammad-Taheri and others, Optimal adjustment sets for causal query estimation in partially observed biomolecular networks, Bioinformatics, Volume 39, Issue Supplement_1, June 2023, Pages i494–i503, https://doi. org/10.1093/bioinformatics/btad270 In the originally published version of this manuscript, the sixth author’s name was incorrectly spelled as Charles Taply Hoyt. It should be Charles Tapley Hoyt.","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":"39 9","pages":""},"PeriodicalIF":5.8,"publicationDate":"2023-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10500085/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10262499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DrForna: visualization of cotranscriptional folding. DrForna:同转录折叠的可视化。
IF 4.4 3区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2023-09-02 DOI: 10.1093/bioinformatics/btad555
Anda Ramona Tănasie, Peter Kerpedjiev, Stefan Hammer, Stefan Badelt

Motivation: Understanding RNA folding at the level of secondary structures can give important insights concerning the function of a molecule. We are interested to learn how secondary structures change dynamically during transcription, as well as whether particular secondary structures form already during or only after transcription. While different approaches exist to simulate cotranscriptional folding, the current strategies for visualization are lagging behind. New, more suitable approaches are necessary to help with exploring the generated data from cotranscriptional folding simulations.

Results: We present DrForna, an interactive visualization app for viewing the time course of a cotranscriptional RNA folding simulation. Specifically, users can scroll along the time axis and see the population of structures that are present at any particular time point.

Availability and implementation: DrForna is a JavaScript project available on Github at https://github.com/ViennaRNA/drforna and deployed at https://viennarna.github.io/drforna.

动机从二级结构的层面了解 RNA 折叠,可以深入了解分子的功能。我们有兴趣了解二级结构在转录过程中是如何动态变化的,以及特定的二级结构是在转录过程中已经形成还是仅在转录后才形成。虽然有不同的方法可以模拟共转录折叠,但目前的可视化策略还比较落后。我们需要新的、更合适的方法来帮助探索共转录折叠模拟生成的数据:我们介绍了 DrForna,这是一款用于查看共转录 RNA 折叠模拟时间过程的交互式可视化应用程序。具体来说,用户可以沿时间轴滚动,查看任何特定时间点的结构群:DrForna 是一个 JavaScript 项目,可在 Github https://github.com/ViennaRNA/drforna 上获取,并已部署在 https://viennarna.github.io/drforna 上。
{"title":"DrForna: visualization of cotranscriptional folding.","authors":"Anda Ramona Tănasie, Peter Kerpedjiev, Stefan Hammer, Stefan Badelt","doi":"10.1093/bioinformatics/btad555","DOIUrl":"10.1093/bioinformatics/btad555","url":null,"abstract":"<p><strong>Motivation: </strong>Understanding RNA folding at the level of secondary structures can give important insights concerning the function of a molecule. We are interested to learn how secondary structures change dynamically during transcription, as well as whether particular secondary structures form already during or only after transcription. While different approaches exist to simulate cotranscriptional folding, the current strategies for visualization are lagging behind. New, more suitable approaches are necessary to help with exploring the generated data from cotranscriptional folding simulations.</p><p><strong>Results: </strong>We present DrForna, an interactive visualization app for viewing the time course of a cotranscriptional RNA folding simulation. Specifically, users can scroll along the time axis and see the population of structures that are present at any particular time point.</p><p><strong>Availability and implementation: </strong>DrForna is a JavaScript project available on Github at https://github.com/ViennaRNA/drforna and deployed at https://viennarna.github.io/drforna.</p>","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":"39 9","pages":""},"PeriodicalIF":4.4,"publicationDate":"2023-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10504468/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10357864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Phylogenetic inference using generative adversarial networks. 利用生成对抗网络进行系统发育推断。
IF 4.4 3区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2023-09-02 DOI: 10.1093/bioinformatics/btad543
Megan L Smith, Matthew W Hahn

Motivation: The application of machine learning approaches in phylogenetics has been impeded by the vast model space associated with inference. Supervised machine learning approaches require data from across this space to train models. Because of this, previous approaches have typically been limited to inferring relationships among unrooted quartets of taxa, where there are only three possible topologies. Here, we explore the potential of generative adversarial networks (GANs) to address this limitation. GANs consist of a generator and a discriminator: at each step, the generator aims to create data that is similar to real data, while the discriminator attempts to distinguish generated and real data. By using an evolutionary model as the generator, we use GANs to make evolutionary inferences. Since a new model can be considered at each iteration, heuristic searches of complex model spaces are possible. Thus, GANs offer a potential solution to the challenges of applying machine learning in phylogenetics.

Results: We developed phyloGAN, a GAN that infers phylogenetic relationships among species. phyloGAN takes as input a concatenated alignment, or a set of gene alignments, and infers a phylogenetic tree either considering or ignoring gene tree heterogeneity. We explored the performance of phyloGAN for up to 15 taxa in the concatenation case and 6 taxa when considering gene tree heterogeneity. Error rates are relatively low in these simple cases. However, run times are slow and performance metrics suggest issues during training. Future work should explore novel architectures that may result in more stable and efficient GANs for phylogenetics.

Availability and implementation: phyloGAN is available on github: https://github.com/meganlsmith/phyloGAN/.

动机机器学习方法在系统发育学中的应用一直受到与推理相关的巨大模型空间的阻碍。有监督的机器学习方法需要来自整个空间的数据来训练模型。正因为如此,以前的方法通常仅限于推断无根四分类群之间的关系,因为在这种情况下只有三种可能的拓扑结构。在此,我们将探索生成对抗网络(GANs)的潜力,以解决这一局限性。生成式对抗网络由生成器和判别器组成:在每一步中,生成器的目标是创建与真实数据相似的数据,而判别器则试图区分生成的数据和真实数据。通过使用进化模型作为生成器,我们利用 GANs 进行进化推断。由于每次迭代都可以考虑新的模型,因此可以对复杂的模型空间进行启发式搜索。因此,GANs 为在系统发育学中应用机器学习所面临的挑战提供了潜在的解决方案:我们开发了phyloGAN,这是一种能推断物种间系统发育关系的GAN。phyloGAN的输入是一条连接排列或一组基因排列,然后推断出一棵考虑或忽略基因树异质性的系统发育树。我们对phyloGAN的性能进行了探索,在并列情况下最多可推算15个分类群,而在考虑基因树异质性时最多可推算6个分类群。在这些简单的情况下,错误率相对较低。不过,运行时间较慢,而且性能指标表明在训练过程中存在问题。未来的工作应该探索新的架构,从而为系统发育学带来更稳定、更高效的 GAN。可用性和实现:phyloGAN 可在 github 上获取:https://github.com/meganlsmith/phyloGAN/。
{"title":"Phylogenetic inference using generative adversarial networks.","authors":"Megan L Smith, Matthew W Hahn","doi":"10.1093/bioinformatics/btad543","DOIUrl":"10.1093/bioinformatics/btad543","url":null,"abstract":"<p><strong>Motivation: </strong>The application of machine learning approaches in phylogenetics has been impeded by the vast model space associated with inference. Supervised machine learning approaches require data from across this space to train models. Because of this, previous approaches have typically been limited to inferring relationships among unrooted quartets of taxa, where there are only three possible topologies. Here, we explore the potential of generative adversarial networks (GANs) to address this limitation. GANs consist of a generator and a discriminator: at each step, the generator aims to create data that is similar to real data, while the discriminator attempts to distinguish generated and real data. By using an evolutionary model as the generator, we use GANs to make evolutionary inferences. Since a new model can be considered at each iteration, heuristic searches of complex model spaces are possible. Thus, GANs offer a potential solution to the challenges of applying machine learning in phylogenetics.</p><p><strong>Results: </strong>We developed phyloGAN, a GAN that infers phylogenetic relationships among species. phyloGAN takes as input a concatenated alignment, or a set of gene alignments, and infers a phylogenetic tree either considering or ignoring gene tree heterogeneity. We explored the performance of phyloGAN for up to 15 taxa in the concatenation case and 6 taxa when considering gene tree heterogeneity. Error rates are relatively low in these simple cases. However, run times are slow and performance metrics suggest issues during training. Future work should explore novel architectures that may result in more stable and efficient GANs for phylogenetics.</p><p><strong>Availability and implementation: </strong>phyloGAN is available on github: https://github.com/meganlsmith/phyloGAN/.</p>","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":"39 9","pages":""},"PeriodicalIF":4.4,"publicationDate":"2023-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10500083/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10631514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Bioinformatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1