{"title":"AlphaFold-based protein analysis pipeline","authors":"Octavian-Florin Maghiar","doi":"10.1109/SYNASC57785.2022.00061","DOIUrl":null,"url":null,"abstract":"During the 14th edition of the Critical Assessment of protein Structure Prediction competition, great progress towards solving the protein structure prediction problem has been achieved by the winning model, DeepMind's AlphaFold2. Thanks to AlphaFold2's significant leap in accuracy, new possibilities in protein structure analysis and design have been opened. This paper presents a new protein analysis pipeline that builds upon the predictions of AlphaFold2. The core functionality of the pipeline is to determine and present different properties based on the protein sequence and the predicted three-dimensional structure. Some of the available features include computing physicochemical properties, executing an evolutionary analysis by aligning the sequence against databases such as Pfam and Swiss-Prot/UniRef90, the detection of binding pockets using P2Rank, and the molecular docking of ligands using AutoDock Vina. The results produced by the pipeline can be visualized as a MultiQC HTML report. The performance of the pipeline has been analyzed using a small dataset of protein structures, and the developed workflow has then been used to compare the accuracy of AlphaFold2's predictions against other experimental structures. The pipeline has been developed using Nextflow, a popular workflow manager for bioinformatic analyses, and has been made freely available at https://github.com/OtimusOne/AFPAP.","PeriodicalId":446065,"journal":{"name":"2022 24th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 24th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SYNASC57785.2022.00061","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
During the 14th edition of the Critical Assessment of protein Structure Prediction competition, great progress towards solving the protein structure prediction problem has been achieved by the winning model, DeepMind's AlphaFold2. Thanks to AlphaFold2's significant leap in accuracy, new possibilities in protein structure analysis and design have been opened. This paper presents a new protein analysis pipeline that builds upon the predictions of AlphaFold2. The core functionality of the pipeline is to determine and present different properties based on the protein sequence and the predicted three-dimensional structure. Some of the available features include computing physicochemical properties, executing an evolutionary analysis by aligning the sequence against databases such as Pfam and Swiss-Prot/UniRef90, the detection of binding pockets using P2Rank, and the molecular docking of ligands using AutoDock Vina. The results produced by the pipeline can be visualized as a MultiQC HTML report. The performance of the pipeline has been analyzed using a small dataset of protein structures, and the developed workflow has then been used to compare the accuracy of AlphaFold2's predictions against other experimental structures. The pipeline has been developed using Nextflow, a popular workflow manager for bioinformatic analyses, and has been made freely available at https://github.com/OtimusOne/AFPAP.