{"title":"Scalable Mining, Analysis, and Visualization of Protein-Protein Interaction Networks","authors":"Bikesh Pandey, S. Arifuzzaman","doi":"10.1504/IJBDI.2019.10019036","DOIUrl":null,"url":null,"abstract":"Proteins are linear chain biomolecules that are the basis of functional networks in all organisms. Protein-protein interaction (PPI) networks are networks of protein complexes formed by biochemical events and electrostatic forces. PPI networks can be used to study diseases and discover drugs. The causes of diseases are evident on a protein interaction level. For instance, elevation of interaction edge weights of oncogenes is manifested in cancers. The availability of large datasets and need for efficient analysis necessitate the design of scalable methods leveraging modern high-performance computing (HPC) platforms. In this paper, we design a lightweight framework on a distributed-memory parallel system to study PPI networks. Our framework supports automated analytics based on methods for extracting signed motifs, computing centrality, and finding functional units. We design message passing interface (MPI)-based parallel methods and workflow, scalable to large networks. To the best of our knowledge, these capabilities collectively make our tool novel.","PeriodicalId":193819,"journal":{"name":"International Journal of Big Data Intelligence","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Big Data Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1504/IJBDI.2019.10019036","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Proteins are linear chain biomolecules that are the basis of functional networks in all organisms. Protein-protein interaction (PPI) networks are networks of protein complexes formed by biochemical events and electrostatic forces. PPI networks can be used to study diseases and discover drugs. The causes of diseases are evident on a protein interaction level. For instance, elevation of interaction edge weights of oncogenes is manifested in cancers. The availability of large datasets and need for efficient analysis necessitate the design of scalable methods leveraging modern high-performance computing (HPC) platforms. In this paper, we design a lightweight framework on a distributed-memory parallel system to study PPI networks. Our framework supports automated analytics based on methods for extracting signed motifs, computing centrality, and finding functional units. We design message passing interface (MPI)-based parallel methods and workflow, scalable to large networks. To the best of our knowledge, these capabilities collectively make our tool novel.