{"title":"Discover true association rates in multi-protein complex proteomics data sets.","authors":"Changyu Shen, Lang Li, Jake Yue Chen","doi":"10.1109/csb.2005.29","DOIUrl":null,"url":null,"abstract":"<p><p>Experimental processes to collect and process proteomics data are increasingly complex, while the computational methods to assess the quality and significance of these data remain unsophisticated. These challenges have led to many biological oversights and computational misconceptions. We developed a complete empirical Bayes model to analyze multi-protein complex (MPC) proteomics data derived from peptide mass spectrometry detections of purified protein complex pull-down experiments. Our model considers not only bait-prey associations, but also prey-prey associations missed in previous work. Using our model and a yeast MPC proteomics data set, we estimated that there should be an average of 28 true associations per MPC, almost ten times as high as was previously estimated. For data sets generated to mimic a real proteome, our model achieved on average 80% sensitivity in detecting true associations, as compared with the 3% sensitivity in previous work, while maintaining a comparable false discovery rate of 0.3%.</p>","PeriodicalId":87417,"journal":{"name":"Proceedings. IEEE Computational Systems Bioinformatics Conference","volume":" ","pages":"167-74"},"PeriodicalIF":0.0000,"publicationDate":"2005-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/csb.2005.29","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. IEEE Computational Systems Bioinformatics Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/csb.2005.29","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Experimental processes to collect and process proteomics data are increasingly complex, while the computational methods to assess the quality and significance of these data remain unsophisticated. These challenges have led to many biological oversights and computational misconceptions. We developed a complete empirical Bayes model to analyze multi-protein complex (MPC) proteomics data derived from peptide mass spectrometry detections of purified protein complex pull-down experiments. Our model considers not only bait-prey associations, but also prey-prey associations missed in previous work. Using our model and a yeast MPC proteomics data set, we estimated that there should be an average of 28 true associations per MPC, almost ten times as high as was previously estimated. For data sets generated to mimic a real proteome, our model achieved on average 80% sensitivity in detecting true associations, as compared with the 3% sensitivity in previous work, while maintaining a comparable false discovery rate of 0.3%.