{"title":"Evidence for exon shuffling is sensitive to model choice.","authors":"Xiaoyue Cui, Maureen Stolzer, Dannie Durand","doi":"10.1142/S0219720021400138","DOIUrl":null,"url":null,"abstract":"<p><p>The exon shuffling theory posits that intronic recombination creates new domain combinations, facilitating the evolution of novel protein function. This theory predicts that introns will be preferentially situated near domain boundaries. Many studies have sought evidence for exon shuffling by testing the correspondence between introns and domain boundaries against chance intron positioning. Here, we present an empirical investigation of how the choice of null model influences significance. Although genome-wide studies have used a uniform null model, exclusively, more realistic null models have been proposed for single gene studies. We extended these models for genome-wide analyses and applied them to 21 metazoan and fungal genomes. Our results show that compared with the other two models, the uniform model does not recapitulate genuine exon lengths, dramatically underestimates the probability of chance agreement, and overestimates the significance of intron-domain correspondence by as much as 100 orders of magnitude. Model choice had much greater impact on the assessment of exon shuffling in fungal genomes than in metazoa, leading to different evolutionary conclusions in seven of the 16 fungal genomes tested. Genome-wide studies that use this overly permissive null model may exaggerate the importance of exon shuffling as a general mechanism of multidomain evolution.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"19 6","pages":"2140013"},"PeriodicalIF":0.9000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Bioinformatics and Computational Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1142/S0219720021400138","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2021/11/19 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
The exon shuffling theory posits that intronic recombination creates new domain combinations, facilitating the evolution of novel protein function. This theory predicts that introns will be preferentially situated near domain boundaries. Many studies have sought evidence for exon shuffling by testing the correspondence between introns and domain boundaries against chance intron positioning. Here, we present an empirical investigation of how the choice of null model influences significance. Although genome-wide studies have used a uniform null model, exclusively, more realistic null models have been proposed for single gene studies. We extended these models for genome-wide analyses and applied them to 21 metazoan and fungal genomes. Our results show that compared with the other two models, the uniform model does not recapitulate genuine exon lengths, dramatically underestimates the probability of chance agreement, and overestimates the significance of intron-domain correspondence by as much as 100 orders of magnitude. Model choice had much greater impact on the assessment of exon shuffling in fungal genomes than in metazoa, leading to different evolutionary conclusions in seven of the 16 fungal genomes tested. Genome-wide studies that use this overly permissive null model may exaggerate the importance of exon shuffling as a general mechanism of multidomain evolution.
期刊介绍:
The Journal of Bioinformatics and Computational Biology aims to publish high quality, original research articles, expository tutorial papers and review papers as well as short, critical comments on technical issues associated with the analysis of cellular information.
The research papers will be technical presentations of new assertions, discoveries and tools, intended for a narrower specialist community. The tutorials, reviews and critical commentary will be targeted at a broader readership of biologists who are interested in using computers but are not knowledgeable about scientific computing, and equally, computer scientists who have an interest in biology but are not familiar with current thrusts nor the language of biology. Such carefully chosen tutorials and articles should greatly accelerate the rate of entry of these new creative scientists into the field.