{"title":"Short Read Lengths Recover Ecological Patterns in 16S rRNA Gene Amplicon Data.","authors":"Stephanie D Jurburg","doi":"10.1111/1755-0998.14102","DOIUrl":null,"url":null,"abstract":"<p><p>16S rRNA gene metabarcoding, the study of amplicon sequences of the 16S rRNA gene from mixed environmental samples, is an increasingly popular and accessible method for assessing bacterial communities across a wide range of environments. As metabarcoding sequence data archives continue to grow, data reuse will likely become an important source of novel insights into the ecology of microbes. While recent work has demonstrated the benefits of longer read lengths for the study of microbial communities from 16S rRNA gene segments, no studies have explored the use of shorter (< 200 bp) read lengths in the context of data reuse. Nevertheless, this information is essential to improve the reuse and comparability of metabarcoding data across existing datasets. This study reanalyzed nine 16S rRNA datasets targeting aquatic, animal-associated and soil microbiomes, and evaluated how processing the sequence data across a range of read lengths affected the resulting taxonomic assignments, biodiversity metrics and differential (i.e., before-after treatment) analyses. Short read lengths successfully recovered ecological patterns and allowed for the use of more sequences. Limited increases in resolution were observed beyond 150 bp reads across environments. Furthermore, abundance-weighted diversity metrics (e.g., Inverse Simpson index, Morisita-Horn dissimilarities or weighted Unifrac distances) were more robust to variation in read lengths. Read lengths alone contributed to consistent increases in the total number of ASVs detected, highlighting the need to consider metabarcoding-derived diversity estimates within the context of the bioinformatics parameters selected. This study provides evidence-based guidelines for the processing of short reads.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14102"},"PeriodicalIF":5.5000,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Ecology Resources","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1111/1755-0998.14102","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
16S rRNA gene metabarcoding, the study of amplicon sequences of the 16S rRNA gene from mixed environmental samples, is an increasingly popular and accessible method for assessing bacterial communities across a wide range of environments. As metabarcoding sequence data archives continue to grow, data reuse will likely become an important source of novel insights into the ecology of microbes. While recent work has demonstrated the benefits of longer read lengths for the study of microbial communities from 16S rRNA gene segments, no studies have explored the use of shorter (< 200 bp) read lengths in the context of data reuse. Nevertheless, this information is essential to improve the reuse and comparability of metabarcoding data across existing datasets. This study reanalyzed nine 16S rRNA datasets targeting aquatic, animal-associated and soil microbiomes, and evaluated how processing the sequence data across a range of read lengths affected the resulting taxonomic assignments, biodiversity metrics and differential (i.e., before-after treatment) analyses. Short read lengths successfully recovered ecological patterns and allowed for the use of more sequences. Limited increases in resolution were observed beyond 150 bp reads across environments. Furthermore, abundance-weighted diversity metrics (e.g., Inverse Simpson index, Morisita-Horn dissimilarities or weighted Unifrac distances) were more robust to variation in read lengths. Read lengths alone contributed to consistent increases in the total number of ASVs detected, highlighting the need to consider metabarcoding-derived diversity estimates within the context of the bioinformatics parameters selected. This study provides evidence-based guidelines for the processing of short reads.
期刊介绍:
Molecular Ecology Resources promotes the creation of comprehensive resources for the scientific community, encompassing computer programs, statistical and molecular advancements, and a diverse array of molecular tools. Serving as a conduit for disseminating these resources, the journal targets a broad audience of researchers in the fields of evolution, ecology, and conservation. Articles in Molecular Ecology Resources are crafted to support investigations tackling significant questions within these disciplines.
In addition to original resource articles, Molecular Ecology Resources features Reviews, Opinions, and Comments relevant to the field. The journal also periodically releases Special Issues focusing on resource development within specific areas.