The advance of environmental DNA (eDNA) has enabled rapid and non-invasive species detection in aquatic environments. While most studies focus on species detection, recent works explored using eDNA concentration to quantify species abundance. However, the differential individual DNA contribution to eDNA samples could easily obscure the eDNA concentration-species abundance relationship. We propose using the number of segregating sites as a proxy for estimating species abundance. Segregating sites reflect the genetic diversity of the population, which is less sensitive to differential individual DNA contribution than eDNA concentration. We examined the relationship between the number of segregating sites and species abundance in silico, in vitro, and in situ experiments, using two brackish goby species, Acanthogobius hasta and Tridentiger bifasciatus. Analyses of the simulated and in vitro data with DNA mixed from a known number of individuals showed a strong correlation between the number of segregating sites and species abundance (R2 > 0.9; p < 0.01). In the in situ experiments, we analysed eDNA samples collected from mesocosm. The results further validated that the correlation (R2 = 0.70, p < 0.01) was not affected by biotic factors, including body size and feeding behaviour (p > 0.05). The cross-validation test results also showed that the number of segregating sites predicted species abundance with less bias and variability than the eDNA concentration. Overall, the number of segregating sites is less affected by differential DNA contribution among individuals compared to eDNA concentration. This advancement can significantly enhance the proficiency of estimating species abundance using eDNA.
{"title":"Estimation of Species Abundance Based on the Number of Segregating Sites Using Environmental DNA (eDNA).","authors":"Qiaoyun Ai, Hao Yuan, Ying Wang, Chenhong Li","doi":"10.1111/1755-0998.14076","DOIUrl":"https://doi.org/10.1111/1755-0998.14076","url":null,"abstract":"<p><p>The advance of environmental DNA (eDNA) has enabled rapid and non-invasive species detection in aquatic environments. While most studies focus on species detection, recent works explored using eDNA concentration to quantify species abundance. However, the differential individual DNA contribution to eDNA samples could easily obscure the eDNA concentration-species abundance relationship. We propose using the number of segregating sites as a proxy for estimating species abundance. Segregating sites reflect the genetic diversity of the population, which is less sensitive to differential individual DNA contribution than eDNA concentration. We examined the relationship between the number of segregating sites and species abundance in silico, in vitro, and in situ experiments, using two brackish goby species, Acanthogobius hasta and Tridentiger bifasciatus. Analyses of the simulated and in vitro data with DNA mixed from a known number of individuals showed a strong correlation between the number of segregating sites and species abundance (R<sup>2</sup> > 0.9; p < 0.01). In the in situ experiments, we analysed eDNA samples collected from mesocosm. The results further validated that the correlation (R<sup>2</sup> = 0.70, p < 0.01) was not affected by biotic factors, including body size and feeding behaviour (p > 0.05). The cross-validation test results also showed that the number of segregating sites predicted species abundance with less bias and variability than the eDNA concentration. Overall, the number of segregating sites is less affected by differential DNA contribution among individuals compared to eDNA concentration. This advancement can significantly enhance the proficiency of estimating species abundance using eDNA.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14076"},"PeriodicalIF":5.5,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143253985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alexandre Gilardet, Edana Lord, Gonzalo Oteo García, Georgios Xenikoudakis, Katerina Douka, Matthew J Wooller, Timothy Rowe, Michael D Martin, Mathilde Le Moullec, Michail Anisimov, Peter D Heintzman, Love Dalén
Large-scale DNA screening of palaeontological and archaeological collections remains a limiting and costly factor for ancient DNA studies. Several DNA extraction protocols are routinely used in ancient DNA laboratories and have even been automated on robotic platforms. Robots offer a solution for high-throughput screening but the costs, as well as necessity for trained technicians and engineers, can be prohibitive for some laboratories. Here, we present a high-throughput alternative to robot-based ancient DNA extraction using a 96-column plate. When compared to routine single MinElute columns, we retrieved highly similar endogenous DNA contents, an important metric in ancient DNA screening. Mitogenomes with a coverage depth greater than 0.1× could be generated and allowed for taxonomic assignment. However, average fragment lengths, DNA damage and library complexities significantly differed between methods but these differences became nonsignificant after modification of our library purification protocol. Our high-throughput extraction method allows generation of 96 extracts within approximately 4 hours of laboratory work while bringing the cost down by ~39% compared to using single columns. Additionally, we formally demonstrate that the addition of Tween-20 during the elution step results in higher complexity libraries, thereby enabling higher genome coverage for the same sequencing effort.
{"title":"A High-Throughput Ancient DNA Extraction Method for Large-Scale Sample Screening.","authors":"Alexandre Gilardet, Edana Lord, Gonzalo Oteo García, Georgios Xenikoudakis, Katerina Douka, Matthew J Wooller, Timothy Rowe, Michael D Martin, Mathilde Le Moullec, Michail Anisimov, Peter D Heintzman, Love Dalén","doi":"10.1111/1755-0998.14077","DOIUrl":"https://doi.org/10.1111/1755-0998.14077","url":null,"abstract":"<p><p>Large-scale DNA screening of palaeontological and archaeological collections remains a limiting and costly factor for ancient DNA studies. Several DNA extraction protocols are routinely used in ancient DNA laboratories and have even been automated on robotic platforms. Robots offer a solution for high-throughput screening but the costs, as well as necessity for trained technicians and engineers, can be prohibitive for some laboratories. Here, we present a high-throughput alternative to robot-based ancient DNA extraction using a 96-column plate. When compared to routine single MinElute columns, we retrieved highly similar endogenous DNA contents, an important metric in ancient DNA screening. Mitogenomes with a coverage depth greater than 0.1× could be generated and allowed for taxonomic assignment. However, average fragment lengths, DNA damage and library complexities significantly differed between methods but these differences became nonsignificant after modification of our library purification protocol. Our high-throughput extraction method allows generation of 96 extracts within approximately 4 hours of laboratory work while bringing the cost down by ~39% compared to using single columns. Additionally, we formally demonstrate that the addition of Tween-20 during the elution step results in higher complexity libraries, thereby enabling higher genome coverage for the same sequencing effort.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14077"},"PeriodicalIF":5.5,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143253979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Matthew I M Pinder, Björn Andersson, Hannah Blossom, Marie Svensson, Karin Rengefors, Mats Töpel
Evolutionary changes in populations of microbes, such as microalgae, cannot be traced using conventional metabarcoding loci as they lack intraspecific resolution. Consequently, selection and competition processes among strains of the same species cannot be resolved without elaborate isolation, culturing, and genotyping efforts. Bamboozle, a new bioinformatic tool introduced here, scans the entire genome of a species and identifies allele-rich barcodes that enable direct identification of different genetic strains from a population using amplicon sequencing of a single DNA sample. We demonstrate its usefulness by identifying hypervariable barcoding loci (< 500 bp) from genomic data in two microalgal species, the diploid diatom Skeletonema marinoi and the haploid chlorophyte Chlamydomonas reinhardtii. Across the two genomes, four and twenty-two loci, respectively, were identified that could in silico resolve all analysed genotypes. All of the identified loci are within protein-coding genes with various metabolic functions. Single nucleotide polymorphisms (SNPs) provided the most reliable genetic markers, and among 54 strains of S. marinoi, three 500 bp loci contained, on average, 46 SNPs, 103 strain-specific alleles, and displayed 100% heterozygosity. This high level of heterozygosity was identified as a novel opportunity to improve strain quantification and detect false positive artefacts during denoising of amplicon sequences. Finally, we illustrate how metabarcoding of a single genetic locus can be used to track abundances of S. marinoi strains in an artificial selection experiment. As future genomic datasets become available and DNA sequencing technologies develop, Bamboozle has flexible user settings enabling optimal barcodes to be designed for other species and applications.
{"title":"Bamboozle: A Bioinformatic Tool for Identification and Quantification of Intraspecific Barcodes.","authors":"Matthew I M Pinder, Björn Andersson, Hannah Blossom, Marie Svensson, Karin Rengefors, Mats Töpel","doi":"10.1111/1755-0998.14067","DOIUrl":"https://doi.org/10.1111/1755-0998.14067","url":null,"abstract":"<p><p>Evolutionary changes in populations of microbes, such as microalgae, cannot be traced using conventional metabarcoding loci as they lack intraspecific resolution. Consequently, selection and competition processes among strains of the same species cannot be resolved without elaborate isolation, culturing, and genotyping efforts. Bamboozle, a new bioinformatic tool introduced here, scans the entire genome of a species and identifies allele-rich barcodes that enable direct identification of different genetic strains from a population using amplicon sequencing of a single DNA sample. We demonstrate its usefulness by identifying hypervariable barcoding loci (< 500 bp) from genomic data in two microalgal species, the diploid diatom Skeletonema marinoi and the haploid chlorophyte Chlamydomonas reinhardtii. Across the two genomes, four and twenty-two loci, respectively, were identified that could in silico resolve all analysed genotypes. All of the identified loci are within protein-coding genes with various metabolic functions. Single nucleotide polymorphisms (SNPs) provided the most reliable genetic markers, and among 54 strains of S. marinoi, three 500 bp loci contained, on average, 46 SNPs, 103 strain-specific alleles, and displayed 100% heterozygosity. This high level of heterozygosity was identified as a novel opportunity to improve strain quantification and detect false positive artefacts during denoising of amplicon sequences. Finally, we illustrate how metabarcoding of a single genetic locus can be used to track abundances of S. marinoi strains in an artificial selection experiment. As future genomic datasets become available and DNA sequencing technologies develop, Bamboozle has flexible user settings enabling optimal barcodes to be designed for other species and applications.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14067"},"PeriodicalIF":5.5,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143187519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Emily C Giles, Vanessa L González, Paulina Carimán, Carlos Leiva, Ana Victoria Suescún, Sarah Lemer, Marie Laure Guillemin, Daniel Ortiz-Barrientos, Pablo Saenz-Agudelo
Comparative genomic studies of closely related taxa are important for our understanding of the causes of divergence on a changing Earth. This being said, the genomic resources available for marine intertidal molluscs are limited and currently, there are few publicly available high-quality annotated genomes for intertidal species and for molluscs in general. Here we report transcriptome assemblies for six species of Patellogastropoda and genome assemblies and annotations for three of these species (Scurria scurra, Scurria viridula and Scurria zebrina). Comparative analysis using these genomic resources suggest that and recently diverging lineages (10-20 Mya) have experienced similar amounts of contractions and expansions but across different gene families. Furthermore, differences among recently diverged species are reflected in variation in the amount of coding and noncoding material in genomes, such as amount of repetitive elements and lengths of transcripts and introns and exons. Additionally, functional ontologies of species-specific and duplicated genes together with demographic inference support the finding that recent divergence among members of the genus Scurria aligns with their unique ecological characteristics. Overall, the resources presented here will be valuable for future studies of adaptation in molluscs and in intertidal habitats as a whole.
{"title":"Comparative Genomics Points to Ecological Drivers of Genomic Divergence Among Intertidal Limpets.","authors":"Emily C Giles, Vanessa L González, Paulina Carimán, Carlos Leiva, Ana Victoria Suescún, Sarah Lemer, Marie Laure Guillemin, Daniel Ortiz-Barrientos, Pablo Saenz-Agudelo","doi":"10.1111/1755-0998.14075","DOIUrl":"https://doi.org/10.1111/1755-0998.14075","url":null,"abstract":"<p><p>Comparative genomic studies of closely related taxa are important for our understanding of the causes of divergence on a changing Earth. This being said, the genomic resources available for marine intertidal molluscs are limited and currently, there are few publicly available high-quality annotated genomes for intertidal species and for molluscs in general. Here we report transcriptome assemblies for six species of Patellogastropoda and genome assemblies and annotations for three of these species (Scurria scurra, Scurria viridula and Scurria zebrina). Comparative analysis using these genomic resources suggest that and recently diverging lineages (10-20 Mya) have experienced similar amounts of contractions and expansions but across different gene families. Furthermore, differences among recently diverged species are reflected in variation in the amount of coding and noncoding material in genomes, such as amount of repetitive elements and lengths of transcripts and introns and exons. Additionally, functional ontologies of species-specific and duplicated genes together with demographic inference support the finding that recent divergence among members of the genus Scurria aligns with their unique ecological characteristics. Overall, the resources presented here will be valuable for future studies of adaptation in molluscs and in intertidal habitats as a whole.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14075"},"PeriodicalIF":5.5,"publicationDate":"2025-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143062764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Katherine A Solari, Shakeel Ahmad, Ellie E Armstrong, Michael G Campana, Hussain Ali, Shoaib Hameed, Jami Ullah, Barkat Ullah Khan, Muhammad A Nawaz, Dmitri A Petrov
In recent years, numerous single nucleotide polymorphism (SNP) panel methods to genotype non-invasive faecal samples have been developed. However, none of these existing methods fit all of the criteria necessary to make a SNP panel broadly usable for conservation projects in any country-cost effective, streamlined lab protocol and user-friendly open-source bioinformatics protocols for panel design and analysis. Here, we present such a method and display its utility by developing a multiplex PCR SNP panel for conducting individual ID of snow leopards, Panthera uncia, from faecal samples. The SNP panel we present consists of 144 SNPs and utilises next-generation sequencing technology. We validate our SNP panel with paired tissue and faecal samples from zoo individuals, showing a minimum of 96.7% accuracy in allele calls per run. We then generate SNP data from 235 field-collected faecal samples from across Pakistan to show that the panel can reliably identify individuals from low-quality faecal samples of unknown age and is robust to contamination. We also show that our SNP panel has the capability to identify first-order relatives among sampled zoo individuals and provides insights into the geographic origin of samples. This SNP panel will empower the snow leopard research community in their efforts to assess local and global snow leopard population sizes. More broadly, we present a SNP panel development method that can be used for any species of interest for which adequate genomic reference data is available.
{"title":"Next-Generation Snow Leopard Population Assessment Tool: Multiplex-PCR SNP Panel for Individual Identification From Faeces.","authors":"Katherine A Solari, Shakeel Ahmad, Ellie E Armstrong, Michael G Campana, Hussain Ali, Shoaib Hameed, Jami Ullah, Barkat Ullah Khan, Muhammad A Nawaz, Dmitri A Petrov","doi":"10.1111/1755-0998.14074","DOIUrl":"https://doi.org/10.1111/1755-0998.14074","url":null,"abstract":"<p><p>In recent years, numerous single nucleotide polymorphism (SNP) panel methods to genotype non-invasive faecal samples have been developed. However, none of these existing methods fit all of the criteria necessary to make a SNP panel broadly usable for conservation projects in any country-cost effective, streamlined lab protocol and user-friendly open-source bioinformatics protocols for panel design and analysis. Here, we present such a method and display its utility by developing a multiplex PCR SNP panel for conducting individual ID of snow leopards, Panthera uncia, from faecal samples. The SNP panel we present consists of 144 SNPs and utilises next-generation sequencing technology. We validate our SNP panel with paired tissue and faecal samples from zoo individuals, showing a minimum of 96.7% accuracy in allele calls per run. We then generate SNP data from 235 field-collected faecal samples from across Pakistan to show that the panel can reliably identify individuals from low-quality faecal samples of unknown age and is robust to contamination. We also show that our SNP panel has the capability to identify first-order relatives among sampled zoo individuals and provides insights into the geographic origin of samples. This SNP panel will empower the snow leopard research community in their efforts to assess local and global snow leopard population sizes. More broadly, we present a SNP panel development method that can be used for any species of interest for which adequate genomic reference data is available.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14074"},"PeriodicalIF":5.5,"publicationDate":"2025-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143062706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nauras Daraghmeh, Katrina Exter, Justine Pagnier, Piotr Balazy, Ibon Cancio, Giorgos Chatzigeorgiou, Eva Chatzinikolaou, Maciej Chelchowski, Nathan Alexis Mitchell Chrismas, Thierry Comtet, Thanos Dailianis, Klaas Deneudt, Oihane Diaz de Cerio, Markos Digenis, Vasilis Gerovasileiou, José González, Laura Kauppi, Jon Bent Kristoffersen, Piotr Kukliński, Rafał Lasota, Liraz Levy, Magdalena Małachowicz, Borut Mavrič, Jonas Mortelmans, Estefania Paredes, Anita Poćwierz-Kotus, Henning Reiss, Ioulia Santi, Georgia Sarafidou, Grigorios Skouradakis, Jostein Solbakken, Peter A U Staehr, Javier Tajadura, Jakob Thyrring, Jesus S Troncoso, Emmanouela Vernadou, Frederique Viard, Haris Zafeiropoulos, Małgorzata Zbawicka, Christina Pavloudi, Matthias Obst
Molecular methods such as DNA/eDNA metabarcoding have emerged as useful tools to document the biodiversity of complex communities over large spatio-temporal scales. We established an international Marine Biodiversity Observation Network (ARMS-MBON) combining standardised sampling using autonomous reef monitoring structures (ARMS) with metabarcoding for genetic monitoring of marine hard-bottom benthic communities. Here, we present the data of our first sampling campaign comprising 56 ARMS units deployed in 2018-2019 and retrieved in 2018-2020 across 15 observatories along the coasts of Europe and adjacent regions. We describe the open-access data set (image, genetic and metadata) and explore the genetic data to show its potential for marine biodiversity monitoring and ecological research. Our analysis shows that ARMS recovered more than 60 eukaryotic phyla capturing diversity of up to ~5500 amplicon sequence variants and ~1800 operational taxonomic units, and up to ~250 and ~50 species per observatory using the cytochrome c oxidase subunit I (COI) and 18S rRNA marker genes, respectively. Further, ARMS detected threatened, vulnerable and non-indigenous species often targeted in biological monitoring. We show that while deployment duration does not drive diversity estimates, sampling effort and sequencing depth across observatories do. We recommend that ARMS should be deployed for at least 3-6 months during the main growth season to use resources as efficiently as possible and that post-sequencing curation is applied to enable statistical comparison of spatio-temporal entities. We suggest that ARMS should be used in biological monitoring programs and long-term ecological research and encourage the adoption of our ARMS-MBON protocols.
{"title":"A Long-Term Ecological Research Data Set From the Marine Genetic Monitoring Program ARMS-MBON 2018-2020.","authors":"Nauras Daraghmeh, Katrina Exter, Justine Pagnier, Piotr Balazy, Ibon Cancio, Giorgos Chatzigeorgiou, Eva Chatzinikolaou, Maciej Chelchowski, Nathan Alexis Mitchell Chrismas, Thierry Comtet, Thanos Dailianis, Klaas Deneudt, Oihane Diaz de Cerio, Markos Digenis, Vasilis Gerovasileiou, José González, Laura Kauppi, Jon Bent Kristoffersen, Piotr Kukliński, Rafał Lasota, Liraz Levy, Magdalena Małachowicz, Borut Mavrič, Jonas Mortelmans, Estefania Paredes, Anita Poćwierz-Kotus, Henning Reiss, Ioulia Santi, Georgia Sarafidou, Grigorios Skouradakis, Jostein Solbakken, Peter A U Staehr, Javier Tajadura, Jakob Thyrring, Jesus S Troncoso, Emmanouela Vernadou, Frederique Viard, Haris Zafeiropoulos, Małgorzata Zbawicka, Christina Pavloudi, Matthias Obst","doi":"10.1111/1755-0998.14073","DOIUrl":"https://doi.org/10.1111/1755-0998.14073","url":null,"abstract":"<p><p>Molecular methods such as DNA/eDNA metabarcoding have emerged as useful tools to document the biodiversity of complex communities over large spatio-temporal scales. We established an international Marine Biodiversity Observation Network (ARMS-MBON) combining standardised sampling using autonomous reef monitoring structures (ARMS) with metabarcoding for genetic monitoring of marine hard-bottom benthic communities. Here, we present the data of our first sampling campaign comprising 56 ARMS units deployed in 2018-2019 and retrieved in 2018-2020 across 15 observatories along the coasts of Europe and adjacent regions. We describe the open-access data set (image, genetic and metadata) and explore the genetic data to show its potential for marine biodiversity monitoring and ecological research. Our analysis shows that ARMS recovered more than 60 eukaryotic phyla capturing diversity of up to ~5500 amplicon sequence variants and ~1800 operational taxonomic units, and up to ~250 and ~50 species per observatory using the cytochrome c oxidase subunit I (COI) and 18S rRNA marker genes, respectively. Further, ARMS detected threatened, vulnerable and non-indigenous species often targeted in biological monitoring. We show that while deployment duration does not drive diversity estimates, sampling effort and sequencing depth across observatories do. We recommend that ARMS should be deployed for at least 3-6 months during the main growth season to use resources as efficiently as possible and that post-sequencing curation is applied to enable statistical comparison of spatio-temporal entities. We suggest that ARMS should be used in biological monitoring programs and long-term ecological research and encourage the adoption of our ARMS-MBON protocols.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14073"},"PeriodicalIF":5.5,"publicationDate":"2025-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143062749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Collagen is the most ubiquitous protein in the animal kingdom and one of the most abundant proteins on Earth. Despite having a relatively repetitive amino acid sequence motif that enables its triple helical structure, in type 1 collagen, that dominates skin and bone, there is enough variation for its increasing use for the biomolecular species identification of animal tissues processed or degraded beyond the amenability of DNA-based analyses. In recent years, this has been most commonly achieved through the technique of collagen peptide mass fingerprinting (PMF) known as ZooMS (Zooarchaeology by Mass Spectrometry), applied to the analysis of tens of thousands of samples across over one hundred studies in the past decade alone. However, a robust means to quantify variation between these fingerprints remains elusive, despite being increasingly required due to the shift towards a wider range of wild fauna and those that are more distantly related from currently known sequences. This is particularly problematic in fish due to their greater sequence variation. Here we evaluate the quantification of the relative closeness of collagen fingerprints between families using ANOSIM and a modified SIMPER analysis, incorporating relative peak intensity. Our results show a clear correlation between sequence differentiation and statistical distance of PMFs, indicating that the additional complexity of type 1 collagen in fish could directly affect the efficacy of biomolecular techniques such as ZooMS. Furthermore, this multivariate statistical analysis demonstrates that PMFs in fish are substantively more distinct than those of mammalian or amphibian taxa.
{"title":"Quantifying Bone Collagen Fingerprint Variation Between Species.","authors":"Andrew Baker, Michael Buckley","doi":"10.1111/1755-0998.14072","DOIUrl":"https://doi.org/10.1111/1755-0998.14072","url":null,"abstract":"<p><p>Collagen is the most ubiquitous protein in the animal kingdom and one of the most abundant proteins on Earth. Despite having a relatively repetitive amino acid sequence motif that enables its triple helical structure, in type 1 collagen, that dominates skin and bone, there is enough variation for its increasing use for the biomolecular species identification of animal tissues processed or degraded beyond the amenability of DNA-based analyses. In recent years, this has been most commonly achieved through the technique of collagen peptide mass fingerprinting (PMF) known as ZooMS (Zooarchaeology by Mass Spectrometry), applied to the analysis of tens of thousands of samples across over one hundred studies in the past decade alone. However, a robust means to quantify variation between these fingerprints remains elusive, despite being increasingly required due to the shift towards a wider range of wild fauna and those that are more distantly related from currently known sequences. This is particularly problematic in fish due to their greater sequence variation. Here we evaluate the quantification of the relative closeness of collagen fingerprints between families using ANOSIM and a modified SIMPER analysis, incorporating relative peak intensity. Our results show a clear correlation between sequence differentiation and statistical distance of PMFs, indicating that the additional complexity of type 1 collagen in fish could directly affect the efficacy of biomolecular techniques such as ZooMS. Furthermore, this multivariate statistical analysis demonstrates that PMFs in fish are substantively more distinct than those of mammalian or amphibian taxa.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14072"},"PeriodicalIF":5.5,"publicationDate":"2025-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143057608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}