Scott C Lowe, Benjamin Misiuk, Isaac Xu, Shakhboz Abdulazizov, Amit R Baroi, Alex C Bastos, Merlin Best, Vicki Ferrini, Ariell Friedman, Deborah Hart, Ove Hoegh-Guldberg, Daniel Ierodiaconou, Julia Mackin-McLaughlin, Kathryn Markey, Pedro S Menandro, Jacquomo Monk, Shreya Nemani, John O'Brien, Elizabeth Oh, Luba Y Reshitnyk, Katleen Robert, Chris M Roelfsema, Jessica A Sameoto, Alexandre C G Schimel, Jordan A Thomson, Brittany R Wilson, Melisa C Wong, Craig J Brown, Thomas Trappenberg
{"title":"BenthicNet:用于深度学习应用的全球海底图像汇编。","authors":"Scott C Lowe, Benjamin Misiuk, Isaac Xu, Shakhboz Abdulazizov, Amit R Baroi, Alex C Bastos, Merlin Best, Vicki Ferrini, Ariell Friedman, Deborah Hart, Ove Hoegh-Guldberg, Daniel Ierodiaconou, Julia Mackin-McLaughlin, Kathryn Markey, Pedro S Menandro, Jacquomo Monk, Shreya Nemani, John O'Brien, Elizabeth Oh, Luba Y Reshitnyk, Katleen Robert, Chris M Roelfsema, Jessica A Sameoto, Alexandre C G Schimel, Jordan A Thomson, Brittany R Wilson, Melisa C Wong, Craig J Brown, Thomas Trappenberg","doi":"10.1038/s41597-025-04491-1","DOIUrl":null,"url":null,"abstract":"<p><p>Advances in underwater imaging enable collection of extensive seafloor image datasets necessary for monitoring important benthic ecosystems. The ability to collect seafloor imagery has outpaced our capacity to analyze it, hindering mobilization of this crucial environmental information. Machine learning approaches provide opportunities to increase the efficiency with which seafloor imagery is analyzed, yet large and consistent datasets to support development of such approaches are scarce. Here we present BenthicNet: a global compilation of seafloor imagery designed to support the training and evaluation of large-scale image recognition models. An initial set of over 11.4 million images was collected and curated to represent a diversity of seafloor environments using a representative subset of 1.3 million images. These are accompanied by 3.1 million annotations translated to the CATAMI scheme, which span 190,000 of the images. A large deep learning model was trained on this compilation and preliminary results suggest it has utility for automating large and small-scale image analysis tasks. The compilation and model are made openly available for reuse.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"230"},"PeriodicalIF":6.9000,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11806053/pdf/","citationCount":"0","resultStr":"{\"title\":\"BenthicNet: A global compilation of seafloor images for deep learning applications.\",\"authors\":\"Scott C Lowe, Benjamin Misiuk, Isaac Xu, Shakhboz Abdulazizov, Amit R Baroi, Alex C Bastos, Merlin Best, Vicki Ferrini, Ariell Friedman, Deborah Hart, Ove Hoegh-Guldberg, Daniel Ierodiaconou, Julia Mackin-McLaughlin, Kathryn Markey, Pedro S Menandro, Jacquomo Monk, Shreya Nemani, John O'Brien, Elizabeth Oh, Luba Y Reshitnyk, Katleen Robert, Chris M Roelfsema, Jessica A Sameoto, Alexandre C G Schimel, Jordan A Thomson, Brittany R Wilson, Melisa C Wong, Craig J Brown, Thomas Trappenberg\",\"doi\":\"10.1038/s41597-025-04491-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Advances in underwater imaging enable collection of extensive seafloor image datasets necessary for monitoring important benthic ecosystems. The ability to collect seafloor imagery has outpaced our capacity to analyze it, hindering mobilization of this crucial environmental information. Machine learning approaches provide opportunities to increase the efficiency with which seafloor imagery is analyzed, yet large and consistent datasets to support development of such approaches are scarce. Here we present BenthicNet: a global compilation of seafloor imagery designed to support the training and evaluation of large-scale image recognition models. An initial set of over 11.4 million images was collected and curated to represent a diversity of seafloor environments using a representative subset of 1.3 million images. These are accompanied by 3.1 million annotations translated to the CATAMI scheme, which span 190,000 of the images. A large deep learning model was trained on this compilation and preliminary results suggest it has utility for automating large and small-scale image analysis tasks. The compilation and model are made openly available for reuse.</p>\",\"PeriodicalId\":21597,\"journal\":{\"name\":\"Scientific Data\",\"volume\":\"12 1\",\"pages\":\"230\"},\"PeriodicalIF\":6.9000,\"publicationDate\":\"2025-02-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11806053/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Scientific Data\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.1038/s41597-025-04491-1\",\"RegionNum\":2,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific Data","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41597-025-04491-1","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
BenthicNet: A global compilation of seafloor images for deep learning applications.
Advances in underwater imaging enable collection of extensive seafloor image datasets necessary for monitoring important benthic ecosystems. The ability to collect seafloor imagery has outpaced our capacity to analyze it, hindering mobilization of this crucial environmental information. Machine learning approaches provide opportunities to increase the efficiency with which seafloor imagery is analyzed, yet large and consistent datasets to support development of such approaches are scarce. Here we present BenthicNet: a global compilation of seafloor imagery designed to support the training and evaluation of large-scale image recognition models. An initial set of over 11.4 million images was collected and curated to represent a diversity of seafloor environments using a representative subset of 1.3 million images. These are accompanied by 3.1 million annotations translated to the CATAMI scheme, which span 190,000 of the images. A large deep learning model was trained on this compilation and preliminary results suggest it has utility for automating large and small-scale image analysis tasks. The compilation and model are made openly available for reuse.
期刊介绍:
Scientific Data is an open-access journal focused on data, publishing descriptions of research datasets and articles on data sharing across natural sciences, medicine, engineering, and social sciences. Its goal is to enhance the sharing and reuse of scientific data, encourage broader data sharing, and acknowledge those who share their data.
The journal primarily publishes Data Descriptors, which offer detailed descriptions of research datasets, including data collection methods and technical analyses validating data quality. These descriptors aim to facilitate data reuse rather than testing hypotheses or presenting new interpretations, methods, or in-depth analyses.