Alterations in intestinal microbiota have been identified as a key risk factor in rheumatoid arthritis (RA). This study presents a multidimensional gut microbiota profile from a large cohort of RA patients, stratified by disease stage and treatment regimens, and compared to healthy controls. Our dataset comprises gut microbiota profiles from 2,238 individuals, including 1,034 RA patients (Ascia Pacific RA cohort, APRAC) and 1,204 healthy controls. This dataset is enriched with detailed clinical metadata, including patient profiles, treatment histories, and environmental factors, providing a comprehensive "disease exposome" for RA. By integrating 16S rRNA gene sequencing with demographic, clinical, and environmental data, we offer a valuable resource to explore the complex relationships between gut microbiota and RA progression. This large-scale dataset is expected to be a foundation for collaborative research, advancing our understanding of the microbiome's systemic effects in RA and other autoimmune diseases and potentially guiding new therapeutic approaches.
{"title":"A Comprehensive Dataset on Microbiome Dynamics in Rheumatoid Arthritis from a Large-Scale Cohort Study.","authors":"Jing Li, Jun Xu, Jiayang Jin, Congmin Xu, Yuzhou Gan, Yifan Wang, Ruiling Feng, Wenqiang Fan, Yingni Li, Xiaozhen Zhao, Yucui Li, Shushi Gong, Linchong Su, Yueming Cai, Lianjie Shi, Xiaolin Sun, Yang Xiang, Qingwen Wang, Ru Li, Jinxia Zhao, Yulan Liu, Junjie Qin, Zhanguo Li, Jing He","doi":"10.1038/s41597-025-04422-0","DOIUrl":"10.1038/s41597-025-04422-0","url":null,"abstract":"<p><p>Alterations in intestinal microbiota have been identified as a key risk factor in rheumatoid arthritis (RA). This study presents a multidimensional gut microbiota profile from a large cohort of RA patients, stratified by disease stage and treatment regimens, and compared to healthy controls. Our dataset comprises gut microbiota profiles from 2,238 individuals, including 1,034 RA patients (Ascia Pacific RA cohort, APRAC) and 1,204 healthy controls. This dataset is enriched with detailed clinical metadata, including patient profiles, treatment histories, and environmental factors, providing a comprehensive \"disease exposome\" for RA. By integrating 16S rRNA gene sequencing with demographic, clinical, and environmental data, we offer a valuable resource to explore the complex relationships between gut microbiota and RA progression. This large-scale dataset is expected to be a foundation for collaborative research, advancing our understanding of the microbiome's systemic effects in RA and other autoimmune diseases and potentially guiding new therapeutic approaches.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"232"},"PeriodicalIF":5.8,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11806058/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143370996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-07DOI: 10.1038/s41597-025-04547-2
Nicolás Vidal-Vázquez, Ismael Hernández-Núñez, Pablo Carballo-Pacoret, Sarah Salisbury, Paula R Villamayor, Francisca Hervas-Sotomayor, Xuefei Yuan, Francesco Lamanna, Céline Schneider, Julia Schmidt, Sylvie Mazan, Henrik Kaessmann, Fátima Adrio, Diego Robledo, Antón Barreiro-Iglesias, Eva Candal
The retina, whose basic cellular structure is highly conserved across vertebrates, constitutes an accessible system for studying the central nervous system. In recent years, single-cell RNA sequencing studies have uncovered cellular diversity in the retina of a variety of species, providing new insights on retinal evolution and development. However, similar data in cartilaginous fishes, the sister group to all other extant jawed vertebrates, are still lacking. Here, we present a single-nucleus RNA sequencing atlas of the postnatal retina of the catshark Scyliorhinus canicula, consisting of the expression profiles for 17,438 individual cells from three female, juvenile catshark specimens. Unsupervised clustering revealed 22 distinct cell types comprising all major retinal cell classes, as well as retinal progenitor cells (whose presence reflects the persistence of proliferative activity in postnatal stages in sharks) and oligodendrocytes. Thus, our dataset serves as a foundation for further studies on the development and function of the catshark retina. Moreover, integration of our atlas with data from other species will allow for a better understanding of vertebrate retinal evolution.
{"title":"A single-nucleus RNA sequencing atlas of the postnatal retina of the shark Scyliorhinus canicula.","authors":"Nicolás Vidal-Vázquez, Ismael Hernández-Núñez, Pablo Carballo-Pacoret, Sarah Salisbury, Paula R Villamayor, Francisca Hervas-Sotomayor, Xuefei Yuan, Francesco Lamanna, Céline Schneider, Julia Schmidt, Sylvie Mazan, Henrik Kaessmann, Fátima Adrio, Diego Robledo, Antón Barreiro-Iglesias, Eva Candal","doi":"10.1038/s41597-025-04547-2","DOIUrl":"10.1038/s41597-025-04547-2","url":null,"abstract":"<p><p>The retina, whose basic cellular structure is highly conserved across vertebrates, constitutes an accessible system for studying the central nervous system. In recent years, single-cell RNA sequencing studies have uncovered cellular diversity in the retina of a variety of species, providing new insights on retinal evolution and development. However, similar data in cartilaginous fishes, the sister group to all other extant jawed vertebrates, are still lacking. Here, we present a single-nucleus RNA sequencing atlas of the postnatal retina of the catshark Scyliorhinus canicula, consisting of the expression profiles for 17,438 individual cells from three female, juvenile catshark specimens. Unsupervised clustering revealed 22 distinct cell types comprising all major retinal cell classes, as well as retinal progenitor cells (whose presence reflects the persistence of proliferative activity in postnatal stages in sharks) and oligodendrocytes. Thus, our dataset serves as a foundation for further studies on the development and function of the catshark retina. Moreover, integration of our atlas with data from other species will allow for a better understanding of vertebrate retinal evolution.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"228"},"PeriodicalIF":5.8,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11806052/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143370997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-07DOI: 10.1038/s41597-025-04572-1
Nan Lin, Mengxuan Zheng, Lian Li, Peng Hu, Weifang Gao, Heyang Sun, Chang Xu, Gonglin Yuan, Zi Liang, Yisu Dong, Haibo He, Liying Cui, Qiang Lu
Interictal epileptiform discharge (IED) and its spatial distribution are critical for the diagnosis, classification, and treatment of epilepsy. Existing publicly available datasets suffer from limitations such as insufficient data amount and lack of spatial distribution information. In this paper, we present a comprehensive EEG dataset containing annotated interictal epileptic data from 84 patients, each contributing 20 minutes of continuous raw EEG recordings, totaling 28 hours. IEDs and states of consciousness (wake/sleep) were meticulously annotated by at least three EEG experts. The IEDs were categorized into five types based on occurrence regions: generalized, frontal, temporal, occipital, and centro-parietal. The dataset includes 2,516 IED epochs and 22,933 non-IED epochs, each 4 seconds long. We developed and validated a VGG-based model for IED detection using this dataset, achieving improved performance with the inclusion of consciousness and/or spatial distribution information. Additionally, our dataset serves as a reliable test set for evaluating and comparing existing IED detection models.
{"title":"An EEG dataset for interictal epileptiform discharge with spatial distribution information.","authors":"Nan Lin, Mengxuan Zheng, Lian Li, Peng Hu, Weifang Gao, Heyang Sun, Chang Xu, Gonglin Yuan, Zi Liang, Yisu Dong, Haibo He, Liying Cui, Qiang Lu","doi":"10.1038/s41597-025-04572-1","DOIUrl":"10.1038/s41597-025-04572-1","url":null,"abstract":"<p><p>Interictal epileptiform discharge (IED) and its spatial distribution are critical for the diagnosis, classification, and treatment of epilepsy. Existing publicly available datasets suffer from limitations such as insufficient data amount and lack of spatial distribution information. In this paper, we present a comprehensive EEG dataset containing annotated interictal epileptic data from 84 patients, each contributing 20 minutes of continuous raw EEG recordings, totaling 28 hours. IEDs and states of consciousness (wake/sleep) were meticulously annotated by at least three EEG experts. The IEDs were categorized into five types based on occurrence regions: generalized, frontal, temporal, occipital, and centro-parietal. The dataset includes 2,516 IED epochs and 22,933 non-IED epochs, each 4 seconds long. We developed and validated a VGG-based model for IED detection using this dataset, achieving improved performance with the inclusion of consciousness and/or spatial distribution information. Additionally, our dataset serves as a reliable test set for evaluating and comparing existing IED detection models.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"229"},"PeriodicalIF":5.8,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11805897/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143370962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-07DOI: 10.1038/s41597-025-04491-1
Scott C Lowe, Benjamin Misiuk, Isaac Xu, Shakhboz Abdulazizov, Amit R Baroi, Alex C Bastos, Merlin Best, Vicki Ferrini, Ariell Friedman, Deborah Hart, Ove Hoegh-Guldberg, Daniel Ierodiaconou, Julia Mackin-McLaughlin, Kathryn Markey, Pedro S Menandro, Jacquomo Monk, Shreya Nemani, John O'Brien, Elizabeth Oh, Luba Y Reshitnyk, Katleen Robert, Chris M Roelfsema, Jessica A Sameoto, Alexandre C G Schimel, Jordan A Thomson, Brittany R Wilson, Melisa C Wong, Craig J Brown, Thomas Trappenberg
Advances in underwater imaging enable collection of extensive seafloor image datasets necessary for monitoring important benthic ecosystems. The ability to collect seafloor imagery has outpaced our capacity to analyze it, hindering mobilization of this crucial environmental information. Machine learning approaches provide opportunities to increase the efficiency with which seafloor imagery is analyzed, yet large and consistent datasets to support development of such approaches are scarce. Here we present BenthicNet: a global compilation of seafloor imagery designed to support the training and evaluation of large-scale image recognition models. An initial set of over 11.4 million images was collected and curated to represent a diversity of seafloor environments using a representative subset of 1.3 million images. These are accompanied by 3.1 million annotations translated to the CATAMI scheme, which span 190,000 of the images. A large deep learning model was trained on this compilation and preliminary results suggest it has utility for automating large and small-scale image analysis tasks. The compilation and model are made openly available for reuse.
{"title":"BenthicNet: A global compilation of seafloor images for deep learning applications.","authors":"Scott C Lowe, Benjamin Misiuk, Isaac Xu, Shakhboz Abdulazizov, Amit R Baroi, Alex C Bastos, Merlin Best, Vicki Ferrini, Ariell Friedman, Deborah Hart, Ove Hoegh-Guldberg, Daniel Ierodiaconou, Julia Mackin-McLaughlin, Kathryn Markey, Pedro S Menandro, Jacquomo Monk, Shreya Nemani, John O'Brien, Elizabeth Oh, Luba Y Reshitnyk, Katleen Robert, Chris M Roelfsema, Jessica A Sameoto, Alexandre C G Schimel, Jordan A Thomson, Brittany R Wilson, Melisa C Wong, Craig J Brown, Thomas Trappenberg","doi":"10.1038/s41597-025-04491-1","DOIUrl":"10.1038/s41597-025-04491-1","url":null,"abstract":"<p><p>Advances in underwater imaging enable collection of extensive seafloor image datasets necessary for monitoring important benthic ecosystems. The ability to collect seafloor imagery has outpaced our capacity to analyze it, hindering mobilization of this crucial environmental information. Machine learning approaches provide opportunities to increase the efficiency with which seafloor imagery is analyzed, yet large and consistent datasets to support development of such approaches are scarce. Here we present BenthicNet: a global compilation of seafloor imagery designed to support the training and evaluation of large-scale image recognition models. An initial set of over 11.4 million images was collected and curated to represent a diversity of seafloor environments using a representative subset of 1.3 million images. These are accompanied by 3.1 million annotations translated to the CATAMI scheme, which span 190,000 of the images. A large deep learning model was trained on this compilation and preliminary results suggest it has utility for automating large and small-scale image analysis tasks. The compilation and model are made openly available for reuse.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"230"},"PeriodicalIF":5.8,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11806053/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143370985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-06DOI: 10.1038/s41597-025-04502-1
Mark J Lara, Roger Michaelides, Duncan Anderson, Wenqu Chen, Emma C Hall, Caroline Ludden, Aiden I G Schore, Umakant Mishra, Sarah N Scott
Peatlands are prevalent across northern regions, including bogs, fens, marshes, meadows, and select tundra wetlands that all vary in size (e.g., 0.01 s to 10 s km2) and shape (e.g., circular to elongated). However, our best remotely sensed products describing the regional-scale distribution of peatland extents are constrained to 1 km2 pixels, often representing notable sub-pixel heterogeneity and local-scale uncertainties. Here we develop a new 20 m spatial resolution wall-to-wall ~1.5 million km2 peatland map of Alaska, using peat cores, ground observations, and sub-meter resolution image interpretation. Ground-data were used to train machine learning classifiers to detect peatlands using a fusion of Sentinel-1 (Dual-polarized Synthetic Aperture Radar), Sentinel-2 (Multi-Spectral Imager), and derivatives from the Arctic Digital Elevation Model (ArcticDEM), that were spatially constrained by a peatland suitability model. Statewide peatland mapping (overall agreement:85%) identified peatlands to cover 4.6, 10.4, and 5.3% of polar, boreal, and maritime ecoregions, respectively, and 7.3% of the total terrestrial land area. This new dataset will improve the representation of peatland carbon, nutrient, and fire dynamics across Alaska.
{"title":"A 20 m spatial resolution peatland extent map of Alaska.","authors":"Mark J Lara, Roger Michaelides, Duncan Anderson, Wenqu Chen, Emma C Hall, Caroline Ludden, Aiden I G Schore, Umakant Mishra, Sarah N Scott","doi":"10.1038/s41597-025-04502-1","DOIUrl":"10.1038/s41597-025-04502-1","url":null,"abstract":"<p><p>Peatlands are prevalent across northern regions, including bogs, fens, marshes, meadows, and select tundra wetlands that all vary in size (e.g., 0.01 s to 10 s km<sup>2</sup>) and shape (e.g., circular to elongated). However, our best remotely sensed products describing the regional-scale distribution of peatland extents are constrained to 1 km<sup>2</sup> pixels, often representing notable sub-pixel heterogeneity and local-scale uncertainties. Here we develop a new 20 m spatial resolution wall-to-wall ~1.5 million km<sup>2</sup> peatland map of Alaska, using peat cores, ground observations, and sub-meter resolution image interpretation. Ground-data were used to train machine learning classifiers to detect peatlands using a fusion of Sentinel-1 (Dual-polarized Synthetic Aperture Radar), Sentinel-2 (Multi-Spectral Imager), and derivatives from the Arctic Digital Elevation Model (ArcticDEM), that were spatially constrained by a peatland suitability model. Statewide peatland mapping (overall agreement:85%) identified peatlands to cover 4.6, 10.4, and 5.3% of polar, boreal, and maritime ecoregions, respectively, and 7.3% of the total terrestrial land area. This new dataset will improve the representation of peatland carbon, nutrient, and fire dynamics across Alaska.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"226"},"PeriodicalIF":5.8,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11802868/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143365930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-06DOI: 10.1038/s41597-024-04300-1
G Roncoroni, P Koyan, E Forte, J Tronicke, M Pipan
We present a 2D multi-offset, multi-frequency synthetic GPR data set specifically designed to evaluate and test processing, analysis and inversion techniques. The data set replicates realistic subsurface conditions at four sections separated by 2 m. We modeled four multi-offset GPR profiles at 50, 100 and 200 MHz frequencies using realistic wavelets. The data set provides a robust framework for validating advanced GPR algorithms and techniques such as pre-stack depth migration, amplitude versus offset analysis and full waveform inversion. Extensive technical validation ensures data reproducibility and affordability. The standardized, realistic synthetic data set can be used as a reliable benchmark for developing and testing new algorithms and methods, thereby advancing the understanding of subsurface imaging and real-world data interpretation.
{"title":"A realistic 2D multi-offset, multi-frequency synthetic GPR data set as a benchmark for testing new algorithms.","authors":"G Roncoroni, P Koyan, E Forte, J Tronicke, M Pipan","doi":"10.1038/s41597-024-04300-1","DOIUrl":"10.1038/s41597-024-04300-1","url":null,"abstract":"<p><p>We present a 2D multi-offset, multi-frequency synthetic GPR data set specifically designed to evaluate and test processing, analysis and inversion techniques. The data set replicates realistic subsurface conditions at four sections separated by 2 m. We modeled four multi-offset GPR profiles at 50, 100 and 200 MHz frequencies using realistic wavelets. The data set provides a robust framework for validating advanced GPR algorithms and techniques such as pre-stack depth migration, amplitude versus offset analysis and full waveform inversion. Extensive technical validation ensures data reproducibility and affordability. The standardized, realistic synthetic data set can be used as a reliable benchmark for developing and testing new algorithms and methods, thereby advancing the understanding of subsurface imaging and real-world data interpretation.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"221"},"PeriodicalIF":5.8,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11802766/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143365933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-06DOI: 10.1038/s41597-024-04154-7
Carolyn B McNabb, Ian D Driver, Vanessa Hyde, Garin Hughes, Hannah L Chandler, Hannah Thomas, Christopher Allen, Eirini Messaritaki, Carl J Hodgetts, Craig Hedge, Maria Engel, Sophie F Standen, Emma L Morgan, Elena Stylianopoulou, Svetla Manolova, Lucie Reed, Matthew Ploszajski, Mark Drakesmith, Michael Germuska, Alexander D Shaw, Lars Mueller, Holly Rossiter, Christopher W Davies-Jenkins, Tom Lancaster, C John Evans, David Owen, Gavin Perry, Slawomir Kusmia, Emily Lambe, Adam M Partridge, Allison Cooper, Peter Hobden, Hanzhang Lu, Kim S Graham, Andrew D Lawrence, Richard G Wise, James T R Walters, Petroc Sumner, Krish D Singh, Derek K Jones
This paper introduces the Welsh Advanced Neuroimaging Database (WAND), a multi-scale, multi-modal imaging dataset comprising in vivo brain data from 170 healthy volunteers (aged 18-63 years), including 3 Tesla (3 T) magnetic resonance imaging (MRI) with ultra-strong (300 mT/m) magnetic field gradients, structural and functional MRI and nuclear magnetic resonance spectroscopy at 3 T and 7 T, magnetoencephalography (MEG), and transcranial magnetic stimulation (TMS), together with trait questionnaire and cognitive data. Data are organised using the Brain Imaging Data Structure (BIDS). In addition to raw data, we provide brain-extracted T1-weighted images, and quality reports for diffusion, T1- and T2-weighted structural data, and blood-oxygen level dependent functional tasks. Reasons for participant exclusion are also included. Data are available for download through our GIN repository, a data access management system designed to reduce storage requirements. Users can interact with and retrieve data as needed, without downloading the complete dataset. Given the depth of neuroimaging phenotyping, leveraging ultra-high-gradient, high-field MRI, MEG and TMS, this dataset will facilitate multi-scale and multi-modal investigations of the healthy human brain.
{"title":"WAND: A multi-modal dataset integrating advanced MRI, MEG, and TMS for multi-scale brain analysis.","authors":"Carolyn B McNabb, Ian D Driver, Vanessa Hyde, Garin Hughes, Hannah L Chandler, Hannah Thomas, Christopher Allen, Eirini Messaritaki, Carl J Hodgetts, Craig Hedge, Maria Engel, Sophie F Standen, Emma L Morgan, Elena Stylianopoulou, Svetla Manolova, Lucie Reed, Matthew Ploszajski, Mark Drakesmith, Michael Germuska, Alexander D Shaw, Lars Mueller, Holly Rossiter, Christopher W Davies-Jenkins, Tom Lancaster, C John Evans, David Owen, Gavin Perry, Slawomir Kusmia, Emily Lambe, Adam M Partridge, Allison Cooper, Peter Hobden, Hanzhang Lu, Kim S Graham, Andrew D Lawrence, Richard G Wise, James T R Walters, Petroc Sumner, Krish D Singh, Derek K Jones","doi":"10.1038/s41597-024-04154-7","DOIUrl":"10.1038/s41597-024-04154-7","url":null,"abstract":"<p><p>This paper introduces the Welsh Advanced Neuroimaging Database (WAND), a multi-scale, multi-modal imaging dataset comprising in vivo brain data from 170 healthy volunteers (aged 18-63 years), including 3 Tesla (3 T) magnetic resonance imaging (MRI) with ultra-strong (300 mT/m) magnetic field gradients, structural and functional MRI and nuclear magnetic resonance spectroscopy at 3 T and 7 T, magnetoencephalography (MEG), and transcranial magnetic stimulation (TMS), together with trait questionnaire and cognitive data. Data are organised using the Brain Imaging Data Structure (BIDS). In addition to raw data, we provide brain-extracted T1-weighted images, and quality reports for diffusion, T1- and T2-weighted structural data, and blood-oxygen level dependent functional tasks. Reasons for participant exclusion are also included. Data are available for download through our GIN repository, a data access management system designed to reduce storage requirements. Users can interact with and retrieve data as needed, without downloading the complete dataset. Given the depth of neuroimaging phenotyping, leveraging ultra-high-gradient, high-field MRI, MEG and TMS, this dataset will facilitate multi-scale and multi-modal investigations of the healthy human brain.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"220"},"PeriodicalIF":5.8,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11803114/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143365944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-06DOI: 10.1038/s41597-025-04541-8
Luigi Cavaleri, Angela Pomaro, Luciana Bertotti, Andrea Bonometto, Luca De Nat, Alvise Papa, Valerio Volpe
The dataset comprises a 45-year-long directional wave time series recorded at the Acqua Alta Oceanographic research Tower (AAOT) since 1979. The AAOT is located in the Northern Adriatic Sea and it is managed by the Institute of Marine Sciences of the National Research Council of Italy (CNR-ISMAR). The extent of the time series enables the description of the wave climate in the North Adriatic region and the identification of trends and links with large-scale climate patterns from a single and permanent observational source. Different wave gauges have been used since the start of the measurements, progressively upgraded and repositioned during maintenance operations. The recent addition of more precise instruments and the availability of the related raw dataset allowed for a re-evaluation of the previously published data, while extending the timeseries in time. This has enabled the creation of a substantially improved, yet homogeneous, measured dataset, thereby enhancing the reliability of the related long-term scenario analysis.
{"title":"45 years of directional wave recorded data at the Acqua Alta oceanographic tower.","authors":"Luigi Cavaleri, Angela Pomaro, Luciana Bertotti, Andrea Bonometto, Luca De Nat, Alvise Papa, Valerio Volpe","doi":"10.1038/s41597-025-04541-8","DOIUrl":"10.1038/s41597-025-04541-8","url":null,"abstract":"<p><p>The dataset comprises a 45-year-long directional wave time series recorded at the Acqua Alta Oceanographic research Tower (AAOT) since 1979. The AAOT is located in the Northern Adriatic Sea and it is managed by the Institute of Marine Sciences of the National Research Council of Italy (CNR-ISMAR). The extent of the time series enables the description of the wave climate in the North Adriatic region and the identification of trends and links with large-scale climate patterns from a single and permanent observational source. Different wave gauges have been used since the start of the measurements, progressively upgraded and repositioned during maintenance operations. The recent addition of more precise instruments and the availability of the related raw dataset allowed for a re-evaluation of the previously published data, while extending the timeseries in time. This has enabled the creation of a substantially improved, yet homogeneous, measured dataset, thereby enhancing the reliability of the related long-term scenario analysis.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"224"},"PeriodicalIF":5.8,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11802720/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143365928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-06DOI: 10.1038/s41597-025-04492-0
Rongen Yan, Ping An, Xianghao Meng, Yakun Li, Dongmei Li, Fu Xu, Depeng Dang
A standardized representation and sharing of crop disease and pest data is crucial for enhancing crop yields, especially in China, which features vast cultivation areas and complex agricultural ecosystems. A knowledge graph for crop diseases and pests, acting as a repository of entities and relationships, is crucial conceptually for achieving unified data management. However, there is currently a lack of knowledge graphs specifically designed for this field. In this paper, we propose CropDP-KG, a knowledge graph for crop diseases and pests in China, which leverages natural language processing techniques to analyze data from the Chinese crop diseases and pests image-text database. CropDP-KG covers relevant information on crop diseases and pests in China, featuring 8 primary entities such as diseases, symptoms, and crops, and is organized into 7 relationships such as primary occurrence locations, affected parts and suitable temperature. In total, it includes 13,840 entities and 21,961 relationships. In the case studies presented in this research, we also show a versatile application of CropDP, namely a knowledge service system, and have released its codebase under an open-source license. The content of this paper provides a guide for users to build their own knowledge graphs, aiming to help them effectively reuse and extend the knowledge graphs they create.
{"title":"A knowledge graph for crop diseases and pests in China.","authors":"Rongen Yan, Ping An, Xianghao Meng, Yakun Li, Dongmei Li, Fu Xu, Depeng Dang","doi":"10.1038/s41597-025-04492-0","DOIUrl":"10.1038/s41597-025-04492-0","url":null,"abstract":"<p><p>A standardized representation and sharing of crop disease and pest data is crucial for enhancing crop yields, especially in China, which features vast cultivation areas and complex agricultural ecosystems. A knowledge graph for crop diseases and pests, acting as a repository of entities and relationships, is crucial conceptually for achieving unified data management. However, there is currently a lack of knowledge graphs specifically designed for this field. In this paper, we propose CropDP-KG, a knowledge graph for crop diseases and pests in China, which leverages natural language processing techniques to analyze data from the Chinese crop diseases and pests image-text database. CropDP-KG covers relevant information on crop diseases and pests in China, featuring 8 primary entities such as diseases, symptoms, and crops, and is organized into 7 relationships such as primary occurrence locations, affected parts and suitable temperature. In total, it includes 13,840 entities and 21,961 relationships. In the case studies presented in this research, we also show a versatile application of CropDP, namely a knowledge service system, and have released its codebase under an open-source license. The content of this paper provides a guide for users to build their own knowledge graphs, aiming to help them effectively reuse and extend the knowledge graphs they create.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"222"},"PeriodicalIF":5.8,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11802884/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143365931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-06DOI: 10.1038/s41597-025-04471-5
Emmanuel O Akindele, Abiodun M Adedapo, Oluwaseun T Akinpelu, Esther D Kowobari, Oluwatosin C Folorunso, Ibrahim R Fagbohun, Tolulope A Oladeji, Olanrewaju O Aliu, Oluwatobiloba S Adenola, Babasola W Adu, Francis O Arimoro, Sylvester S Ogbogu, Sami Domisch
The Guineo-Congolian region, extending from Guinea in West Africa to the central part of Africa, is considered an important biodiversity hotspot in the Afrotropics. Aside from the underreporting and underestimation of freshwater ecosystems, the challenges regarding incorrect coordinates and taxonomical inaccuracies in freshwater species occurrence data pose another major hurdle that may hinder freshwater conservation efforts in the hotspot. Hence, for any biogeographic analysis, species distribution modelling or conservation initiative, it is crucial to use datasets that are, to the largest possible extent, free of spatial and taxonomic errors. We present the final output of 8,809 occurrences consisting of 4 phyla, eight classes, 32 orders, and 1,104 species. We also added the Hydrography90m stream network attributes to the macroinvertebrate occurrence records, such that the data spans across 2,890 sub-catchments and Strahler stream orders 1-12. These records are considered valid and can be used for biogeographic analysis of freshwater macroinvertebrates in this important yet understudied freshwater biodiversity hotspot.
{"title":"A spatial inventory of freshwater macroinvertebrate occurrences in the Guineo-Congolian biodiversity hotspot.","authors":"Emmanuel O Akindele, Abiodun M Adedapo, Oluwaseun T Akinpelu, Esther D Kowobari, Oluwatosin C Folorunso, Ibrahim R Fagbohun, Tolulope A Oladeji, Olanrewaju O Aliu, Oluwatobiloba S Adenola, Babasola W Adu, Francis O Arimoro, Sylvester S Ogbogu, Sami Domisch","doi":"10.1038/s41597-025-04471-5","DOIUrl":"10.1038/s41597-025-04471-5","url":null,"abstract":"<p><p>The Guineo-Congolian region, extending from Guinea in West Africa to the central part of Africa, is considered an important biodiversity hotspot in the Afrotropics. Aside from the underreporting and underestimation of freshwater ecosystems, the challenges regarding incorrect coordinates and taxonomical inaccuracies in freshwater species occurrence data pose another major hurdle that may hinder freshwater conservation efforts in the hotspot. Hence, for any biogeographic analysis, species distribution modelling or conservation initiative, it is crucial to use datasets that are, to the largest possible extent, free of spatial and taxonomic errors. We present the final output of 8,809 occurrences consisting of 4 phyla, eight classes, 32 orders, and 1,104 species. We also added the Hydrography90m stream network attributes to the macroinvertebrate occurrence records, such that the data spans across 2,890 sub-catchments and Strahler stream orders 1-12. These records are considered valid and can be used for biogeographic analysis of freshwater macroinvertebrates in this important yet understudied freshwater biodiversity hotspot.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"227"},"PeriodicalIF":5.8,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11802732/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143365934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}