Pub Date : 2026-04-01Epub Date: 2026-01-22DOI: 10.1016/j.dib.2026.112501
Marcello Abbondio , Alessandro Tanca , Rosangela Sau , Giovanna Pira , Alessandra Errigo , Roberto Manetti , Giovanni Mario Pes , Stefano Bibbò , Maria Pina Dore , Sergio Uzzau
This dataset provides the fecal metaproteome profiles of 28 celiac disease patients on a gluten-free diet, distinguished by the presence or absence of co-occurring autoimmune conditions. The resource includes raw liquid chromatography-tandem mass spectrometry (LC-MS/MS) files, database search results, protein/peptide identification outputs, and taxonomic/functional annotation outputs, along with comprehensive anthropometric, clinical, and dietary metadata for each patient. The identified proteins originate from microbial, human, and plant sources, consistent with the multi-database search strategy used. This collection is designed for reuse in meta-analyses and integrative studies exploring functional changes in the gut microbiome related to auto-immune status and dietary variables. The complete dataset is available via the ProteomeXchange Consortium with the identifier PXD069517.
{"title":"A human fecal metaproteomic dataset from celiac disease patients on gluten-free diet with or without poly-autoimmunity","authors":"Marcello Abbondio , Alessandro Tanca , Rosangela Sau , Giovanna Pira , Alessandra Errigo , Roberto Manetti , Giovanni Mario Pes , Stefano Bibbò , Maria Pina Dore , Sergio Uzzau","doi":"10.1016/j.dib.2026.112501","DOIUrl":"10.1016/j.dib.2026.112501","url":null,"abstract":"<div><div>This dataset provides the fecal metaproteome profiles of 28 celiac disease patients on a gluten-free diet, distinguished by the presence or absence of co-occurring autoimmune conditions. The resource includes raw liquid chromatography-tandem mass spectrometry (LC-MS/MS) files, database search results, protein/peptide identification outputs, and taxonomic/functional annotation outputs, along with comprehensive anthropometric, clinical, and dietary metadata for each patient. The identified proteins originate from microbial, human, and plant sources, consistent with the multi-database search strategy used. This collection is designed for reuse in meta-analyses and integrative studies exploring functional changes in the gut microbiome related to auto-immune status and dietary variables. The complete dataset is available via the ProteomeXchange Consortium with the identifier PXD069517.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112501"},"PeriodicalIF":1.4,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146075123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-04-01Epub Date: 2026-01-08DOI: 10.1016/j.dib.2026.112448
Simeon Okechukwu Ajakwe , Vivian Ukamaka Ihekoronye , Golam Mohtasin , Rubina Akter , Jae Min Lee , Dong Seong Kim
The rapid proliferation of unmanned aerial vehicles (UAVs) for logistics, surveillance, and civilian applications continues to pose significant challenges to airspace security, particularly through unauthorized or malicious deployments. Existing UAV datasets are limited in scope, often focusing on single-drone scenarios, synthetic imagery, or restricted environmental conditions, thereby constraining the development of robust counter-UAV systems. To bridge these gaps, we present vision-based drone detection dataset named as VisioDECT, a comprehensive and scenario-rich dataset for multi-drone detection, identification, and neutralization. The dataset comprises 20,924 annotated images and labels from six UAV models (Anafi-Extended, DJI FPV, DJI Phantom, EFT-E410S, Mavic Air 2, and Mavic 2 Enterprise), captured across three distinct scenarios (sunny, cloudy, and evening) at varying altitudes (30–100 m) and distances. Importantly, all UAVs included in this dataset are rotary-wing (multirotor) platforms, which dominate low-altitude airspace and are the most commonly encountered in real-world surveillance and counter-UAV scenarios. Data were collected over 20 months from more than 12 locations in South Korea, ensuring diversity in illumination, weather, and background complexity. Each sample is provided in three standard formats (.txt, .xml, .csv), with detailed metadata and quality-verified annotations for detection and classification tasks. Illustrative benchmark evaluations using state-of-the-art detection models (e.g., DRONET, YOLO variants) are included solely to validate the quality and practical usability of the dataset for real-time drone defense research. VisioDECT provides a standardized, reproducible, and scalable resource that enables benchmarking, model training, and evaluation for airspace surveillance, UAV traffic management, and national security applications.
{"title":"VisioDECT: A robust dataset for aerial and scenario based multi-drone detection, identification, and neutralization","authors":"Simeon Okechukwu Ajakwe , Vivian Ukamaka Ihekoronye , Golam Mohtasin , Rubina Akter , Jae Min Lee , Dong Seong Kim","doi":"10.1016/j.dib.2026.112448","DOIUrl":"10.1016/j.dib.2026.112448","url":null,"abstract":"<div><div>The rapid proliferation of unmanned aerial vehicles (UAVs) for logistics, surveillance, and civilian applications continues to pose significant challenges to airspace security, particularly through unauthorized or malicious deployments. Existing UAV datasets are limited in scope, often focusing on single-drone scenarios, synthetic imagery, or restricted environmental conditions, thereby constraining the development of robust counter-UAV systems. To bridge these gaps, we present vision-based drone detection dataset named as <strong>VisioDECT</strong>, a comprehensive and scenario-rich dataset for multi-drone detection, identification, and neutralization. The dataset comprises 20,924 annotated images and labels from six UAV models (Anafi-Extended, DJI FPV, DJI Phantom, EFT-E410S, Mavic Air 2, and Mavic 2 Enterprise), captured across three distinct scenarios (sunny, cloudy, and evening) at varying altitudes (30–100 m) and distances. Importantly, all UAVs included in this dataset are rotary-wing (multirotor) platforms, which dominate low-altitude airspace and are the most commonly encountered in real-world surveillance and counter-UAV scenarios. Data were collected over 20 months from more than 12 locations in South Korea, ensuring diversity in illumination, weather, and background complexity. Each sample is provided in three standard formats (.txt, .xml, .csv), with detailed metadata and quality-verified annotations for detection and classification tasks. Illustrative benchmark evaluations using state-of-the-art detection models (e.g., DRONET, YOLO variants) are included solely to validate the quality and practical usability of the dataset for real-time drone defense research. VisioDECT provides a standardized, reproducible, and scalable resource that enables benchmarking, model training, and evaluation for airspace surveillance, UAV traffic management, and national security applications.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112448"},"PeriodicalIF":1.4,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146075114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-04-01Epub Date: 2026-01-18DOI: 10.1016/j.dib.2026.112476
Brandon Beltz , Jim Doty , Yvonne Fonken , Nikolos Gurney , Brett Israelsen , Nathan Lau , Stacy Marsella , Rachelle Thomas , Stoney Trent , Peggy Wu , Ya-Ting Yang , Quanyan Zhu
We present datasets from three large-scale human-subject experiments involving red-team hacking in a cyber range in the Guarding Against Malicious Biased Threats (GAMBiT) project. Across Experiments 1-3 (July 2024-March 2025), 19-20 skilled attackers per experiment conducted two 8-hour days of self-paced operations in a simulated enterprise network (SimSpace Cyber Force Platform) while collecting multi-modal data: self-reports (background, demographics, psychometrics), operational notes, terminal histories, key logs, network packet captures (PCAP), and NIDS alerts (Suricata). Each participant began from a standardized Kali Linux VM and pursued realistic objectives (e.g., target discovery and data exfiltration) under controlled constraints. Derivative curated logs and labels are included. The combined data release supports research on attacker behavior modeling, bias-aware analytics, and method benchmarking. Data are available via IEEE DataPort entries for Experiments 1-3.
我们展示了在防范恶意偏见威胁(GAMBiT)项目中涉及红队黑客在网络范围内的三个大规模人类受试者实验的数据集。在实验1-3期间(2024年7月至2025年3月),每个实验有19-20名熟练的攻击者在模拟企业网络(SimSpace Cyber Force Platform)中进行了两个8小时的自定义操作,同时收集了多模态数据:自我报告(背景,人口统计,心理测量),操作笔记,终端历史,关键日志,网络数据包捕获(PCAP)和NIDS警报(Suricata)。每个参与者都从一个标准化的Kali Linux VM开始,在受控的约束下追求现实的目标(例如,目标发现和数据泄露)。衍生策划日志和标签包括在内。合并的数据发布支持攻击者行为建模、偏见感知分析和方法基准测试的研究。实验1-3的数据可通过IEEE数据端口条目获得。
{"title":"Guarding against malicious biased threats (GAMBiT) datasets: Revealing cognitive bias in human-subjects red-team cyber range operations","authors":"Brandon Beltz , Jim Doty , Yvonne Fonken , Nikolos Gurney , Brett Israelsen , Nathan Lau , Stacy Marsella , Rachelle Thomas , Stoney Trent , Peggy Wu , Ya-Ting Yang , Quanyan Zhu","doi":"10.1016/j.dib.2026.112476","DOIUrl":"10.1016/j.dib.2026.112476","url":null,"abstract":"<div><div>We present datasets from three large-scale human-subject experiments involving red-team hacking in a cyber range in the Guarding Against Malicious Biased Threats (GAMBiT) project. Across Experiments 1-3 (July 2024-March 2025), 19-20 skilled attackers per experiment conducted two 8-hour days of self-paced operations in a simulated enterprise network (SimSpace Cyber Force Platform) while collecting multi-modal data: self-reports (background, demographics, psychometrics), operational notes, terminal histories, key logs, network packet captures (PCAP), and NIDS alerts (Suricata). Each participant began from a standardized Kali Linux VM and pursued realistic objectives (e.g., target discovery and data exfiltration) under controlled constraints. Derivative curated logs and labels are included. The combined data release supports research on attacker behavior modeling, bias-aware analytics, and method benchmarking. Data are available via IEEE DataPort entries for Experiments 1-3.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112476"},"PeriodicalIF":1.4,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146075157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-04-01Epub Date: 2026-01-21DOI: 10.1016/j.dib.2026.112481
Paulo Moreno-Meynard
This dataset documents the spatially explicit quantification of multiple ecosystem functions across 12 mountain headwater catchments in the Aysén Region of Chilean Patagonia. Designed to capture landscape variability, the observational framework employs a paired-catchment approach, comparing basins with different degrees of anthropogenic disturbance across two forest types: deciduous and evergreen. Each catchment is treated as an integrated landscape unit, with cluster-based field measurements capturing fine-scale variation in vegetation structure, biomass, soil conditions, and species richness.
The field inventory integrates and adapts methodologies from several national and international forest monitoring frameworks. Its core structure is based on Chile’s Continuous National Forest Inventory, but also incorporates sampling concepts and measurement protocols inspired by the Swiss National Forest Inventory (LFI), the U.S. Forest Inventory and Analysis (FIA) program, and long-term ecological monitoring plots used in New Zealand. This hybrid design ensures multidimensional assessment of ecosystem functions while enhancing cross-regional comparability.
The sampling design addresses ecosystem functions across four service categories: provisioning (sawlog and firewood volume), regulating (carbon stocks in trees, shrubs, and deadwood, and decadal sequestration rates), supporting (soil formation and erosion proxies, plus nutrient concentrations), and biodiversity maintenance (vascular plant and epiphyte).
This dataset supports ecological synthesis, spatial modeling, and integration into broader assessments of ecosystem services and land-use impacts under changing environmental conditions.
{"title":"Monitoring ecosystem functions in mountain catchments of chilean patagonia: A cluster-based dataset","authors":"Paulo Moreno-Meynard","doi":"10.1016/j.dib.2026.112481","DOIUrl":"10.1016/j.dib.2026.112481","url":null,"abstract":"<div><div>This dataset documents the spatially explicit quantification of multiple ecosystem functions across 12 mountain headwater catchments in the Aysén Region of Chilean Patagonia. Designed to capture landscape variability, the observational framework employs a paired-catchment approach, comparing basins with different degrees of anthropogenic disturbance across two forest types: deciduous and evergreen. Each catchment is treated as an integrated landscape unit, with cluster-based field measurements capturing fine-scale variation in vegetation structure, biomass, soil conditions, and species richness.</div><div>The field inventory integrates and adapts methodologies from several national and international forest monitoring frameworks. Its core structure is based on Chile’s Continuous National Forest Inventory, but also incorporates sampling concepts and measurement protocols inspired by the Swiss National Forest Inventory (LFI), the U.S. Forest Inventory and Analysis (FIA) program, and long-term ecological monitoring plots used in New Zealand. This hybrid design ensures multidimensional assessment of ecosystem functions while enhancing cross-regional comparability.</div><div>The sampling design addresses ecosystem functions across four service categories: provisioning (sawlog and firewood volume), regulating (carbon stocks in trees, shrubs, and deadwood, and decadal sequestration rates), supporting (soil formation and erosion proxies, plus nutrient concentrations), and biodiversity maintenance (vascular plant and epiphyte).</div><div>This dataset supports ecological synthesis, spatial modeling, and integration into broader assessments of ecosystem services and land-use impacts under changing environmental conditions.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112481"},"PeriodicalIF":1.4,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146075162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-04-01Epub Date: 2026-02-06DOI: 10.1016/j.dib.2026.112549
Luiz Antonio Falaguasta Barbosa, Hernani Mazier Junior, Ivan Rizzo Guilherme, Daniel Carlos Guimarães Pedronette
Brazil is the world’s largest producer of sugarcane (Saccharum officinarum), accounting for approximately 40% of global production, with the state of São Paulo responsible for more than half of the national output due to its high level of mechanization. Despite its economic importance, publicly available datasets integrating information on sugarcane yield and production environment remain scarce.
This is the first freely available dataset comprising crop yield, meteorological, and production environment data with a large number of observations derived from multiple plots, harvest cycles, and time steps, and that identifies the exact locations of 12 commercial fields in the northeast of São Paulo State, Brazil. It is combined with images downloaded from the Sentinel-2 satellite, based on plot shapefiles, and with other meteorological data at the exact locations and during the same periods of sugarcane cultivation.
Crop yield and production environment data were shared by a sugar and alcohol plant operating in the region, collected at farms in the northeast of São Paulo State, Brazil, with measurements taken at the plot level across two plots per farm, across six farms. The data correspond to different numbers of harvests per plot. Between the plant and harvest dates, complementary data were generated by downloading Sentinel-2 RGB bands as single-band images and combining them into a single image. The exact process is applied using a meteorological dataset, selecting the closest meteorological station to obtain data for the same days between the plant and harvest dates.
Given the unavailability of integrated sugarcane datasets, this resource provides a valuable foundation for studies on crop yield prediction, analysis of production environments, and the development and evaluation of data-driven models in precision agriculture.
{"title":"A dataset of sugarcane crop yield, production environment, meteorological records, and satellite images of commercial fields in the northeast of São Paulo State, Brazil","authors":"Luiz Antonio Falaguasta Barbosa, Hernani Mazier Junior, Ivan Rizzo Guilherme, Daniel Carlos Guimarães Pedronette","doi":"10.1016/j.dib.2026.112549","DOIUrl":"10.1016/j.dib.2026.112549","url":null,"abstract":"<div><div>Brazil is the world’s largest producer of sugarcane (<em>Saccharum officinarum</em>), accounting for approximately 40% of global production, with the state of São Paulo responsible for more than half of the national output due to its high level of mechanization. Despite its economic importance, publicly available datasets integrating information on sugarcane yield and production environment remain scarce.</div><div>This is the first freely available dataset comprising crop yield, meteorological, and production environment data with a large number of observations derived from multiple plots, harvest cycles, and time steps, and that identifies the exact locations of 12 commercial fields in the northeast of São Paulo State, Brazil. It is combined with images downloaded from the Sentinel-2 satellite, based on plot shapefiles, and with other meteorological data at the exact locations and during the same periods of sugarcane cultivation.</div><div>Crop yield and production environment data were shared by a sugar and alcohol plant operating in the region, collected at farms in the northeast of São Paulo State, Brazil, with measurements taken at the plot level across two plots per farm, across six farms. The data correspond to different numbers of harvests per plot. Between the plant and harvest dates, complementary data were generated by downloading Sentinel-2 RGB bands as single-band images and combining them into a single image. The exact process is applied using a meteorological dataset, selecting the closest meteorological station to obtain data for the same days between the plant and harvest dates.</div><div>Given the unavailability of integrated sugarcane datasets, this resource provides a valuable foundation for studies on crop yield prediction, analysis of production environments, and the development and evaluation of data-driven models in precision agriculture.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112549"},"PeriodicalIF":1.4,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146184990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-04-01Epub Date: 2026-02-09DOI: 10.1016/j.dib.2026.112559
Tuomas Sormunen , Ella Mahlamäki , Satu-Marja Mäkelä , Mikko Mäkelä
This dataset presents the first open-access collection of near-infrared hyperspectral imaging (NIR-HSI) data for the optical identification of textiles, with a focus on supporting research in sensor-based textile sorting and recycling. The dataset comprises hyperspectral images, RGB photographs, and detailed metadata, including fibre composition and colour, for 71 post-industrial textile samples, collected in Finland. Over 11 million spectra are included in the hyperspectral images, with more than 6 million annotated, providing a robust foundation for machine learning and data analysis. In addition, we provide a single representative NIR spectra and RGB value for each sample in order to accommodate classic spectroscopic analysis.
Used garments were sourced from a partner company specializing in end-of-life textile management, with ground truth information on fibre composition obtained from suppliers. Small pieces of each garment were measured using Specim SWIR 3 hyperspectral camera and photographed with high-resolution mobile phone camera (Samsung Galaxy A52). The dataset is organized into folders containing raw and processed data, including ENVI-format hyperspectral images, RGB images, as well as CSV files with mean spectra, mean RGB values, and sample metadata. An example Python script is provided to facilitate data access and processing.
Potential reuse scenarios include classification of textiles by material or colour, prediction of natural fibre content, image segmentation, algorithm development for spectral classification, and use as a reference spectral library. The dataset’s comprehensive structure and open availability address the limitations of previous research, which often relied on small or non-public datasets, and is intended to accelerate advances in optical identification technologies for textile recycling.
{"title":"Introducing OpenTextile-NIR: Near-infrared hyperspectral imaging and photography dataset for optical identification of textiles","authors":"Tuomas Sormunen , Ella Mahlamäki , Satu-Marja Mäkelä , Mikko Mäkelä","doi":"10.1016/j.dib.2026.112559","DOIUrl":"10.1016/j.dib.2026.112559","url":null,"abstract":"<div><div>This dataset presents the first open-access collection of near-infrared hyperspectral imaging (NIR-HSI) data for the optical identification of textiles, with a focus on supporting research in sensor-based textile sorting and recycling. The dataset comprises hyperspectral images, RGB photographs, and detailed metadata, including fibre composition and colour, for 71 post-industrial textile samples, collected in Finland. Over 11 million spectra are included in the hyperspectral images, with more than 6 million annotated, providing a robust foundation for machine learning and data analysis. In addition, we provide a single representative NIR spectra and RGB value for each sample in order to accommodate classic spectroscopic analysis.</div><div>Used garments were sourced from a partner company specializing in end-of-life textile management, with ground truth information on fibre composition obtained from suppliers. Small pieces of each garment were measured using Specim SWIR 3 hyperspectral camera and photographed with high-resolution mobile phone camera (Samsung Galaxy A52). The dataset is organized into folders containing raw and processed data, including ENVI-format hyperspectral images, RGB images, as well as CSV files with mean spectra, mean RGB values, and sample metadata. An example Python script is provided to facilitate data access and processing.</div><div>Potential reuse scenarios include classification of textiles by material or colour, prediction of natural fibre content, image segmentation, algorithm development for spectral classification, and use as a reference spectral library. The dataset’s comprehensive structure and open availability address the limitations of previous research, which often relied on small or non-public datasets, and is intended to accelerate advances in optical identification technologies for textile recycling.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112559"},"PeriodicalIF":1.4,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146185074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-04-01Epub Date: 2026-01-27DOI: 10.1016/j.dib.2026.112510
Yunfan Huang , Ying Qiao , Ruifang Chen , Lanfang Dong , Haijuan Liu , Theerakamol Pengsakul , Xiaowan Ma
The “living fossil” Tachypleus tridentatus holds significant medical and economic value but is currently experiencing a severe decline in germplasm resources. The extended incubation period of T. tridentatus eggs make them susceptible to invasion by pathogenic microorganisms, with fungal infections posing a major threat to embryonic development. However, the molecular immune mechanisms underlying embryonic immunity in T. tridentatus remain poorly understood. We collected T. tridentatus embryos at stages 18–20 that were naturally infected with Aspergillus candidus under aquaculture conditions, and conducted RNA sequencing to analyze transcriptomic response to the fungal infection.
{"title":"RNA-seq data of healthy and fungal infected Tachypleus tridentatus embryos","authors":"Yunfan Huang , Ying Qiao , Ruifang Chen , Lanfang Dong , Haijuan Liu , Theerakamol Pengsakul , Xiaowan Ma","doi":"10.1016/j.dib.2026.112510","DOIUrl":"10.1016/j.dib.2026.112510","url":null,"abstract":"<div><div>The “living fossil” <em>Tachypleus tridentatus</em> holds significant medical and economic value but is currently experiencing a severe decline in germplasm resources. The extended incubation period of <em>T. tridentatus</em> eggs make them susceptible to invasion by pathogenic microorganisms, with fungal infections posing a major threat to embryonic development. However, the molecular immune mechanisms underlying embryonic immunity in <em>T. tridentatus</em> remain poorly understood. We collected <em>T. tridentatus</em> embryos at stages 18–20 that were naturally infected with <em>Aspergillus candidus</em> under aquaculture conditions, and conducted RNA sequencing to analyze transcriptomic response to the fungal infection.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112510"},"PeriodicalIF":1.4,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146185145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
There is a sustained demand for biofertilizers to enhance crop productivity. Endophytic bacteria associated with disease-tolerant rice varieties offer significant potential as biofertilizers; however, the bacteriome diversity within these plants remains underexplored. This dataset presents full-length 16S metagenomic sequences of endophytic bacteria isolated from the roots of blast-infected and uninfected rice plants. Root samples were processed and subjected to surface sterilisation. Following total genomic DNA extraction, sequencing was performed using 16S ribosomal RNA primers via the high-throughput Oxford Nanopore Technologies platform. The raw sequence data were filtered for quality control using NanoFilt. Subsequently, the sequences were aligned against the National Center for Biotechnology Information (NCBI) 16S RefSeq database to identify the species of the endophytic root bacteria. The data associated with this project have been registered in the NCBI BioProject database under accession number PRJNA992961. The dataset comprises two distinct sample groups, each analysed in duplicate, with sequencing yields ranging from 17.7 to 20.3 Mb. Consequently, this dataset provides valuable insights regarding the comparative composition of endophytic bacteria inhabiting healthy roots versus those found in blast-infected rice. Characterizing this diversity, particularly within healthy rice plants, is essential for foundational research underpinning the future development of biofertilizers.
{"title":"Metabarcoding data: Full-length 16S rRNA sequence of endophytic bacteria in the root of asymptomatic and blast-symptomatic rice plants (Oryza sativa, L.)","authors":"Yasir Sidiq , Triastuti Rahayu , Peni Indrayudha , Erma Musbita Tyastuti , Azmi Zaki Waliudin Althaf , Banuwati Kartika Sari","doi":"10.1016/j.dib.2026.112522","DOIUrl":"10.1016/j.dib.2026.112522","url":null,"abstract":"<div><div>There is a sustained demand for biofertilizers to enhance crop productivity. Endophytic bacteria associated with disease-tolerant rice varieties offer significant potential as biofertilizers; however, the bacteriome diversity within these plants remains underexplored. This dataset presents full-length 16S metagenomic sequences of endophytic bacteria isolated from the roots of blast-infected and uninfected rice plants. Root samples were processed and subjected to surface sterilisation. Following total genomic DNA extraction, sequencing was performed using 16S ribosomal RNA primers via the high-throughput Oxford Nanopore Technologies platform. The raw sequence data were filtered for quality control using NanoFilt. Subsequently, the sequences were aligned against the National Center for Biotechnology Information (NCBI) 16S RefSeq database to identify the species of the endophytic root bacteria. The data associated with this project have been registered in the NCBI BioProject database under accession number PRJNA992961. The dataset comprises two distinct sample groups, each analysed in duplicate, with sequencing yields ranging from 17.7 to 20.3 Mb. Consequently, this dataset provides valuable insights regarding the comparative composition of endophytic bacteria inhabiting healthy roots versus those found in blast-infected rice. Characterizing this diversity, particularly within healthy rice plants, is essential for foundational research underpinning the future development of biofertilizers<em>.</em></div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112522"},"PeriodicalIF":1.4,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146185146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-04-01Epub Date: 2026-01-07DOI: 10.1016/j.dib.2025.112444
Utshob Sutradhar , Priyankar Biswas , Sumon Hossain , A.T.M. Saiful Islam , Shuvo Dev
This paper presents a dataset on electrical power collected from a university campus in Bangladesh. It is meant to help research on energy forecasting in university settings. The dataset has hourly measurements of system voltage, three-phase currents (R, Y, B), and power factor (pf). These were recorded at the campus substation. Data were collected during different operational conditions, including academic periods and vacations. This provides insights into load behaviour, changes in power factor, and phase imbalance patterns in an educational setting. The dataset supports the creation and assessment of models for load forecasting, anomaly detection, and improving power efficiency. It was also combined with weather data to aid research on load forecasting that takes weather into account. The weather parameters include temperature, humidity, precipitation, wind speed, and solar radiation. All weather values match energy values and were gathered hourly and daily. This dataset is especially useful for researchers studying how artificial intelligence and machine learning can be applied in managing electrical energy. The dataset also includes notes about context, such as reduced load during national holidays. This improves its usefulness for studies that focus on events in forecasting. By making this dataset open access, it helps fill the gap in publicly available electrical load data from educational institutions in developing countries. This supports reproducible research and sustainable energy management on campus.
{"title":"UniEload: Electrical load dataset for energy forecasting applications at public universities in Bangladesh","authors":"Utshob Sutradhar , Priyankar Biswas , Sumon Hossain , A.T.M. Saiful Islam , Shuvo Dev","doi":"10.1016/j.dib.2025.112444","DOIUrl":"10.1016/j.dib.2025.112444","url":null,"abstract":"<div><div>This paper presents a dataset on electrical power collected from a university campus in Bangladesh. It is meant to help research on energy forecasting in university settings. The dataset has hourly measurements of system voltage, three-phase currents (R, Y, B), and power factor (pf). These were recorded at the campus substation. Data were collected during different operational conditions, including academic periods and vacations. This provides insights into load behaviour, changes in power factor, and phase imbalance patterns in an educational setting. The dataset supports the creation and assessment of models for load forecasting, anomaly detection, and improving power efficiency. It was also combined with weather data to aid research on load forecasting that takes weather into account. The weather parameters include temperature, humidity, precipitation, wind speed, and solar radiation. All weather values match energy values and were gathered hourly and daily. This dataset is especially useful for researchers studying how artificial intelligence and machine learning can be applied in managing electrical energy. The dataset also includes notes about context, such as reduced load during national holidays. This improves its usefulness for studies that focus on events in forecasting. By making this dataset open access, it helps fill the gap in publicly available electrical load data from educational institutions in developing countries. This supports reproducible research and sustainable energy management on campus.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112444"},"PeriodicalIF":1.4,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145976668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-04-01Epub Date: 2026-01-07DOI: 10.1016/j.dib.2025.112434
Camille Marchal , Damien Ballan , Sarra Azib , Morgane Innocent , Bertrand Urien , Annick Tamaro , Marine Le Gall-Ely , Emmanuel Coton , Adeline Picot , Jérôme Mounier , Louis Coroller , Patrick Gabriel
Fresh fruits and vegetables (FFV) represent the largest part of food waste at the consumer level. This waste directly results from FFV physiological and microbiological spoilage, itself intricately linked to behavioural factors such as consumer practices, including purchase, storage and hygiene practices, but also consumers’ perceptions towards spoilage. Based on a dual approach combining microbiological and behavioural sciences, we examined the link between FFV waste produced by 49 volunteering French households, measured using connected bins, the microbial ecology of their storage compartments, using culture-dependent and -independent approaches, and their consumer behaviour, cleaning and storage practices, through in-depth interviews and a dedicated survey. An exploratory qualitative survey carried out on 17 individuals followed by two quantitative data collections on 1048 and 815 representative French consumers enabled us to identify anti-FFV waste practices and to cluster consumers according to their anti-FFV waste behaviours. Spoilage dynamics of commonly consumed FFV, according to storage temperature, microbial contamination level and the presence or absence of surface wounds, were also performed in controlled conditions. This citizen-science-based dataset covers a wide array of microbiological and behavioural factors related to domestic FFV waste, as well as real measurements of waste volumes thanks to the innovative use of connected bins. Altogether, this data could provide interesting insights into more effective and accessible guidelines for FFV waste reduction at the consumer level, and thus to a potential reduction of global food waste and its related costs.
{"title":"Participatory and multi-disciplinary science dataset and surveys for the assessment of the microbiological and behavioural factors influencing fresh fruits and vegetables' waste at home","authors":"Camille Marchal , Damien Ballan , Sarra Azib , Morgane Innocent , Bertrand Urien , Annick Tamaro , Marine Le Gall-Ely , Emmanuel Coton , Adeline Picot , Jérôme Mounier , Louis Coroller , Patrick Gabriel","doi":"10.1016/j.dib.2025.112434","DOIUrl":"10.1016/j.dib.2025.112434","url":null,"abstract":"<div><div>Fresh fruits and vegetables (FFV) represent the largest part of food waste at the consumer level. This waste directly results from FFV physiological and microbiological spoilage, itself intricately linked to behavioural factors such as consumer practices, including purchase, storage and hygiene practices, but also consumers’ perceptions towards spoilage. Based on a dual approach combining microbiological and behavioural sciences, we examined the link between FFV waste produced by 49 volunteering French households, measured using connected bins, the microbial ecology of their storage compartments, using culture-dependent and -independent approaches, and their consumer behaviour, cleaning and storage practices, through in-depth interviews and a dedicated survey. An exploratory qualitative survey carried out on 17 individuals followed by two quantitative data collections on 1048 and 815 representative French consumers enabled us to identify anti-FFV waste practices and to cluster consumers according to their anti-FFV waste behaviours. Spoilage dynamics of commonly consumed FFV, according to storage temperature, microbial contamination level and the presence or absence of surface wounds, were also performed in controlled conditions. This citizen-science-based dataset covers a wide array of microbiological and behavioural factors related to domestic FFV waste, as well as real measurements of waste volumes thanks to the innovative use of connected bins. Altogether, this data could provide interesting insights into more effective and accessible guidelines for FFV waste reduction at the consumer level, and thus to a potential reduction of global food waste and its related costs.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112434"},"PeriodicalIF":1.4,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145976666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}