Pub Date : 2026-01-10DOI: 10.1016/j.dib.2026.112462
Takaki Nishio, Yuki Kawae
The conservation of marine resources and the mitigation of marine pollution require strengthened knowledge of marine biodiversity, particularly in the deep sea. Videos and images are valuable for documenting the distribution of deep-sea organisms, but manual processing is labor-intensive and variable, emphasizing the need for automated methods. To address this, the J-EDI Organism Detection Dataset (JODD) is introduced. This dataset comprises 8151 images and 15,621 bounding boxes annotated in the Common Objects in Context (COCO) format. The images were captured during deep-sea surveys conducted by the Japan Agency for Marine-Earth Science and Technology (JAMSTEC) between 1984 and 2021, using remotely operated vehicles (ROVs) and human-occupied vehicles (HOVs). All images were derived from publicly available videos in JAMSTEC’s E-library of Deep-sea Images (J-EDI). The dataset includes 20 object categories—19 biological groups and one machine category—providing a reusable resource for developing and benchmarking machine learning models for the automatic detection of deep-sea organisms.
养护海洋资源和减轻海洋污染需要加强对海洋生物多样性的认识,特别是对深海生物多样性的认识。视频和图像对于记录深海生物的分布是有价值的,但人工处理是劳动密集型的,而且是可变的,强调了自动化方法的必要性。为了解决这个问题,引入了J-EDI生物检测数据集(JODD)。该数据集包括8151张图像和15621个边界框,以Common Objects in Context (COCO)格式标注。这些图像是在1984年至2021年期间由日本海洋地球科学技术机构(JAMSTEC)使用远程操作车辆(rov)和载人车辆(hov)进行的深海调查中捕获的。所有图像均来自JAMSTEC的深海图像电子库(J-EDI)中的公开视频。该数据集包括20个对象类别- 19个生物类群和一个机器类别-为深海生物自动检测的机器学习模型的开发和基准测试提供了可重复使用的资源。
{"title":"Deep-sea image dataset for organism detection","authors":"Takaki Nishio, Yuki Kawae","doi":"10.1016/j.dib.2026.112462","DOIUrl":"10.1016/j.dib.2026.112462","url":null,"abstract":"<div><div>The conservation of marine resources and the mitigation of marine pollution require strengthened knowledge of marine biodiversity, particularly in the deep sea. Videos and images are valuable for documenting the distribution of deep-sea organisms, but manual processing is labor-intensive and variable, emphasizing the need for automated methods. To address this, the J-EDI Organism Detection Dataset (JODD) is introduced. This dataset comprises 8151 images and 15,621 bounding boxes annotated in the Common Objects in Context (COCO) format. The images were captured during deep-sea surveys conducted by the Japan Agency for Marine-Earth Science and Technology (JAMSTEC) between 1984 and 2021, using remotely operated vehicles (ROVs) and human-occupied vehicles (HOVs). All images were derived from publicly available videos in JAMSTEC’s E-library of Deep-sea Images (J-EDI). The dataset includes 20 object categories—19 biological groups and one machine category—providing a reusable resource for developing and benchmarking machine learning models for the automatic detection of deep-sea organisms.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112462"},"PeriodicalIF":1.4,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145976670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Type-2 diabetes is a major public health concern in Bangladesh, and this dataset provides 1065 curated patient records with demographic, anthropometric, and clinical variables relevant to its assessment. The data were collected during routine clinical visits and recorded by trained staff, with checks to ensure accuracy and completeness. It includes basic details like age, pregnancy count, body mass index, and skin-fold thickness; vital signs such as blood pressure; lab results related to blood sugar (fasting glucose and insulin); the Diabetes Pedigree Function; and a simple yes/no label for Type-2 diabetes. A few values are missing for diastolic blood pressure and skin-fold thickness, so users should handle these carefully. Since the data are cross-sectional and come from patients seeking care, there are more diabetic cases (840) than non-diabetic cases (225). The dataset is intended for reuse in method development (for example, machine-learning classifier training, feature-selection benchmarking, and oversampling/imputation research), for context-specific epidemiologic description and model validation in South Asian clinical settings, and as a teaching resource for reproducible biomedical-data workflows.
{"title":"A clinical dataset on type-2 diabetes including demographic, anthropometric, and biochemical parameters from Bangladesh","authors":"Md. Younus Bhuiyan , Shahriar Siddique Ayon , Md. Ebrahim Hossain , Md. Saef Ullah Miah , Afjal H. Sarower , Fateha khanam Bappee","doi":"10.1016/j.dib.2026.112457","DOIUrl":"10.1016/j.dib.2026.112457","url":null,"abstract":"<div><div>Type-2 diabetes is a major public health concern in Bangladesh, and this dataset provides 1065 curated patient records with demographic, anthropometric, and clinical variables relevant to its assessment. The data were collected during routine clinical visits and recorded by trained staff, with checks to ensure accuracy and completeness. It includes basic details like age, pregnancy count, body mass index, and skin-fold thickness; vital signs such as blood pressure; lab results related to blood sugar (fasting glucose and insulin); the Diabetes Pedigree Function; and a simple yes/no label for Type-2 diabetes. A few values are missing for diastolic blood pressure and skin-fold thickness, so users should handle these carefully. Since the data are cross-sectional and come from patients seeking care, there are more diabetic cases (840) than non-diabetic cases (225). The dataset is intended for reuse in method development (for example, machine-learning classifier training, feature-selection benchmarking, and oversampling/imputation research), for context-specific epidemiologic description and model validation in South Asian clinical settings, and as a teaching resource for reproducible biomedical-data workflows.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112457"},"PeriodicalIF":1.4,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145976669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Industrial hemp cultivation is expanding and requires reliable monitoring for legal compliance and agricultural management. This paper presents a standardized UAV-based multisensor framework designed for Cannabis sativa L. It integrates RGB, multispectral, and thermal imaging as core modules, with hyperspectral and LiDAR as optional extensions. The framework sets protocols for sensor integration, flight planning, field measurements, and annotation, ensuring datasets that meet EU altitude limits (≤120 m AGL). Multi-altitude and multi-time-of-day acquisitions are proposed to capture spatial and diurnal variability. These data improve model robustness for phenotyping, stress detection, and THC compliance verification. Potential applications include precision agriculture, breeding, regulatory monitoring, environmental assessment, and illicit crop detection. Open-access datasets generated through this framework will support reproducibility, machine learning development, and collaboration among researchers, farmers, and regulators.
{"title":"A Uav-based multisensor framework for legal industrial Cannabis monitoring and open-access dataset development","authors":"Genta Rexha , Ina Papadhopulli , Aleksandër Biberaj , Elson Agastra , Enida Sheme , Elinda Meçe","doi":"10.1016/j.dib.2026.112463","DOIUrl":"10.1016/j.dib.2026.112463","url":null,"abstract":"<div><div>Industrial hemp cultivation is expanding and requires reliable monitoring for legal compliance and agricultural management. This paper presents a standardized UAV-based multisensor framework designed for Cannabis sativa L. It integrates RGB, multispectral, and thermal imaging as core modules, with hyperspectral and LiDAR as optional extensions. The framework sets protocols for sensor integration, flight planning, field measurements, and annotation, ensuring datasets that meet EU altitude limits (≤120 m AGL). Multi-altitude and multi-time-of-day acquisitions are proposed to capture spatial and diurnal variability. These data improve model robustness for phenotyping, stress detection, and THC compliance verification. Potential applications include precision agriculture, breeding, regulatory monitoring, environmental assessment, and illicit crop detection. Open-access datasets generated through this framework will support reproducibility, machine learning development, and collaboration among researchers, farmers, and regulators.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112463"},"PeriodicalIF":1.4,"publicationDate":"2026-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145976673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This dataset provides a comprehensive genomic and pathogenicity profiling of Staphylococcus aureus strain IHS3A, a methicillin-resistant (MRSA) clinical isolate obtained from a healthcare worker in a teaching hospital in Jordan, Middle East. Whole genome sequencing was performed using the Illumina NextSeq 2000 platform, followed by high-quality de novo assembly using SPAdes. The genome spans 2821,373 bp across 90 contigs, with a GC content of 32.78%, and demonstrates high-quality metrics, including 99.67% completeness and minimal contamination (0.08%). The genome analysis identified 2611 predicted protein-coding sequences. Multilocus sequence typing (MLST) assigned the isolate to ST10647, SCCmec typing revealed type IVc (2B), and spa typing identified t131. The dataset includes comprehensive annotations of key antimicrobial resistance genes, such as mecA (methicillin resistance), blaZ (penicillin resistance), and lmrS (macrolide efflux), as well as virulence factors related to adherence (e.g., atl, clfA), immune evasion (e.g., scn, adsA), secretion systems (e.g., esaA, esaB), and toxins (e.g., hla, lukF-PV, tsst). Secondary metabolite biosynthetic gene clusters, such as staphyloferrin B and staphylopine, were identified. The genome also encodes a diverse carbohydrate-active enzyme (CAZyme) profile. These genomic data are valuable for further research on MRSA evolution, resistance mechanisms, and virulence factors in Jordan and the Middle East. The genome data have been deposited in the NCBI database under the accession number JBPPGA000000000, with a direct URL to data: https://www.ncbi.nlm.nih.gov/nuccore/JBPPGA000000000.1. Bioproject: PRJNA1283614, Biosample: SAMN49700843.
{"title":"Draft genome data analysis and pathogenicity profiling of Staphylococcus aureus strain IHS3A with antibiotic resistance genes isolated from a hospital in Jordan","authors":"Saqr Abushattal , Sulaiman M. Alnaimat , Nidal Odat , Mahmoud Abushattal","doi":"10.1016/j.dib.2026.112453","DOIUrl":"10.1016/j.dib.2026.112453","url":null,"abstract":"<div><div>This dataset provides a comprehensive genomic and pathogenicity profiling of <em>Staphylococcus aureus</em> strain IHS3A, a methicillin-resistant (MRSA) clinical isolate obtained from a healthcare worker in a teaching hospital in Jordan, Middle East. Whole genome sequencing was performed using the Illumina NextSeq 2000 platform, followed by high-quality de novo assembly using SPAdes. The genome spans 2821,373 bp across 90 contigs, with a GC content of 32.78%, and demonstrates high-quality metrics, including 99.67% completeness and minimal contamination (0.08%). The genome analysis identified 2611 predicted protein-coding sequences. Multilocus sequence typing (MLST) assigned the isolate to ST10647, SCC<em>mec</em> typing revealed type IVc (2B), and spa typing identified t131. The dataset includes comprehensive annotations of key antimicrobial resistance genes, such as <em>mecA</em> (methicillin resistance), <em>blaZ</em> (penicillin resistance), and <em>lmrS</em> (macrolide efflux), as well as virulence factors related to adherence (e.g., <em>atl, clfA</em>), immune evasion (e.g., <em>scn, adsA</em>), secretion systems (e.g., <em>esaA, esaB</em>), and toxins (e.g., <em>hla, lukF</em>-<em>PV, tsst</em>). Secondary metabolite biosynthetic gene clusters, such as staphyloferrin B and staphylopine, were identified. The genome also encodes a diverse carbohydrate-active enzyme (CAZyme) profile. These genomic data are valuable for further research on MRSA evolution, resistance mechanisms, and virulence factors in Jordan and the Middle East. The genome data have been deposited in the NCBI database under the accession number JBPPGA000000000, with a direct URL to data: <span><span>https://www.ncbi.nlm.nih.gov/nuccore/JBPPGA000000000.1</span><svg><path></path></svg></span>. Bioproject: PRJNA1283614, Biosample: SAMN49700843.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"64 ","pages":"Article 112453"},"PeriodicalIF":1.4,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145973220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-08DOI: 10.1016/j.dib.2026.112455
Chao LI , Chen Zhang , Wenbo Zhang , Chengzhen LV , Yaqiang Li , Yufen Wang
This study employed an HY-6010-S hyperspectral imaging system, covering a spectral range of 400–1000 nm, combined with an RGB industrial camera to acquire multimodal data. The dataset simulates phenotypic analysis scenarios of maize seeds under controlled laboratory conditions, with the ambient temperature maintained at 20–25°C. Comprehensive testing was conducted using 12 different maize varieties. Approximately 200 seed samples were collected per variety, resulting in a total sample size of about 2400, each subjected to hyperspectral and RGB image acquisition. Preprocessing steps included noise reduction, background removal, band selection, and modality alignment. To ensure the accuracy and reliability of the experimental data, HHIT software and Python were utilized for data processing. This dataset plays a significant role in seed variety classification, phenotypic analysis, precision agriculture, and machine learning applications.
{"title":"Corn seed dataset based on hyperspectral and RGB images","authors":"Chao LI , Chen Zhang , Wenbo Zhang , Chengzhen LV , Yaqiang Li , Yufen Wang","doi":"10.1016/j.dib.2026.112455","DOIUrl":"10.1016/j.dib.2026.112455","url":null,"abstract":"<div><div>This study employed an HY-6010-S hyperspectral imaging system, covering a spectral range of 400–1000 nm, combined with an RGB industrial camera to acquire multimodal data. The dataset simulates phenotypic analysis scenarios of maize seeds under controlled laboratory conditions, with the ambient temperature maintained at 20–25°C. Comprehensive testing was conducted using 12 different maize varieties. Approximately 200 seed samples were collected per variety, resulting in a total sample size of about 2400, each subjected to hyperspectral and RGB image acquisition. Preprocessing steps included noise reduction, background removal, band selection, and modality alignment. To ensure the accuracy and reliability of the experimental data, HHIT software and Python were utilized for data processing. This dataset plays a significant role in seed variety classification, phenotypic analysis, precision agriculture, and machine learning applications.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112455"},"PeriodicalIF":1.4,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145976664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-08DOI: 10.1016/j.dib.2026.112452
Oguz Akbilgic , Ibrahim Karabayir , Luke Patterson , Stephanie B. Dixon , Daniel A. Mulrooney , Kirsten K. Ness , Melissa M. Hudson
Childhood cancer survivors (CCS), exposed to prior cardiotoxic treatments such as anthracyclines and chest radiation, are at lifelong risk of cardiovascular complications. Current guidelines recommend periodic echocardiographic surveillance, but adherence rates are as low as 41%. This dataset provides paired same-day 12-lead clinical electrocardiograms (ECG) and single-lead wearable ECG recordings from the Apple Watch, collected from adult CCS participating in the St. Jude Lifetime Cohort Study (SJLIFE). The availability of paired wearable and clinical ECGs enables the development and validation of remote AI-based cardiac screening tools, potentially leading to more precise long-term cardiovascular surveillance in this population. Using this dataset, researchers can assess whether an AI model developed using clinical ECG can be repeat when using ECG from an Apple Watch.
{"title":"Paired clinical 12 lead and apple watch electrocardiogram data repository from childhood cancer survivors authors","authors":"Oguz Akbilgic , Ibrahim Karabayir , Luke Patterson , Stephanie B. Dixon , Daniel A. Mulrooney , Kirsten K. Ness , Melissa M. Hudson","doi":"10.1016/j.dib.2026.112452","DOIUrl":"10.1016/j.dib.2026.112452","url":null,"abstract":"<div><div>Childhood cancer survivors (CCS), exposed to prior cardiotoxic treatments such as anthracyclines and chest radiation, are at lifelong risk of cardiovascular complications. Current guidelines recommend periodic echocardiographic surveillance, but adherence rates are as low as 41%. This dataset provides paired same-day 12-lead clinical electrocardiograms (ECG) and single-lead wearable ECG recordings from the Apple Watch, collected from adult CCS participating in the St. Jude Lifetime Cohort Study (SJLIFE). The availability of paired wearable and clinical ECGs enables the development and validation of remote AI-based cardiac screening tools, potentially leading to more precise long-term cardiovascular surveillance in this population. Using this dataset, researchers can assess whether an AI model developed using clinical ECG can be repeat when using ECG from an Apple Watch.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112452"},"PeriodicalIF":1.4,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145976667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-08DOI: 10.1016/j.dib.2026.112456
Marie-Liesse Vermeire , Pathé Basse , Samuel Legros , Falilou Diallo , Anne Desnues , Frédéric Feder
Recycling the growing stock of organic waste products (OWP) from cities, factories, and farms is a key challenge for sustainable agriculture. However, it must be done with awareness of performances but also potential long-term environmental and health risks. In this context, the SOERE PRO observatory was established ("Systèmes d'Observation et d'Expérimentation pour la Recherche en Environnement - Produits Résiduaires Organiques'', a label granted by the French National Research Alliance for the Environment (AllEnvi) to recognize high-quality research infrastructures, which translates to "Long-term Observation and Experimentation Systems for Environmental Research - Organic Waste Products''), including the trial in Sangalkam, in the Dakar region of Senegal, where these data are collected. Since 2016, four fertilizer types - one mineral (synthetic) and three organic - have been applied annually to three successive vegetable crops (tomato, lettuce, carrot). The dataset currently covers the period 2016 - 2025, with data collection ongoing and new data to be added in the future. Manual weeding and hoeing is carried out regularly for each crop, no pesticides are used for crop protection on the trial. A comprehensive, multi-variable dataset is consistently documented, including soil physico-chemical parameters measured annually at three depths, organic waste product characterization, crop yield and quality parameters, and detailed management activities, making it particularly suitable for process-based modelling and long-term impact assessment. The originality of this dataset lies in its long duration, the diversity of organic and mineral fertilization strategies, the inclusion of multiple vegetable crops per year, and its location under Sub-Sahelian conditions, a context for which long-term agronomic datasets remain scarce. All soil, OWP and vegetables samples are stored in a sample bank in Dakar, and available for additional analyses. The objective of this dataset is to provide long-term, integrated information on crop productivity, crop quality, and soil responses to repeated organic and mineral fertilization in a Sub-Sahelian market-gardening system. The dataset is publicly available through a Dataverse repository for free (re)use in meta-analyses, process-based modelling, and environmental studies, notably to improve understanding of nutrient cycling, contaminant dynamics, soil biodiversity, and long-term soil functioning in Sub-Sahelian agroecosystems, and to support sustainable land management and food security in Southern countries under future climate change.
{"title":"Soil and crop data from a long-term organic fertilization trial in Sub-Sahelian market gardening","authors":"Marie-Liesse Vermeire , Pathé Basse , Samuel Legros , Falilou Diallo , Anne Desnues , Frédéric Feder","doi":"10.1016/j.dib.2026.112456","DOIUrl":"10.1016/j.dib.2026.112456","url":null,"abstract":"<div><div>Recycling the growing stock of organic waste products (OWP) from cities, factories, and farms is a key challenge for sustainable agriculture. However, it must be done with awareness of performances but also potential long-term environmental and health risks. In this context, the SOERE PRO observatory was established (\"Systèmes d'Observation et d'Expérimentation pour la Recherche en Environnement - Produits Résiduaires Organiques'', a label granted by the French National Research Alliance for the Environment (AllEnvi) to recognize high-quality research infrastructures, which translates to \"Long-term Observation and Experimentation Systems for Environmental Research - Organic Waste Products''), including the trial in Sangalkam, in the Dakar region of Senegal, where these data are collected. Since 2016, four fertilizer types - one mineral (synthetic) and three organic - have been applied annually to three successive vegetable crops (tomato, lettuce, carrot). The dataset currently covers the period 2016 - 2025, with data collection ongoing and new data to be added in the future. Manual weeding and hoeing is carried out regularly for each crop, no pesticides are used for crop protection on the trial. A comprehensive, multi-variable dataset is consistently documented, including soil physico-chemical parameters measured annually at three depths, organic waste product characterization, crop yield and quality parameters, and detailed management activities, making it particularly suitable for process-based modelling and long-term impact assessment. The originality of this dataset lies in its long duration, the diversity of organic and mineral fertilization strategies, the inclusion of multiple vegetable crops per year, and its location under Sub-Sahelian conditions, a context for which long-term agronomic datasets remain scarce. All soil, OWP and vegetables samples are stored in a sample bank in Dakar, and available for additional analyses. The objective of this dataset is to provide long-term, integrated information on crop productivity, crop quality, and soil responses to repeated organic and mineral fertilization in a Sub-Sahelian market-gardening system. The dataset is publicly available through a Dataverse repository for free (re)use in meta-analyses, process-based modelling, and environmental studies, notably to improve understanding of nutrient cycling, contaminant dynamics, soil biodiversity, and long-term soil functioning in Sub-Sahelian agroecosystems, and to support sustainable land management and food security in Southern countries under future climate change.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112456"},"PeriodicalIF":1.4,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145976671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-08DOI: 10.1016/j.dib.2026.112450
Xinchao Song , Mingjun Li , Sean Banerjee , Natasha Kholgade Banerjee
We present the HILO dataset consisting of high-resolution 3D scanned models for 253 common-use objects and 32,256 multi-viewpoint RGB-D images with typically low-resolution data for 144 tabletop scenes consisting of collections of random sets of 10 objects drawn from the set of 253 objects. The dataset provides the 6 degree of freedom (6DOF) pose for all objects found in each of the 32,256 RGB-D images, obtained by performing precise 3D alignment of the 3D models to the RGB-D images. The dataset also contains metadata on object mass, short text descriptor, binning into everyday use classes, and aspect ratio and function categories, intrinsic parameter information for RGB-D sensors used in capture, and transformations between camera poses. Object 3D models in the dataset were acquired by scanning using a tabletop 3D scanner, and were manually inspected, cleaned, repaired, and exported as original ultra high-resolution at ∼1M vertices and simplified high-resolution meshes at ∼10k vertices. To capture the multi-view RGB-D images, we established an in-house testbed consisting of a turntable and two robotic manipulators to respectively cover azimuth angles and elevation angles, and span a hemisphere. Images were captured using two Microsoft Azure Kinect sensors mounted at the wrists of the robot, one per robot. We captured images over two distances forming hemispherical shells. We used in-house software written in python to control the turntable movement, robot motion, and image capture, as well as to perform camera calibration, processing to generate registered images and foreground masks, manual precise alignment of object models to images, and post-capture correction of misalignments in camera transformation parameters. The dataset provides value in enabling training and evaluation of algorithms for several tasks in computer vision, artificial intelligence (AI), and robotics such as object completion, recognition, segmentation, high-resolution structure generation, robotic grasp planning, and recognition of human-preferred grasp locations for human-robot collaboration.
{"title":"Dataset of RGB-D images of object collections from multiple viewpoints with aligned high-resolution 3D models of objects","authors":"Xinchao Song , Mingjun Li , Sean Banerjee , Natasha Kholgade Banerjee","doi":"10.1016/j.dib.2026.112450","DOIUrl":"10.1016/j.dib.2026.112450","url":null,"abstract":"<div><div>We present the HILO dataset consisting of high-resolution 3D scanned models for 253 common-use objects and 32,256 multi-viewpoint RGB-D images with typically low-resolution data for 144 tabletop scenes consisting of collections of random sets of 10 objects drawn from the set of 253 objects. The dataset provides the 6 degree of freedom (6DOF) pose for all objects found in each of the 32,256 RGB-D images, obtained by performing precise 3D alignment of the 3D models to the RGB-D images. The dataset also contains metadata on object mass, short text descriptor, binning into everyday use classes, and aspect ratio and function categories, intrinsic parameter information for RGB-D sensors used in capture, and transformations between camera poses. Object 3D models in the dataset were acquired by scanning using a tabletop 3D scanner, and were manually inspected, cleaned, repaired, and exported as original ultra high-resolution at ∼1M vertices and simplified high-resolution meshes at ∼10k vertices. To capture the multi-view RGB-D images, we established an in-house testbed consisting of a turntable and two robotic manipulators to respectively cover azimuth angles and elevation angles, and span a hemisphere. Images were captured using two Microsoft Azure Kinect sensors mounted at the wrists of the robot, one per robot. We captured images over two distances forming hemispherical shells. We used in-house software written in python to control the turntable movement, robot motion, and image capture, as well as to perform camera calibration, processing to generate registered images and foreground masks, manual precise alignment of object models to images, and post-capture correction of misalignments in camera transformation parameters. The dataset provides value in enabling training and evaluation of algorithms for several tasks in computer vision, artificial intelligence (AI), and robotics such as object completion, recognition, segmentation, high-resolution structure generation, robotic grasp planning, and recognition of human-preferred grasp locations for human-robot collaboration.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112450"},"PeriodicalIF":1.4,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145976672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-07DOI: 10.1016/j.dib.2025.112432
Nicole Nawrot , Jacek Kluska
Cultivating Miscanthus × giganteus (M×g) energy crop on marginal soil supports phytoattenuation and provides high-energy biomass for biofuel production. Improving nutrient-poor soil with low-cost recovered organic amendments, such as spent coffee grounds (SCG) and SCG-derived biochar (BC) offers sustainable benefits. This data article presents the findings from a medium-term greenhouse experiment at the Gdansk University of Technology assessing M×g cultivation on marginal soil with SCG and BC amendments into soil. In a pot-scale experiment the medium term-effect on M×g biomass growth, photosynthesis parameters, root tissues development, as well as final elemental composition was examined. Soil pH and elemental composition were also determined. As global coffee consumption increases, large quantities of SCG are generated and often landfilled. Their beneficial reuse aligns with circular economy principles and Sustainable Development Goals (SDGs 7 and 13), providing both a short-term nutrient source and a means of improving soil quality and resilience. The article compiles five datasets detailing: (1) M×g growth parameters, tissue development, and photosynthetic indices, (2) nutrient and caffeine leaching behaviour; and (3) elemental composition of plants and soils following exposure. These datasets, available in the Bridge of Knowledge Gdansk University of Technology repository, provide a resource for environmental researchers, soil and plant scientists, biochar specialists, and decisionmakers working to restore marginal soil usability. This study promotes sustainable land management by demonstrating how organic wastes and biochar can be combined to improve crop performance, sequester carbon, and reduce nutrient losses while minimizing external fertilizer inputs.
{"title":"Effects of raw and thermally processed spent coffee grounds on Miscanthus × giganteus plantation: Data description","authors":"Nicole Nawrot , Jacek Kluska","doi":"10.1016/j.dib.2025.112432","DOIUrl":"10.1016/j.dib.2025.112432","url":null,"abstract":"<div><div>Cultivating <em>Miscanthus × giganteus</em> (<em>M</em> <em>×</em> <em>g</em>) energy crop on marginal soil supports phytoattenuation and provides high-energy biomass for biofuel production. Improving nutrient-poor soil with low-cost recovered organic amendments, such as spent coffee grounds (SCG) and SCG-derived biochar (BC) offers sustainable benefits. This data article presents the findings from a medium-term greenhouse experiment at the Gdansk University of Technology assessing <em>M</em> <em>×</em> <em>g</em> cultivation on marginal soil with SCG and BC amendments into soil. In a pot-scale experiment the medium term-effect on <em>M</em> <em>×</em> <em>g</em> biomass growth, photosynthesis parameters, root tissues development, as well as final elemental composition was examined. Soil pH and elemental composition were also determined. As global coffee consumption increases, large quantities of SCG are generated and often landfilled. Their beneficial reuse aligns with circular economy principles and Sustainable Development Goals (SDGs 7 and 13), providing both a short-term nutrient source and a means of improving soil quality and resilience. The article compiles five datasets detailing: (1) <em>M</em> <em>×</em> <em>g</em> growth parameters, tissue development, and photosynthetic indices, (2) nutrient and caffeine leaching behaviour; and (3) elemental composition of plants and soils following exposure. These datasets, available in the Bridge of Knowledge Gdansk University of Technology repository, provide a resource for environmental researchers, soil and plant scientists, biochar specialists, and decisionmakers working to restore marginal soil usability. This study promotes sustainable land management by demonstrating how organic wastes and biochar can be combined to improve crop performance, sequester carbon, and reduce nutrient losses while minimizing external fertilizer inputs.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"64 ","pages":"Article 112432"},"PeriodicalIF":1.4,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145973351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-07DOI: 10.1016/j.dib.2025.112434
Camille Marchal , Damien Ballan , Sarra Azib , Morgane Innocent , Bertrand Urien , Annick Tamaro , Marine Le Gall-Ely , Emmanuel Coton , Adeline Picot , Jérôme Mounier , Louis Coroller , Patrick Gabriel
Fresh fruits and vegetables (FFV) represent the largest part of food waste at the consumer level. This waste directly results from FFV physiological and microbiological spoilage, itself intricately linked to behavioural factors such as consumer practices, including purchase, storage and hygiene practices, but also consumers’ perceptions towards spoilage. Based on a dual approach combining microbiological and behavioural sciences, we examined the link between FFV waste produced by 49 volunteering French households, measured using connected bins, the microbial ecology of their storage compartments, using culture-dependent and -independent approaches, and their consumer behaviour, cleaning and storage practices, through in-depth interviews and a dedicated survey. An exploratory qualitative survey carried out on 17 individuals followed by two quantitative data collections on 1048 and 815 representative French consumers enabled us to identify anti-FFV waste practices and to cluster consumers according to their anti-FFV waste behaviours. Spoilage dynamics of commonly consumed FFV, according to storage temperature, microbial contamination level and the presence or absence of surface wounds, were also performed in controlled conditions. This citizen-science-based dataset covers a wide array of microbiological and behavioural factors related to domestic FFV waste, as well as real measurements of waste volumes thanks to the innovative use of connected bins. Altogether, this data could provide interesting insights into more effective and accessible guidelines for FFV waste reduction at the consumer level, and thus to a potential reduction of global food waste and its related costs.
{"title":"Participatory and multi-disciplinary science dataset and surveys for the assessment of the microbiological and behavioural factors influencing fresh fruits and vegetables' waste at home","authors":"Camille Marchal , Damien Ballan , Sarra Azib , Morgane Innocent , Bertrand Urien , Annick Tamaro , Marine Le Gall-Ely , Emmanuel Coton , Adeline Picot , Jérôme Mounier , Louis Coroller , Patrick Gabriel","doi":"10.1016/j.dib.2025.112434","DOIUrl":"10.1016/j.dib.2025.112434","url":null,"abstract":"<div><div>Fresh fruits and vegetables (FFV) represent the largest part of food waste at the consumer level. This waste directly results from FFV physiological and microbiological spoilage, itself intricately linked to behavioural factors such as consumer practices, including purchase, storage and hygiene practices, but also consumers’ perceptions towards spoilage. Based on a dual approach combining microbiological and behavioural sciences, we examined the link between FFV waste produced by 49 volunteering French households, measured using connected bins, the microbial ecology of their storage compartments, using culture-dependent and -independent approaches, and their consumer behaviour, cleaning and storage practices, through in-depth interviews and a dedicated survey. An exploratory qualitative survey carried out on 17 individuals followed by two quantitative data collections on 1048 and 815 representative French consumers enabled us to identify anti-FFV waste practices and to cluster consumers according to their anti-FFV waste behaviours. Spoilage dynamics of commonly consumed FFV, according to storage temperature, microbial contamination level and the presence or absence of surface wounds, were also performed in controlled conditions. This citizen-science-based dataset covers a wide array of microbiological and behavioural factors related to domestic FFV waste, as well as real measurements of waste volumes thanks to the innovative use of connected bins. Altogether, this data could provide interesting insights into more effective and accessible guidelines for FFV waste reduction at the consumer level, and thus to a potential reduction of global food waste and its related costs.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112434"},"PeriodicalIF":1.4,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145976666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}