Pub Date : 2026-01-21DOI: 10.1016/j.dib.2026.112487
Viktor Peterson
Impact-loaded reinforced concrete beams often fail in shear. This becomes relevant for shelter design against ballistics or fragment impact, for instance. An experimental campaign was conducted to study the different types of shear failure and governing parameters. Eighteen reinforced concrete beams were tested by a 70 kg steel striker dropped from a 2.4 m height. The beams were loaded at different positions from the support with different amounts of transverse reinforcement. The beams were of reduced scale with a length of 0.80 m and a square 0.15 m × 0.15 m cross-section. The drop weight tests were monitored with shock accelerometers on the striker and beam centre, load cells under the supports measuring reaction forces, and a high-speed camera (HSC). High-speed camera measurements were recorded orthogonal to the surface with the aim of performing high-quality digital image correlation (DIC) analyses. The beams and striker were painted with a speckled pattern prior to testing for the DIC analyses. Camera recordings were conducted with a 1024 × 512 px resolution and 6 kHz sampling, resulting in a time resolution of about 0.17 ms. Accelerometer and load cell measurements were sampled at 19.2 kHz. The accelerometer on the striker was used to approximate the impact force, and beam acceleration can be used to synchronize the camera and DAQ recordings. The data may be used to calibrate finite element models, study the impact response of beams, or develop new mechanical models.
{"title":"Dataset of high-speed camera measurements from impact-tested reinforced concrete beams","authors":"Viktor Peterson","doi":"10.1016/j.dib.2026.112487","DOIUrl":"10.1016/j.dib.2026.112487","url":null,"abstract":"<div><div>Impact-loaded reinforced concrete beams often fail in shear. This becomes relevant for shelter design against ballistics or fragment impact, for instance. An experimental campaign was conducted to study the different types of shear failure and governing parameters. Eighteen reinforced concrete beams were tested by a 70 kg steel striker dropped from a 2.4 m height. The beams were loaded at different positions from the support with different amounts of transverse reinforcement. The beams were of reduced scale with a length of 0.80 m and a square 0.15 m × 0.15 m cross-section. The drop weight tests were monitored with shock accelerometers on the striker and beam centre, load cells under the supports measuring reaction forces, and a high-speed camera (HSC). High-speed camera measurements were recorded orthogonal to the surface with the aim of performing high-quality digital image correlation (DIC) analyses. The beams and striker were painted with a speckled pattern prior to testing for the DIC analyses. Camera recordings were conducted with a 1024 × 512 px resolution and 6 kHz sampling, resulting in a time resolution of about 0.17 ms. Accelerometer and load cell measurements were sampled at 19.2 kHz. The accelerometer on the striker was used to approximate the impact force, and beam acceleration can be used to synchronize the camera and DAQ recordings. The data may be used to calibrate finite element models, study the impact response of beams, or develop new mechanical models.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112487"},"PeriodicalIF":1.4,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146075164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-21DOI: 10.1016/j.dib.2026.112486
Italo Aldo Campodonico-Avendano , Silvia Erba , Panayiotis Papadopoulos , Salvatore Carlucci , Antonio Luparelli , Amedeo Ingrosso , Greta Tresoldi , Muhammad Salman Shahid , Frederic Wurtz , Benoit Delinchant , Per Martin Leinan , Stefano Cera , Peter Riederer , Runar Solli , Amin Moazami , Mohammadreza Aghaei
Indoor Environmental Quality directly affects public health, productivity, and well-being, while also playing a vital role in developing climate-neutral, energy-efficient, and resilient buildings. This paper presents a comprehensive dataset of indoor environmental parameters that affect thermal comfort, indoor air quality, and visual comfort, which was created under the European Union’s Horizon 2020 Project Collective Intelligence for Energy Flexibility. The dataset comprises high-resolution measurements of carbon dioxide, pollutants, volatile organic compounds, air temperature, relative humidity, and illuminance on a horizontal plane, collected over a two-year period at 1-minute intervals. Data were gathered from 14 pilot buildings across four European climates: Cyprus, France, Italy, and Norway, covering diverse building types such as schools, medical centres, sports arenas, residential complexes, universities, and elder care facilities, representing about 40 % of common European building categories. Sensors were installed in specific thermal zones within each building to monitor environmental conditions. All data is organized by building and zone and supplemented with standardized Brick metadata to ensure interoperability. This comprehensive dataset, with its broad geographic coverage, variety of building types, long-term high-frequency measurements, and multimodal data, provides a valuable resource for comparative IEQ research, cross-domain modelling, and integrated assessments of comfort, ventilation, and daylighting across different climates and operational settings and is available upon request under a non-disclosure agreement provided by the consortium.
{"title":"COLLECTiEF dataset: A high-resolution indoor environmental dataset from European buildings across diverse climates supporting thermal, air-quality, and visual-comfort assessments","authors":"Italo Aldo Campodonico-Avendano , Silvia Erba , Panayiotis Papadopoulos , Salvatore Carlucci , Antonio Luparelli , Amedeo Ingrosso , Greta Tresoldi , Muhammad Salman Shahid , Frederic Wurtz , Benoit Delinchant , Per Martin Leinan , Stefano Cera , Peter Riederer , Runar Solli , Amin Moazami , Mohammadreza Aghaei","doi":"10.1016/j.dib.2026.112486","DOIUrl":"10.1016/j.dib.2026.112486","url":null,"abstract":"<div><div>Indoor Environmental Quality directly affects public health, productivity, and well-being, while also playing a vital role in developing climate-neutral, energy-efficient, and resilient buildings. This paper presents a comprehensive dataset of indoor environmental parameters that affect thermal comfort, indoor air quality, and visual comfort, which was created under the European Union’s Horizon 2020 Project <em>Collective Intelligence for Energy Flexibility</em>. The dataset comprises high-resolution measurements of carbon dioxide, pollutants, volatile organic compounds, air temperature, relative humidity, and illuminance on a horizontal plane, collected over a two-year period at 1-minute intervals. Data were gathered from 14 pilot buildings across four European climates: Cyprus, France, Italy, and Norway, covering diverse building types such as schools, medical centres, sports arenas, residential complexes, universities, and elder care facilities, representing about 40 % of common European building categories. Sensors were installed in specific thermal zones within each building to monitor environmental conditions. All data is organized by building and zone and supplemented with standardized Brick metadata to ensure interoperability. This comprehensive dataset, with its broad geographic coverage, variety of building types, long-term high-frequency measurements, and multimodal data, provides a valuable resource for comparative IEQ research, cross-domain modelling, and integrated assessments of comfort, ventilation, and daylighting across different climates and operational settings and is available upon request under a non-disclosure agreement provided by the consortium.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112486"},"PeriodicalIF":1.4,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146075166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-20DOI: 10.1016/j.dib.2026.112495
Tri Lathif Mardi Suryanto , Aji Prasetya Wibawa , Hariyono , Andrew Nafalski , Gulsun Kurubacak Çakır
This dataset presents a valuable compilation of question–answer (QA) pairs derived from cultural texts and sources related to Durga mythology. A total of 21,395 QA pairs, encompassing textual materials such as scriptures, ritual narratives, temple inscriptions, and traditional storytelling records. Each entry includes the source reference, question, and corresponding answer, provided in a structured format compatible with Excel for seamless integration into downstream natural language processing (NLP) tasks. Data collection involved manual curation and annotation by domain experts, followed by preprocessing steps including text normalization, duplication removal, and verification of factual and contextual accuracy. The dataset is designed to support generative QA models, culturally aware chatbots, and digital preservation of heritage knowledge. It is particularly valuable for research in AI-driven cultural applications, educational tools, and digital humanities initiatives aiming to bridge traditional knowledge with computational methods. Researchers and practitioners may utilize the dataset for training generative models, creating interactive educational platforms, developing culturally sensitive AI agents, and supporting comparative studies in cross-cultural heritage. This openly accessible resource adheres to ethical standards, with proper attribution to source materials, and provides a foundational asset for both academic research and applied development in culturally informed artificial intelligence.
{"title":"Generated cultural heritage question–answer dataset: Durga in multi-dimensional perspectives","authors":"Tri Lathif Mardi Suryanto , Aji Prasetya Wibawa , Hariyono , Andrew Nafalski , Gulsun Kurubacak Çakır","doi":"10.1016/j.dib.2026.112495","DOIUrl":"10.1016/j.dib.2026.112495","url":null,"abstract":"<div><div>This dataset presents a valuable compilation of question–answer (QA) pairs derived from cultural texts and sources related to Durga mythology. A total of 21,395 QA pairs, encompassing textual materials such as scriptures, ritual narratives, temple inscriptions, and traditional storytelling records. Each entry includes the source reference, question, and corresponding answer, provided in a structured format compatible with Excel for seamless integration into downstream natural language processing (NLP) tasks. Data collection involved manual curation and annotation by domain experts, followed by preprocessing steps including text normalization, duplication removal, and verification of factual and contextual accuracy. The dataset is designed to support generative QA models, culturally aware chatbots, and digital preservation of heritage knowledge. It is particularly valuable for research in AI-driven cultural applications, educational tools, and digital humanities initiatives aiming to bridge traditional knowledge with computational methods. Researchers and practitioners may utilize the dataset for training generative models, creating interactive educational platforms, developing culturally sensitive AI agents, and supporting comparative studies in cross-cultural heritage. This openly accessible resource adheres to ethical standards, with proper attribution to source materials, and provides a foundational asset for both academic research and applied development in culturally informed artificial intelligence.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112495"},"PeriodicalIF":1.4,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146075156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-20DOI: 10.1016/j.dib.2026.112485
Enzo Komatz , Marion Andritz , Christoph Markowitsch
This data article presents a dataset of miniplant-scale reverse water-gas shift (rWGS) experiments conducted in a heated fixed-bed reactor under systematically varied operating conditions. The dataset contains processed measurements including reactor temperature, molar fractions of CO2, CO, H2, CH4, and derived quantities such as CO2 conversion and CO selectivity. The experiments cover a wide parameter space, including gas hourly space velocities of 8000, 14,000 and 20,000 h-1 with temperatures between 550 and 950 °C (increment of 50 K), and H2:CO2 feed ratios of 2:1, 2.5:1 and 3:1.
The dataset presents the steady-state values and links to the reproductible data processing step, based on a prior study, enabling Fairness of all steps from the initial measurements to the final processed variables. The processing workflow includes calibration of gas analysis signals, smoothing, dry-gas calculation, and uncertainty estimation.
These data provide value for validating mechanistic kinetic models, benchmarking computational fluid dynamics (CFD) reactor simulations, training machine learning models including physics-informed machine learning frameworks, and supporting thermodynamic model assessments. All raw and processed data are made publicly available in a long-term repository, ensuring FAIR access and enabling reuse by the scientific community.
{"title":"Experimental dataset of the reverse water-gas shift reaction in a fixed-bed reactor setup under varying reactor conditions","authors":"Enzo Komatz , Marion Andritz , Christoph Markowitsch","doi":"10.1016/j.dib.2026.112485","DOIUrl":"10.1016/j.dib.2026.112485","url":null,"abstract":"<div><div>This data article presents a dataset of miniplant-scale reverse water-gas shift (rWGS) experiments conducted in a heated fixed-bed reactor under systematically varied operating conditions. The dataset contains processed measurements including reactor temperature, molar fractions of CO<sub>2</sub>, CO, H<sub>2</sub>, CH<sub>4</sub>, and derived quantities such as CO<sub>2</sub> conversion and CO selectivity. The experiments cover a wide parameter space, including gas hourly space velocities of 8000, 14,000 and 20,000 h<sup>-1</sup> with temperatures between 550 and 950 °C (increment of 50 K), and H<sub>2</sub>:CO<sub>2</sub> feed ratios of 2:1, 2.5:1 and 3:1.</div><div>The dataset presents the steady-state values and links to the reproductible data processing step, based on a prior study, enabling Fairness of all steps from the initial measurements to the final processed variables. The processing workflow includes calibration of gas analysis signals, smoothing, dry-gas calculation, and uncertainty estimation.</div><div>These data provide value for validating mechanistic kinetic models, benchmarking computational fluid dynamics (CFD) reactor simulations, training machine learning models including physics-informed machine learning frameworks, and supporting thermodynamic model assessments. All raw and processed data are made publicly available in a long-term repository, ensuring FAIR access and enabling reuse by the scientific community.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112485"},"PeriodicalIF":1.4,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146075165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-20DOI: 10.1016/j.dib.2026.112496
Kai Zhang, Qin Ma, Yichen Liu, Xiaochen Shi
The rapid advancement of intelligent livestock farming and precision breeding has underscored the importance of non-contact body measurement and weight estimation. Based on 3D reconstruction, these techniques represent a critical pathway for the digital transformation of animal husbandry. However, there are no publicly available 3D point cloud datasets specific to Jining Qing Goats, particularly systematic data covering key developmental stages. To bridge this gap, we present a comprehensive dataset comprising multi-view 3D point clouds and standardized morphometric records of Jining Qing Goats. The dataset spans multiple age groups and emphasizes critical phases such as juvenile, growing, mature, and reproductive stages, thereby capturing a holistic representation of the breed’s life cycle. During data acquisition, two Microsoft Kinect DK depth cameras were positioned bilaterally to capture RGB and depth images simultaneously under relatively static conditions. Multi-view point clouds were registered using the Iterative Closest Point (ICP) algorithm, with the floor plane serving as a unified reference to align all scans within a global coordinate system. In parallel, manual measurements of six key morphometric traits, including body length, withers height, shoulder width, abdominal width, heart girth and hip width, were collected as validation references. The dataset consists of raw RGB images, depth maps, point cloud files, camera calibration parameters, and manually annotated measurement records, all of which are openly accessible. This resource supports a wide range of computer vision tasks such as livestock 3D reconstruction, automated morphometric measurement, and weight estimation, thereby facilitating digital transformation, intelligent management, and sustainable development in modern livestock farming.
{"title":"A 3D point cloud dataset of Jining Qing Goats for segmentation analysis and body size measurement","authors":"Kai Zhang, Qin Ma, Yichen Liu, Xiaochen Shi","doi":"10.1016/j.dib.2026.112496","DOIUrl":"10.1016/j.dib.2026.112496","url":null,"abstract":"<div><div>The rapid advancement of intelligent livestock farming and precision breeding has underscored the importance of non-contact body measurement and weight estimation. Based on 3D reconstruction, these techniques represent a critical pathway for the digital transformation of animal husbandry. However, there are no publicly available 3D point cloud datasets specific to Jining Qing Goats, particularly systematic data covering key developmental stages. To bridge this gap, we present a comprehensive dataset comprising multi-view 3D point clouds and standardized morphometric records of Jining Qing Goats. The dataset spans multiple age groups and emphasizes critical phases such as juvenile, growing, mature, and reproductive stages, thereby capturing a holistic representation of the breed’s life cycle. During data acquisition, two Microsoft Kinect DK depth cameras were positioned bilaterally to capture RGB and depth images simultaneously under relatively static conditions. Multi-view point clouds were registered using the Iterative Closest Point (ICP) algorithm, with the floor plane serving as a unified reference to align all scans within a global coordinate system. In parallel, manual measurements of six key morphometric traits, including body length, withers height, shoulder width, abdominal width, heart girth and hip width, were collected as validation references. The dataset consists of raw RGB images, depth maps, point cloud files, camera calibration parameters, and manually annotated measurement records, all of which are openly accessible. This resource supports a wide range of computer vision tasks such as livestock 3D reconstruction, automated morphometric measurement, and weight estimation, thereby facilitating digital transformation, intelligent management, and sustainable development in modern livestock farming.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112496"},"PeriodicalIF":1.4,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146185204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-20DOI: 10.1016/j.dib.2026.112492
Md. Famidul Islam Pranto, Md. Rifatul Islam, Md. Ali Akbor, Nabonita Ghosh, Md. Rahatun Alam, Sudipto Chaki, Md. Masudul Islam
We present ASL-HG, a comprehensive American Sign Language (ASL) image dataset designed to advance gesture recognition and assistive technologies. The collection contains 36,000 static images across 36 classes, covering the full English alphabet (A–Z) and digits (0–9). Data were captured from 10 volunteers in Mirpur, Dhaka, Bangladesh, with each participant contributing 100 samples per class, ensuring a balanced distribution across subjects, genders, and skin tones. Unlike many existing ASL datasets, ASL-HG explicitly distinguishes between the letter “O” and the digit “0″ by including the standard two-handed ASL “zero” sign used in practical alphanumeric communication. The dataset is released in two complementary forms: raw images with natural indoor and outdoor backgrounds, and a MediaPipe-processed version with hand-segmented crops and predefined 80–20 train–test splits. This design supports both custom pre-processing and immediate model training. ASL-HG is intended to serve as a benchmark resource for developing robust and fair ASL recognition systems, reducing communication barriers for deaf and speech-impaired users, and enabling broader research in gesture-based human–computer interaction.
{"title":"A comprehensive image dataset of American Sign Language hand gestures","authors":"Md. Famidul Islam Pranto, Md. Rifatul Islam, Md. Ali Akbor, Nabonita Ghosh, Md. Rahatun Alam, Sudipto Chaki, Md. Masudul Islam","doi":"10.1016/j.dib.2026.112492","DOIUrl":"10.1016/j.dib.2026.112492","url":null,"abstract":"<div><div>We present ASL-HG, a comprehensive American Sign Language (ASL) image dataset designed to advance gesture recognition and assistive technologies. The collection contains 36,000 static images across 36 classes, covering the full English alphabet (A–Z) and digits (0–9). Data were captured from 10 volunteers in Mirpur, Dhaka, Bangladesh, with each participant contributing 100 samples per class, ensuring a balanced distribution across subjects, genders, and skin tones. Unlike many existing ASL datasets, ASL-HG explicitly distinguishes between the letter “O” and the digit “0″ by including the standard two-handed ASL “zero” sign used in practical alphanumeric communication. The dataset is released in two complementary forms: raw images with natural indoor and outdoor backgrounds, and a MediaPipe-processed version with hand-segmented crops and predefined 80–20 train–test splits. This design supports both custom pre-processing and immediate model training. ASL-HG is intended to serve as a benchmark resource for developing robust and fair ASL recognition systems, reducing communication barriers for deaf and speech-impaired users, and enabling broader research in gesture-based human–computer interaction.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112492"},"PeriodicalIF":1.4,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146075155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-20DOI: 10.1016/j.dib.2026.112489
Solmaz Hossein Khani, Maxime Corré, Khadidja Ould Amer, Noah Remy, Berangère Lebas, Anouck Habrant, Gabriel Paës, Yassin Refahi
The conversion of lignocellulosic biomass from plant cell walls into bioproducts can contribute to reducing dependence on fossil sources and achieving sustainable development. Biotechnological conversion of lignocellulosic biomass has several advantages over other conversion approaches such as thermochemical and chemical conversions. These advantages include improved efficiency and specificity for desired products, ecological compatibility and reduced toxicity. Enzymatic transformation is a key step in biotechnological conversion. To achieve a cost-effective conversion, a comprehensive understanding of cell wall enzymatic hydrolysis is required. Despite progress, the enzymatic hydrolysis at microscale is comparatively understudied and lacks comprehensive investigation. Addressing this gap requires collection of time-lapse image datasets of cell wall enzymatic hydrolysis which is a technically demanding task. Furthermore, accurate processing of the time-lapse images to identify and track individual cell walls is particularly challenging, notably because of the sample drift present in the images. Recently, an efficient image processing pipeline, called AIMTrack, has been developed which uses an enhanced divide-and-conquer strategy to divide time-lapse images into clusters whose sizes are dynamically adjusted to the deconstruction extent. The image registrations are then limited to clusters and the resulting transformations are combined to correct sample drift across time-lapse images. Subsequently AIMTrack provides segmentation of time-lapse images where voxels belonging to the same cell walls are labelled with a unique identifier. The time-lapse image datasets presented here consist of time-lapse images of spruce wood cell walls acquired during enzymatic hydrolysis using a cellulolytic enzyme cocktail at two enzyme loadings of 15 and 30 FPU/g biomass. Control time-lapse datasets which are acquired under the identical conditions, but without addition of enzymes, are also included. Both control and hydrolysis datasets are processed using AIMTrack to track the cell walls from time-lapse images. The generated segmentations are also provided.
{"title":"4D (space + time) datasets of spruce wood enzymatic hydrolysis","authors":"Solmaz Hossein Khani, Maxime Corré, Khadidja Ould Amer, Noah Remy, Berangère Lebas, Anouck Habrant, Gabriel Paës, Yassin Refahi","doi":"10.1016/j.dib.2026.112489","DOIUrl":"10.1016/j.dib.2026.112489","url":null,"abstract":"<div><div>The conversion of lignocellulosic biomass from plant cell walls into bioproducts can contribute to reducing dependence on fossil sources and achieving sustainable development. Biotechnological conversion of lignocellulosic biomass has several advantages over other conversion approaches such as thermochemical and chemical conversions. These advantages include improved efficiency and specificity for desired products, ecological compatibility and reduced toxicity. Enzymatic transformation is a key step in biotechnological conversion. To achieve a cost-effective conversion, a comprehensive understanding of cell wall enzymatic hydrolysis is required. Despite progress, the enzymatic hydrolysis at microscale is comparatively understudied and lacks comprehensive investigation. Addressing this gap requires collection of time-lapse image datasets of cell wall enzymatic hydrolysis which is a technically demanding task. Furthermore, accurate processing of the time-lapse images to identify and track individual cell walls is particularly challenging, notably because of the sample drift present in the images. Recently, an efficient image processing pipeline, called AIMTrack, has been developed which uses an enhanced divide-and-conquer strategy to divide time-lapse images into clusters whose sizes are dynamically adjusted to the deconstruction extent. The image registrations are then limited to clusters and the resulting transformations are combined to correct sample drift across time-lapse images. Subsequently AIMTrack provides segmentation of time-lapse images where voxels belonging to the same cell walls are labelled with a unique identifier. The time-lapse image datasets presented here consist of time-lapse images of spruce wood cell walls acquired during enzymatic hydrolysis using a cellulolytic enzyme cocktail at two enzyme loadings of 15 and 30 FPU/g biomass. Control time-lapse datasets which are acquired under the identical conditions, but without addition of enzymes, are also included. Both control and hydrolysis datasets are processed using AIMTrack to track the cell walls from time-lapse images. The generated segmentations are also provided.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112489"},"PeriodicalIF":1.4,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146075233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-20DOI: 10.1016/j.dib.2026.112493
José A. Guzmán-Torres, Sandra del C. Arguello-Hernández, Francisco J. Domínguez-Mota, Gerardo Tinoco-Guerrero, Elia M. Alonso-Guzmán
The ConcreteCARB dataset provides a comprehensive repository of 903 high-resolution images of concrete surfaces evaluated using the phenolphthalein test for carbonation detection. This data was collected under controlled laboratory conditions and aims to support artificial intelligence applications in civil engineering, especially in structural health monitoring tasks. The images are systematically organized into two distinct classes: “Carbonated Samples” and “No Carbonation Presence,” enabling binary classification approaches. All samples were manually tested, split, and visually labelled by expert engineers to ensure reliable ground-truth classification, in accordance with standardized procedures. The dataset includes images of concrete prism elements fabricated with varying mix designs, incorporating different water-cement ratios and additives, such as industrial silica waste and natural admixtures derived from Opuntia ficus-indica. The specimens were subjected to natural atmospheric carbonation conditions for 180 days, and their carbonation fronts were revealed by phenolphthalein staining. The samples were then split manually with a chisel and hammer, and photographic documentation was performed with a Samsung SM-S901U1 smartphone using predefined settings to ensure consistency and quality across the dataset. ConcreteCARB is intended for researchers, engineers, and data scientists working on machine learning, deep learning, and computer vision solutions for concrete diagnostics. It provides valuable training and benchmarking data for the development of automated detection, classification, and segmentation models for carbonation damage assessment. Furthermore, the dataset can serve as a foundational tool for cross-comparative studies on the efficacy of AI techniques in materials degradation analysis. The openly accessible nature of the dataset through a public repository supports reproducibility and encourages the extension of AI applications in concrete durability and sustainability studies.
{"title":"ConcreteCARB: A comprehensive image dataset of concrete carbonation for computer vision tasks","authors":"José A. Guzmán-Torres, Sandra del C. Arguello-Hernández, Francisco J. Domínguez-Mota, Gerardo Tinoco-Guerrero, Elia M. Alonso-Guzmán","doi":"10.1016/j.dib.2026.112493","DOIUrl":"10.1016/j.dib.2026.112493","url":null,"abstract":"<div><div>The ConcreteCARB dataset provides a comprehensive repository of 903 high-resolution images of concrete surfaces evaluated using the phenolphthalein test for carbonation detection. This data was collected under controlled laboratory conditions and aims to support artificial intelligence applications in civil engineering, especially in structural health monitoring tasks. The images are systematically organized into two distinct classes: “Carbonated Samples” and “No Carbonation Presence,” enabling binary classification approaches. All samples were manually tested, split, and visually labelled by expert engineers to ensure reliable ground-truth classification, in accordance with standardized procedures. The dataset includes images of concrete prism elements fabricated with varying mix designs, incorporating different water-cement ratios and additives, such as industrial silica waste and natural admixtures derived from Opuntia ficus-indica. The specimens were subjected to natural atmospheric carbonation conditions for 180 days, and their carbonation fronts were revealed by phenolphthalein staining. The samples were then split manually with a chisel and hammer, and photographic documentation was performed with a Samsung SM-S901U1 smartphone using predefined settings to ensure consistency and quality across the dataset. ConcreteCARB is intended for researchers, engineers, and data scientists working on machine learning, deep learning, and computer vision solutions for concrete diagnostics. It provides valuable training and benchmarking data for the development of automated detection, classification, and segmentation models for carbonation damage assessment. Furthermore, the dataset can serve as a foundational tool for cross-comparative studies on the efficacy of AI techniques in materials degradation analysis. The openly accessible nature of the dataset through a public repository supports reproducibility and encourages the extension of AI applications in concrete durability and sustainability studies.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112493"},"PeriodicalIF":1.4,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146075163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-19DOI: 10.1016/j.dib.2026.112482
Alexander Vallejo-Díaz , Idalberto Herrera-Moya , Edwin Garabitos-Lara , Héctor David Morbán-Ramírez , Adermim Jhoswar Severino-Simeón , Anders Malmquist
This dataset provides in-situ wind measurements collected between March 2024 and June 2025 from four urban locations in the southern region of the Dominican Republic: San Cristóbal, Azua, Barahona, and San Juan de la Maguana. Measurements were performed using Davis Instruments Vantage PRO2 meteorological stations, strategically installed to characterize the urban wind resource for potential microgeneration applications. The dataset includes wind speed, wind direction, air temperature, relative humidity, and atmospheric pressure. Data processing involved quality control procedures, gap filling through interpolation techniques, and subsequent analysis for wind characterization. The analysis was carried out using WRPLOT View – Wind Rose Plotting Software Version 9.2.0 for wind rose generation, HOMER Pro – Microgrid Analysis Tool, Version 3.18.4 for renewable resource assessment, and Microsoft Excel for parameterization of the Weibull distribution function. In addition, derived metrics such as theoretical wind potential and the theoretically available wind potential were calculated for each location.
This dataset can serve as a valuable resource for preliminary renewable energy feasibility studies, particularly for screening and comparative assessments of small-scale or distributed generation in urban environments. It also supports urban energy planning and can be used to inform computational fluid dynamics (CFD) studies of urban wind flows, for example through boundary condition definition, model calibration, or comparative analysis, rather than direct validation of highly turbulent urban wind fields, and the design of hybrid wind–solar systems. Beyond energy applications, the data may be applied to urban climate studies, including the assessment of diurnal and seasonal variability and heat island effects, and to modeling the dispersion of air pollution in complex urban settings.
该数据集提供了在2024年3月至2025年6月期间从多米尼加共和国南部地区的四个城市地点收集的现场风测量数据:圣Cristóbal,阿祖阿,巴拉霍纳和圣胡安德拉马瓜纳。测量使用的是Davis Instruments Vantage PRO2气象站,这些气象站战略性地安装在城市风力资源特征上,用于潜在的微发电应用。数据集包括风速、风向、气温、相对湿度和大气压力。数据处理包括质量控制程序,通过插值技术填充间隙,以及随后的风特性分析。分析采用WRPLOT View -风玫瑰生成软件9.2.0版本,HOMER Pro -微电网分析工具3.18.4版本进行可再生资源评价,Microsoft Excel进行威布尔分布函数参数化。此外,还计算了每个地点的理论风势和理论可用风势等衍生指标。该数据集可作为初步可再生能源可行性研究的宝贵资源,特别是用于筛选和比较城市环境中小规模或分布式发电的评估。它还支持城市能源规划,并可用于城市风流的计算流体动力学(CFD)研究,例如通过边界条件定义、模型校准或比较分析,而不是直接验证高度湍流的城市风场,以及设计混合风能-太阳能系统。除了能源应用之外,这些数据还可用于城市气候研究,包括评估日和季节变率以及热岛效应,并可用于模拟复杂城市环境中空气污染的扩散。
{"title":"Dataset of in-situ meteorological measurements for urban wind energy assessment in the southern region of the Dominican Republic","authors":"Alexander Vallejo-Díaz , Idalberto Herrera-Moya , Edwin Garabitos-Lara , Héctor David Morbán-Ramírez , Adermim Jhoswar Severino-Simeón , Anders Malmquist","doi":"10.1016/j.dib.2026.112482","DOIUrl":"10.1016/j.dib.2026.112482","url":null,"abstract":"<div><div>This dataset provides in-situ wind measurements collected between March 2024 and June 2025 from four urban locations in the southern region of the Dominican Republic: San Cristóbal, Azua, Barahona, and San Juan de la Maguana. Measurements were performed using Davis Instruments Vantage PRO2 meteorological stations, strategically installed to characterize the urban wind resource for potential microgeneration applications. The dataset includes wind speed, wind direction, air temperature, relative humidity, and atmospheric pressure. Data processing involved quality control procedures, gap filling through interpolation techniques, and subsequent analysis for wind characterization. The analysis was carried out using WRPLOT View – Wind Rose Plotting Software Version 9.2.0 for wind rose generation, HOMER Pro – Microgrid Analysis Tool, Version 3.18.4 for renewable resource assessment, and Microsoft Excel for parameterization of the Weibull distribution function. In addition, derived metrics such as theoretical wind potential and the theoretically available wind potential were calculated for each location.</div><div>This dataset can serve as a valuable resource for preliminary renewable energy feasibility studies, particularly for screening and comparative assessments of small-scale or distributed generation in urban environments. It also supports urban energy planning and can be used to inform computational fluid dynamics (CFD) studies of urban wind flows, for example through boundary condition definition, model calibration, or comparative analysis, rather than direct validation of highly turbulent urban wind fields, and the design of hybrid wind–solar systems. Beyond energy applications, the data may be applied to urban climate studies, including the assessment of diurnal and seasonal variability and heat island effects, and to modeling the dispersion of air pollution in complex urban settings.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112482"},"PeriodicalIF":1.4,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146075117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Insects feeding on xylem sap, such as adult Aphrophoridae spittlebugs, are vectors of the plant pathogenic xylem-limited bacterium Xylella fastidiosa (Xf), a causal agent of a number of severe diseases, including the Olive Quick Decline Syndrome (OQDS), which has decimated olive trees in the Mediterranean region. The Aphrophoridae life cycle and behaviour feature a weak stage, known as the juvenile stage, in which the insects live solitary on stems covered in a self-produced foamy fluid (froth) that protects them from dehydration and temperature stress. Juvenile vectors are ideal targets for a control intervention aimed at reducing transmission by adults. This paper presents the first, to the best of our knowledge, image dataset framing spittlebug froth samples in the field for the purpose of automated Aphrophoridae nymph identification. Images were captured using different devices including a consumer-grade RGB-D sensor, a digital reflex camera, and a smartphone camera. The dataset comprises 365 colour images, focusing on spittlebug foam. 211 of these images were captured in April 2024 during a two-day campaign. For these 211 images, a manual semantic annotation was performed, generating PNG binary masks that precisely distinguish spittlebug foam pixels from the background. To further enhance usability, labels are also provided in YOLO (You Only Look Once) format as text files, both for segmentation and object detection. The remaining 154 images were collected during a separate two-day campaign in 2025. These images are unannotated and are intended for further testing purposes. Overall, the dataset enables the development of both semantic segmentation models and object detectors for automated froth detection in natural images, thus facilitating the early identification of potentially harmful insects in sustainable pest management and control systems.
{"title":"Towards sustainable management of Xylella fastidiosa vectors: An annotated image dataset for automated in-field detection of Aphrophoridae foam","authors":"Michele Elia , Angelo Cardellicchio , Michele Paradiso , Giuseppe Veronico , Arianna Rana , Antonio Petitti , Vito Renò , Simone Pascuzzi , Annalisa Milella","doi":"10.1016/j.dib.2026.112477","DOIUrl":"10.1016/j.dib.2026.112477","url":null,"abstract":"<div><div>Insects feeding on xylem sap, such as adult <em>Aphrophoridae</em> spittlebugs, are vectors of the plant pathogenic xylem-limited bacterium <em>Xylella fastidiosa (Xf)</em>, a causal agent of a number of severe diseases, including the Olive Quick Decline Syndrome (OQDS), which has decimated olive trees in the Mediterranean region. The <em>Aphrophoridae</em> life cycle and behaviour feature a weak stage, known as the juvenile stage, in which the insects live solitary on stems covered in a self-produced foamy fluid (froth) that protects them from dehydration and temperature stress. Juvenile vectors are ideal targets for a control intervention aimed at reducing transmission by adults. This paper presents the first, to the best of our knowledge, image dataset framing spittlebug froth samples in the field for the purpose of automated <em>Aphrophoridae</em> nymph identification. Images were captured using different devices including a consumer-grade RGB-D sensor, a digital reflex camera, and a smartphone camera. The dataset comprises 365 colour images, focusing on spittlebug foam. 211 of these images were captured in April 2024 during a two-day campaign. For these 211 images, a manual semantic annotation was performed, generating PNG binary masks that precisely distinguish spittlebug foam pixels from the background. To further enhance usability, labels are also provided in YOLO (You Only Look Once) format as text files, both for segmentation and object detection. The remaining 154 images were collected during a separate two-day campaign in 2025. These images are unannotated and are intended for further testing purposes. Overall, the dataset enables the development of both semantic segmentation models and object detectors for automated froth detection in natural images, thus facilitating the early identification of potentially harmful insects in sustainable pest management and control systems.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112477"},"PeriodicalIF":1.4,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146075160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}