首页 > 最新文献

Data in Brief最新文献

英文 中文
Dataset of microscale atmospheric flow and pollutant concentration large-eddy simulations for varying mesoscale meteorological forcing in an idealized urban environment
IF 1 Q3 MULTIDISCIPLINARY SCIENCES Pub Date : 2025-02-01 DOI: 10.1016/j.dib.2025.111285
Eliott Lumet, Thomas Jaravel, Mélanie C. Rochoux
<div><div>By 2050, two-thirds of the world's population will live in urban areas under climate change, exacerbating the environmental and public health risks associated with poor air quality and urban heat island effects. Assessing these risks requires the development of microscale meteorological models that quickly and accurately predict wind velocity and pollutant concentration with high resolution, as the heterogeneity of urban environments leads to complex wind patterns and strong pollutant concentration gradients. Computational Fluid Dynamics (CFD) has emerged as a powerful tool to address this challenge by providing obstacle-resolved flow and dispersion predictions. However, CFD models are very expensive and require intensive computing resources, which can hinder their systematic use in practical engineering applications. They are also subject to significant uncertainties, particularly those arising from the mesoscale meteorological forcing and the internal variability of the atmospheric boundary layer, some of which are aleatory and thereby irreducible. Given these issues, the construction of CFD datasets that account for uncertainty would be an interesting avenue of research for microscale atmospheric science.</div><div>In this context, we present the PPMLES (Perturbed-Parameter ensemble of MUST Large-Eddy Simulations) dataset, which consists of 200 large-eddy simulations (LES) characterizing the complex interactions between the turbulent airflow, the tracer dispersion, and an idealized urban environment. These simulations reproduce the canonical MUST dispersion field campaign while perturbing the model's mesoscale meteorological forcing parameters. PPMLES includes time series at human height within the built environment to track wind velocity and pollutant release and dispersion over time. PPMLES also includes complete 3-D fields of first- and second-order temporal statistics of the wind velocity and pollutant concentration, with a sub-metric resolution. The uncertainty of the fields induced by the internal variability of the atmospheric boundary layer is also provided. The computation of PPMLES required significant resources, consuming 6 million CPU core hours, equivalent to the emission of approximately 10 tCO2eq of greenhouse gases. This significant computational effort and associated carbon footprint motivates the sharing of the data generated.</div><div>The added value of the PPMLES dataset is twofold. First, the perturbed-parameter ensemble of LES enables to quantify and understand the effects of the mesoscale meteorological forcing and the internal variability of the atmospheric boundary layer, which has been identified as a major challenge in predicting atmospheric flow and pollutant dispersion in urban environments. Secondly, PPMLES reference data can be used to benchmark models of different levels of complexity, and to extract key information about the physical processes involved to inform more operational modeling approaches,
{"title":"Dataset of microscale atmospheric flow and pollutant concentration large-eddy simulations for varying mesoscale meteorological forcing in an idealized urban environment","authors":"Eliott Lumet,&nbsp;Thomas Jaravel,&nbsp;Mélanie C. Rochoux","doi":"10.1016/j.dib.2025.111285","DOIUrl":"10.1016/j.dib.2025.111285","url":null,"abstract":"&lt;div&gt;&lt;div&gt;By 2050, two-thirds of the world's population will live in urban areas under climate change, exacerbating the environmental and public health risks associated with poor air quality and urban heat island effects. Assessing these risks requires the development of microscale meteorological models that quickly and accurately predict wind velocity and pollutant concentration with high resolution, as the heterogeneity of urban environments leads to complex wind patterns and strong pollutant concentration gradients. Computational Fluid Dynamics (CFD) has emerged as a powerful tool to address this challenge by providing obstacle-resolved flow and dispersion predictions. However, CFD models are very expensive and require intensive computing resources, which can hinder their systematic use in practical engineering applications. They are also subject to significant uncertainties, particularly those arising from the mesoscale meteorological forcing and the internal variability of the atmospheric boundary layer, some of which are aleatory and thereby irreducible. Given these issues, the construction of CFD datasets that account for uncertainty would be an interesting avenue of research for microscale atmospheric science.&lt;/div&gt;&lt;div&gt;In this context, we present the PPMLES (Perturbed-Parameter ensemble of MUST Large-Eddy Simulations) dataset, which consists of 200 large-eddy simulations (LES) characterizing the complex interactions between the turbulent airflow, the tracer dispersion, and an idealized urban environment. These simulations reproduce the canonical MUST dispersion field campaign while perturbing the model's mesoscale meteorological forcing parameters. PPMLES includes time series at human height within the built environment to track wind velocity and pollutant release and dispersion over time. PPMLES also includes complete 3-D fields of first- and second-order temporal statistics of the wind velocity and pollutant concentration, with a sub-metric resolution. The uncertainty of the fields induced by the internal variability of the atmospheric boundary layer is also provided. The computation of PPMLES required significant resources, consuming 6 million CPU core hours, equivalent to the emission of approximately 10 tCO2eq of greenhouse gases. This significant computational effort and associated carbon footprint motivates the sharing of the data generated.&lt;/div&gt;&lt;div&gt;The added value of the PPMLES dataset is twofold. First, the perturbed-parameter ensemble of LES enables to quantify and understand the effects of the mesoscale meteorological forcing and the internal variability of the atmospheric boundary layer, which has been identified as a major challenge in predicting atmospheric flow and pollutant dispersion in urban environments. Secondly, PPMLES reference data can be used to benchmark models of different levels of complexity, and to extract key information about the physical processes involved to inform more operational modeling approaches, ","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"58 ","pages":"Article 111285"},"PeriodicalIF":1.0,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143131300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Annotated image dataset with different stages of European pear rust for UAV-based automated symptom detection in orchards
IF 1 Q3 MULTIDISCIPLINARY SCIENCES Pub Date : 2025-02-01 DOI: 10.1016/j.dib.2025.111271
Virginia Maß , Pendar Alirezazadeh , Johannes Seidl-Schulz , Matthias Leipnitz , Eric Fritzsche , Rasheed Ali Adam Ibraheem , Martin Geyer , Michael Pflanz , Stefanie Reim
The evaluation of fruit genetic resources regarding a resistance to pathogens is an essential basis for subsequent selection in fruit breeding. Both genetic analysis and phenotyping of defined traits are important tools and provide decision data in the evaluation process. However, the phenotyping of plants is often carried out ‘by hand’ and remains the bottleneck in fruit breeding and fruit growing. The development of a digital and UAV (unmanned aerial vehicle)-based phenotyping method for the assessment of genotype-specific susceptibility or resistance against diseases in orchards would significantly increase the efficiency of plant breeding. In this framework, a workflow for drone-based monitoring of pathogens in orchards was developed using the European pear rust (Gymnosporangium sabinae) as model pathogen. Pear rust is widespread in orchards and causes conspicuous, clearly visible, yellow to orange-colored disease symptoms.
In this paper, we provide a dataset with expert-annotated high-resolution RGB images with pear rust symptoms. For data collection, ten UAV-flight campaigns were realized between 2021 and 2023 under various weather conditions and with different flight parameters in the experimental orchard of the Julius Kühn-Institute for Breeding Research on Fruit Crops in Dresden-Pillnitz (Germany). 1394 images were captured of different pear genotypes, including varieties, wild species and progeny from breeding. The dataset contains manually labelled images with a size of 768 × 768 pixels of leaves infected with pear rust at different stages of development, labelled as class GYMNSA, as well as background images without symptoms. Each leaf with pear rust symptoms was annotated with the drawing method by two points (bounding boxes) using the Computer Vision Annotation Tool (CVAT, v1.1.0) [1] and presented in YOLO 1.1 file format (.txt files). A total of 584 annotated images and 162 background images, organized into a training and validation set, are included in the GYMNSA dataset. This GYMNSA dataset can be used as a resource for researchers and developers working on drone-based plant disease monitoring systems.
{"title":"Annotated image dataset with different stages of European pear rust for UAV-based automated symptom detection in orchards","authors":"Virginia Maß ,&nbsp;Pendar Alirezazadeh ,&nbsp;Johannes Seidl-Schulz ,&nbsp;Matthias Leipnitz ,&nbsp;Eric Fritzsche ,&nbsp;Rasheed Ali Adam Ibraheem ,&nbsp;Martin Geyer ,&nbsp;Michael Pflanz ,&nbsp;Stefanie Reim","doi":"10.1016/j.dib.2025.111271","DOIUrl":"10.1016/j.dib.2025.111271","url":null,"abstract":"<div><div>The evaluation of fruit genetic resources regarding a resistance to pathogens is an essential basis for subsequent selection in fruit breeding. Both genetic analysis and phenotyping of defined traits are important tools and provide decision data in the evaluation process. However, the phenotyping of plants is often carried out ‘by hand’ and remains the bottleneck in fruit breeding and fruit growing. The development of a digital and UAV (unmanned aerial vehicle)-based phenotyping method for the assessment of genotype-specific susceptibility or resistance against diseases in orchards would significantly increase the efficiency of plant breeding. In this framework, a workflow for drone-based monitoring of pathogens in orchards was developed using the European pear rust (<em>Gymnosporangium sabinae</em>) as model pathogen. Pear rust is widespread in orchards and causes conspicuous, clearly visible, yellow to orange-colored disease symptoms.</div><div>In this paper, we provide a dataset with expert-annotated high-resolution RGB images with pear rust symptoms. For data collection, ten UAV-flight campaigns were realized between 2021 and 2023 under various weather conditions and with different flight parameters in the experimental orchard of the Julius Kühn-Institute for Breeding Research on Fruit Crops in Dresden-Pillnitz (Germany). 1394 images were captured of different pear genotypes, including varieties, wild species and progeny from breeding. The dataset contains manually labelled images with a size of 768 × 768 pixels of leaves infected with pear rust at different stages of development, labelled as class GYMNSA, as well as background images without symptoms. Each leaf with pear rust symptoms was annotated with the drawing method by two points (bounding boxes) using the Computer Vision Annotation Tool (CVAT, v1.1.0) [1] and presented in YOLO 1.1 file format (.txt files). A total of 584 annotated images and 162 background images, organized into a training and validation set, are included in the GYMNSA dataset. This GYMNSA dataset can be used as a resource for researchers and developers working on drone-based plant disease monitoring systems.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"58 ","pages":"Article 111271"},"PeriodicalIF":1.0,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11783052/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143078893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sediment flow connectivity index data for the Apulia region (Italy): An open-source geodatabase and the innovative CONNECTOSED WebGIS platform 意大利普利亚地区沉积物流动连通性指数数据:一个开源地理数据库和创新的CONNECTOSED WebGIS平台。
IF 1 Q3 MULTIDISCIPLINARY SCIENCES Pub Date : 2025-02-01 DOI: 10.1016/j.dib.2024.111210
Alok Kushabaha , Domenico Capolongo , Giovanni Scicchitano , Floriana Rizzo , Marina Zingaro
An open-source geodatabase and its associate WebGIS platform (CONNECTOSED) were developed to collect and utilize data for the Sediment Flow Connectivity Index (SfCI) for the Apulia region of southern Italy. Maps depicting sediment mobility and connectivity across the hydrographic basins of the Apulia region were generated and stored in the geodatabase. This geodatabase is organized into folders containing data in TIFF, shapefile, Jpeg and Pdf formats, including input variables (digital elevation model, land cover map, rainfall map, and soil units dataset for each hydrographic basin), classification graphs (ranking of variable values), dimensionless index maps (slope, ruggedness, rainfall, land cover, and soil stability) and key products (maps of sediment mobility, SfCI, and applied SfCI). The geodatabase maintains the mapping methodology underlying the SfCI algorithm by integrating various Earth datasets from multiple sources through ArcMap™, QGIS® and Matlab® software. This approach aligns surface characteristics with driving forces to describe the spatial variability of sediment pathways and identify hotspot areas. The availability of both input and processed data enables the computation and continuous updating of this applied geomorphological indicator, which is useful for assessing susceptibility to rapid Earth surface changes related to multi-hazard exposure. The geodatabase and the CONNECTOSED platform are valuable tools for researchers and stakeholders involved in land monitoring. The geodatabase and the CONNECTOSED platform are essential tools for researchers, policymakers, and stakeholders involved in land monitoring and environmental management. These tools provide open access to extensive datasets and detailed descriptions of surface dynamics, establishing connections between the causes and effects of extreme phenomena, such as floods, landslides, fires, soil pollution. This integration allows users to combine various forms of environmental data, a capability that is vital for enhancing scientific knowledge, supporting the development of insights, and fostering more informed, evidence-based decision-making in land use planning, conservation efforts, and sustainability initiatives.
开发了一个开源地理数据库及其相关的WebGIS平台(CONNECTOSED),用于收集和利用意大利南部普利亚地区的沉积物流连通性指数(SfCI)数据。绘制了描绘阿普利亚地区水文盆地沉积物流动性和连通性的地图,并将其存储在地理数据库中。该地理数据库被组织成多个文件夹,其中包含TIFF、shapefile、Jpeg和Pdf格式的数据,包括输入变量(数字高程模型、土地覆盖图、降雨量图和每个水文盆地的土壤单位数据集)、分类图(变量值排序)、无因次索引图(坡度、坚固度、降雨量、土地覆盖和土壤稳定性)和关键产品(沉积物流动性图、SfCI图和应用SfCI图)。地理数据库通过ArcMap™、QGIS®和Matlab®软件集成来自多个来源的各种地球数据集,维护SfCI算法的制图方法。该方法将地表特征与驱动力相结合,描述了沉积物路径的空间变异性,并确定了热点区域。有了输入和处理过的数据,就可以计算和不断更新这一实用的地貌指标,这对于评估与多种灾害接触有关的地球表面快速变化的易感性是有用的。地理数据库和CONNECTOSED平台是研究人员和参与土地监测的利益相关者的宝贵工具。地理数据库和CONNECTOSED平台是参与土地监测和环境管理的研究人员、政策制定者和利益相关者的重要工具。这些工具提供了对大量数据集和地表动力学详细描述的开放访问,建立了洪水、滑坡、火灾、土壤污染等极端现象的原因和影响之间的联系。这种整合使用户能够将各种形式的环境数据结合起来,这一能力对于增强科学知识、支持见解的发展、促进在土地利用规划、保护工作和可持续性倡议方面做出更明智、基于证据的决策至关重要。
{"title":"Sediment flow connectivity index data for the Apulia region (Italy): An open-source geodatabase and the innovative CONNECTOSED WebGIS platform","authors":"Alok Kushabaha ,&nbsp;Domenico Capolongo ,&nbsp;Giovanni Scicchitano ,&nbsp;Floriana Rizzo ,&nbsp;Marina Zingaro","doi":"10.1016/j.dib.2024.111210","DOIUrl":"10.1016/j.dib.2024.111210","url":null,"abstract":"<div><div>An open-source geodatabase and its associate WebGIS platform (CONNECTOSED) were developed to collect and utilize data for the Sediment Flow Connectivity Index (SfCI) for the Apulia region of southern Italy. Maps depicting sediment mobility and connectivity across the hydrographic basins of the Apulia region were generated and stored in the geodatabase. This geodatabase is organized into folders containing data in TIFF, shapefile, Jpeg and Pdf formats, including input variables (digital elevation model, land cover map, rainfall map, and soil units dataset for each hydrographic basin), classification graphs (ranking of variable values), dimensionless index maps (slope, ruggedness, rainfall, land cover, and soil stability) and key products (maps of sediment mobility, SfCI, and applied SfCI). The geodatabase maintains the mapping methodology underlying the SfCI algorithm by integrating various Earth datasets from multiple sources through ArcMap™, QGIS® and Matlab® software. This approach aligns surface characteristics with driving forces to describe the spatial variability of sediment pathways and identify hotspot areas. The availability of both input and processed data enables the computation and continuous updating of this applied geomorphological indicator, which is useful for assessing susceptibility to rapid Earth surface changes related to multi-hazard exposure. The geodatabase and the CONNECTOSED platform are valuable tools for researchers and stakeholders involved in land monitoring. The geodatabase and the CONNECTOSED platform are essential tools for researchers, policymakers, and stakeholders involved in land monitoring and environmental management. These tools provide open access to extensive datasets and detailed descriptions of surface dynamics, establishing connections between the causes and effects of extreme phenomena, such as floods, landslides, fires, soil pollution. This integration allows users to combine various forms of environmental data, a capability that is vital for enhancing scientific knowledge, supporting the development of insights, and fostering more informed, evidence-based decision-making in land use planning, conservation efforts, and sustainability initiatives.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"58 ","pages":"Article 111210"},"PeriodicalIF":1.0,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11730577/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142982990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using microtremor data to obtain dynamic properties of soils in the Veracruz-Boca del Rio metropolitan area 利用微震数据获取韦拉克鲁斯-博卡德尔b里约热内卢大都市区土壤的动力特性。
IF 1 Q3 MULTIDISCIPLINARY SCIENCES Pub Date : 2025-02-01 DOI: 10.1016/j.dib.2024.111196
José E. Barradas-Hernández, Sergio Márquez-Domínguez, Franco Antonio Carpio-Santamaria, Alejandro Vargas-Colorado, Abigail Zamora-Hernández, Roberto Rivera-Baizabal
The data presented here are the result of microtremor measurements at 44 points in three different soil types classified according to their fundamental vibration frequencies, on the metropolitan area of Veracruz-Boca del Río, Mexico. These Data are raw and was obtained using a GÜRALP 6TD model broadband orthogonal triaxial seismometer with an integrated 24-bit digitizer with a minimum recording time of 30 min and a recording rate of 100 samples per second (sps). The microtremor records were used to construct the H/V spectral ratios using the method of Nakamura. These H/V spectral ratios are a good approximation of the transfer function between the vibration waves in the sediment and the rigid stratum. Therefore, they can be used to construct seismic microzonation maps, seismic intensity maps and spectra for designing seismic resistant structures. One-dimensional stratigraphic soil models were obtained by processing the H/V spectral ratios. The relevant data from these models are layer thickness, primary wave velocities (Vp), secondary wave velocities (Vs) and density. These models represent a mathematical approximation of the soil structure that can be used to dynamically classify it according to Mexican technical codes.
这里展示的数据是在墨西哥Veracruz-Boca del Río的大都市区,根据其基本振动频率,在三种不同土壤类型的44个点进行微震测量的结果。这些原始数据是使用GÜRALP 6TD型宽带正交三轴地震仪获得的,该地震仪带有集成的24位数字化仪,最小记录时间为30分钟,记录速率为每秒100个样本(sps)。利用微震记录,采用Nakamura方法构建H/V谱比。这些H/V谱比很好地近似了泥沙和刚性地层中振动波之间的传递函数。因此,它们可用于构造地震微区划图、地震烈度图和抗震结构设计谱。通过对H/V谱比的处理,得到一维地层土壤模型。这些模型的相关数据是层厚、一次波速度(Vp)、二次波速度(Vs)和密度。这些模型代表了土壤结构的数学近似,可用于根据墨西哥技术规范对其进行动态分类。
{"title":"Using microtremor data to obtain dynamic properties of soils in the Veracruz-Boca del Rio metropolitan area","authors":"José E. Barradas-Hernández,&nbsp;Sergio Márquez-Domínguez,&nbsp;Franco Antonio Carpio-Santamaria,&nbsp;Alejandro Vargas-Colorado,&nbsp;Abigail Zamora-Hernández,&nbsp;Roberto Rivera-Baizabal","doi":"10.1016/j.dib.2024.111196","DOIUrl":"10.1016/j.dib.2024.111196","url":null,"abstract":"<div><div>The data presented here are the result of microtremor measurements at 44 points in three different soil types classified according to their fundamental vibration frequencies, on the metropolitan area of Veracruz-Boca del Río, Mexico. These Data are raw and was obtained using a GÜRALP 6TD model broadband orthogonal triaxial seismometer with an integrated 24-bit digitizer with a minimum recording time of 30 min and a recording rate of 100 samples per second (sps). The microtremor records were used to construct the H/V spectral ratios using the method of Nakamura. These H/V spectral ratios are a good approximation of the transfer function between the vibration waves in the sediment and the rigid stratum. Therefore, they can be used to construct seismic microzonation maps, seismic intensity maps and spectra for designing seismic resistant structures. One-dimensional stratigraphic soil models were obtained by processing the H/V spectral ratios. The relevant data from these models are layer thickness, primary wave velocities (Vp), secondary wave velocities (Vs) and density. These models represent a mathematical approximation of the soil structure that can be used to dynamically classify it according to Mexican technical codes.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"58 ","pages":"Article 111196"},"PeriodicalIF":1.0,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11698973/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142930934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BidCorpus: A multifaceted learning dataset for public procurement BidCorpus:面向公共采购的多面学习数据集。
IF 1 Q3 MULTIDISCIPLINARY SCIENCES Pub Date : 2025-02-01 DOI: 10.1016/j.dib.2024.111202
Weslley Lima , Victor Silva , Jasson Silva , Ricardo Lira , Anselmo Paiva
Digital transformation has significantly impacted public procurement, improving operational efficiency, transparency, and competition. This transformation has allowed the automation of data analysis and oversight in public administration. Public procurement involves various stages and generates a multitude of documents. However, experts manually analyze these unstructured textual documents, which are time-consuming and inefficient. To address this issue, we introduce BidCorpus, a novel and comprehensive dataset consisting of thousands of documents related to public procurement, specifically bidding notices from Brazilian public websites. The dataset was labeled using weak supervision techniques, manual labeling, and BERT-based language models. Models trained with these annotated data showed promising results, with metrics greater than 80 % in various experiments. The models could also tolerate intentional changes made to bidding notices to evade fraud detection. All the resources from this work are publicly available, including the documents, pre-processing scripts, and training and evaluation of the models. We expect the dataset and its labels to be of great value to researchers working on public procurement problems.
数字化转型对公共采购产生了重大影响,提高了运营效率、透明度和竞争。这种转变使公共行政中的数据分析和监督实现了自动化。公共采购涉及多个阶段,并产生大量文件。然而,专家们手工分析这些非结构化的文本文档,这既耗时又低效。为了解决这个问题,我们引入了BidCorpus,这是一个新颖而全面的数据集,由数千份与公共采购相关的文件组成,特别是来自巴西公共网站的招标通知。使用弱监督技术、人工标记和基于bert的语言模型对数据集进行标记。用这些带注释的数据训练的模型显示出有希望的结果,在各种实验中指标大于80%。这些模型还可以容忍故意修改投标通知以逃避欺诈检测。这项工作的所有资源都是公开可用的,包括文档、预处理脚本以及模型的训练和评估。我们希望数据集及其标签对研究公共采购问题的研究人员有很大的价值。
{"title":"BidCorpus: A multifaceted learning dataset for public procurement","authors":"Weslley Lima ,&nbsp;Victor Silva ,&nbsp;Jasson Silva ,&nbsp;Ricardo Lira ,&nbsp;Anselmo Paiva","doi":"10.1016/j.dib.2024.111202","DOIUrl":"10.1016/j.dib.2024.111202","url":null,"abstract":"<div><div>Digital transformation has significantly impacted public procurement, improving operational efficiency, transparency, and competition. This transformation has allowed the automation of data analysis and oversight in public administration. Public procurement involves various stages and generates a multitude of documents. However, experts manually analyze these unstructured textual documents, which are time-consuming and inefficient. To address this issue, we introduce BidCorpus, a novel and comprehensive dataset consisting of thousands of documents related to public procurement, specifically bidding notices from Brazilian public websites. The dataset was labeled using weak supervision techniques, manual labeling, and BERT-based language models. Models trained with these annotated data showed promising results, with metrics greater than 80 % in various experiments. The models could also tolerate intentional changes made to bidding notices to evade fraud detection. All the resources from this work are publicly available, including the documents, pre-processing scripts, and training and evaluation of the models. We expect the dataset and its labels to be of great value to researchers working on public procurement problems.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"58 ","pages":"Article 111202"},"PeriodicalIF":1.0,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11715116/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142946185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dataset of video game-based assessments in digital culture courses at Indoamerica University 印美大学数字文化课程中基于视频游戏的评估数据集。
IF 1 Q3 MULTIDISCIPLINARY SCIENCES Pub Date : 2025-02-01 DOI: 10.1016/j.dib.2024.111217
Miguel Cobos
This dataset contains evaluation results from video game-based assessments administered to first-level university students across six different academic programs at Universidad Indoamérica from October 2022 to August 2024. The data were collected using an adapted version of Pacman through the ClassTools.net platform, where traditional quiz questions were integrated into gameplay mechanics. The dataset comprises 1418 assessment attempts from students in Law, Medicine, Psychology, Clinical Psychology, Architecture, and Nursing programs, documenting their performance in digital culture and computing courses. Each record includes attempt number, timestamp, student identifier, gender, academic period, section, career program, and score achieved. The dataset enables analysis of student performance patterns, learning progression through multiple attempts, and comparative studies across different academic programs and periods. This information can support research in educational gamification, assessment design, and digital learning strategies in higher education.
该数据集包含了从 2022 年 10 月到 2024 年 8 月期间,印度洋大学对六个不同学科专业的大学一年级学生进行的基于视频游戏的评估结果。这些数据是通过 ClassTools.net 平台使用改编版《吃豆人》收集的,其中将传统的测验问题融入了游戏机制。数据集包括法律、医学、心理学、临床心理学、建筑学和护理学专业学生的 1418 次评估尝试,记录了他们在数字文化和计算机课程中的表现。每条记录包括尝试编号、时间戳、学生标识符、性别、学制、章节、职业项目和成绩。通过该数据集,可以分析学生的成绩模式、多次尝试的学习进度,以及不同学制和不同时期的比较研究。这些信息可为高等教育中的教育游戏化、评估设计和数字化学习策略研究提供支持。
{"title":"Dataset of video game-based assessments in digital culture courses at Indoamerica University","authors":"Miguel Cobos","doi":"10.1016/j.dib.2024.111217","DOIUrl":"10.1016/j.dib.2024.111217","url":null,"abstract":"<div><div>This dataset contains evaluation results from video game-based assessments administered to first-level university students across six different academic programs at Universidad Indoamérica from October 2022 to August 2024. The data were collected using an adapted version of Pacman through the <span><span>ClassTools.net</span><svg><path></path></svg></span> platform, where traditional quiz questions were integrated into gameplay mechanics. The dataset comprises 1418 assessment attempts from students in Law, Medicine, Psychology, Clinical Psychology, Architecture, and Nursing programs, documenting their performance in digital culture and computing courses. Each record includes attempt number, timestamp, student identifier, gender, academic period, section, career program, and score achieved. The dataset enables analysis of student performance patterns, learning progression through multiple attempts, and comparative studies across different academic programs and periods. This information can support research in educational gamification, assessment design, and digital learning strategies in higher education.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"58 ","pages":"Article 111217"},"PeriodicalIF":1.0,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11719281/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142969974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MEVDT: Multi-modal event-based vehicle detection and tracking dataset MEVDT:基于多模态事件的车辆检测和跟踪数据集。
IF 1 Q3 MULTIDISCIPLINARY SCIENCES Pub Date : 2025-02-01 DOI: 10.1016/j.dib.2024.111205
Zaid A. El Shair, Samir A. Rawashdeh
In this data article, we introduce the Multi-Modal Event-based Vehicle Detection and Tracking (MEVDT) dataset. This dataset provides a synchronized stream of event data and grayscale images of traffic scenes, captured using the Dynamic and Active-Pixel Vision Sensor (DAVIS) 240c hybrid event-based camera. MEVDT comprises 63 multi-modal sequences with approximately 13k images, 5M events, 10k object labels, and 85 unique object tracking trajectories. Additionally, MEVDT includes manually annotated ground truth labels — consisting of object classifications, pixel-precise bounding boxes, and unique object IDs — which are provided at a labeling frequency of 24 Hz. Designed to advance the research in the domain of event-based vision, MEVDT aims to address the critical need for high-quality, real-world annotated datasets that enable the development and evaluation of object detection and tracking algorithms in automotive environments.
{"title":"MEVDT: Multi-modal event-based vehicle detection and tracking dataset","authors":"Zaid A. El Shair,&nbsp;Samir A. Rawashdeh","doi":"10.1016/j.dib.2024.111205","DOIUrl":"10.1016/j.dib.2024.111205","url":null,"abstract":"<div><div>In this data article, we introduce the Multi-Modal Event-based Vehicle Detection and Tracking (MEVDT) dataset. This dataset provides a synchronized stream of event data and grayscale images of traffic scenes, captured using the Dynamic and Active-Pixel Vision Sensor (DAVIS) 240c hybrid event-based camera. MEVDT comprises 63 multi-modal sequences with approximately 13k images, 5M events, 10k object labels, and 85 unique object tracking trajectories. Additionally, MEVDT includes manually annotated ground truth labels — consisting of object classifications, pixel-precise bounding boxes, and unique object IDs — which are provided at a labeling frequency of 24 Hz. Designed to advance the research in the domain of event-based vision, MEVDT aims to address the critical need for high-quality, real-world annotated datasets that enable the development and evaluation of object detection and tracking algorithms in automotive environments.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"58 ","pages":"Article 111205"},"PeriodicalIF":1.0,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11720431/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142969978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A comprehensive image dataset for the identification of lemon leaf diseases and computer vision applications 一个综合的图像数据集,用于识别柠檬叶疾病和计算机视觉应用。
IF 1 Q3 MULTIDISCIPLINARY SCIENCES Pub Date : 2025-02-01 DOI: 10.1016/j.dib.2024.111244
A K M Fazlul Kobir Siam, Prayma Bishshash, Md. Asraful Sharker Nirob, Sajib Bin Mamun, Md Assaduzzaman, Sheak Rashed Haider Noori
A comprehensive dataset on lemon leaf disease can surely bring a lot of potentials into the development of agricultural research and the improvement of disease management strategies. This dataset was developed from 1354 raw images taken with professional agricultural specialist guidance from July to September 2024 in Charpolisha, Jamalpur, and further enhanced with augmented techniques, adding 9000 images. The augmentation process involves a set of techniques-flipping, rotation, zooming, shifting, adding noise, shearing, and brightening-to increase variety for different lemon leaf condition representations. Each of these images was standardized to 800 × 800 pixels resolution, so that consistency may be maintained among the dataset. All images were labelled in the nine prefixed categories: anthracnose, bacterial blight, citrus canker, curl virus, deficiency leaf, dry leaf, healthy leaf, sooty mould, and spider mites. In the present study, a DenseNet-121 architecture was used, where 20 % of the dataset was kept for validation and the remaining 80 % for training. A trained model with a batch size of 32 was trained for 30 epochs, achieving an accuracy of 98.56 % with augmentation, and 96.19 % without it. The dataset will not only act as a benchmark in developing accurate machine learning models for early disease detection, but it will also contribute to the cause of sustainable lemon cultivation practices by facilitating timely and effective disease management interventions.
一个全面的柠檬叶病数据集必将为农业研究的发展和病害管理策略的改进带来巨大的潜力。该数据集是在专业农业专家指导下于2024年7月至9月在贾马尔普尔Charpolisha拍摄的1354张原始图像中开发的,并通过增强技术进一步增强,增加了9000张图像。增强过程包括一系列技术——翻转、旋转、缩放、移动、添加噪音、剪切和增亮——以增加不同柠檬叶状况表现的多样性。每个图像都被标准化到800 × 800像素的分辨率,这样可以保持数据集之间的一致性。所有图像都被标记为9个前缀类别:炭疽病、细菌性枯萎病、柑橘溃疡、卷曲病毒、缺叶、干叶、健康叶、煤烟霉菌和蜘蛛螨。在本研究中,使用了DenseNet-121架构,其中20%的数据集用于验证,其余80%用于训练。一个批大小为32的训练模型训练了30个epoch,增强后的准确率为98.56%,未增强时的准确率为96.19%。该数据集不仅将作为开发用于早期疾病检测的准确机器学习模型的基准,而且还将通过促进及时有效的疾病管理干预,为可持续柠檬种植实践做出贡献。
{"title":"A comprehensive image dataset for the identification of lemon leaf diseases and computer vision applications","authors":"A K M Fazlul Kobir Siam,&nbsp;Prayma Bishshash,&nbsp;Md. Asraful Sharker Nirob,&nbsp;Sajib Bin Mamun,&nbsp;Md Assaduzzaman,&nbsp;Sheak Rashed Haider Noori","doi":"10.1016/j.dib.2024.111244","DOIUrl":"10.1016/j.dib.2024.111244","url":null,"abstract":"<div><div>A comprehensive dataset on lemon leaf disease can surely bring a lot of potentials into the development of agricultural research and the improvement of disease management strategies. This dataset was developed from 1354 raw images taken with professional agricultural specialist guidance from July to September 2024 in Charpolisha, Jamalpur, and further enhanced with augmented techniques, adding 9000 images. The augmentation process involves a set of techniques-flipping, rotation, zooming, shifting, adding noise, shearing, and brightening-to increase variety for different lemon leaf condition representations. Each of these images was standardized to 800 × 800 pixels resolution, so that consistency may be maintained among the dataset. All images were labelled in the nine prefixed categories: anthracnose, bacterial blight, citrus canker, curl virus, deficiency leaf, dry leaf, healthy leaf, sooty mould, and spider mites. In the present study, a DenseNet-121 architecture was used, where 20 % of the dataset was kept for validation and the remaining 80 % for training. A trained model with a batch size of 32 was trained for 30 epochs, achieving an accuracy of 98.56 % with augmentation, and 96.19 % without it. The dataset will not only act as a benchmark in developing accurate machine learning models for early disease detection, but it will also contribute to the cause of sustainable lemon cultivation practices by facilitating timely and effective disease management interventions<em>.</em></div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"58 ","pages":"Article 111244"},"PeriodicalIF":1.0,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11732584/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142982937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AzSLD: Azerbaijani sign language dataset for fingerspelling, word, and sentence translation with baseline software AzSLD:阿塞拜疆手语数据集,用于手指拼写,单词和句子翻译与基线软件。
IF 1 Q3 MULTIDISCIPLINARY SCIENCES Pub Date : 2025-02-01 DOI: 10.1016/j.dib.2024.111230
Nigar Alishzade , Jamaladdin Hasanov
Advancements in sign language processing technology hinge on the availability of extensive, reliable datasets, comprehensive instructions, and adherence to ethical guidelines. To facilitate progress in gesture recognition and translation systems and to support the Azerbaijani sign language community we present the Azerbaijani Sign Language Dataset (AzSLD). This comprehensive dataset was collected from a diverse group of sign language users, encompassing a range of linguistic parameters. Developed within the framework of a vision-based Azerbaijani Sign Language translation project, AzSLD includes recordings of the fingerspelling alphabet, individual words, and sentences. The data acquisition process involved recording signers across various age groups, genders, and proficiency levels to ensure broad representation. Sign language sentences were captured using two cameras from different angles, providing comprehensive visual coverage of each gesture. This approach enables robust training and evaluation of gesture recognition algorithms. The dataset comprises 30,000 meticulously annotated videos, each labeled with precise gesture identifiers and corresponding linguistic translations. To facilitate efficient usage of the dataset, we provide technical instructions and source code for a data loader. Researchers and developers working on sign language recognition, translation, and synthesis systems will find AzSLD invaluable, as it offers a rich repository of labeled data for training and evaluation purposes.
手语处理技术的进步取决于广泛、可靠的数据集的可用性、全面的说明以及对道德准则的遵守。为了促进手势识别和翻译系统的进展,并支持阿塞拜疆手语社区,我们提出了阿塞拜疆手语数据集(AzSLD)。这个综合数据集是从不同的手语使用者群体中收集的,包含了一系列的语言参数。AzSLD是在一个基于视觉的阿塞拜疆手语翻译项目框架内开发的,包括手指拼写字母、单个单词和句子的录音。数据采集过程包括记录不同年龄组、性别和熟练程度的签名者,以确保广泛的代表性。使用两个摄像机从不同的角度捕捉手语句子,为每个手势提供全面的视觉覆盖。这种方法可以实现手势识别算法的鲁棒训练和评估。该数据集包括30,000个精心注释的视频,每个视频都标有精确的手势标识符和相应的语言翻译。为了方便有效地使用数据集,我们提供了数据加载器的技术说明和源代码。从事手语识别、翻译和合成系统的研究人员和开发人员将发现AzSLD非常宝贵,因为它为培训和评估目的提供了丰富的标记数据存储库。
{"title":"AzSLD: Azerbaijani sign language dataset for fingerspelling, word, and sentence translation with baseline software","authors":"Nigar Alishzade ,&nbsp;Jamaladdin Hasanov","doi":"10.1016/j.dib.2024.111230","DOIUrl":"10.1016/j.dib.2024.111230","url":null,"abstract":"<div><div>Advancements in sign language processing technology hinge on the availability of extensive, reliable datasets, comprehensive instructions, and adherence to ethical guidelines. To facilitate progress in gesture recognition and translation systems and to support the Azerbaijani sign language community we present the Azerbaijani Sign Language Dataset (AzSLD). This comprehensive dataset was collected from a diverse group of sign language users, encompassing a range of linguistic parameters. Developed within the framework of a vision-based Azerbaijani Sign Language translation project, AzSLD includes recordings of the fingerspelling alphabet, individual words, and sentences. The data acquisition process involved recording signers across various age groups, genders, and proficiency levels to ensure broad representation. Sign language sentences were captured using two cameras from different angles, providing comprehensive visual coverage of each gesture. This approach enables robust training and evaluation of gesture recognition algorithms. The dataset comprises 30,000 meticulously annotated videos, each labeled with precise gesture identifiers and corresponding linguistic translations. To facilitate efficient usage of the dataset, we provide technical instructions and source code for a data loader. Researchers and developers working on sign language recognition, translation, and synthesis systems will find AzSLD invaluable, as it offers a rich repository of labeled data for training and evaluation purposes.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"58 ","pages":"Article 111230"},"PeriodicalIF":1.0,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11730573/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142982985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Soil data from the Barbastro-Balaguer gypsum belt, NE Spain 西班牙东北部巴巴斯特罗-巴拉格尔石膏带的土壤数据。
IF 1 Q3 MULTIDISCIPLINARY SCIENCES Pub Date : 2025-02-01 DOI: 10.1016/j.dib.2024.111236
Juan Herrero, María Tierra, Carmen Castañeda
<div><div>The dataset [<span><span>1</span></span>] hosts pedological info and images of the lands —locally known as <em>chesas</em>— of the outcropping gypsiferous core of the Barbastro-Balaguer anticline (<span><span>Fig. 1</span></span>). It stands out in the landscape for the linear reliefs due to outcrops of dipping strata with differential resistance to erosion, and also because of its whitish color (<span><span>Fig. 2</span></span>) and gypsophilous vegetation. This gypsum outcrop was named in the 19<sup>th</sup> Century [<span><span>2</span></span>] as a gypseous belt, and has been further studied by other geologists like [<span><span>3</span></span>,<span><span>4</span></span>] and by civil engineers e.g. Hué and Llamas [<span><span>5</span></span>]. Traditionally chesas were rangeland, with sparse almond and olive trees and rainfed winter cereals confined at the flat —and often terraced— valley bottoms, or <em>vales</em> as known in NE Spain. The chesas have attracted the attention of botanists [<span><span>[6]</span></span>, <span><span>[7]</span></span>, <span><span>[8]</span></span>], foresters [<span><span>9</span></span>,<span><span>10</span></span>], and soil hydrophysical properties researchers [<span><span>11</span></span>]. Moreover, public interest is increasing as the administrations are establishing rules for nature protection in the gypseous lands, e.g., a demarcation of 137 km<sup>2</sup> set within the chesas was declared a Special Conservation Area “ES2410074 Yesos de Barbastro”, and then protected by the Habitats Directive of European Union. Also, plant physiologists are focusing on the adaptations of plants to gypsum as reviewed by Escudero et al. [<span><span>12</span></span>]. No soil map is available, but according to [<span><span>13</span></span>,<span><span>14</span></span>] the Gypsic Haploxerepts [<span><span>15</span></span>] are dominant. In the absence of a soil map, our dataset can help in the decisions to be made by the authorities, as is the case for water allocation to irrigated estates both in operation and planned, or for authorizations for the spreading of pig slurry.</div><div>The herein presented soil data were collected with the classical techniques of pedological prospection. The dataset [<span><span>1</span></span>] contains the scans in .TIFF format of 150 whole thin sections of the soils, under both plane polarized light (PPL) and cross polarized light (XPL). Moreover, this dataset directs to a freely downloadable book [<span><span>16</span></span>] with the corresponding pedological descriptions, chemical and physical analyses, hydrophysical data, and scanning electron microscope images of the soils, plus micrographs of relevant pedofeatures of thin sections seen under petrographic microscope. The dataset [<span><span>1</span></span>] also presents a .xlsx file with an English translation of all figure captions of [<span><span>16</span></span>], including those of micrographs, and two more .xls
数据集[1]包含了土地的土壤学信息和图像(当地称为chesas),这些土地是barastrol - balaguer背斜露头的石膏岩心(图1)。由于露头的倾斜地层具有不同的抗侵蚀能力,它在景观中脱颖而出,因为它的白色(图2)和石膏植被。这一石膏露头在19世纪被命名为“石膏带”,并被其他地质学家[3,4]和土木工程师(如hu和Llamas bb1)进一步研究。传统上,切萨斯是牧场,稀疏的杏树和橄榄树和雨水喂养的冬季谷物被限制在平坦的——通常是梯田的——山谷底部,或西班牙东北部所知的山谷。这些chesas已经引起了植物学家[[6],[7],[8]]、林业学家[9,10]和土壤水物理性质研究者[bbb]的注意。此外,随着管理部门制定石膏土地自然保护规则,公众的兴趣也在增加,例如,在chesas内划定了137平方公里的边界,被宣布为“ES2410074 Yesos de barbasstro”特别保护区,然后受欧盟栖息地指令保护。此外,植物生理学家正在关注植物对石膏的适应性,正如Escudero等人所回顾的那样。没有土壤地图可用,但根据[13,14],Gypsic Haploxerepts[15]占优势。在没有土壤地图的情况下,我们的数据集可以帮助当局做出决定,就像在运营和计划中的灌溉庄园分配水的情况一样,或者授权猪浆的传播。本文所介绍的土壤资料是用经典的土壤学勘探技术采集的。数据集[1]包含在平面偏振光(PPL)和交叉偏振光(XPL)下的。tiff格式的150个完整的土壤薄片的扫描。此外,该数据集指向免费下载的书籍[16],其中包含相应的土壤学描述,化学和物理分析,水物理数据,土壤的扫描电子显微镜图像,以及在岩石显微镜下看到的薄片的相关土壤特征的显微照片。数据集[1]还提供了一个.xlsx文件,其中包含[16]的所有图片标题的英文翻译,包括那些显微照片,以及另外两个.xlsx文件,其中包含分析数据。所有的数据都可以被自然学家、工程师、技术人员和负责环境法制定和执行的公务员,以及参与公民科学活动的人直接重用。薄片保存在EEAD,并可根据要求在我们的场所进行检查。
{"title":"Soil data from the Barbastro-Balaguer gypsum belt, NE Spain","authors":"Juan Herrero,&nbsp;María Tierra,&nbsp;Carmen Castañeda","doi":"10.1016/j.dib.2024.111236","DOIUrl":"10.1016/j.dib.2024.111236","url":null,"abstract":"&lt;div&gt;&lt;div&gt;The dataset [&lt;span&gt;&lt;span&gt;1&lt;/span&gt;&lt;/span&gt;] hosts pedological info and images of the lands —locally known as &lt;em&gt;chesas&lt;/em&gt;— of the outcropping gypsiferous core of the Barbastro-Balaguer anticline (&lt;span&gt;&lt;span&gt;Fig. 1&lt;/span&gt;&lt;/span&gt;). It stands out in the landscape for the linear reliefs due to outcrops of dipping strata with differential resistance to erosion, and also because of its whitish color (&lt;span&gt;&lt;span&gt;Fig. 2&lt;/span&gt;&lt;/span&gt;) and gypsophilous vegetation. This gypsum outcrop was named in the 19&lt;sup&gt;th&lt;/sup&gt; Century [&lt;span&gt;&lt;span&gt;2&lt;/span&gt;&lt;/span&gt;] as a gypseous belt, and has been further studied by other geologists like [&lt;span&gt;&lt;span&gt;3&lt;/span&gt;&lt;/span&gt;,&lt;span&gt;&lt;span&gt;4&lt;/span&gt;&lt;/span&gt;] and by civil engineers e.g. Hué and Llamas [&lt;span&gt;&lt;span&gt;5&lt;/span&gt;&lt;/span&gt;]. Traditionally chesas were rangeland, with sparse almond and olive trees and rainfed winter cereals confined at the flat —and often terraced— valley bottoms, or &lt;em&gt;vales&lt;/em&gt; as known in NE Spain. The chesas have attracted the attention of botanists [&lt;span&gt;&lt;span&gt;[6]&lt;/span&gt;&lt;/span&gt;, &lt;span&gt;&lt;span&gt;[7]&lt;/span&gt;&lt;/span&gt;, &lt;span&gt;&lt;span&gt;[8]&lt;/span&gt;&lt;/span&gt;], foresters [&lt;span&gt;&lt;span&gt;9&lt;/span&gt;&lt;/span&gt;,&lt;span&gt;&lt;span&gt;10&lt;/span&gt;&lt;/span&gt;], and soil hydrophysical properties researchers [&lt;span&gt;&lt;span&gt;11&lt;/span&gt;&lt;/span&gt;]. Moreover, public interest is increasing as the administrations are establishing rules for nature protection in the gypseous lands, e.g., a demarcation of 137 km&lt;sup&gt;2&lt;/sup&gt; set within the chesas was declared a Special Conservation Area “ES2410074 Yesos de Barbastro”, and then protected by the Habitats Directive of European Union. Also, plant physiologists are focusing on the adaptations of plants to gypsum as reviewed by Escudero et al. [&lt;span&gt;&lt;span&gt;12&lt;/span&gt;&lt;/span&gt;]. No soil map is available, but according to [&lt;span&gt;&lt;span&gt;13&lt;/span&gt;&lt;/span&gt;,&lt;span&gt;&lt;span&gt;14&lt;/span&gt;&lt;/span&gt;] the Gypsic Haploxerepts [&lt;span&gt;&lt;span&gt;15&lt;/span&gt;&lt;/span&gt;] are dominant. In the absence of a soil map, our dataset can help in the decisions to be made by the authorities, as is the case for water allocation to irrigated estates both in operation and planned, or for authorizations for the spreading of pig slurry.&lt;/div&gt;&lt;div&gt;The herein presented soil data were collected with the classical techniques of pedological prospection. The dataset [&lt;span&gt;&lt;span&gt;1&lt;/span&gt;&lt;/span&gt;] contains the scans in .TIFF format of 150 whole thin sections of the soils, under both plane polarized light (PPL) and cross polarized light (XPL). Moreover, this dataset directs to a freely downloadable book [&lt;span&gt;&lt;span&gt;16&lt;/span&gt;&lt;/span&gt;] with the corresponding pedological descriptions, chemical and physical analyses, hydrophysical data, and scanning electron microscope images of the soils, plus micrographs of relevant pedofeatures of thin sections seen under petrographic microscope. The dataset [&lt;span&gt;&lt;span&gt;1&lt;/span&gt;&lt;/span&gt;] also presents a .xlsx file with an English translation of all figure captions of [&lt;span&gt;&lt;span&gt;16&lt;/span&gt;&lt;/span&gt;], including those of micrographs, and two more .xls","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"58 ","pages":"Article 111236"},"PeriodicalIF":1.0,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11731882/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142982991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Data in Brief
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1