Pub Date : 2025-02-01DOI: 10.1016/j.dib.2025.111274
Giulia Forestieri , Francisco Tomatis , Daniel Jato-Espino , Monica Pena Acosta
This dataset contains air and surface temperature measurements taken twice daily at 11:00 and 23:00 GMT+2 from 24th June to 5th July 2024 (a total of 10 days) in the historical centre of Malaga, Spain. It includes detailed thermal readings from various street materials and the facades of historical buildings, offering insights into the thermal properties and responses of these elements at different times of day. This dataset provides valuable information on localized temperature variations within the historical centre, influenced by different materials and architectural styles. It can be used to model microclimate variations, evaluate the thermal behavior of both historical and contemporary materials, and inform urban planning and heritage conservation efforts.
{"title":"Thermal data from Málaga's historical centre: Surface and air temperature measurements captured via mobile station and thermal imaging","authors":"Giulia Forestieri , Francisco Tomatis , Daniel Jato-Espino , Monica Pena Acosta","doi":"10.1016/j.dib.2025.111274","DOIUrl":"10.1016/j.dib.2025.111274","url":null,"abstract":"<div><div>This dataset contains air and surface temperature measurements taken twice daily at 11:00 and 23:00 GMT+2 from 24th June to 5th July 2024 (a total of 10 days) in the historical centre of Malaga, Spain. It includes detailed thermal readings from various street materials and the facades of historical buildings, offering insights into the thermal properties and responses of these elements at different times of day. This dataset provides valuable information on localized temperature variations within the historical centre, influenced by different materials and architectural styles. It can be used to model microclimate variations, evaluate the thermal behavior of both historical and contemporary materials, and inform urban planning and heritage conservation efforts<em>.</em></div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"58 ","pages":"Article 111274"},"PeriodicalIF":1.0,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11783051/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143078924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-01DOI: 10.1016/j.dib.2025.111277
Adrien Deschamps , Lucas Potin
Public procurement can be defined as the process by which public contracting authorities purchase goods, services, and works from private suppliers. To ensure transparency and prevent favoritism and corruption, public contracts adhere to strict procedures based on calls for tenders and award notices, which are publicly accessible online. This paper introduces a tabular dataset derived from the processing and consolidation of online public procurement notices. It provides a detailed overview of public contracts awarded in France between 2015 and 2023. The dataset encompasses approximately one million contractual relationships between public authorities and companies, in over 300,000 contracts, spanning all sectors and public institutions. It includes 113 variables covering contract characteristics (procedure, subject matter, award criteria, clauses, etc.), award outcomes (award price, number of bids, etc.), as well as information on contracting authorities (type, location, main activity, etc.) and awarded firms (size, legal status, main activity, age, location, etc.). This unprecedented dataset, both in accuracy and scope, provides reliable and detailed information on every advertised contract in France for nearly a decade, making it valuable for empirical research in diverse domains such as economics, geography, law, and political science.
{"title":"Processing and consolidation of open data on public procurement in France (2015–2023)","authors":"Adrien Deschamps , Lucas Potin","doi":"10.1016/j.dib.2025.111277","DOIUrl":"10.1016/j.dib.2025.111277","url":null,"abstract":"<div><div>Public procurement can be defined as the process by which public contracting authorities purchase goods, services, and works from private suppliers. To ensure transparency and prevent favoritism and corruption, public contracts adhere to strict procedures based on calls for tenders and award notices, which are publicly accessible online. This paper introduces a tabular dataset derived from the processing and consolidation of online public procurement notices. It provides a detailed overview of public contracts awarded in France between 2015 and 2023. The dataset encompasses approximately one million contractual relationships between public authorities and companies, in over 300,000 contracts, spanning all sectors and public institutions. It includes 113 variables covering contract characteristics (procedure, subject matter, award criteria, clauses, etc.), award outcomes (award price, number of bids, etc.), as well as information on contracting authorities (type, location, main activity, etc.) and awarded firms (size, legal status, main activity, age, location, etc.). This unprecedented dataset, both in accuracy and scope, provides reliable and detailed information on every advertised contract in France for nearly a decade, making it valuable for empirical research in diverse domains such as economics, geography, law, and political science.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"58 ","pages":"Article 111277"},"PeriodicalIF":1.0,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143131301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-01DOI: 10.1016/j.dib.2024.111204
Amr Seifelnasr , Chen Sun , Peng Ding , Xiuhua April Si , Jinxiang Xi
This dataset comprises a comprehensive collection of videos and images illustrating the fluid dynamics of swallowing and aspiration in a patient-specific pharyngolaryngeal model with varying epiglottis angles. The data also includes the physical properties of the fluids used, comprising dynamic viscosity, surface tension, and contact angle. Videos under varying swallowing conditions were collected to investigate the mechanisms underlying aspiration. The study utilized a biomechanical swallowing model developed using transparent casts of an anatomically accurate pharyngolaryngeal structure. Fluorescent dye was used to visualize the liquid flow dynamics from both side and back views. The dataset includes videos for two types of liquids, water and a 1% w/v methylcellulose aqueous solution, evaluated under two dispensing speeds (fast and slow) and two dispensing locations (anterior and posterior) across four epiglottis angles (30° up-tilt, 0° horizontal, 45° down-tilt, and 80° down-tilt). Additionally, the dataset includes photos of the pharyngolaryngeal model setup, photos of the epiglottis models used, and STL files for both the pharyngolaryngeal model and the epiglottis 3D models.
The videos document the distinct flow patterns and frequent aspiration sites identified during the experiments, including the interarytenoid notch, the cuneiform tubercular recess, and the vallecula. These data are valuable for researchers aiming to understand the etiology of dysphagia and can be reused to validate computational models, guide future experimental designs, and inform clinical diagnostics and treatment strategies. The dataset is organized into folders based on the epiglottis angles, dispensing speeds, and locations, as well as liquid types. This organization facilitates easy access and analysis for researchers in biomedical engineering, clinical research, and computational biology. The data provide a rich resource for further investigation into swallowing mechanics and the development of etiology-based interventions for dysphagia management.
{"title":"Data on hydrodynamic flow and aspiration mechanisms in a patient-specific pharyngolaryngeal model with variable epiglottis angles","authors":"Amr Seifelnasr , Chen Sun , Peng Ding , Xiuhua April Si , Jinxiang Xi","doi":"10.1016/j.dib.2024.111204","DOIUrl":"10.1016/j.dib.2024.111204","url":null,"abstract":"<div><div>This dataset comprises a comprehensive collection of videos and images illustrating the fluid dynamics of swallowing and aspiration in a patient-specific pharyngolaryngeal model with varying epiglottis angles. The data also includes the physical properties of the fluids used, comprising dynamic viscosity, surface tension, and contact angle. Videos under varying swallowing conditions were collected to investigate the mechanisms underlying aspiration. The study utilized a biomechanical swallowing model developed using transparent casts of an anatomically accurate pharyngolaryngeal structure. Fluorescent dye was used to visualize the liquid flow dynamics from both side and back views. The dataset includes videos for two types of liquids, water and a 1% w/v methylcellulose aqueous solution, evaluated under two dispensing speeds (fast and slow) and two dispensing locations (anterior and posterior) across four epiglottis angles (30° up-tilt, 0° horizontal, 45° down-tilt, and 80° down-tilt). Additionally, the dataset includes photos of the pharyngolaryngeal model setup, photos of the epiglottis models used, and STL files for both the pharyngolaryngeal model and the epiglottis 3D models.</div><div>The videos document the distinct flow patterns and frequent aspiration sites identified during the experiments, including the interarytenoid notch, the cuneiform tubercular recess, and the vallecula. These data are valuable for researchers aiming to understand the etiology of dysphagia and can be reused to validate computational models, guide future experimental designs, and inform clinical diagnostics and treatment strategies. The dataset is organized into folders based on the epiglottis angles, dispensing speeds, and locations, as well as liquid types. This organization facilitates easy access and analysis for researchers in biomedical engineering, clinical research, and computational biology. The data provide a rich resource for further investigation into swallowing mechanics and the development of etiology-based interventions for dysphagia management.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"58 ","pages":"Article 111204"},"PeriodicalIF":1.0,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11699481/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142930856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-01DOI: 10.1016/j.dib.2024.111214
Muhammad Noman Riaz , Amir Hamza , Hamid Jabbar , Manzar Abbas , Mohsin Islam Tiwana
The dataset includes vibration signatures from both healthy and faulty main journal bearings of an internal combustion engine, captured with a tri-axial accelerometer mounted on the bearings' housing. The engine was exposed to various climatic and operating conditions, including variations in temperature and humidity, and tested at different engine rotation speeds, following a standard, MIL-STD-810G. The data was collected in a state-of-the-art climatic and vibration chamber to simulate real-world environmental conditions. The dataset comprises more than 500 files, each recorded under a range of climatic and operational conditions, with root mean square (RMS) values included. This dataset provides a valuable and reliable benchmark for researchers and practitioners to validate their diagnostic algorithms and models against real engine conditions, as no such dataset is publicly available.
{"title":"Vibration dataset of main journal bearings in internal combustion engines under diverse climatic and varying operating conditions","authors":"Muhammad Noman Riaz , Amir Hamza , Hamid Jabbar , Manzar Abbas , Mohsin Islam Tiwana","doi":"10.1016/j.dib.2024.111214","DOIUrl":"10.1016/j.dib.2024.111214","url":null,"abstract":"<div><div>The dataset includes vibration signatures from both healthy and faulty main journal bearings of an internal combustion engine, captured with a tri-axial accelerometer mounted on the bearings' housing. The engine was exposed to various climatic and operating conditions, including variations in temperature and humidity, and tested at different engine rotation speeds, following a standard, MIL-STD-810G. The data was collected in a state-of-the-art climatic and vibration chamber to simulate real-world environmental conditions. The dataset comprises more than 500 files, each recorded under a range of climatic and operational conditions, with root mean square (RMS) values included. This dataset provides a valuable and reliable benchmark for researchers and practitioners to validate their diagnostic algorithms and models against real engine conditions, as no such dataset is publicly available.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"58 ","pages":"Article 111214"},"PeriodicalIF":1.0,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11714381/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142946168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-01DOI: 10.1016/j.dib.2024.111188
Kiran Fatima , Syed Zeeshan Haider Naqvi , Hazrat Ali , Noor Hassan , Anam Saqib , Farheen Ansari , Sidrah Saleem , Shah Jahan , Mushtaq Ahmad
Acinetobacter baumannii is a well-known opportunistic pathogen, responsible for various nosocomial infections. A. baumannii UOL-KIMZ-24 was previously isolated from a clinical specimen, collected from Lahore General Hospital, Lahore (LGH), Pakistan, dated 3rd March, 2022. During the initial screening for antimicrobial susceptibility, the UOL-KIMZ-24 was found a multiple drug resistant (MDR) strain. However, the detailed genomic insights for genes e.g. responsible for exhibiting antibiotic resistance via efflux pumps, have not yet been reported from A. baumannii strains, recovered from LGH. The current research to fills this gap by isolating, whole genome sequencing and subsequent post-sequencing analysis for addressing and identifying the efflux pumps associated genes, responsible for multiple drug resistant in A. baumannii. In a hybrid approach, short reads were processed through Illumina platform, while long reads were sequenced by MinION MK1B sequencing technique. The assembled and annotated genome of the UOL-KIMZ-24 revealed that it has 4048631 bp genome size with 179 contigs, 38.9 % GC content, 3628 protein coding sequences, 80 tRNA and 7 rRNA. The analysis of antibiotic-resistance genes (AMR) depicted 27 genes. where the genes encoding efflux pumps such as adeABC, adeRS, adeJK, and adeMN were the more prominent. In addition, sequence typing (ST) study showed that UOL-KIMZ-24 strain lies in ST2, six prophage sequences and 73 virulence factors were also identified in the studied UOL-KIMZ-24. Such an all-inclusive study uncovered the genetic flexibility of UOL-KIMZ-24 genome for acquiring MDR against in-practice antibiotics.
{"title":"Analysis of the genome data of Acinetobacter baumannii UOL-KIMZ-24, exhibiting multiple drug resistance through efflux pumps","authors":"Kiran Fatima , Syed Zeeshan Haider Naqvi , Hazrat Ali , Noor Hassan , Anam Saqib , Farheen Ansari , Sidrah Saleem , Shah Jahan , Mushtaq Ahmad","doi":"10.1016/j.dib.2024.111188","DOIUrl":"10.1016/j.dib.2024.111188","url":null,"abstract":"<div><div><em>Acinetobacter baumannii</em> is a well-known opportunistic pathogen, responsible for various nosocomial infections. <em>A. baumannii</em> UOL-KIMZ-24 was previously isolated from a clinical specimen, collected from Lahore General Hospital, Lahore (LGH), Pakistan, dated 3rd March, 2022. During the initial screening for antimicrobial susceptibility, the UOL-KIMZ-24 was found a multiple drug resistant (MDR) strain. However, the detailed genomic insights for genes e.g. responsible for exhibiting antibiotic resistance via efflux pumps, have not yet been reported from <em>A. baumannii</em> strains, recovered from LGH. The current research to fills this gap by isolating, whole genome sequencing and subsequent post-sequencing analysis for addressing and identifying the efflux pumps associated genes, responsible for multiple drug resistant in <em>A. baumannii</em>. In a hybrid approach, short reads were processed through Illumina platform, while long reads were sequenced by MinION MK1B sequencing technique. The assembled and annotated genome of the UOL-KIMZ-24 revealed that it has 4048631 bp genome size with 179 contigs, 38.9 % GC content, 3628 protein coding sequences, 80 tRNA and 7 rRNA. The analysis of antibiotic-resistance genes (AMR) depicted 27 genes. where the genes encoding efflux pumps such as <em>adeABC, adeRS, adeJK</em>, and <em>adeMN</em> were the more prominent. In addition, sequence typing (ST) study showed that UOL-KIMZ-24 strain lies in ST2, six prophage sequences and 73 virulence factors were also identified in the studied UOL-KIMZ-24. Such an all-inclusive study uncovered the genetic flexibility of UOL-KIMZ-24 genome for acquiring MDR against in-practice antibiotics.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"58 ","pages":"Article 111188"},"PeriodicalIF":1.0,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11719380/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142969888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-01DOI: 10.1016/j.dib.2024.111215
John Curtis , Niall Farrell
This data article describes the operation of gas and oil fuelled residential heating systems in Ireland. Based on almost 10,000 homes, the data presents information on the operation of domestic heating systems (whether turned on/off by the user), and the firing of the boiler during 2-hour slots across a period of two years ending in September 2021 by geographical region. The electrification of heating is government policy, with the ambition that hundreds of thousands of homes will switch from oil and gas fuelled residential heating to heat pumps. Such an outcome will have implications for electricity generation, transmission and distribution, especially during peak demand periods. This dataset can be used as a starting point to examine how additional residential heating loads will impact on the electricity grid and provide insight on where electricity grid strengthening works should be prioritised.
{"title":"Irish residential heating demand profile dataset","authors":"John Curtis , Niall Farrell","doi":"10.1016/j.dib.2024.111215","DOIUrl":"10.1016/j.dib.2024.111215","url":null,"abstract":"<div><div>This data article describes the operation of gas and oil fuelled residential heating systems in Ireland. Based on almost 10,000 homes, the data presents information on the operation of domestic heating systems (whether turned on/off by the user), and the firing of the boiler during 2-hour slots across a period of two years ending in September 2021 by geographical region. The electrification of heating is government policy, with the ambition that hundreds of thousands of homes will switch from oil and gas fuelled residential heating to heat pumps. Such an outcome will have implications for electricity generation, transmission and distribution, especially during peak demand periods. This dataset can be used as a starting point to examine how additional residential heating loads will impact on the electricity grid and provide insight on where electricity grid strengthening works should be prioritised.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"58 ","pages":"Article 111215"},"PeriodicalIF":1.0,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11772139/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143058372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-01DOI: 10.1016/j.dib.2024.111186
Gabriele Mack , Christian Ritzel , Jeanine Ammann , Katja Heitkämper , Nadja El Benni
We present data from a paper-and-pencil survey of Swiss farmers. The survey was mailed to 2000 randomly selected Swiss farmers from the two largest Swiss language regions (German and French) in February 2019. A reminder was sent in April 2019. The response rate was around 40 % (N = 811). In the main part of the survey, we collected quantitative data on farmers’ workload and perceived burden due to (1) overall farming activities, (2) administrative activities related to the application of direct payments, and (3) other office work related to farm planning, bookkeeping, purchasing, and sales. We also asked farmers to rate their current workload and perceived administrative burden compared to five years earlier. We also collected data on the perceived burden of using e-government services, the administrative workload of various voluntary direct payment schemes, and the workload of inspections and sanctions. We collected personal information about the farmers. Finally, the farmers were asked to rate a series of statements regarding agricultural policy measures, the importance of inspection measures, the obligation to provide proof of eligibility for direct payments, information on current policy measures, and the justification of penalties for non-compliance with environmental or animal welfare standards. The survey results showed that, on average, Swiss farmers spent 3–5 % of their total working time on administrative tasks. The farmers rated the perceived burden of administrative activities as higher than the burden of overall farming activities or other office work. The data also showed that the farmers’ perceived administrative burden had increased compared to five years earlier. Finally, the results showed that 28 % of the Swiss farmers had received a penalty for non-compliance with direct payment regulations.
{"title":"Data on the administrative workload and perceived administrative burden of farmers in Switzerland","authors":"Gabriele Mack , Christian Ritzel , Jeanine Ammann , Katja Heitkämper , Nadja El Benni","doi":"10.1016/j.dib.2024.111186","DOIUrl":"10.1016/j.dib.2024.111186","url":null,"abstract":"<div><div>We present data from a paper-and-pencil survey of Swiss farmers. The survey was mailed to 2000 randomly selected Swiss farmers from the two largest Swiss language regions (German and French) in February 2019. A reminder was sent in April 2019. The response rate was around 40 % (<em>N</em> = 811). In the main part of the survey, we collected quantitative data on farmers’ workload and perceived burden due to (1) overall farming activities, (2) administrative activities related to the application of direct payments, and (3) other office work related to farm planning, bookkeeping, purchasing, and sales. We also asked farmers to rate their current workload and perceived administrative burden compared to five years earlier. We also collected data on the perceived burden of using e-government services, the administrative workload of various voluntary direct payment schemes, and the workload of inspections and sanctions. We collected personal information about the farmers. Finally, the farmers were asked to rate a series of statements regarding agricultural policy measures, the importance of inspection measures, the obligation to provide proof of eligibility for direct payments, information on current policy measures, and the justification of penalties for non-compliance with environmental or animal welfare standards. The survey results showed that, on average, Swiss farmers spent 3–5 % of their total working time on administrative tasks. The farmers rated the perceived burden of administrative activities as higher than the burden of overall farming activities or other office work. The data also showed that the farmers’ perceived administrative burden had increased compared to five years earlier. Finally, the results showed that 28 % of the Swiss farmers had received a penalty for non-compliance with direct payment regulations.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"58 ","pages":"Article 111186"},"PeriodicalIF":1.0,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143130841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-01DOI: 10.1016/j.dib.2024.111148
Salma Kazemi Rashed, Malou Arvidsson, Rafsan Ahmed, Sonja Aits
Many forms of bioimage analysis involve the detection of objects and their outlines. In the context of microscopy-based high-throughput drug and genomic screening and even in smaller scale microscopy experiments, the objects that most often need to be detected are cells. In order to develop and benchmark algorithms and neural networks that can perform this task, high-quality datasets with annotated cell outlines are needed.
We have created a dataset, named Aitslab_bioimaging2, consisting of 60 fluorescence microscopy images with EGFP-Galectin-3 labelled cells and their hand-labelled outlines. Images were acquired on a Thermo Fischer CX7 high-content imaging system at 20x magnification created as part of an RNA interference screen with a modified U2OS osteosarcoma cell line. Outlines were labelled by three annotators, who had high inter-annotator agreement between them and with a biomedical expert, who labelled some of the objects for comparison and reviewed a subset of the labels, making minor corrections as needed.
The dataset comprises over 2200 annotated cell objects in total, making it sufficient in size to train high-performing neural networks for instance or semantic segmentation. Labels can also easily be converted to boxes for object detection tasks. The dataset is already pre-divided into training, development, and test sets. Matching nuclear staining and outlines are available for part of the dataset from a previous publication (dataset Aitslab_bioimaging1) [1].
{"title":"An annotated high-content fluorescence microscopy dataset with EGFP-Galectin-3-stained cells and manually labelled outlines","authors":"Salma Kazemi Rashed, Malou Arvidsson, Rafsan Ahmed, Sonja Aits","doi":"10.1016/j.dib.2024.111148","DOIUrl":"10.1016/j.dib.2024.111148","url":null,"abstract":"<div><div>Many forms of bioimage analysis involve the detection of objects and their outlines. In the context of microscopy-based high-throughput drug and genomic screening and even in smaller scale microscopy experiments, the objects that most often need to be detected are cells. In order to develop and benchmark algorithms and neural networks that can perform this task, high-quality datasets with annotated cell outlines are needed.</div><div>We have created a dataset, named Aitslab_bioimaging2, consisting of 60 fluorescence microscopy images with EGFP-Galectin-3 labelled cells and their hand-labelled outlines. Images were acquired on a Thermo Fischer CX7 high-content imaging system at 20x magnification created as part of an RNA interference screen with a modified U2OS osteosarcoma cell line. Outlines were labelled by three annotators, who had high inter-annotator agreement between them and with a biomedical expert, who labelled some of the objects for comparison and reviewed a subset of the labels, making minor corrections as needed.</div><div>The dataset comprises over 2200 annotated cell objects in total, making it sufficient in size to train high-performing neural networks for instance or semantic segmentation. Labels can also easily be converted to boxes for object detection tasks. The dataset is already pre-divided into training, development, and test sets. Matching nuclear staining and outlines are available for part of the dataset from a previous publication (dataset Aitslab_bioimaging1) [1].</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"58 ","pages":"Article 111148"},"PeriodicalIF":1.0,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11751569/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143022102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-01DOI: 10.1016/j.dib.2024.111242
Fernan A. Villa-Garzon, Maria A. Muñoz-Alarcon, John W. Branch-Bedoya
This dataset examines the interplay between socioeconomic status and educational outcomes among students at the Universidad Nacional de Colombia. Collected from publicly available data in collaboration with the National Directorate of Information, the dataset includes anonymized records of 3361 students from multiple university campuses during the first semester of 2021. It captures a diverse array of socioeconomic and academic variables, such as family income, residence type, tuition fee, and career choice, providing a unique basis for studying educational access in Colombia.
Following authorization from the National Directorate of Information, the dataset underwent a rigorous anonymization and validation process, ensuring data integrity and reproducibility. The data was systematically cleaned and organized to highlight key demographic and socioeconomic variables relevant to student accessibility, persistence, and financial sustainability in higher education.
With its detailed structure, this dataset is a valuable resource for policymakers and researchers focused on reducing educational inequalities. It supports analyses that reveal how socioeconomic conditions impact educational pathways, enabling the design of targeted interventions to enhance equity in university access and retention. This dataset contributes significantly to the understanding of educational challenges in Colombian public universities, providing a comprehensive basis for the investigation of the socioeconomic factors that influence students' access to and success in higher education.
{"title":"ColombiaTuitionSET: Labeled dataset for exploring socioeconomic status, career selection, and tuition fees at a Colombian public university","authors":"Fernan A. Villa-Garzon, Maria A. Muñoz-Alarcon, John W. Branch-Bedoya","doi":"10.1016/j.dib.2024.111242","DOIUrl":"10.1016/j.dib.2024.111242","url":null,"abstract":"<div><div>This dataset examines the interplay between socioeconomic status and educational outcomes among students at the Universidad Nacional de Colombia. Collected from publicly available data in collaboration with the National Directorate of Information, the dataset includes anonymized records of 3361 students from multiple university campuses during the first semester of 2021. It captures a diverse array of socioeconomic and academic variables, such as family income, residence type, tuition fee, and career choice, providing a unique basis for studying educational access in Colombia.</div><div>Following authorization from the National Directorate of Information, the dataset underwent a rigorous anonymization and validation process, ensuring data integrity and reproducibility. The data was systematically cleaned and organized to highlight key demographic and socioeconomic variables relevant to student accessibility, persistence, and financial sustainability in higher education.</div><div>With its detailed structure, this dataset is a valuable resource for policymakers and researchers focused on reducing educational inequalities. It supports analyses that reveal how socioeconomic conditions impact educational pathways, enabling the design of targeted interventions to enhance equity in university access and retention. This dataset contributes significantly to the understanding of educational challenges in Colombian public universities, providing a comprehensive basis for the investigation of the socioeconomic factors that influence students' access to and success in higher education.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"58 ","pages":"Article 111242"},"PeriodicalIF":1.0,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11732587/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142983021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-01DOI: 10.1016/j.dib.2024.111235
Danielle Dana Mitchell, Jo-Marie Vreulink, Alaric Prins, Marilize Le Roes-Hill
The marine isolate, Streptomyces griseoincarnatus strain R-35, was isolated from marine sediments collected from the Glencairn Tidal Pool, Table Mountain National Park, Cape Town, South Africa. The genomic DNA was sequenced using the Ion Torrent GeneStudio™ S5 platform, and the de novo assembly was performed using the SPAdes assembler on the Centre for High Performance Computing (CHPC) Lengau Cluster located at the CSIR, Rosebank, South Africa. The draft genome assembly consisted of 722 contigs totaling 7,625,174 base pairs and a G+C% content of 72.2 mol%. Genome completeness and genome contamination were determined as 99.12% and 0.92%, respectively. Genome annotations performed using the Rapid Annotation with Subsystem Technology (RAST) and the Bacterial and Viral Bioinformatics Resource Centre (BV-BRC) determined the presence of 7996 coding sequences (CDS), 63 transfer RNAs (tRNAs), and six ribosomal RNAs (rRNAs). A total of 2570 hypothetical proteins were assigned, and 5246 proteins were assigned to function. The phylogenomic positioning of S. griseoincarnatus strain R-35 was determined using the Type Strain Genome Server (TYGS) and was found to be related to S. griseoincarnatus JCM 4381T, with a digital DNA-DNA hybridisation (dDDH) value of 84.1%, and an OrthoANIu value of 98.22%. The CARD RGI algorithm on Proksee predicted the presence of 6,107 antimicrobial resistance (AMR) features, 27 biosynthetic gene clusters (BGCs) were predicted using antiSMASH, while 189 carbohydrate-active enzymes (CAZymes) were predicted using dbCAN3. The raw genome sequencing data has been submitted to the National Center for Biotechnology (NCBI) under the BioProject ID PRJNA1129156 (BioSample ID Accession Number: SAMN42145163; Short Read Archive (SRA) Accession: SRR29633055; https://www.ncbi.nlm.nih.gov/bioproject/PRJNA1129156).
{"title":"Draft genome dataset of Streptomyces griseoincarnatus strain R-35 isolated from tidal pool sediments","authors":"Danielle Dana Mitchell, Jo-Marie Vreulink, Alaric Prins, Marilize Le Roes-Hill","doi":"10.1016/j.dib.2024.111235","DOIUrl":"10.1016/j.dib.2024.111235","url":null,"abstract":"<div><div>The marine isolate, <em>Streptomyces griseoincarnatus</em> strain R-35, was isolated from marine sediments collected from the Glencairn Tidal Pool, Table Mountain National Park, Cape Town, South Africa. The genomic DNA was sequenced using the Ion Torrent GeneStudio™ S5 platform, and the <em>de novo</em> assembly was performed using the SPAdes assembler on the Centre for High Performance Computing (CHPC) Lengau Cluster located at the CSIR, Rosebank, South Africa. The draft genome assembly consisted of 722 contigs totaling 7,625,174 base pairs and a G+C% content of 72.2 mol%. Genome completeness and genome contamination were determined as 99.12% and 0.92%, respectively. Genome annotations performed using the Rapid Annotation with Subsystem Technology (RAST) and the Bacterial and Viral Bioinformatics Resource Centre (BV-BRC) determined the presence of 7996 coding sequences (CDS), 63 transfer RNAs (tRNAs), and six ribosomal RNAs (rRNAs). A total of 2570 hypothetical proteins were assigned, and 5246 proteins were assigned to function. The phylogenomic positioning of <em>S. griseoincarnatus</em> strain R-35 was determined using the Type Strain Genome Server (TYGS) and was found to be related to <em>S. griseoincarnatus</em> JCM 4381<sup>T</sup>, with a digital DNA-DNA hybridisation (dDDH) value of 84.1%, and an OrthoANIu value of 98.22%. The CARD RGI algorithm on Proksee predicted the presence of 6,107 antimicrobial resistance (AMR) features, 27 biosynthetic gene clusters (BGCs) were predicted using antiSMASH, while 189 carbohydrate-active enzymes (CAZymes) were predicted using dbCAN3. The raw genome sequencing data has been submitted to the National Center for Biotechnology (NCBI) under the BioProject ID PRJNA1129156 (BioSample ID Accession Number: SAMN42145163; Short Read Archive (SRA) Accession: SRR29633055; <span><span>https://www.ncbi.nlm.nih.gov/bioproject/PRJNA1129156</span><svg><path></path></svg></span>).</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"58 ","pages":"Article 111235"},"PeriodicalIF":1.0,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11731734/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142983042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}