Pub Date : 2026-04-01Epub Date: 2026-01-10DOI: 10.1016/j.dib.2026.112462
Takaki Nishio, Yuki Kawae
The conservation of marine resources and the mitigation of marine pollution require strengthened knowledge of marine biodiversity, particularly in the deep sea. Videos and images are valuable for documenting the distribution of deep-sea organisms, but manual processing is labor-intensive and variable, emphasizing the need for automated methods. To address this, the J-EDI Organism Detection Dataset (JODD) is introduced. This dataset comprises 8151 images and 15,621 bounding boxes annotated in the Common Objects in Context (COCO) format. The images were captured during deep-sea surveys conducted by the Japan Agency for Marine-Earth Science and Technology (JAMSTEC) between 1984 and 2021, using remotely operated vehicles (ROVs) and human-occupied vehicles (HOVs). All images were derived from publicly available videos in JAMSTEC’s E-library of Deep-sea Images (J-EDI). The dataset includes 20 object categories—19 biological groups and one machine category—providing a reusable resource for developing and benchmarking machine learning models for the automatic detection of deep-sea organisms.
养护海洋资源和减轻海洋污染需要加强对海洋生物多样性的认识,特别是对深海生物多样性的认识。视频和图像对于记录深海生物的分布是有价值的,但人工处理是劳动密集型的,而且是可变的,强调了自动化方法的必要性。为了解决这个问题,引入了J-EDI生物检测数据集(JODD)。该数据集包括8151张图像和15621个边界框,以Common Objects in Context (COCO)格式标注。这些图像是在1984年至2021年期间由日本海洋地球科学技术机构(JAMSTEC)使用远程操作车辆(rov)和载人车辆(hov)进行的深海调查中捕获的。所有图像均来自JAMSTEC的深海图像电子库(J-EDI)中的公开视频。该数据集包括20个对象类别- 19个生物类群和一个机器类别-为深海生物自动检测的机器学习模型的开发和基准测试提供了可重复使用的资源。
{"title":"Deep-sea image dataset for organism detection","authors":"Takaki Nishio, Yuki Kawae","doi":"10.1016/j.dib.2026.112462","DOIUrl":"10.1016/j.dib.2026.112462","url":null,"abstract":"<div><div>The conservation of marine resources and the mitigation of marine pollution require strengthened knowledge of marine biodiversity, particularly in the deep sea. Videos and images are valuable for documenting the distribution of deep-sea organisms, but manual processing is labor-intensive and variable, emphasizing the need for automated methods. To address this, the J-EDI Organism Detection Dataset (JODD) is introduced. This dataset comprises 8151 images and 15,621 bounding boxes annotated in the Common Objects in Context (COCO) format. The images were captured during deep-sea surveys conducted by the Japan Agency for Marine-Earth Science and Technology (JAMSTEC) between 1984 and 2021, using remotely operated vehicles (ROVs) and human-occupied vehicles (HOVs). All images were derived from publicly available videos in JAMSTEC’s E-library of Deep-sea Images (J-EDI). The dataset includes 20 object categories—19 biological groups and one machine category—providing a reusable resource for developing and benchmarking machine learning models for the automatic detection of deep-sea organisms.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112462"},"PeriodicalIF":1.4,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145976670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-04-01Epub Date: 2026-01-08DOI: 10.1016/j.dib.2026.112452
Oguz Akbilgic , Ibrahim Karabayir , Luke Patterson , Stephanie B. Dixon , Daniel A. Mulrooney , Kirsten K. Ness , Melissa M. Hudson
Childhood cancer survivors (CCS), exposed to prior cardiotoxic treatments such as anthracyclines and chest radiation, are at lifelong risk of cardiovascular complications. Current guidelines recommend periodic echocardiographic surveillance, but adherence rates are as low as 41%. This dataset provides paired same-day 12-lead clinical electrocardiograms (ECG) and single-lead wearable ECG recordings from the Apple Watch, collected from adult CCS participating in the St. Jude Lifetime Cohort Study (SJLIFE). The availability of paired wearable and clinical ECGs enables the development and validation of remote AI-based cardiac screening tools, potentially leading to more precise long-term cardiovascular surveillance in this population. Using this dataset, researchers can assess whether an AI model developed using clinical ECG can be repeat when using ECG from an Apple Watch.
{"title":"Paired clinical 12 lead and apple watch electrocardiogram data repository from childhood cancer survivors authors","authors":"Oguz Akbilgic , Ibrahim Karabayir , Luke Patterson , Stephanie B. Dixon , Daniel A. Mulrooney , Kirsten K. Ness , Melissa M. Hudson","doi":"10.1016/j.dib.2026.112452","DOIUrl":"10.1016/j.dib.2026.112452","url":null,"abstract":"<div><div>Childhood cancer survivors (CCS), exposed to prior cardiotoxic treatments such as anthracyclines and chest radiation, are at lifelong risk of cardiovascular complications. Current guidelines recommend periodic echocardiographic surveillance, but adherence rates are as low as 41%. This dataset provides paired same-day 12-lead clinical electrocardiograms (ECG) and single-lead wearable ECG recordings from the Apple Watch, collected from adult CCS participating in the St. Jude Lifetime Cohort Study (SJLIFE). The availability of paired wearable and clinical ECGs enables the development and validation of remote AI-based cardiac screening tools, potentially leading to more precise long-term cardiovascular surveillance in this population. Using this dataset, researchers can assess whether an AI model developed using clinical ECG can be repeat when using ECG from an Apple Watch.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112452"},"PeriodicalIF":1.4,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145976667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-04-01Epub Date: 2026-01-07DOI: 10.1016/j.dib.2025.112445
Trang Thu Tran , Huyen Minh Thi Ta , Duc Hoang Le , Duong Huy Nguyen , Nam Trung Nguyen
This dataset presents RNA sequencing (RNA-seq) data from RAW264.7 murine macrophages pretreated with 9-methoxycanthin-6-one, a canthin-6-one–type alkaloid isolated from Eurycoma longifolia Jack, and subsequently stimulated with polyinosinic:polycytidylic acid [poly(I:C)], a synthetic double-stranded RNA analog that activates TLR3-mediated antiviral signaling. RAW264.7 cells were pretreated with 9-methoxycanthin-6-one (30 µM) for 30 min and then exposed to poly(I:C) (20 µg/mL) for 6 h. Total RNA was extracted, quality-checked, and sequenced on the Illumina platform to generate paired-end reads. Differential expression analysis and functional annotation were performed to profile genes responsive to 9-methoxycanthin-6-one treatment under poly(I:C) stimulation. The dataset includes normalized expression matrices, lists of upregulated and downregulated genes, and pathway enrichment outputs in standard formats. These data provide a reference resource for understanding the transcriptomic responses of macrophages to natural alkaloid treatment during viral-mimetic immune activation. The dataset can be reused to compare host antiviral transcriptional responses across TLR3-related pathways, evaluate macrophage activation markers, or integrate with other E. longifolia bioactive compounds.
{"title":"Transcriptomic dataset of RAW264.7 murine macrophages pretreated with 9-methoxycanthin-6-one under poly(I:C)-TLR3 stimulation","authors":"Trang Thu Tran , Huyen Minh Thi Ta , Duc Hoang Le , Duong Huy Nguyen , Nam Trung Nguyen","doi":"10.1016/j.dib.2025.112445","DOIUrl":"10.1016/j.dib.2025.112445","url":null,"abstract":"<div><div>This dataset presents RNA sequencing (RNA-seq) data from RAW264.7 murine macrophages pretreated with 9-methoxycanthin-6-one, a canthin-6-one–type alkaloid isolated from <em>Eurycoma longifolia</em> Jack, and subsequently stimulated with polyinosinic:polycytidylic acid [poly(I:C)], a synthetic double-stranded RNA analog that activates TLR3-mediated antiviral signaling. RAW264.7 cells were pretreated with 9-methoxycanthin-6-one (30 µM) for 30 min and then exposed to poly(I:C) (20 µg/mL) for 6 h. Total RNA was extracted, quality-checked, and sequenced on the Illumina platform to generate paired-end reads. Differential expression analysis and functional annotation were performed to profile genes responsive to 9-methoxycanthin-6-one treatment under poly(I:C) stimulation. The dataset includes normalized expression matrices, lists of upregulated and downregulated genes, and pathway enrichment outputs in standard formats. These data provide a reference resource for understanding the transcriptomic responses of macrophages to natural alkaloid treatment during viral-mimetic immune activation. The dataset can be reused to compare host antiviral transcriptional responses across TLR3-related pathways, evaluate macrophage activation markers, or integrate with other <em>E. longifolia</em> bioactive compounds.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112445"},"PeriodicalIF":1.4,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145976665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-04-01Epub Date: 2026-01-13DOI: 10.1016/j.dib.2026.112461
Tom Marti , Cécile Costa , Emmanuel de Salis , Laura Brambilla , Stefano Carrino
This dataset presents acoustic emission (AE) recordings collected from woodboring insect-infested and non-infested wood samples and cultural heritage objects. Data acquisition was conducted across four institutions: Haute École Arc (HE-Arc), Switzerland; Canadian Museum of History (CMH), Canada; National Gallery of Canada (NGC), Canada; and Musée National de l'Automobile (MNA), France; from April to July 2025.
The recordings were captured using Vallen VS900-M sensors with AEP5 preamplifiers set to 34dB gain and AMSY-6 4-channel chassis, employing continuous acoustic emission monitoring at 2 MHz sampling rate. Each experiment utilized three sensors positioned on test objects and one reference sensor facing up to record ambient noise conditions. The dataset comprises approximately 440.9 hours of recordings distributed across the four collection sites.
The dataset includes four main components: raw Vallen AE database files (.tradb format), processed statistical data exported as CSV files, contextual images documenting setups and sensor placements, and Python script for statistical data processing. Each experiment is documented with duration, material specifications, coupling methods (renaissance wax, cyclododecane, or mechanical fastening), environmental conditions, and infestation labels.
The dataset's structure enables multiple research applications. The time-series statistical features and binary classification labels (infested/non-infested) provide a foundation for supervised machine learning model development. The diverse experimental conditions across four geographic locations, varying coupling methods, and different ambient environments offer opportunities to evaluate model generalization and robustness. Reference sensor recordings captured simultaneously with each experiment allow for ambient noise characterization studies and development of noise filtering methodologies. The combination of raw acoustic data and contextual documentation makes this dataset suitable for comparative studies of different signal processing approaches and feature extraction techniques in acoustic emission analysis for heritage conservation applications.
该数据集展示了从木材钻孔昆虫感染和非昆虫感染的木材样本和文化遗产中收集的声发射(AE)记录。数据采集在四个机构进行:瑞士的Haute École Arc (HE-Arc);加拿大历史博物馆,加拿大;加拿大国家美术馆(NGC),加拿大;和法国mus National de l'Automobile (MNA);从2025年4月到7月。录音采用Vallen VS900-M传感器,AEP5前置放大器设置为34dB增益,AMSY-6 4通道机箱,采用连续声发射监测,采样率为2 MHz。每个实验使用放置在测试对象上的三个传感器和一个向上的参考传感器来记录环境噪声条件。该数据集包括分布在四个收集点的大约440.9小时的录音。数据集包括四个主要组成部分:原始valenae数据库文件(;tradb格式),处理的统计数据导出为CSV文件,记录设置和传感器位置的上下文图像,以及用于统计数据处理的Python脚本。每个实验都记录了持续时间、材料规格、耦合方法(再生蜡、环十二烷或机械紧固)、环境条件和虫害标签。数据集的结构支持多种研究应用。时间序列统计特征和二元分类标签(出没/未出没)为监督式机器学习模型的开发提供了基础。四个地理位置的不同实验条件、不同的耦合方法和不同的环境为评估模型的泛化和鲁棒性提供了机会。与每个实验同时捕获的参考传感器记录允许环境噪声特性研究和噪声过滤方法的发展。原始声学数据和上下文文档的结合使该数据集适合于在遗产保护应用的声发射分析中对不同信号处理方法和特征提取技术进行比较研究。
{"title":"A dataset of acoustics emissions recordings of woodboring insects in wood and cultural objects, context images and remarks","authors":"Tom Marti , Cécile Costa , Emmanuel de Salis , Laura Brambilla , Stefano Carrino","doi":"10.1016/j.dib.2026.112461","DOIUrl":"10.1016/j.dib.2026.112461","url":null,"abstract":"<div><div>This dataset presents acoustic emission (AE) recordings collected from woodboring insect-infested and non-infested wood samples and cultural heritage objects. Data acquisition was conducted across four institutions: Haute École Arc (HE-Arc), Switzerland; Canadian Museum of History (CMH), Canada; National Gallery of Canada (NGC), Canada; and Musée National de l'Automobile (MNA), France; from April to July 2025.</div><div>The recordings were captured using Vallen VS900-M sensors with AEP5 preamplifiers set to 34dB gain and AMSY-6 4-channel chassis, employing continuous acoustic emission monitoring at 2 MHz sampling rate. Each experiment utilized three sensors positioned on test objects and one reference sensor facing up to record ambient noise conditions. The dataset comprises approximately 440.9 hours of recordings distributed across the four collection sites.</div><div>The dataset includes four main components: raw Vallen AE database files (.tradb format), processed statistical data exported as CSV files, contextual images documenting setups and sensor placements, and Python script for statistical data processing. Each experiment is documented with duration, material specifications, coupling methods (renaissance wax, cyclododecane, or mechanical fastening), environmental conditions, and infestation labels.</div><div>The dataset's structure enables multiple research applications. The time-series statistical features and binary classification labels (infested/non-infested) provide a foundation for supervised machine learning model development. The diverse experimental conditions across four geographic locations, varying coupling methods, and different ambient environments offer opportunities to evaluate model generalization and robustness. Reference sensor recordings captured simultaneously with each experiment allow for ambient noise characterization studies and development of noise filtering methodologies. The combination of raw acoustic data and contextual documentation makes this dataset suitable for comparative studies of different signal processing approaches and feature extraction techniques in acoustic emission analysis for heritage conservation applications.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112461"},"PeriodicalIF":1.4,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146036169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Indonesian Pharmaceutical Dataset for Self-medication consists of two structured datasets containing some of the most important public health information: a drug dataset and a disease dataset. Both were extracted from the websites of Indonesian-registered and regulated telemedicine providers. The drug dataset contains general data on drugs, indications, dosages, side effects, contraindications, and warnings, whereas the disease dataset contains definitions, descriptions, symptoms, and causes of diseases. Both datasets are provided in CSV file format and are available exclusively in Bahasa Indonesia to maintain consistency with the source content and cater to local users’ needs. These datasets are available to facilitate research, application development, and Indonesian health information systems through locally contextualized and accessible health data for the Indonesian population to use. Some potential applications include powering health chatbots, arming medical search tools, guiding health literacy programs, and facilitating the integration of standardized local information into HealthTech platforms.
{"title":"Indonesian pharmaceutical dataset for self-medication","authors":"Richard Wiputra , Carrie Florista Benjaminsz , Andrian Loria , Rafaell Widjaya , Rudy , Andry Chowanda","doi":"10.1016/j.dib.2026.112460","DOIUrl":"10.1016/j.dib.2026.112460","url":null,"abstract":"<div><div>The Indonesian Pharmaceutical Dataset for Self-medication consists of two structured datasets containing some of the most important public health information: a drug dataset and a disease dataset. Both were extracted from the websites of Indonesian-registered and regulated telemedicine providers. The drug dataset contains general data on drugs, indications, dosages, side effects, contraindications, and warnings, whereas the disease dataset contains definitions, descriptions, symptoms, and causes of diseases. Both datasets are provided in CSV file format and are available exclusively in Bahasa Indonesia to maintain consistency with the source content and cater to local users’ needs. These datasets are available to facilitate research, application development, and Indonesian health information systems through locally contextualized and accessible health data for the Indonesian population to use. Some potential applications include powering health chatbots, arming medical search tools, guiding health literacy programs, and facilitating the integration of standardized local information into HealthTech platforms.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112460"},"PeriodicalIF":1.4,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146036499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Type-2 diabetes is a major public health concern in Bangladesh, and this dataset provides 1065 curated patient records with demographic, anthropometric, and clinical variables relevant to its assessment. The data were collected during routine clinical visits and recorded by trained staff, with checks to ensure accuracy and completeness. It includes basic details like age, pregnancy count, body mass index, and skin-fold thickness; vital signs such as blood pressure; lab results related to blood sugar (fasting glucose and insulin); the Diabetes Pedigree Function; and a simple yes/no label for Type-2 diabetes. A few values are missing for diastolic blood pressure and skin-fold thickness, so users should handle these carefully. Since the data are cross-sectional and come from patients seeking care, there are more diabetic cases (840) than non-diabetic cases (225). The dataset is intended for reuse in method development (for example, machine-learning classifier training, feature-selection benchmarking, and oversampling/imputation research), for context-specific epidemiologic description and model validation in South Asian clinical settings, and as a teaching resource for reproducible biomedical-data workflows.
{"title":"A clinical dataset on type-2 diabetes including demographic, anthropometric, and biochemical parameters from Bangladesh","authors":"Md. Younus Bhuiyan , Shahriar Siddique Ayon , Md. Ebrahim Hossain , Md. Saef Ullah Miah , Afjal H. Sarower , Fateha khanam Bappee","doi":"10.1016/j.dib.2026.112457","DOIUrl":"10.1016/j.dib.2026.112457","url":null,"abstract":"<div><div>Type-2 diabetes is a major public health concern in Bangladesh, and this dataset provides 1065 curated patient records with demographic, anthropometric, and clinical variables relevant to its assessment. The data were collected during routine clinical visits and recorded by trained staff, with checks to ensure accuracy and completeness. It includes basic details like age, pregnancy count, body mass index, and skin-fold thickness; vital signs such as blood pressure; lab results related to blood sugar (fasting glucose and insulin); the Diabetes Pedigree Function; and a simple yes/no label for Type-2 diabetes. A few values are missing for diastolic blood pressure and skin-fold thickness, so users should handle these carefully. Since the data are cross-sectional and come from patients seeking care, there are more diabetic cases (840) than non-diabetic cases (225). The dataset is intended for reuse in method development (for example, machine-learning classifier training, feature-selection benchmarking, and oversampling/imputation research), for context-specific epidemiologic description and model validation in South Asian clinical settings, and as a teaching resource for reproducible biomedical-data workflows.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112457"},"PeriodicalIF":1.4,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145976669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-04-01Epub Date: 2026-01-30DOI: 10.1016/j.dib.2026.112540
Anshu Raj , Xin Wang , Matthew Luebbe , Haiming Wen , Kun Lu , Shuozhi Xu
We report a curated dataset that brings together composition, processing conditions, microstructural details, and mechanical properties for 396 combinations of alloy composition and processing condition drawn from 100 peer-reviewed research articles on precipitate-containing multi-principal element alloys (MPEAs). The dataset was created by first utilizing a generative large language model for information extraction, followed by expert review to ensure accurate recovery of materials data. Compositional information was taken directly from tables and text, while processing routes — including homogenization, rolling, recrystallization, and aging — were converted into uniform temperature and time metrics. Microstructural descriptors, including precipitate phases and sizes, were consolidated into a consistent labeling scheme to accommodate the wide range of terminology used in published literature. Finally, mechanical property data, such as strength and ductility, were compiled together with the temperatures at which they were measured. These data provide a coherent view of the composition-processing-microstructure-property features explored in existing MPEA research and establish a resource that supports data-driven alloy design as well as future development of automated materials information-extraction methodologies. The complete dataset is available on Zenodo.
{"title":"A dataset of precipitate-containing multi-principal element alloys","authors":"Anshu Raj , Xin Wang , Matthew Luebbe , Haiming Wen , Kun Lu , Shuozhi Xu","doi":"10.1016/j.dib.2026.112540","DOIUrl":"10.1016/j.dib.2026.112540","url":null,"abstract":"<div><div>We report a curated dataset that brings together composition, processing conditions, microstructural details, and mechanical properties for 396 combinations of alloy composition and processing condition drawn from 100 peer-reviewed research articles on precipitate-containing multi-principal element alloys (MPEAs). The dataset was created by first utilizing a generative large language model for information extraction, followed by expert review to ensure accurate recovery of materials data. Compositional information was taken directly from tables and text, while processing routes — including homogenization, rolling, recrystallization, and aging — were converted into uniform temperature and time metrics. Microstructural descriptors, including precipitate phases and sizes, were consolidated into a consistent labeling scheme to accommodate the wide range of terminology used in published literature. Finally, mechanical property data, such as strength and ductility, were compiled together with the temperatures at which they were measured. These data provide a coherent view of the composition-processing-microstructure-property features explored in existing MPEA research and establish a resource that supports data-driven alloy design as well as future development of automated materials information-extraction methodologies. The complete dataset is available on Zenodo.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112540"},"PeriodicalIF":1.4,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146185073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Increasing occurrences of toxic dinoflagellate blooms are a growing concern under climate change. The benthic dinoflagellate Ostreopsis blooms through mechanisms that remain poorly understood and is assumed to produce palytoxin-like compounds such as ovatoxins. Recent studies have highlighted the diversity of bacterial communities associated with Ostreopsis and suggested a possible role for these bacteria in toxin biosynthesis. However, genome information on potential bacterial toxin producers remains limited. Here, we report a dataset of bacterial metagenome-assembled genomes (MAGs) obtained from the culture of the toxic dinoflagellate Ostreopsis cf. ovata strain (NIES-3351). HiFi long reads from PacBio Revio system were assembled with hifiasm-meta. We identified forty complete bacterial MAGs, each with an estimated completeness of 93-100%. These MAGs span a wide range of genome sizes (1.5 Mb to 6.7 Mb) and GC contents (36% to 67%). The dataset is available at DDBJ/ENA/GenBank under accession number PRJDB37958.
{"title":"A dataset for forty complete bacterial genome sequences in cultures of the toxic dinoflagellate Ostreopsis cf. ovata","authors":"Yuki Yoshioka , Chika Ando , Hiroshi Yamashita , Mayumi Kawamitsu , Masanobu Kawachi , Yuta Tsunematsu , Eiichi Shoguchi","doi":"10.1016/j.dib.2026.112499","DOIUrl":"10.1016/j.dib.2026.112499","url":null,"abstract":"<div><div>Increasing occurrences of toxic dinoflagellate blooms are a growing concern under climate change. The benthic dinoflagellate <em>Ostreopsis</em> blooms through mechanisms that remain poorly understood and is assumed to produce palytoxin-like compounds such as ovatoxins. Recent studies have highlighted the diversity of bacterial communities associated with <em>Ostreopsis</em> and suggested a possible role for these bacteria in toxin biosynthesis. However, genome information on potential bacterial toxin producers remains limited. Here, we report a dataset of bacterial metagenome-assembled genomes (MAGs) obtained from the culture of the toxic dinoflagellate <em>Ostreopsis</em> cf. <em>ovata</em> strain (NIES-3351). HiFi long reads from PacBio Revio system were assembled with hifiasm-meta. We identified forty complete bacterial MAGs, each with an estimated completeness of 93-100%. These MAGs span a wide range of genome sizes (1.5 Mb to 6.7 Mb) and GC contents (36% to 67%). The dataset is available at DDBJ/ENA/GenBank under accession number PRJDB37958.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112499"},"PeriodicalIF":1.4,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146185148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-04-01Epub Date: 2026-01-14DOI: 10.1016/j.dib.2026.112470
Eric Gooden, Nicole Holden
The data reports the results of an experiment used to examine whether short-horizon investors differ from long-horizon investors in their sensitivity to truthful disclosure regarding negative earnings news. We used an experimental methodology to isolate the impact of Investment Horizon and Forthcomingness on investors' long-term management credibility assessments. Specifically, participants assumed the role of either a long-horizon current investor (already owned shares) or a short-horizon prospective investor (contemplating an investment decision) in a fictional company. Participants were then given identical information regarding the firm and asked to make an initial credibility assessment. Subsequently, participants either received forthcoming disclosure from company management regarding negative earnings news (Forthcomingness) or the participant did not receive disclosure. Short-horizon investors then made an investment decision regarding an investment position in the firm, and all participants received negative earnings news regarding the firm. Participants returned two-weeks later to make final credibility assessments of company management as part of the post-experimental questionnaire.
{"title":"Long-term management reporting credibility data from an accounting experiment","authors":"Eric Gooden, Nicole Holden","doi":"10.1016/j.dib.2026.112470","DOIUrl":"10.1016/j.dib.2026.112470","url":null,"abstract":"<div><div>The data reports the results of an experiment used to examine whether short-horizon investors differ from long-horizon investors in their sensitivity to truthful disclosure regarding negative earnings news. We used an experimental methodology to isolate the impact of Investment Horizon and Forthcomingness on investors' long-term management credibility assessments. Specifically, participants assumed the role of either a long-horizon current investor (already owned shares) or a short-horizon prospective investor (contemplating an investment decision) in a fictional company. Participants were then given identical information regarding the firm and asked to make an initial credibility assessment. Subsequently, participants either received forthcoming disclosure from company management regarding negative earnings news (Forthcomingness) or the participant did not receive disclosure. Short-horizon investors then made an investment decision regarding an investment position in the firm, and all participants received negative earnings news regarding the firm. Participants returned two-weeks later to make final credibility assessments of company management as part of the post-experimental questionnaire.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112470"},"PeriodicalIF":1.4,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146075119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-04-01Epub Date: 2026-01-21DOI: 10.1016/j.dib.2026.112487
Viktor Peterson
Impact-loaded reinforced concrete beams often fail in shear. This becomes relevant for shelter design against ballistics or fragment impact, for instance. An experimental campaign was conducted to study the different types of shear failure and governing parameters. Eighteen reinforced concrete beams were tested by a 70 kg steel striker dropped from a 2.4 m height. The beams were loaded at different positions from the support with different amounts of transverse reinforcement. The beams were of reduced scale with a length of 0.80 m and a square 0.15 m × 0.15 m cross-section. The drop weight tests were monitored with shock accelerometers on the striker and beam centre, load cells under the supports measuring reaction forces, and a high-speed camera (HSC). High-speed camera measurements were recorded orthogonal to the surface with the aim of performing high-quality digital image correlation (DIC) analyses. The beams and striker were painted with a speckled pattern prior to testing for the DIC analyses. Camera recordings were conducted with a 1024 × 512 px resolution and 6 kHz sampling, resulting in a time resolution of about 0.17 ms. Accelerometer and load cell measurements were sampled at 19.2 kHz. The accelerometer on the striker was used to approximate the impact force, and beam acceleration can be used to synchronize the camera and DAQ recordings. The data may be used to calibrate finite element models, study the impact response of beams, or develop new mechanical models.
{"title":"Dataset of high-speed camera measurements from impact-tested reinforced concrete beams","authors":"Viktor Peterson","doi":"10.1016/j.dib.2026.112487","DOIUrl":"10.1016/j.dib.2026.112487","url":null,"abstract":"<div><div>Impact-loaded reinforced concrete beams often fail in shear. This becomes relevant for shelter design against ballistics or fragment impact, for instance. An experimental campaign was conducted to study the different types of shear failure and governing parameters. Eighteen reinforced concrete beams were tested by a 70 kg steel striker dropped from a 2.4 m height. The beams were loaded at different positions from the support with different amounts of transverse reinforcement. The beams were of reduced scale with a length of 0.80 m and a square 0.15 m × 0.15 m cross-section. The drop weight tests were monitored with shock accelerometers on the striker and beam centre, load cells under the supports measuring reaction forces, and a high-speed camera (HSC). High-speed camera measurements were recorded orthogonal to the surface with the aim of performing high-quality digital image correlation (DIC) analyses. The beams and striker were painted with a speckled pattern prior to testing for the DIC analyses. Camera recordings were conducted with a 1024 × 512 px resolution and 6 kHz sampling, resulting in a time resolution of about 0.17 ms. Accelerometer and load cell measurements were sampled at 19.2 kHz. The accelerometer on the striker was used to approximate the impact force, and beam acceleration can be used to synchronize the camera and DAQ recordings. The data may be used to calibrate finite element models, study the impact response of beams, or develop new mechanical models.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"65 ","pages":"Article 112487"},"PeriodicalIF":1.4,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146075164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}