首页 > 最新文献

Scientific Data最新文献

英文 中文
Telehealth Infrastructure for Cancer Care in the United States. 美国癌症护理的远程医疗基础设施。
IF 6.9 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2026-03-16 DOI: 10.1038/s41597-026-07063-z
Lingbo Liu, Tracy Onega, Erika L Moen, Anna N A Tosteson, Rebecca E Smith, Qianfei Wang, Lauren Cowan, Fahui Wang

Telehealth can reduce travel barriers to oncology, yet its impact depends on both digital connectivity and the geography of care. We present an open, reusable dataset that characterizes two critical components of telehealth infrastructure for cancer care, accessible oncologists and sufficient and affordable internet, at the ZIP Code Tabulation Area level across the United States. The resource integrates population-weighted fixed broadband measures and 5G coverage, internet subscription as an affordability proxy, geocoded oncologist practice sites with full-time-equivalent capacity, and a national origin-destination matrix of road travel times. From these inputs we compute spatial accessibility for in-person care by two-step floating catchment area method (2SFCA) and telehealth-enabled care by two-step virtual catchment area method (2SVCA) at 45-120-minute thresholds. We support transparency by releasing the source and intermediate indicators, the final accessibility scores, and a replicable 2SFCA/2SVCA workflow. Anticipated uses include benchmarking infrastructure across states and metropolitan areas, analyses of disparities by rurality and area deprivation, subsidy simulations, and rapid replication in new diseases or providers contexts.

远程医疗可以减少肿瘤学的旅行障碍,但其影响取决于数字连接和护理的地理位置。我们提出了一个开放的、可重复使用的数据集,该数据集表征了癌症护理远程医疗基础设施的两个关键组成部分,可访问的肿瘤学家和充足且价格合理的互联网,在美国的邮政编码制表区级别。该资源整合了人口加权的固定宽带措施和5G覆盖范围,互联网订阅作为可负担性代理,具有全职等效容量的地理编码肿瘤学家实践站点,以及国家公路旅行时间的出发地-目的地矩阵。根据这些输入,我们计算了两步浮动集水区法(2SFCA)和两步虚拟集水区法(2SVCA)在45-120分钟阈值下的面对面护理的空间可达性。我们通过发布源和中间指标、最终可访问性分数和可复制的2SFCA/2SVCA工作流来支持透明度。预期的用途包括各州和大都市地区的基准基础设施、按农村和地区贫困情况分析差异、补贴模拟以及在新疾病或提供者情况下的快速复制。
{"title":"Telehealth Infrastructure for Cancer Care in the United States.","authors":"Lingbo Liu, Tracy Onega, Erika L Moen, Anna N A Tosteson, Rebecca E Smith, Qianfei Wang, Lauren Cowan, Fahui Wang","doi":"10.1038/s41597-026-07063-z","DOIUrl":"https://doi.org/10.1038/s41597-026-07063-z","url":null,"abstract":"<p><p>Telehealth can reduce travel barriers to oncology, yet its impact depends on both digital connectivity and the geography of care. We present an open, reusable dataset that characterizes two critical components of telehealth infrastructure for cancer care, accessible oncologists and sufficient and affordable internet, at the ZIP Code Tabulation Area level across the United States. The resource integrates population-weighted fixed broadband measures and 5G coverage, internet subscription as an affordability proxy, geocoded oncologist practice sites with full-time-equivalent capacity, and a national origin-destination matrix of road travel times. From these inputs we compute spatial accessibility for in-person care by two-step floating catchment area method (2SFCA) and telehealth-enabled care by two-step virtual catchment area method (2SVCA) at 45-120-minute thresholds. We support transparency by releasing the source and intermediate indicators, the final accessibility scores, and a replicable 2SFCA/2SVCA workflow. Anticipated uses include benchmarking infrastructure across states and metropolitan areas, analyses of disparities by rurality and area deprivation, subsidy simulations, and rapid replication in new diseases or providers contexts.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":" ","pages":""},"PeriodicalIF":6.9,"publicationDate":"2026-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147469119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Eye movement benchmark data for smooth-pursuit classification. 平滑追踪分类的眼动基准数据。
IF 6.9 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2026-03-16 DOI: 10.1038/s41597-026-06963-4
Luke Korthals, Ingmar Visser, Šimon Kucharský

Analysis of eye tracking data often requires accurate classification of eye movement events. Human experts and classification algorithms often confuse episodes of fixations (fixating stationary targets) and smooth pursuits (fixating moving targets) because their feature characteristics overlap. To foster the development of better classification algorithms, we created a benchmark data set that does not rely on human annotation as the gold standard. It consists of almost four hours of eye movements. Ten participants fixated different targets designed to induce saccades, fixations, and smooth pursuits. Plausible benchmark labels were established by designing stimuli that prevent fixations and smooth pursuits to co-occur, and separating them from saccades by their velocity. Here we make available both the raw data and offer a convenient way for preprocessing and assigning plausible benchmark labels in the form of a companion package in Python. We encourage researchers to utilize them for feature engineering, and to train, validate, and benchmark their algorithms.

眼动追踪数据的分析通常需要对眼动事件进行准确的分类。人类专家和分类算法经常混淆注视事件(注视静止目标)和平滑追踪事件(注视移动目标),因为它们的特征特征重叠。为了促进更好的分类算法的发展,我们创建了一个基准数据集,它不依赖于人类注释作为黄金标准。它包括近四个小时的眼球运动。10名参与者盯着不同的目标,目的是诱发扫视、注视和平稳追求。合理的基准标签是通过设计刺激来建立的,这些刺激可以防止注视和平稳的追求同时发生,并通过速度将它们与扫视分开。在这里,我们提供了原始数据,并提供了一种方便的方法来预处理和以Python的配套包的形式分配合理的基准标签。我们鼓励研究人员利用它们进行特征工程,并训练、验证和基准测试他们的算法。
{"title":"Eye movement benchmark data for smooth-pursuit classification.","authors":"Luke Korthals, Ingmar Visser, Šimon Kucharský","doi":"10.1038/s41597-026-06963-4","DOIUrl":"10.1038/s41597-026-06963-4","url":null,"abstract":"<p><p>Analysis of eye tracking data often requires accurate classification of eye movement events. Human experts and classification algorithms often confuse episodes of fixations (fixating stationary targets) and smooth pursuits (fixating moving targets) because their feature characteristics overlap. To foster the development of better classification algorithms, we created a benchmark data set that does not rely on human annotation as the gold standard. It consists of almost four hours of eye movements. Ten participants fixated different targets designed to induce saccades, fixations, and smooth pursuits. Plausible benchmark labels were established by designing stimuli that prevent fixations and smooth pursuits to co-occur, and separating them from saccades by their velocity. Here we make available both the raw data and offer a convenient way for preprocessing and assigning plausible benchmark labels in the form of a companion package in Python. We encourage researchers to utilize them for feature engineering, and to train, validate, and benchmark their algorithms.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"13 1","pages":""},"PeriodicalIF":6.9,"publicationDate":"2026-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12992799/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147469148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
OBIMD: A Multi-modal Dataset for Contextual Interpretation of Oracle Bone Inscriptions. OBIMD:一个用于甲骨文上下文解释的多模态数据集。
IF 6.9 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2026-03-14 DOI: 10.1038/s41597-026-06967-0
Bang Li, Jing Yang, Yujie Liang, Xiaobin Hu, Zengmao Ding, Xu Peng, Shengwei Han, Peichao Qin, Donghao Luo, Taisong Jin, Feng Gao, Yongge Liu, Rongrong Ji

Oracle bone inscriptions, the earliest known form of Chinese writing, hold immense historical and linguistic significance. However, existing digital datasets are typically limited to isolated characters and lack contextual and structural information essential for comprehensive analysis. We present the Oracle Bone Inscriptions Multi-modal Dataset (OBIMD), a large-scale, publicly available corpus to provide pixel-aligned rubbing and facsimile images, character-level annotations, and sentence-level transcriptions with corresponding reading sequences. OBIMD encompasses 10,077 oracle bone inscription images spanning five phases of the Shang Dynasty, featuring 93,652 annotated characters, 21,667 recorded missing-character positions, 21,941 sentence units, and 4,192 non-sentential elements. By integrating visual, structural, and linguistic modalities, OBIMD supports multi-modal learning and diverse tasks such as facsimile enhancement, character retrieval, and syntactic reconstruction. It constitutes a foundational resource for oracle bone inscription recognition and interpretation, enabling scalable and systematic analysis of ancient Chinese writing.

甲骨文是已知最早的中文文字形式,具有巨大的历史和语言学意义。然而,现有的数字数据集通常仅限于孤立的字符,缺乏综合分析所必需的上下文和结构信息。我们提出了甲骨文多模态数据集(OBIMD),这是一个大规模的、公开可用的语料库,提供像素对齐的拓印和传真图像、字符级注释和句子级转录以及相应的阅读序列。OBIMD收录了商代五期甲骨文图像10077幅,注释文字93652个,记录缺字位置21667个,句子单位21941个,非句子元素4192个。通过集成视觉、结构和语言模式,OBIMD支持多模式学习和多种任务,如传真增强、字符检索和句法重建。它构成了甲骨文识别和解释的基础资源,使中国古代文字的可扩展和系统的分析成为可能。
{"title":"OBIMD: A Multi-modal Dataset for Contextual Interpretation of Oracle Bone Inscriptions.","authors":"Bang Li, Jing Yang, Yujie Liang, Xiaobin Hu, Zengmao Ding, Xu Peng, Shengwei Han, Peichao Qin, Donghao Luo, Taisong Jin, Feng Gao, Yongge Liu, Rongrong Ji","doi":"10.1038/s41597-026-06967-0","DOIUrl":"https://doi.org/10.1038/s41597-026-06967-0","url":null,"abstract":"<p><p>Oracle bone inscriptions, the earliest known form of Chinese writing, hold immense historical and linguistic significance. However, existing digital datasets are typically limited to isolated characters and lack contextual and structural information essential for comprehensive analysis. We present the Oracle Bone Inscriptions Multi-modal Dataset (OBIMD), a large-scale, publicly available corpus to provide pixel-aligned rubbing and facsimile images, character-level annotations, and sentence-level transcriptions with corresponding reading sequences. OBIMD encompasses 10,077 oracle bone inscription images spanning five phases of the Shang Dynasty, featuring 93,652 annotated characters, 21,667 recorded missing-character positions, 21,941 sentence units, and 4,192 non-sentential elements. By integrating visual, structural, and linguistic modalities, OBIMD supports multi-modal learning and diverse tasks such as facsimile enhancement, character retrieval, and syntactic reconstruction. It constitutes a foundational resource for oracle bone inscription recognition and interpretation, enabling scalable and systematic analysis of ancient Chinese writing.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":" ","pages":""},"PeriodicalIF":6.9,"publicationDate":"2026-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147459097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Wave spectrum Reconstruction Parameters for nested wave modeling in the China-adjacent seas from 2000 to 2024. 2000 - 2024年中国邻近海域嵌套波模拟的波谱重建参数。
IF 6.9 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2026-03-14 DOI: 10.1038/s41597-026-07017-5
Xingjie Jiang, Yongzeng Yang, Xunqiang Yin, Yuxuan Zha

The storage and retrieval of wave spectra for boundary conditions in nested wave modeling are computationally intensive due to the substantial storage requirements of each spectrum. A recently developed method (Jiang et al., 2023) addresses this by representing two-dimensional wave spectra using a set of Reconstruction Parameters (RPs), enabling efficient long-term and large-scale wave spectrum storage. This study presents an RP dataset for the China-adjacent seas, derived from 165,590 grid points at a 1⁄ 12° × 1⁄ 12° resolution and hourly intervals from 2000 to 2024, supporting the reconstruction of spectra with up to six spectral partitions. Validation against independent buoy and satellite observations shows strong agreement with wave parameters derived from the simulated spectra. Moreover, comparative analysis reveals remarkably close consistency between characteristics obtained from the original simulated spectra and their reconstructed counterparts, with the reconstruction accuracy exceeding the inherent uncertainties of the original numerical simulations. Additional nested modeling experiments further affirm the dataset's exceptional utility for wave hindcasting and forecasting applications in the China-adjacent seas.

嵌套波浪建模中边界条件下的波浪谱的存储和检索需要大量的计算量,因为每个谱都需要大量的存储。最近开发的一种方法(Jiang等人,2023)通过使用一组重建参数(rp)表示二维波浪谱来解决这个问题,从而实现有效的长期和大规模波浪谱存储。本文建立了2000 - 2024年中国近海165,590个栅格点的RP数据集,分辨率为1 / 12°× 1 / 12°,每小时间隔为1 / 12°,支持多达6个光谱分区的光谱重建。对独立浮标和卫星观测的验证表明,从模拟光谱中得到的波浪参数非常符合。对比分析表明,原始模拟光谱的特征与重建光谱的特征非常接近,重建精度超过了原始数值模拟的固有不确定性。另外的嵌套模拟实验进一步证实了该数据集在中国邻近海域波浪预报和预报应用中的特殊效用。
{"title":"Wave spectrum Reconstruction Parameters for nested wave modeling in the China-adjacent seas from 2000 to 2024.","authors":"Xingjie Jiang, Yongzeng Yang, Xunqiang Yin, Yuxuan Zha","doi":"10.1038/s41597-026-07017-5","DOIUrl":"https://doi.org/10.1038/s41597-026-07017-5","url":null,"abstract":"<p><p>The storage and retrieval of wave spectra for boundary conditions in nested wave modeling are computationally intensive due to the substantial storage requirements of each spectrum. A recently developed method (Jiang et al., 2023) addresses this by representing two-dimensional wave spectra using a set of Reconstruction Parameters (RPs), enabling efficient long-term and large-scale wave spectrum storage. This study presents an RP dataset for the China-adjacent seas, derived from 165,590 grid points at a 1⁄ 12° × 1⁄ 12° resolution and hourly intervals from 2000 to 2024, supporting the reconstruction of spectra with up to six spectral partitions. Validation against independent buoy and satellite observations shows strong agreement with wave parameters derived from the simulated spectra. Moreover, comparative analysis reveals remarkably close consistency between characteristics obtained from the original simulated spectra and their reconstructed counterparts, with the reconstruction accuracy exceeding the inherent uncertainties of the original numerical simulations. Additional nested modeling experiments further affirm the dataset's exceptional utility for wave hindcasting and forecasting applications in the China-adjacent seas.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":" ","pages":""},"PeriodicalIF":6.9,"publicationDate":"2026-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147459222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ROBUST-MIPS: A Combined Skeletal Pose and Instance Segmentation Dataset for Laparoscopic Surgical Instruments. 鲁棒- mips:结合骨骼姿态和实例分割数据集的腹腔镜手术器械。
IF 6.9 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2026-03-14 DOI: 10.1038/s41597-026-06938-5
Zhe Han, Charlie Budd, Gongyu Zhang, Huanyu Tian, Christos Bergeles, Tom Vercauteren

Localisation of surgical tools constitutes a foundational building block for computer-assisted interventional technologies. Works in this field typically focus on training deep learning models to perform segmentation tasks. Performance of learning-based approaches is limited by the availability of diverse annotated data. We argue that skeletal pose annotations are a more efficient annotation approach for surgical tools, striking a balance between richness of semantic information and ease of annotation, thus allowing for accelerated growth of available annotated data. To encourage adoption of this annotation style, we present, ROBUST-MIPS, a combined tool pose and tool instance segmentation dataset derived from the existing ROBUST-MIS dataset. Our enriched dataset facilitates the joint study of these two annotation styles and allow head-to-head comparison on various downstream tasks. To demonstrate the adequacy of pose annotations for surgical tool localisation, we set up a simple benchmark using popular pose estimation methods and observe high-quality results. To ease adoption, together with the dataset, we release our benchmark models and custom tool pose annotation software.

手术工具的本地化是计算机辅助介入技术的基础。该领域的工作通常侧重于训练深度学习模型来执行分割任务。基于学习的方法的性能受到各种注释数据的可用性的限制。我们认为骨骼姿势注释是一种更有效的手术工具注释方法,在语义信息的丰富性和注释的便利性之间取得了平衡,从而允许可用注释数据的加速增长。为了鼓励采用这种注释风格,我们提出了ROBUST-MIPS,这是一个从现有的ROBUST-MIS数据集派生的组合工具姿态和工具实例分割数据集。我们丰富的数据集促进了这两种注释风格的联合研究,并允许在各种下游任务上进行正面比较。为了证明姿态注释在手术工具定位中的充分性,我们使用流行的姿态估计方法建立了一个简单的基准,并观察到高质量的结果。为了便于采用,我们与数据集一起发布了基准模型和自定义工具姿态注释软件。
{"title":"ROBUST-MIPS: A Combined Skeletal Pose and Instance Segmentation Dataset for Laparoscopic Surgical Instruments.","authors":"Zhe Han, Charlie Budd, Gongyu Zhang, Huanyu Tian, Christos Bergeles, Tom Vercauteren","doi":"10.1038/s41597-026-06938-5","DOIUrl":"https://doi.org/10.1038/s41597-026-06938-5","url":null,"abstract":"<p><p>Localisation of surgical tools constitutes a foundational building block for computer-assisted interventional technologies. Works in this field typically focus on training deep learning models to perform segmentation tasks. Performance of learning-based approaches is limited by the availability of diverse annotated data. We argue that skeletal pose annotations are a more efficient annotation approach for surgical tools, striking a balance between richness of semantic information and ease of annotation, thus allowing for accelerated growth of available annotated data. To encourage adoption of this annotation style, we present, ROBUST-MIPS, a combined tool pose and tool instance segmentation dataset derived from the existing ROBUST-MIS dataset. Our enriched dataset facilitates the joint study of these two annotation styles and allow head-to-head comparison on various downstream tasks. To demonstrate the adequacy of pose annotations for surgical tool localisation, we set up a simple benchmark using popular pose estimation methods and observe high-quality results. To ease adoption, together with the dataset, we release our benchmark models and custom tool pose annotation software.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":" ","pages":""},"PeriodicalIF":6.9,"publicationDate":"2026-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147459282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A human EEG dataset to study cognitive flexibility during auditory discrimination under real-world distractors. 真实世界干扰下听觉辨别认知灵活性研究的脑电数据集。
IF 6.9 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2026-03-14 DOI: 10.1038/s41597-026-07041-5
Priyanka Ghosh, Kirti Saluja, Arpan Banerjee

Salient sounds in the environment automatically capture our attention, causing a shift of focus away from ongoing goal-directed tasks. Studies of cognitive flexibility can employ such paradigms to examine how the brain reorients attention to the ongoing goal, an ability notably impaired in neurodevelopmental and clinical populations. The current dataset captures attentional reorientation to real-world distractors, featuring 60 naturalistic salient sounds (e.g., ambulance siren, dog bark) presented during goal-directed auditory discrimination tasks involving pure tones, frequency-modulated sweeps, and speech syllables. Novel behavioral and preprocessed electroencephalography (EEG) open-source data are made available from twenty-seven healthy human volunteers performing goal-directed auditory tasks validated across three spectrotemporally different acoustic contexts, along with all task stimuli files. Behavioral data confirmed that distractors significantly modulated task performance across all three auditory tasks, and EEG spectral analyses demonstrated significant power changes linked to auditory distractors. To support accurate source-level analyses, we also provide all individual-specific structural MRIs (3.0 T), 3D head shape digitization files and computed forward models.

环境中明显的声音会自动吸引我们的注意力,导致注意力从正在进行的目标导向任务中转移。认知灵活性的研究可以使用这样的范式来检查大脑如何将注意力重新定向到正在进行的目标上,这是一种在神经发育和临床人群中明显受损的能力。当前的数据集捕获了对现实世界干扰物的注意力重新定向,其中包括60种自然的突出声音(例如救护车警报器、狗叫),这些声音出现在目标导向的听觉识别任务中,包括纯音、调频扫描和语音音节。新的行为和预处理脑电图(EEG)开源数据来自27名健康的人类志愿者,他们执行目标导向的听觉任务,在三种光谱时间不同的声学背景下进行验证,以及所有任务刺激文件。行为数据证实,在所有三种听觉任务中,干扰物显著调节了任务表现,脑电图频谱分析表明,听觉干扰物的功率变化显著。为了支持准确的源级分析,我们还提供了所有个体特异性结构mri (3.0 T), 3D头部形状数字化文件和计算正演模型。
{"title":"A human EEG dataset to study cognitive flexibility during auditory discrimination under real-world distractors.","authors":"Priyanka Ghosh, Kirti Saluja, Arpan Banerjee","doi":"10.1038/s41597-026-07041-5","DOIUrl":"https://doi.org/10.1038/s41597-026-07041-5","url":null,"abstract":"<p><p>Salient sounds in the environment automatically capture our attention, causing a shift of focus away from ongoing goal-directed tasks. Studies of cognitive flexibility can employ such paradigms to examine how the brain reorients attention to the ongoing goal, an ability notably impaired in neurodevelopmental and clinical populations. The current dataset captures attentional reorientation to real-world distractors, featuring 60 naturalistic salient sounds (e.g., ambulance siren, dog bark) presented during goal-directed auditory discrimination tasks involving pure tones, frequency-modulated sweeps, and speech syllables. Novel behavioral and preprocessed electroencephalography (EEG) open-source data are made available from twenty-seven healthy human volunteers performing goal-directed auditory tasks validated across three spectrotemporally different acoustic contexts, along with all task stimuli files. Behavioral data confirmed that distractors significantly modulated task performance across all three auditory tasks, and EEG spectral analyses demonstrated significant power changes linked to auditory distractors. To support accurate source-level analyses, we also provide all individual-specific structural MRIs (3.0 T), 3D head shape digitization files and computed forward models.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":" ","pages":""},"PeriodicalIF":6.9,"publicationDate":"2026-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147459645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Global basin-scale mapping of pH and alkalinity in inland waters. 内陆水域pH和碱度的全球盆地尺度制图。
IF 6.9 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2026-03-14 DOI: 10.1038/s41597-026-07028-2
Meritxell Batalla, Jordi Martínez-Artero, Jordi Catalan

The acidity and buffering capacity of inland waters are essential for biogeochemical processes and impose significant constraints on the distribution of freshwater species. Although many measurements exist worldwide, the data distribution is biased toward more-studied regions, and a global assessment of gradients and their spatial distribution is lacking. In the PHALK dataset, we compile alkalinity and pH values for continental surface waters worldwide, collating chemical data from 18 source databases and 55 scientific publications. A quality-control filter yielded high-quality alkalinity and pH datasets, including 50,916 and 107,896 sites, respectively. Based on the collated dataset and a random forest model, pH and alkalinity in surface waters were modeled worldwide at the basin scale (HydroBASINS v1 sub-basin level 12: 1,034,083 drainage basins) using 23 variables describing basin geological and hydrological characteristics. Each extrapolated value is accompanied by two uncertainty indicators: environmental differentiation, based on the similarity of the basin's environmental conditions to those of basins with measured data, and upscaling confidence, based on the variation in the random forest's internal bootstrap.

内陆水域的酸度和缓冲能力对生物地球化学过程至关重要,并对淡水物种的分布造成重大限制。尽管世界范围内存在许多测量方法,但数据分布偏向于研究较多的区域,并且缺乏对梯度及其空间分布的全球评估。在PHALK数据集中,我们编制了全球大陆地表水的碱度和pH值,整理了来自18个源数据库和55个科学出版物的化学数据。质量控制过滤器产生高质量的碱度和pH数据集,分别包括50,916和107,896个位点。基于整理的数据集和随机森林模型,采用23个描述流域地质水文特征的变量,在流域尺度(hydrobasin v1子流域级12:1,034,083个流域)上模拟了全球地表水pH和碱度。每个外推值都伴随着两个不确定性指标:环境分化,基于流域环境条件与实测数据流域环境条件的相似性,以及放大置信度,基于随机森林内部自举的变化。
{"title":"Global basin-scale mapping of pH and alkalinity in inland waters.","authors":"Meritxell Batalla, Jordi Martínez-Artero, Jordi Catalan","doi":"10.1038/s41597-026-07028-2","DOIUrl":"https://doi.org/10.1038/s41597-026-07028-2","url":null,"abstract":"<p><p>The acidity and buffering capacity of inland waters are essential for biogeochemical processes and impose significant constraints on the distribution of freshwater species. Although many measurements exist worldwide, the data distribution is biased toward more-studied regions, and a global assessment of gradients and their spatial distribution is lacking. In the PHALK dataset, we compile alkalinity and pH values for continental surface waters worldwide, collating chemical data from 18 source databases and 55 scientific publications. A quality-control filter yielded high-quality alkalinity and pH datasets, including 50,916 and 107,896 sites, respectively. Based on the collated dataset and a random forest model, pH and alkalinity in surface waters were modeled worldwide at the basin scale (HydroBASINS v1 sub-basin level 12: 1,034,083 drainage basins) using 23 variables describing basin geological and hydrological characteristics. Each extrapolated value is accompanied by two uncertainty indicators: environmental differentiation, based on the similarity of the basin's environmental conditions to those of basins with measured data, and upscaling confidence, based on the variation in the random forest's internal bootstrap.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":" ","pages":""},"PeriodicalIF":6.9,"publicationDate":"2026-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147459704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A multimodal dataset of harmful simulated behaviours in high-risk clinical settings using radar. 使用雷达在高风险临床环境中模拟有害行为的多模态数据集。
IF 6.9 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2026-03-13 DOI: 10.1038/s41597-026-06703-8
Benjamin Tilbury, Miguel Arevalillo-Herráez, Naeem Ramzan

We present a new dataset comprising radar, Electrocardiography (ECG), respiration, and inertial measurement signal recordings from 23 individuals while performing a series of simulated harmful behaviors. This dataset covers a range of actions across various levels of agitation and is especially well-suited for conducting research in health monitoring within high-risk clinical settings, such as inpatient psychiatric units. The dataset's design prioritizes unrestricted, naturalistic behavior capture, providing valuable insights into real-world scenarios and supporting a wide range of applications. Although the dataset was initially designed for patient monitoring, the provided ECG and respiration recording extend the potential uses of the data to localization and non-contact vital sign measurement.

我们提出了一个新的数据集,包括雷达、心电图(ECG)、呼吸和惯性测量信号记录,来自23个个体,同时进行一系列模拟有害行为。该数据集涵盖了各种躁动程度的一系列行为,特别适合在高风险临床环境中进行健康监测研究,如住院精神病病房。该数据集的设计优先考虑不受限制的、自然的行为捕捉,为现实世界的场景提供有价值的见解,并支持广泛的应用。虽然该数据集最初是为患者监测而设计的,但所提供的ECG和呼吸记录将数据的潜在用途扩展到定位和非接触式生命体征测量。
{"title":"A multimodal dataset of harmful simulated behaviours in high-risk clinical settings using radar.","authors":"Benjamin Tilbury, Miguel Arevalillo-Herráez, Naeem Ramzan","doi":"10.1038/s41597-026-06703-8","DOIUrl":"https://doi.org/10.1038/s41597-026-06703-8","url":null,"abstract":"<p><p>We present a new dataset comprising radar, Electrocardiography (ECG), respiration, and inertial measurement signal recordings from 23 individuals while performing a series of simulated harmful behaviors. This dataset covers a range of actions across various levels of agitation and is especially well-suited for conducting research in health monitoring within high-risk clinical settings, such as inpatient psychiatric units. The dataset's design prioritizes unrestricted, naturalistic behavior capture, providing valuable insights into real-world scenarios and supporting a wide range of applications. Although the dataset was initially designed for patient monitoring, the provided ECG and respiration recording extend the potential uses of the data to localization and non-contact vital sign measurement.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":" ","pages":""},"PeriodicalIF":6.9,"publicationDate":"2026-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147459640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A longitudinal dataset of hypertensive osteoporotic fracture patients: treatments and long-term outcomes. 高血压骨质疏松性骨折患者的纵向数据集:治疗和长期结果。
IF 6.9 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2026-03-13 DOI: 10.1038/s41597-026-07031-7
Chong Li, Ke Lu, Li-Wen Su, Peng Zhou, Guo-Ji Lin, Jia-Qi Liang, Ya-Qin Gong, Jian Jin, Wen-Rong Xu

Osteoporotic fractures (OPF) and hypertension frequently co-occur in older adults, yet comprehensive datasets integrating clinical, pharmacological, and longitudinal outcome data remain scarce. We describe a longitudinal dataset derived from the Osteoporotic Fracture Registration System at Kunshan Hospital, Jiangsu University, including patients aged ≥ 50 years hospitalized for OPF between 2017 and 2024. A total of 4,782 patients were initially registered. After applying predefined eligibility criteria, 4,325 patients were included in the final analytical cohort. The dataset integrates demographic, clinical, and pharmacologic variables with long-term outcomes on mortality and refracture through deterministic linkage with regional health and mortality registries. Longitudinal antihypertensive prescription records (n = 42,367) were linked via the Kunshan Municipal Health Data Integration Platform, enabling detailed characterization of medication exposure patterns over time. Technical validation, including survival analysis, propensity score methods, and risk prediction modeling, was conducted to assess internal consistency and illustrate potential applications. This structured and de-identified dataset provides a quality-checked resource to support future research in osteoporosis, cardiovascular comorbidity, multimorbidity, and real-world comparative effectiveness studies.

骨质疏松性骨折(OPF)和高血压经常在老年人中同时发生,但综合临床、药理学和纵向结局数据的综合数据集仍然很少。我们描述了来自江苏大学昆山医院骨质疏松性骨折登记系统的纵向数据集,包括2017年至2024年间因OPF住院的年龄≥50岁的患者。最初总共登记了4782名患者。在应用预定义的资格标准后,4325名患者被纳入最终的分析队列。该数据集通过与区域健康和死亡率登记的确定性联系,将人口统计学、临床和药理学变量与死亡率和再骨折的长期结果整合在一起。通过昆山市卫生数据集成平台将纵向抗高血压处方记录(n = 42367)联系起来,从而详细描述药物暴露模式随时间的变化。技术验证包括生存分析、倾向评分方法和风险预测建模,以评估内部一致性并说明潜在的应用。这个结构化和去识别的数据集提供了一个质量检查的资源,以支持骨质疏松症、心血管合并症、多病和现实世界的比较有效性研究的未来研究。
{"title":"A longitudinal dataset of hypertensive osteoporotic fracture patients: treatments and long-term outcomes.","authors":"Chong Li, Ke Lu, Li-Wen Su, Peng Zhou, Guo-Ji Lin, Jia-Qi Liang, Ya-Qin Gong, Jian Jin, Wen-Rong Xu","doi":"10.1038/s41597-026-07031-7","DOIUrl":"https://doi.org/10.1038/s41597-026-07031-7","url":null,"abstract":"<p><p>Osteoporotic fractures (OPF) and hypertension frequently co-occur in older adults, yet comprehensive datasets integrating clinical, pharmacological, and longitudinal outcome data remain scarce. We describe a longitudinal dataset derived from the Osteoporotic Fracture Registration System at Kunshan Hospital, Jiangsu University, including patients aged ≥ 50 years hospitalized for OPF between 2017 and 2024. A total of 4,782 patients were initially registered. After applying predefined eligibility criteria, 4,325 patients were included in the final analytical cohort. The dataset integrates demographic, clinical, and pharmacologic variables with long-term outcomes on mortality and refracture through deterministic linkage with regional health and mortality registries. Longitudinal antihypertensive prescription records (n = 42,367) were linked via the Kunshan Municipal Health Data Integration Platform, enabling detailed characterization of medication exposure patterns over time. Technical validation, including survival analysis, propensity score methods, and risk prediction modeling, was conducted to assess internal consistency and illustrate potential applications. This structured and de-identified dataset provides a quality-checked resource to support future research in osteoporosis, cardiovascular comorbidity, multimorbidity, and real-world comparative effectiveness studies.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":" ","pages":""},"PeriodicalIF":6.9,"publicationDate":"2026-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147459621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Smartphone-based Comprehensive Dataset of Annotated Oral Cavity Images for Enhanced Oral Disease Diagnosis. 基于智能手机的口腔图像注释综合数据集,用于增强口腔疾病诊断。
IF 6.9 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2026-03-13 DOI: 10.1038/s41597-026-06954-5
P D Madan Kumar, K Ranganathan, C Lavanya, S Rajeshwari, Anwesh Nayak, Ramesh Kestur, Raghuram Bharadwaj Diddigi, Sushree S Behera

This study introduces a SMARTphone-based, expert annotated dataset of Oral Mucosa images (SMART-OM), collected to facilitate the development of Artificial Intelligence and Machine Learning (AI/ML) technologies for automated diagnosis of Oral Cancer (OC) and Oral Potentially Malignant Disorders (OPMD). The dataset consists of 2,469 images from 331 subjects from four distinct classes: healthy/normal, variations from normal, OPMD, and OC. The images are captured using Android and iOS smartphone cameras under real-world clinical conditions in visible light. Each image is annotated by expert dental surgeons using the open-source VGG image annotator. Elaborate patient metadata, including clinical diagnosis, age, sex, and lifestyle-based risk indicators such as smoking, smokeless tobacco usage, alcohol consumption, and areca nut chewing, are recorded via a customized Jotform. The data collection and handling procedures are adhered to the ethical guidelines outlined in the Declaration of Helsinki and its amendments for research involving human subjects, with informed consent obtained from each subject. The SMART-OM dataset is intended to advance research and development of AI/ML algorithms for automated oral lesion detection.

本研究介绍了一个基于智能手机的专家注释口腔黏膜图像数据集(SMART-OM),收集该数据集是为了促进人工智能和机器学习(AI/ML)技术的发展,用于口腔癌(OC)和口腔潜在恶性疾病(OPMD)的自动诊断。该数据集由来自331名受试者的2,469张图像组成,这些图像来自四个不同的类别:健康/正常、正常变异、OPMD和OC。这些图像是在真实的临床条件下在可见光下使用Android和iOS智能手机相机拍摄的。每张图像都由牙科专家使用开源的VGG图像注释器进行注释。详细的患者元数据,包括临床诊断、年龄、性别和基于生活方式的风险指标,如吸烟、无烟烟草使用、饮酒和嚼槟榔,通过定制的Jotform记录下来。数据收集和处理程序遵循赫尔辛基宣言及其修正案中关于涉及人类受试者的研究的伦理准则,并获得每个受试者的知情同意。SMART-OM数据集旨在推进用于自动口腔病变检测的AI/ML算法的研究和开发。
{"title":"A Smartphone-based Comprehensive Dataset of Annotated Oral Cavity Images for Enhanced Oral Disease Diagnosis.","authors":"P D Madan Kumar, K Ranganathan, C Lavanya, S Rajeshwari, Anwesh Nayak, Ramesh Kestur, Raghuram Bharadwaj Diddigi, Sushree S Behera","doi":"10.1038/s41597-026-06954-5","DOIUrl":"https://doi.org/10.1038/s41597-026-06954-5","url":null,"abstract":"<p><p>This study introduces a SMARTphone-based, expert annotated dataset of Oral Mucosa images (SMART-OM), collected to facilitate the development of Artificial Intelligence and Machine Learning (AI/ML) technologies for automated diagnosis of Oral Cancer (OC) and Oral Potentially Malignant Disorders (OPMD). The dataset consists of 2,469 images from 331 subjects from four distinct classes: healthy/normal, variations from normal, OPMD, and OC. The images are captured using Android and iOS smartphone cameras under real-world clinical conditions in visible light. Each image is annotated by expert dental surgeons using the open-source VGG image annotator. Elaborate patient metadata, including clinical diagnosis, age, sex, and lifestyle-based risk indicators such as smoking, smokeless tobacco usage, alcohol consumption, and areca nut chewing, are recorded via a customized Jotform. The data collection and handling procedures are adhered to the ethical guidelines outlined in the Declaration of Helsinki and its amendments for research involving human subjects, with informed consent obtained from each subject. The SMART-OM dataset is intended to advance research and development of AI/ML algorithms for automated oral lesion detection.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":" ","pages":""},"PeriodicalIF":6.9,"publicationDate":"2026-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147459650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Scientific Data
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1