首页 > 最新文献

Scientific Data最新文献

英文 中文
Near telomere-to-telomere genome assembly of the stone loach (Traccatichthys pulcher). 石泥鳅近端粒到端粒基因组的组装。
IF 6.9 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2026-02-20 DOI: 10.1038/s41597-026-06849-5
Li-Na Du, Zhuo-Cong Wang, Zhuo-Ni Chen, Zhi-Xian Qin, Chen-Hong Li

Traccatichthys pulcher is an ornamental loach species recognized for its vibrant body coloration, characteristic black dorsal fin margin, and iridescent green lateral stripes. To advance genomic research on this species, a high-quality, near telomere-to-telomere (T2T) genome assembly was generated using PacBio HiFi, ONT ultra-long, and Hi-C sequencing technologies. The resulting haplotype-resolved assembly spanned approximately 623.68 Mb, with a contig N50 of 22.9 Mb, and was anchored onto 24 chromosomes. Telomeric sequences were detected at both ends of eight chromosomes and at one end of 13 chromosomes. Twenty-three chromosomes were entirely gapless, while a single gap was identified in the remaining chromosome. The assembly contained 119.1 Mb of repetitive elements, and 23 967 protein-coding genes were annotated. BUSCO analysis indicated high completeness, with 98.6% of conserved genes recovered. This high-quality, near T2T genome assembly offers a valuable and robust genetic resource for investigating molecular mechanisms, evolutionary processes, conservation biology, and selective breeding of T. pulcher.

泥鳅是一种观赏泥鳅,以其充满活力的身体颜色,黑色的背鳍边缘和虹彩绿色的横向条纹而闻名。为了推进该物种的基因组研究,使用PacBio HiFi, ONT超长和Hi-C测序技术生成了高质量的近端粒到端粒(T2T)基因组组装。得到的单倍型分解组装全长约623.68 Mb, N50长度为22.9 Mb,锚定在24条染色体上。在8条染色体的两端和13条染色体的一端检测到端粒序列。23条染色体完全没有间隙,而在剩余的染色体上发现了一个间隙。该序列包含119.1 Mb的重复元件,共标注了23 967个蛋白编码基因。BUSCO分析显示高完整性,98.6%的保守基因恢复。这一高质量的近T2T基因组组合为研究弓形虫的分子机制、进化过程、保护生物学和选择育种提供了宝贵的遗传资源。
{"title":"Near telomere-to-telomere genome assembly of the stone loach (Traccatichthys pulcher).","authors":"Li-Na Du, Zhuo-Cong Wang, Zhuo-Ni Chen, Zhi-Xian Qin, Chen-Hong Li","doi":"10.1038/s41597-026-06849-5","DOIUrl":"https://doi.org/10.1038/s41597-026-06849-5","url":null,"abstract":"<p><p>Traccatichthys pulcher is an ornamental loach species recognized for its vibrant body coloration, characteristic black dorsal fin margin, and iridescent green lateral stripes. To advance genomic research on this species, a high-quality, near telomere-to-telomere (T2T) genome assembly was generated using PacBio HiFi, ONT ultra-long, and Hi-C sequencing technologies. The resulting haplotype-resolved assembly spanned approximately 623.68 Mb, with a contig N50 of 22.9 Mb, and was anchored onto 24 chromosomes. Telomeric sequences were detected at both ends of eight chromosomes and at one end of 13 chromosomes. Twenty-three chromosomes were entirely gapless, while a single gap was identified in the remaining chromosome. The assembly contained 119.1 Mb of repetitive elements, and 23 967 protein-coding genes were annotated. BUSCO analysis indicated high completeness, with 98.6% of conserved genes recovered. This high-quality, near T2T genome assembly offers a valuable and robust genetic resource for investigating molecular mechanisms, evolutionary processes, conservation biology, and selective breeding of T. pulcher.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":" ","pages":""},"PeriodicalIF":6.9,"publicationDate":"2026-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146259178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Geo-located attendance data for CITES Conferences of the Parties. CITES缔约方大会地理位置出席数据。
IF 6.9 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2026-02-20 DOI: 10.1038/s41597-026-06799-y
Daria Blinova, Gayathri Emuru, Rakesh Emuru, Benjamin E Bagozzi

The Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES) was adopted in 1975 in an effort to manage the international biodiversity trade. Meetings regulating the implementation of CITES have since been held every 2-3 years with the involvement of diverse stakeholders representing country Party-signatories, non-Party states, international organizations, private sector interests, and NGOs. These meetings and their outcomes are of interest to environmental science scholars, social scientists, journalists, and advocacy organizations. Yet, no usable data on meeting attendees and their details exists. This limits researchers' and advocates' abilities to study or track CITES meeting attendance patterns, and their associated causes and effects. Applying NLP techniques to PDF attendance rosters, we build the first CITES attendee-level dataset, covering 20,987 attendee records for all meetings to date. The dataset contains rich information on attendee geo-locations, names, affiliations and genders, and variables associated with attendee delegations, among others. Summaries and validations underscore the promise of our data and suggest new avenues for research on international wildlife conservation.

《濒危野生动植物种国际贸易公约》(CITES)于1975年通过,旨在管理国际生物多样性贸易。此后,每2-3年举行一次规范CITES实施的会议,代表缔约方、非缔约方、国际组织、私营部门利益和非政府组织的各种利益攸关方参与其中。这些会议及其成果是环境科学学者、社会科学家、记者和倡导组织感兴趣的。然而,没有关于与会者及其详细信息的可用数据。这限制了研究人员和倡导者研究或跟踪CITES会议出席模式及其相关原因和影响的能力。将自然语言处理技术应用于PDF出席名单,我们建立了第一个CITES与会者级别的数据集,涵盖了迄今为止所有会议的20,987个与会者记录。该数据集包含与会者地理位置、姓名、所属机构和性别以及与与会者代表团相关的变量等丰富信息。总结和验证强调了我们的数据的前景,并为国际野生动物保护研究提供了新的途径。
{"title":"Geo-located attendance data for CITES Conferences of the Parties.","authors":"Daria Blinova, Gayathri Emuru, Rakesh Emuru, Benjamin E Bagozzi","doi":"10.1038/s41597-026-06799-y","DOIUrl":"https://doi.org/10.1038/s41597-026-06799-y","url":null,"abstract":"<p><p>The Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES) was adopted in 1975 in an effort to manage the international biodiversity trade. Meetings regulating the implementation of CITES have since been held every 2-3 years with the involvement of diverse stakeholders representing country Party-signatories, non-Party states, international organizations, private sector interests, and NGOs. These meetings and their outcomes are of interest to environmental science scholars, social scientists, journalists, and advocacy organizations. Yet, no usable data on meeting attendees and their details exists. This limits researchers' and advocates' abilities to study or track CITES meeting attendance patterns, and their associated causes and effects. Applying NLP techniques to PDF attendance rosters, we build the first CITES attendee-level dataset, covering 20,987 attendee records for all meetings to date. The dataset contains rich information on attendee geo-locations, names, affiliations and genders, and variables associated with attendee delegations, among others. Summaries and validations underscore the promise of our data and suggest new avenues for research on international wildlife conservation.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":" ","pages":""},"PeriodicalIF":6.9,"publicationDate":"2026-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146259193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A dayside aurora dataset from the Global-scale Observations of the Limb and Disk mission. 来自翼盘任务全球尺度观测的白天极光数据集。
IF 6.9 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2026-02-19 DOI: 10.1038/s41597-026-06884-2
Jordan Holmes, Scott L England

We present a comprehensive dataset of dayside auroral emissions observed by the Global-scale Observations of the Limb and Disk (GOLD) mission from October 2018 to June 2025. The dataset contains over 47,000 unique scans of the northern aurora in three far-ultraviolet spectral channels (OI 135.6 nm, NI 149.3 nm, and N₂ LBH), estimates of the background dayglow, binary masks of auroral locations, and other corresponding spatial and temporal metadata. The OI 135.6 nm, NI 149.3 nm, and N₂ LBH emissions are far-ultraviolet signatures of electron-impact excitation in the upper atmosphere and therefore serve as tracers of auroral electron precipitation. From this dataset, auroral pixels are directly available with no dayglow contamination of the emissions. Auroral signals are extracted through a multi-stage processing pipeline inspired by computer vision and machine learning techniques. This dataset provides a consistent view of the dayside aurora over the North American and Atlantic sectors, enabling studies of auroral dynamics with GOLD observations.

我们提供了2018年10月至2025年6月全球尺度翼盘观测(GOLD)任务观测到的日间极光发射的综合数据集。该数据集包含了47,000多个北极光在三个远紫外光谱通道(OI 135.6 nm, NI 149.3 nm和n2lbh)上的独特扫描,对背景日光的估计,极光位置的二元掩模,以及其他相应的时空元数据。OI 135.6 nm, NI 149.3 nm和n2lbh发射是高层大气中电子撞击激发的远紫外特征,因此可以作为极光电子沉淀的示踪剂。从这个数据集中,可以直接获得极光像素,没有排放的日光污染。极光信号通过计算机视觉和机器学习技术启发的多阶段处理管道提取。该数据集提供了北美和大西洋地区白天侧极光的一致视图,使极光动力学研究与GOLD观测成为可能。
{"title":"A dayside aurora dataset from the Global-scale Observations of the Limb and Disk mission.","authors":"Jordan Holmes, Scott L England","doi":"10.1038/s41597-026-06884-2","DOIUrl":"https://doi.org/10.1038/s41597-026-06884-2","url":null,"abstract":"<p><p>We present a comprehensive dataset of dayside auroral emissions observed by the Global-scale Observations of the Limb and Disk (GOLD) mission from October 2018 to June 2025. The dataset contains over 47,000 unique scans of the northern aurora in three far-ultraviolet spectral channels (OI 135.6 nm, NI 149.3 nm, and N₂ LBH), estimates of the background dayglow, binary masks of auroral locations, and other corresponding spatial and temporal metadata. The OI 135.6 nm, NI 149.3 nm, and N₂ LBH emissions are far-ultraviolet signatures of electron-impact excitation in the upper atmosphere and therefore serve as tracers of auroral electron precipitation. From this dataset, auroral pixels are directly available with no dayglow contamination of the emissions. Auroral signals are extracted through a multi-stage processing pipeline inspired by computer vision and machine learning techniques. This dataset provides a consistent view of the dayside aurora over the North American and Atlantic sectors, enabling studies of auroral dynamics with GOLD observations.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":" ","pages":""},"PeriodicalIF":6.9,"publicationDate":"2026-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146228394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Integrated Network Solutions in Government Hiring Trends (INSIGHT+). 政府招聘趋势的综合网络解决方案(INSIGHT+)。
IF 6.9 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2026-02-19 DOI: 10.1038/s41597-026-06825-z
William G Resh, Keunyoung Eli Lee, Yi Ming, Xinyao Andy Xia, Nicole Dias, Kecheng Anderson Liu, Darren Cao, William Huh

The Integrated Network Solutions in Government Hiring Trends (INSIGHT+) database supports research on the U.S. federal civil service labor market. As of September 2024, the federal workforce included over 2.4 million civilian employees spanning more than 400 occupations, with over 85% located outside the Washington, D.C. metropolitan area. INSIGHT + integrates micro-, meso-, and macro-level statistics from sources ranging from governmental and academic sources. Our database covers workforce dynamics from 2018 to 2023. It allows granular multivariate analyses and accommodates agency-, location-, and institution-specific tables that can be matched in the future with further detailed data on agency outputs such as discretionary grants, contracts, contingent liabilities, and various economic impacts. This article outlines the current capabilities and significance of INSIGHT + for civil service labor market research, emphasizing ongoing enhancements to enrich analyses of civil service institutions.

政府招聘趋势集成网络解决方案(INSIGHT+)数据库支持对美国联邦公务员劳动力市场的研究。截至2024年9月,联邦劳动力包括超过240万文职雇员,涉及400多个职业,其中85%以上位于华盛顿特区大都市区以外。INSIGHT +整合了来自政府和学术来源的微观,中观和宏观层面的统计数据。我们的数据库涵盖了2018年至2023年的劳动力动态。它允许进行细粒度的多变量分析,并容纳特定于机构、地点和机构的表格,这些表格可以在未来与有关机构产出的进一步详细数据相匹配,如酌情拨款、合同、或有负债和各种经济影响。本文概述了INSIGHT +对公务员劳动力市场研究的当前能力和意义,强调了正在进行的改进,以丰富对公务员制度的分析。
{"title":"Integrated Network Solutions in Government Hiring Trends (INSIGHT+).","authors":"William G Resh, Keunyoung Eli Lee, Yi Ming, Xinyao Andy Xia, Nicole Dias, Kecheng Anderson Liu, Darren Cao, William Huh","doi":"10.1038/s41597-026-06825-z","DOIUrl":"https://doi.org/10.1038/s41597-026-06825-z","url":null,"abstract":"<p><p>The Integrated Network Solutions in Government Hiring Trends (INSIGHT+) database supports research on the U.S. federal civil service labor market. As of September 2024, the federal workforce included over 2.4 million civilian employees spanning more than 400 occupations, with over 85% located outside the Washington, D.C. metropolitan area. INSIGHT + integrates micro-, meso-, and macro-level statistics from sources ranging from governmental and academic sources. Our database covers workforce dynamics from 2018 to 2023. It allows granular multivariate analyses and accommodates agency-, location-, and institution-specific tables that can be matched in the future with further detailed data on agency outputs such as discretionary grants, contracts, contingent liabilities, and various economic impacts. This article outlines the current capabilities and significance of INSIGHT + for civil service labor market research, emphasizing ongoing enhancements to enrich analyses of civil service institutions.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":" ","pages":""},"PeriodicalIF":6.9,"publicationDate":"2026-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146228444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Global Urban Tree Species (GUTS): Revealing tree species diversity across the world's urban areas. 全球城市树种(GUTS):揭示全球城市地区的树种多样性。
IF 6.9 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2026-02-19 DOI: 10.1038/s41597-026-06868-2
Xudong Yang, Pengbo Yan, Jing Jin, Xinyi Liu, Jun Yang

Diverse tree communities can bolster urban ecosystem resilience and provide vital ecosystem services. However, existing urban tree species datasets have limited geographic coverage and contain inadequate attributes. To address those gaps, we developed the Global Urban Tree Species (GUTS) dataset by integrating data from literature, biodiversity databases, and other open sources. The new dataset encompasses 159,845 occurrence records of 10,094 tree species in 8,349 cities and 139 countries. Among them, 109,879 records were confirmed from urban areas, representing 11.18% of global tree species diversity. The dataset has been validated using multiple methods. GUTS fills critical data gaps and provides a foundation for future research and management of global urban biodiversity.

多样化的树木群落可以增强城市生态系统的恢复能力,并提供重要的生态系统服务。然而,现有的城市树种数据集地理覆盖范围有限,且包含的属性不足。为了解决这些差距,我们整合了文献、生物多样性数据库和其他开放资源的数据,开发了全球城市树种(GUTS)数据集。新的数据集包含了139个国家8349个城市的10094种树种的159845条发生记录。其中,109,879条记录来自城市地区,占全球树种多样性的11.18%。该数据集已使用多种方法进行验证。GUTS填补了关键的数据空白,为未来全球城市生物多样性的研究和管理提供了基础。
{"title":"Global Urban Tree Species (GUTS): Revealing tree species diversity across the world's urban areas.","authors":"Xudong Yang, Pengbo Yan, Jing Jin, Xinyi Liu, Jun Yang","doi":"10.1038/s41597-026-06868-2","DOIUrl":"https://doi.org/10.1038/s41597-026-06868-2","url":null,"abstract":"<p><p>Diverse tree communities can bolster urban ecosystem resilience and provide vital ecosystem services. However, existing urban tree species datasets have limited geographic coverage and contain inadequate attributes. To address those gaps, we developed the Global Urban Tree Species (GUTS) dataset by integrating data from literature, biodiversity databases, and other open sources. The new dataset encompasses 159,845 occurrence records of 10,094 tree species in 8,349 cities and 139 countries. Among them, 109,879 records were confirmed from urban areas, representing 11.18% of global tree species diversity. The dataset has been validated using multiple methods. GUTS fills critical data gaps and provides a foundation for future research and management of global urban biodiversity.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":" ","pages":""},"PeriodicalIF":6.9,"publicationDate":"2026-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146228463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MmodalFire: A Continuous Multimodal Dataset Comprising Video and Physical Sensing Data for Detecting Indoor Fires. modalfire:一个连续的多模态数据集,包括用于检测室内火灾的视频和物理传感数据。
IF 6.9 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2026-02-19 DOI: 10.1038/s41597-026-06810-6
Yang Jia, Yihan Guo, Yetang Chen, Xinmeng Zhang, Gang Wang, Qixing Zhang

Because no multimodal dataset was previously available for fire detection research, we developed the MmodalFire multimodal fire detection dataset for training and evaluation of indoor fire detection algorithms. This publicly available dataset includes video and physical sensing data for fire detection use. The dataset comprises 65 videos that simultaneously captured six physical sensing data types, including smoke density, temperature, and infrared and ultraviolet radiation at 5 μm, 4.4 μm, and 3.8 μm. All data were acquired using monitoring cameras and fire sensors deployed as part of a fire detection system that was carefully designed to cover all possible variations, including different wind velocities, illumination conditions, common interference types, and occlusions. All videos and corresponding physical sensing data sequences are labeled as either fire or non-fire sequences. Using the MmodalFire dataset, we evaluated four basic baseline fusion models and the proposed dynamic fusion models to provide a reference for multimodal fire detection research under controlled laboratory settings, promoting research on multimodal fire detection algorithms using controlled-setting data.

由于以前没有可用于火灾探测研究的多模态数据集,我们开发了MmodalFire多模态火灾探测数据集,用于室内火灾探测算法的训练和评估。这个公开可用的数据集包括用于火灾探测的视频和物理传感数据。该数据集包括65个视频,同时捕获6种物理传感数据类型,包括烟雾密度、温度、5 μm、4.4 μm和3.8 μm的红外和紫外辐射。所有数据都是通过监控摄像头和火灾传感器获得的,这些传感器是火灾探测系统的一部分,该系统经过精心设计,可以覆盖所有可能的变化,包括不同的风速、照明条件、常见的干扰类型和遮挡。所有视频和相应的物理传感数据序列被标记为火灾或非火灾序列。利用MmodalFire数据集,我们评估了四种基本的基线融合模型和提出的动态融合模型,为受控实验室环境下的多模态火灾探测研究提供参考,促进受控环境下多模态火灾探测算法的研究。
{"title":"MmodalFire: A Continuous Multimodal Dataset Comprising Video and Physical Sensing Data for Detecting Indoor Fires.","authors":"Yang Jia, Yihan Guo, Yetang Chen, Xinmeng Zhang, Gang Wang, Qixing Zhang","doi":"10.1038/s41597-026-06810-6","DOIUrl":"https://doi.org/10.1038/s41597-026-06810-6","url":null,"abstract":"<p><p>Because no multimodal dataset was previously available for fire detection research, we developed the MmodalFire multimodal fire detection dataset for training and evaluation of indoor fire detection algorithms. This publicly available dataset includes video and physical sensing data for fire detection use. The dataset comprises 65 videos that simultaneously captured six physical sensing data types, including smoke density, temperature, and infrared and ultraviolet radiation at 5 μm, 4.4 μm, and 3.8 μm. All data were acquired using monitoring cameras and fire sensors deployed as part of a fire detection system that was carefully designed to cover all possible variations, including different wind velocities, illumination conditions, common interference types, and occlusions. All videos and corresponding physical sensing data sequences are labeled as either fire or non-fire sequences. Using the MmodalFire dataset, we evaluated four basic baseline fusion models and the proposed dynamic fusion models to provide a reference for multimodal fire detection research under controlled laboratory settings, promoting research on multimodal fire detection algorithms using controlled-setting data.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":" ","pages":""},"PeriodicalIF":6.9,"publicationDate":"2026-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146228496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data from long-term experiments in temperate croplands to evaluate soil organic carbon models. 基于温带农田土壤有机碳模型的长期试验数据。
IF 6.9 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2026-02-19 DOI: 10.1038/s41597-026-06863-7
Kenji Fujisaki, Fabien Ferchaud, Hugues Clivot, Elisa Bruni, Bertrand Guenet, Christian Pichot, Antoine Versini, François Baudin, Antonio Bispo, Philippe Peylin, Manuel P Martin, Johannes L Jensen, Jørgen Eriksen, Claire Chenu, Andrew S Gregory, Margaret J Glendining, Ines Merbach, Nicolas Beaudoin, Bruno Mary, Alain Mollier, Gilles Tison, Christophe Montagnier, Abad Chabbi, Françoise Vertès, Alice Cadéro, Anne-Isabelle Graux, Sylvain Pellerin, Florent Levavasseur, Manon Gilles, Thierry Morvan, Camille Resseguier, Luis Milesi, Alicia Irizar, Adriàn Andriulo, Marie-Noël Mistou, Arnaud Butier, Michel Bertrand, Bénédicte Autret, Marie-Hélène Jeuffroy, Gilles Grandeau, Thierry Doré, Vincent Cellier, Alain Berthier, Sébastien Darras, Guillaume Audebert, Ludovic Pasquier, Fabien Ecalle, Antoine Savoie, Marcus Schiedung, Christopher Poeplau, Nadia I Maaroufi, Thomas Kätterer, Martin A Bolinder, Jonathan Sanderman, Pierre Barré

Soil organic carbon (SOC) models need independent evaluation against field measurements, but those latter are rarely publicly available and harmonized. In this study, we collected and shared data from 167 agronomic treatments in 34 agronomic long-term experiments (LTEs) located in temperate croplands, allowing the evaluation of several soil organic C models such as RothC, Century, AMG, MIMICS, ICBM, Millenial, and CTOOL. The dataset includes climate data, soil properties, C inputs from crops (n = 4588 records) and organic amendments, irrigation data, monthly soil cover, as well as SOC stock measurements in the topsoil layer (n = 1328 records). Climate, soil moisture, and soil temperature data were extracted from daily climate databases. Carbon inputs from crops were calculated from observed yields and harvest index, with some harvest index values estimated, combined with crop allometric coefficients from the literature. Descriptions of LTE, agronomic treatments, methodological metadata, and a part of the code, accompanies the dataset. The dataset can be reused to evaluate single SOC models, or to evaluate an ensemble of models.

土壤有机碳(SOC)模型需要独立的野外测量评估,但后者很少公开可用和协调。在本研究中,我们收集并共享了位于温带农田的34个农艺长期试验(LTEs)中167个农艺处理的数据,并对RothC、Century、AMG、MIMICS、ICBM、millennial和CTOOL等几种土壤有机C模型进行了评估。该数据集包括气候数据、土壤特性、作物碳输入(n = 4588条记录)和有机修正、灌溉数据、每月土壤覆盖以及表层土壤有机碳储量测量(n = 1328条记录)。气候、土壤湿度和土壤温度数据取自每日气候数据库。作物的碳输入根据观测产量和收获指数计算,并结合文献中的作物异速生长系数估算一些收获指数值。LTE的描述、农艺处理、方法学元数据和部分代码随数据集一起提供。数据集可以被重用来评估单个SOC模型,或者评估模型的集合。
{"title":"Data from long-term experiments in temperate croplands to evaluate soil organic carbon models.","authors":"Kenji Fujisaki, Fabien Ferchaud, Hugues Clivot, Elisa Bruni, Bertrand Guenet, Christian Pichot, Antoine Versini, François Baudin, Antonio Bispo, Philippe Peylin, Manuel P Martin, Johannes L Jensen, Jørgen Eriksen, Claire Chenu, Andrew S Gregory, Margaret J Glendining, Ines Merbach, Nicolas Beaudoin, Bruno Mary, Alain Mollier, Gilles Tison, Christophe Montagnier, Abad Chabbi, Françoise Vertès, Alice Cadéro, Anne-Isabelle Graux, Sylvain Pellerin, Florent Levavasseur, Manon Gilles, Thierry Morvan, Camille Resseguier, Luis Milesi, Alicia Irizar, Adriàn Andriulo, Marie-Noël Mistou, Arnaud Butier, Michel Bertrand, Bénédicte Autret, Marie-Hélène Jeuffroy, Gilles Grandeau, Thierry Doré, Vincent Cellier, Alain Berthier, Sébastien Darras, Guillaume Audebert, Ludovic Pasquier, Fabien Ecalle, Antoine Savoie, Marcus Schiedung, Christopher Poeplau, Nadia I Maaroufi, Thomas Kätterer, Martin A Bolinder, Jonathan Sanderman, Pierre Barré","doi":"10.1038/s41597-026-06863-7","DOIUrl":"https://doi.org/10.1038/s41597-026-06863-7","url":null,"abstract":"<p><p>Soil organic carbon (SOC) models need independent evaluation against field measurements, but those latter are rarely publicly available and harmonized. In this study, we collected and shared data from 167 agronomic treatments in 34 agronomic long-term experiments (LTEs) located in temperate croplands, allowing the evaluation of several soil organic C models such as RothC, Century, AMG, MIMICS, ICBM, Millenial, and CTOOL. The dataset includes climate data, soil properties, C inputs from crops (n = 4588 records) and organic amendments, irrigation data, monthly soil cover, as well as SOC stock measurements in the topsoil layer (n = 1328 records). Climate, soil moisture, and soil temperature data were extracted from daily climate databases. Carbon inputs from crops were calculated from observed yields and harvest index, with some harvest index values estimated, combined with crop allometric coefficients from the literature. Descriptions of LTE, agronomic treatments, methodological metadata, and a part of the code, accompanies the dataset. The dataset can be reused to evaluate single SOC models, or to evaluate an ensemble of models.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":" ","pages":""},"PeriodicalIF":6.9,"publicationDate":"2026-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146228399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Forty-year regional-scale dataset of shoreline change and nearshore wave conditions in Southeast Australia. 澳大利亚东南部海岸线变化和近岸波浪条件的40年区域尺度数据集。
IF 6.9 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2026-02-19 DOI: 10.1038/s41597-026-06859-3
Yongjing Mao, Kilian Vos, Laura Cagigal, Valentine Bodin, Mitchell D Harley, Kristen D Splinter

Coastal erosion at wave-dominated beaches, primarily driven by nearshore wave dynamics, poses a substantial challenge for coastal management. While existing datasets from individual beaches have improved our understanding of site-specific coastal morphodynamics, there is a growing demand for regional-scale datasets to understand and predict regional shoreline responses to climate variability. To address this, we present a combined shoreline and nearshore wave dataset for the wave-dominated coast of southeast Australia, comprising over 8,000 cross-shore transects at 100 m spacing for over 300 beaches. For each transect, satellite-derived shoreline positions (1984-2024) and beach-face slopes are provided, alongside hourly nearshore wave parameters (1979-2024) extracted at the 10 m depth contour. Shoreline data have been validated using available field surveys, and wave data have been assessed against offshore and nearshore buoy observations. This dataset provides a valuable resource for developing regional-scale understanding of shoreline variability along wave-dominated and embayed coastlines.

海岸侵蚀主要是由近岸波浪动力学驱动的,对海岸管理提出了重大挑战。虽然来自单个海滩的现有数据集提高了我们对特定地点海岸形态动力学的理解,但对区域尺度数据集的需求不断增长,以了解和预测区域海岸线对气候变率的响应。为了解决这个问题,我们提出了澳大利亚东南部以波浪为主的海岸的海岸线和近岸波浪数据集,包括300多个海滩的8000多个跨海岸横断面,间距为100米。对于每个样带,提供了卫星导出的海岸线位置(1984-2024)和海滩面坡度,以及在10米深度等值线上提取的每小时近岸波参数(1979-2024)。岸线数据已通过现有的实地调查得到验证,波浪数据已根据近海和近岸浮标观测结果进行评估。该数据集提供了一个宝贵的资源,用于开发沿波浪主导和海湾海岸线的海岸线变化的区域尺度的理解。
{"title":"A Forty-year regional-scale dataset of shoreline change and nearshore wave conditions in Southeast Australia.","authors":"Yongjing Mao, Kilian Vos, Laura Cagigal, Valentine Bodin, Mitchell D Harley, Kristen D Splinter","doi":"10.1038/s41597-026-06859-3","DOIUrl":"https://doi.org/10.1038/s41597-026-06859-3","url":null,"abstract":"<p><p>Coastal erosion at wave-dominated beaches, primarily driven by nearshore wave dynamics, poses a substantial challenge for coastal management. While existing datasets from individual beaches have improved our understanding of site-specific coastal morphodynamics, there is a growing demand for regional-scale datasets to understand and predict regional shoreline responses to climate variability. To address this, we present a combined shoreline and nearshore wave dataset for the wave-dominated coast of southeast Australia, comprising over 8,000 cross-shore transects at 100 m spacing for over 300 beaches. For each transect, satellite-derived shoreline positions (1984-2024) and beach-face slopes are provided, alongside hourly nearshore wave parameters (1979-2024) extracted at the 10 m depth contour. Shoreline data have been validated using available field surveys, and wave data have been assessed against offshore and nearshore buoy observations. This dataset provides a valuable resource for developing regional-scale understanding of shoreline variability along wave-dominated and embayed coastlines.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":" ","pages":""},"PeriodicalIF":6.9,"publicationDate":"2026-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146228407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine learning estimates for G20 subnational urban GHG emissions from 2000-2020. 机器学习对2000-2020年G20次国家级城市温室气体排放的估计。
IF 6.9 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2026-02-19 DOI: 10.1038/s41597-026-06691-9
Ying Yu, Xuewei Wang, Diego Manya, Angel Hsu

Reliable, comparable greenhouse gas (GHG) emissions data at the subnational level remain scarce, despite growing expectations for cities and regions to lead on climate action. Inconsistent reporting, methodological variation, and limited coverage of self-reported inventories hinder efforts to track progress and guide mitigation opportunities. To address these challenges, we develop a machine learning (ML) framework to estimate annual Scope 1 and 2 CO2-equivalent emissions for subnational jurisdictions in G20 countries from 2000 to 2020. Our approach integrates publicly available geospatial, socioeconomic, and environmental data with self-reported inventories where available, and aligns predictions with subnational administrative boundaries. Compared to traditional downscaling or proxy-based approaches, our model improves spatial relevance and predictive performance while capturing locally specific emission drivers. This globally consistent, administratively-aligned dataset can serve as a baseline for assessing climate progress, especially in data-poor or inconsistent reporting contexts, and supports more targeted, data-informed policy decisions for urban and regional decarbonization.

尽管人们对城市和地区带头采取气候行动的期望越来越高,但可靠的、可比较的次国家层面温室气体(GHG)排放数据仍然很少。报告不一致、方法差异和自我报告清单覆盖范围有限,阻碍了跟踪进展和指导缓解机会的努力。为了应对这些挑战,我们开发了一个机器学习(ML)框架,以估计2000年至2020年G20国家次国家管辖区每年的1类和2类二氧化碳当量排放量。我们的方法将公开的地理空间、社会经济和环境数据与可获得的自我报告清单相结合,并使预测与次国家行政边界保持一致。与传统的降尺度或基于代理的方法相比,我们的模型在捕获局部特定排放驱动因素的同时提高了空间相关性和预测性能。这一全球一致的、与行政部门保持一致的数据集可以作为评估气候进展的基线,特别是在数据匮乏或报告不一致的情况下,并支持更有针对性、更有数据依据的城市和区域脱碳政策决策。
{"title":"Machine learning estimates for G20 subnational urban GHG emissions from 2000-2020.","authors":"Ying Yu, Xuewei Wang, Diego Manya, Angel Hsu","doi":"10.1038/s41597-026-06691-9","DOIUrl":"https://doi.org/10.1038/s41597-026-06691-9","url":null,"abstract":"<p><p>Reliable, comparable greenhouse gas (GHG) emissions data at the subnational level remain scarce, despite growing expectations for cities and regions to lead on climate action. Inconsistent reporting, methodological variation, and limited coverage of self-reported inventories hinder efforts to track progress and guide mitigation opportunities. To address these challenges, we develop a machine learning (ML) framework to estimate annual Scope 1 and 2 CO<sub>2</sub>-equivalent emissions for subnational jurisdictions in G20 countries from 2000 to 2020. Our approach integrates publicly available geospatial, socioeconomic, and environmental data with self-reported inventories where available, and aligns predictions with subnational administrative boundaries. Compared to traditional downscaling or proxy-based approaches, our model improves spatial relevance and predictive performance while capturing locally specific emission drivers. This globally consistent, administratively-aligned dataset can serve as a baseline for assessing climate progress, especially in data-poor or inconsistent reporting contexts, and supports more targeted, data-informed policy decisions for urban and regional decarbonization.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":" ","pages":""},"PeriodicalIF":6.9,"publicationDate":"2026-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146228427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Proteomic dataset of MECP2-deficient and wild-type human brain organoids under spaceflight and ground conditions. 太空和地面条件下mecp2缺陷和野生型人脑类器官的蛋白质组学数据集。
IF 6.9 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2026-02-19 DOI: 10.1038/s41597-026-06881-5
Aline M A Martins, Diogo G Biagi, Blake L Tsu, Juliana de Saldanha da Gama Fischer, Luisa Bulcao Vieira Coelho, Paulo Costa Carvalho, Alysson R Muotri

This dataset contains mass spectrometry-based proteomic profiles of human brain organoids cultured on Earth for 30 days, then maintained aboard the International Space Station (ISS) for an additional 30 days, with matched ground controls that remained on Earth for the equivalent duration. Brain organoids were derived from induced pluripotent stem cell (iPSC) lines: Q83X, carrying a nonsense mutation in MECP2 from a male patient with Rett syndrome, and WT83, derived from the patient's unaffected familial control. Rett syndrome is a severe X-linked neurodevelopmental disorder caused by loss-of-function mutations in MECP2, which encodes Methyl-CpG-binding protein 2, a critical epigenetic regulator. The spaceflight experiment was conducted using cryovials with automated control maintenance. Deep proteome coverage with approximately 6,000 protein groups was inferred from 56,639 peptides. This dataset provides unique insights into how the space environment affects human neural tissue and MECP2-related pathologies, serving as a resource for understanding spaceflight-induced neurological changes and as a steppingstone for future space missions.

该数据集包含在地球上培养30天的人类大脑类器官的基于质谱的蛋白质组学图谱,然后在国际空间站(ISS)上再维持30天,同时在地球上保持相同时间的地面对照。脑类器官来源于诱导多能干细胞(iPSC)系:Q83X,携带来自Rett综合征男性患者的MECP2无义突变,以及WT83,来自患者未受影响的家族对照。Rett综合征是一种严重的x连锁神经发育障碍,由MECP2的功能丧失突变引起,MECP2编码甲基- cpg结合蛋白2,这是一种关键的表观遗传调节因子。航天实验采用自动控制维护的低温瓶进行。从56,639个肽中推断出大约6,000个蛋白质组的深度蛋白质组覆盖。该数据集提供了关于空间环境如何影响人类神经组织和mecp2相关病理的独特见解,可作为理解太空飞行诱导的神经变化的资源,并作为未来太空任务的踏脚石。
{"title":"Proteomic dataset of MECP2-deficient and wild-type human brain organoids under spaceflight and ground conditions.","authors":"Aline M A Martins, Diogo G Biagi, Blake L Tsu, Juliana de Saldanha da Gama Fischer, Luisa Bulcao Vieira Coelho, Paulo Costa Carvalho, Alysson R Muotri","doi":"10.1038/s41597-026-06881-5","DOIUrl":"https://doi.org/10.1038/s41597-026-06881-5","url":null,"abstract":"<p><p>This dataset contains mass spectrometry-based proteomic profiles of human brain organoids cultured on Earth for 30 days, then maintained aboard the International Space Station (ISS) for an additional 30 days, with matched ground controls that remained on Earth for the equivalent duration. Brain organoids were derived from induced pluripotent stem cell (iPSC) lines: Q83X, carrying a nonsense mutation in MECP2 from a male patient with Rett syndrome, and WT83, derived from the patient's unaffected familial control. Rett syndrome is a severe X-linked neurodevelopmental disorder caused by loss-of-function mutations in MECP2, which encodes Methyl-CpG-binding protein 2, a critical epigenetic regulator. The spaceflight experiment was conducted using cryovials with automated control maintenance. Deep proteome coverage with approximately 6,000 protein groups was inferred from 56,639 peptides. This dataset provides unique insights into how the space environment affects human neural tissue and MECP2-related pathologies, serving as a resource for understanding spaceflight-induced neurological changes and as a steppingstone for future space missions.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":" ","pages":""},"PeriodicalIF":6.9,"publicationDate":"2026-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146228511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Scientific Data
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1