首页 > 最新文献

Scientific Data最新文献

英文 中文
The magic, memory, and curiosity fMRI dataset of people viewing magic tricks. 观看魔术表演的人的魔术、记忆和好奇心 fMRI 数据集。
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-10-01 DOI: 10.1038/s41597-024-03675-5
Stefanie Meliss, Cristina Pascua-Martin, Jeremy I Skipper, Kou Murayama

Videos of magic tricks offer lots of opportunities to study the human mind. They violate the expectations of the viewer, causing prediction errors, misdirect attention, and elicit epistemic emotions. Herein we describe and share the Magic, Memory, and Curiosity (MMC) Dataset where 50 participants watched 36 magic tricks filmed and edited specifically for functional magnetic imaging (fMRI) experiments. The MMC Dataset includes a contextual incentive manipulation, curiosity ratings for the magic tricks, and incidental memory performance tested a week later. We additionally measured individual differences in working memory and constructs relevant to motivated learning. fMRI data were acquired before, during, and after learning. We show that both behavioural and fMRI data are of high quality, as indicated by basic validation analysis, i.e., variance decomposition as well as intersubject correlation and seed-based functional connectivity, respectively. The richness and complexity of the MMC Dataset will allow researchers to explore dynamic cognitive and motivational processes from various angles during task and rest.

魔术视频为研究人类心理提供了大量机会。它们违反了观众的预期,导致预测错误,误导注意力,并引发认识情绪。在这里,我们描述并分享了 "魔术、记忆和好奇心(MMC)数据集",50 名参与者观看了专门为功能磁成像(fMRI)实验拍摄和编辑的 36 个魔术。MMC 数据集包括情境激励操作、对魔术的好奇心评分以及一周后的偶然记忆测试。我们还测量了工作记忆的个体差异以及与动机学习相关的结构。fMRI 数据是在学习前、学习中和学习后获得的。通过基本的验证分析,即方差分解以及受试者间相关性和基于种子的功能连通性分析,我们发现行为和 fMRI 数据的质量都很高。MMC 数据集的丰富性和复杂性将使研究人员能够从不同角度探索任务和休息期间的动态认知和动机过程。
{"title":"The magic, memory, and curiosity fMRI dataset of people viewing magic tricks.","authors":"Stefanie Meliss, Cristina Pascua-Martin, Jeremy I Skipper, Kou Murayama","doi":"10.1038/s41597-024-03675-5","DOIUrl":"10.1038/s41597-024-03675-5","url":null,"abstract":"<p><p>Videos of magic tricks offer lots of opportunities to study the human mind. They violate the expectations of the viewer, causing prediction errors, misdirect attention, and elicit epistemic emotions. Herein we describe and share the Magic, Memory, and Curiosity (MMC) Dataset where 50 participants watched 36 magic tricks filmed and edited specifically for functional magnetic imaging (fMRI) experiments. The MMC Dataset includes a contextual incentive manipulation, curiosity ratings for the magic tricks, and incidental memory performance tested a week later. We additionally measured individual differences in working memory and constructs relevant to motivated learning. fMRI data were acquired before, during, and after learning. We show that both behavioural and fMRI data are of high quality, as indicated by basic validation analysis, i.e., variance decomposition as well as intersubject correlation and seed-based functional connectivity, respectively. The richness and complexity of the MMC Dataset will allow researchers to explore dynamic cognitive and motivational processes from various angles during task and rest.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11445505/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142361973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Madagascar rural observatory surveys, a longitudinal dataset on household living conditions 1995-2015. 马达加斯加农村观察站调查,1995-2015 年家庭生活条件纵向数据集。
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-09-30 DOI: 10.1038/s41597-024-03879-9
Velomalala Solo Andrianjafindrainibe, Nicole Andrianirina, Florent Bédécarrats, Isabelle Droy, Jean-Luc Dubois, Jeanne de Montalembert, Bako Nirina Rabevohitra, Rolland Rafidimanana, Patrick Rasolofo, Raphaël Ratovoarinony, Lalasoa Anjarafara Onivola Ratsaramiarina, Jean Dieudonné Ravelonandro, Voahirana Razanamavo, Mireille Razafindrakoto, Bezaka Rivolala, François Roubaud, Camille Saint-Macary

A Rural Observatory System (ROS) was established in Madagascar to address the lack of socioeconomic data on rural areas. It collected, analyzed, and disseminated data to help formulate and evaluate development policies. From 1995 to 2015, the ROS surveyed a total of 26 areas. The ROS methodology involved annual household panel surveys using consistent questionnaires supplemented by modules covering new themes. Qualitative community surveys were used to understand local features and dynamics. The site selection combined quantitative and qualitative insights to reflect the diversity of Madagascar's rural challenges. Quality control was comprehensive, with measures such as limiting the number of daily surveyor interviews and daily field supervision. By making this data available for 21 consecutive years, along with documentation, metadata, and code with analysis examples, we aim to facilitate their discovery, assessment, and understanding by researchers, policymakers, and social organizations. To our knowledge, this is the only available data for an in-depth analysis of the situation and trends in the rural areas of Madagascar.

马达加斯加建立了农村观察系统(ROS),以解决农村地区缺乏社会经济数据的问题。该系统收集、分析和传播数据,以帮助制定和评估发展政策。从 1995 年到 2015 年,ROS 共调查了 26 个地区。农村观测系统的方法包括每年进行家庭小组调查,使用一致的问卷,并以涵盖新主题的模块作为补充。定性社区调查用于了解当地特点和动态。选址结合了定量和定性分析,以反映马达加斯加农村挑战的多样性。质量控制非常全面,采取了限制每日调查员访谈次数和每日实地监督等措施。通过连续 21 年提供这些数据以及文档、元数据和带有分析示例的代码,我们旨在为研究人员、政策制定者和社会组织发现、评估和理解这些数据提供便利。据我们所知,这是深入分析马达加斯加农村地区状况和趋势的唯一可用数据。
{"title":"Madagascar rural observatory surveys, a longitudinal dataset on household living conditions 1995-2015.","authors":"Velomalala Solo Andrianjafindrainibe, Nicole Andrianirina, Florent Bédécarrats, Isabelle Droy, Jean-Luc Dubois, Jeanne de Montalembert, Bako Nirina Rabevohitra, Rolland Rafidimanana, Patrick Rasolofo, Raphaël Ratovoarinony, Lalasoa Anjarafara Onivola Ratsaramiarina, Jean Dieudonné Ravelonandro, Voahirana Razanamavo, Mireille Razafindrakoto, Bezaka Rivolala, François Roubaud, Camille Saint-Macary","doi":"10.1038/s41597-024-03879-9","DOIUrl":"10.1038/s41597-024-03879-9","url":null,"abstract":"<p><p>A Rural Observatory System (ROS) was established in Madagascar to address the lack of socioeconomic data on rural areas. It collected, analyzed, and disseminated data to help formulate and evaluate development policies. From 1995 to 2015, the ROS surveyed a total of 26 areas. The ROS methodology involved annual household panel surveys using consistent questionnaires supplemented by modules covering new themes. Qualitative community surveys were used to understand local features and dynamics. The site selection combined quantitative and qualitative insights to reflect the diversity of Madagascar's rural challenges. Quality control was comprehensive, with measures such as limiting the number of daily surveyor interviews and daily field supervision. By making this data available for 21 consecutive years, along with documentation, metadata, and code with analysis examples, we aim to facilitate their discovery, assessment, and understanding by researchers, policymakers, and social organizations. To our knowledge, this is the only available data for an in-depth analysis of the situation and trends in the rural areas of Madagascar.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11442863/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142353153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Chromosome-level genome assembly and annotation of Clanis bilineata tsingtauica Mell (Lepidoptera: Sphingidae). Chromosome-level genome assembly and annotation of Clanis bilineata tsingtauica Mell (Lepidoptera: Sphingidae).
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-09-30 DOI: 10.1038/s41597-024-03853-5
Yulu Yan, Ke Zhao, Longwei Yang, Nan Liu, Yufei Xu, Junyi Gai, Guangnan Xing

The soybean hawkmoth Clanis bilineata tsingtauica Mell (Lepidoptera, Sphingidae; CBT), as one of the main leaf-chewing pests of soybeans, has gained popularity as an edible insect in China recently due to its high nutritional value. However, high-quality genome of CBT remains unclear, which greatly limits further research. In the present study, we assembled a high-quality chromosome-level genome of CBT using PacBio HiFi reads and Hi-C technologies for the first time. The size of the assembled genome is 477.45 Mb with a contig N50 length of 17.43 Mb. After Hi-C scaffolding, the contigs were anchored to 29 chromosomes with a mapping rate of 99.61%. Benchmarking Universal Single-Copy Orthologues (BUSCO) completeness value is 99.49%. The genome contains 252.16 Mb of repeat elements and 14,214 protein-coding genes. In addition, chromosomal synteny analysis showed that the genome of CBT has a strong synteny with that of Manduca sexta. In conclusion, this high-quality genome provides an important resource for future studies of CBT and contributes to the development of integrated pest management strategies.

大豆鹰嘴夜蛾(Clanis bilineata tsingtauica Mell,鳞翅目,鞘翅目;CBT)是大豆的主要啃叶害虫之一,由于其营养价值高,近年来在中国作为食用昆虫受到人们的青睐。然而,CBT的高质量基因组仍然不清楚,这极大地限制了进一步的研究。在本研究中,我们首次利用 PacBio HiFi reads 和 Hi-C 技术组装了高质量的 CBT 染色体级基因组。组装的基因组大小为 477.45 Mb,等位基因 N50 长度为 17.43 Mb。经过Hi-C脚手架处理后,等位基因被锚定到29条染色体上,映射率为99.61%。基准通用单拷贝同源物(BUSCO)完整性值为 99.49%。基因组包含 252.16 Mb 的重复元件和 14,214 个编码蛋白质的基因。此外,染色体同源分析表明,CBT 的基因组与六芒星的基因组有很强的同源关系。总之,这个高质量的基因组为未来的 CBT 研究提供了重要资源,并有助于害虫综合治理策略的开发。
{"title":"Chromosome-level genome assembly and annotation of Clanis bilineata tsingtauica Mell (Lepidoptera: Sphingidae).","authors":"Yulu Yan, Ke Zhao, Longwei Yang, Nan Liu, Yufei Xu, Junyi Gai, Guangnan Xing","doi":"10.1038/s41597-024-03853-5","DOIUrl":"10.1038/s41597-024-03853-5","url":null,"abstract":"<p><p>The soybean hawkmoth Clanis bilineata tsingtauica Mell (Lepidoptera, Sphingidae; CBT), as one of the main leaf-chewing pests of soybeans, has gained popularity as an edible insect in China recently due to its high nutritional value. However, high-quality genome of CBT remains unclear, which greatly limits further research. In the present study, we assembled a high-quality chromosome-level genome of CBT using PacBio HiFi reads and Hi-C technologies for the first time. The size of the assembled genome is 477.45 Mb with a contig N50 length of 17.43 Mb. After Hi-C scaffolding, the contigs were anchored to 29 chromosomes with a mapping rate of 99.61%. Benchmarking Universal Single-Copy Orthologues (BUSCO) completeness value is 99.49%. The genome contains 252.16 Mb of repeat elements and 14,214 protein-coding genes. In addition, chromosomal synteny analysis showed that the genome of CBT has a strong synteny with that of Manduca sexta. In conclusion, this high-quality genome provides an important resource for future studies of CBT and contributes to the development of integrated pest management strategies.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11443141/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142353140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Chromosome-level genome assembly of the bay scallop Argopecten irradians. 海湾扇贝 Argopecten irradians 染色体级基因组组装。
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-09-28 DOI: 10.1038/s41597-024-03904-x
Denis Grouzdev, Emmanuelle Pales Espinosa, Stephen Tettelbach, Sarah Farhat, Arnaud Tanguy, Isabelle Boutet, Nadège Guiglielmoni, Jean-François Flot, Harrison Tobi, Bassem Allam

The bay scallop, Argopecten irradians, is a species of major commercial, cultural, and ecological importance. It is endemic to the eastern coast of the United States, but has also been introduced to China, where it supports a significant aquaculture industry. Here, we provide an annotated chromosome-level reference genome assembly for the bay scallop, assembled using PacBio and Hi-C data. The total genome size is 845.9 Mb, distributed over 1,503 scaffolds with a scaffold N50 of 44.3 Mb. The majority (92.9%) of the assembled genome is contained within the 16 largest scaffolds, corresponding to the 16 chromosomes confirmed by Hi-C analysis. The assembly also includes the complete mitochondrial genome. Approximately 36.2% of the genome consists of repetitive elements. The BUSCO analysis showed a completeness of 96.2%. We identified 33,772 protein-coding genes. This genome assembly will be a valuable resource for future research on evolutionary dynamics, adaptive mechanisms, and will support genome-assisted breeding, contributing to the conservation and management of this iconic species in the face of environmental and pathogenic challenges.

海湾扇贝(Argopecten irradians)是一种具有重要商业、文化和生态意义的物种。它是美国东海岸的特有物种,但也被引入中国,并在中国支撑起了一个重要的水产养殖业。在这里,我们利用 PacBio 和 Hi-C 数据为海湾扇贝提供了染色体组水平的参考基因组注释。基因组总大小为 845.9 Mb,分布在 1,503 个支架上,支架 N50 为 44.3 Mb。组装基因组的大部分(92.9%)包含在 16 个最大的支架中,与 Hi-C 分析确认的 16 条染色体相对应。该基因组还包括完整的线粒体基因组。约 36.2% 的基因组由重复元件组成。BUSCO 分析显示其完整性为 96.2%。我们确定了 33772 个编码蛋白质的基因。该基因组组装将成为未来研究进化动态和适应机制的宝贵资源,并将支持基因组辅助育种,为保护和管理这一面临环境和病原体挑战的标志性物种做出贡献。
{"title":"Chromosome-level genome assembly of the bay scallop Argopecten irradians.","authors":"Denis Grouzdev, Emmanuelle Pales Espinosa, Stephen Tettelbach, Sarah Farhat, Arnaud Tanguy, Isabelle Boutet, Nadège Guiglielmoni, Jean-François Flot, Harrison Tobi, Bassem Allam","doi":"10.1038/s41597-024-03904-x","DOIUrl":"https://doi.org/10.1038/s41597-024-03904-x","url":null,"abstract":"<p><p>The bay scallop, Argopecten irradians, is a species of major commercial, cultural, and ecological importance. It is endemic to the eastern coast of the United States, but has also been introduced to China, where it supports a significant aquaculture industry. Here, we provide an annotated chromosome-level reference genome assembly for the bay scallop, assembled using PacBio and Hi-C data. The total genome size is 845.9 Mb, distributed over 1,503 scaffolds with a scaffold N50 of 44.3 Mb. The majority (92.9%) of the assembled genome is contained within the 16 largest scaffolds, corresponding to the 16 chromosomes confirmed by Hi-C analysis. The assembly also includes the complete mitochondrial genome. Approximately 36.2% of the genome consists of repetitive elements. The BUSCO analysis showed a completeness of 96.2%. We identified 33,772 protein-coding genes. This genome assembly will be a valuable resource for future research on evolutionary dynamics, adaptive mechanisms, and will support genome-assisted breeding, contributing to the conservation and management of this iconic species in the face of environmental and pathogenic challenges.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11439060/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142353141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The W2024 database of the water isotopologue H 2 16 O . W2024 水同位素 H 2 16 O 数据库。
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-09-28 DOI: 10.1038/s41597-024-03847-3
Tibor Furtenbacher, Roland Tóbiás, Jonathan Tennyson, Robert R Gamache, Attila G Császár

The rovibrational spectrum of the water molecule is the crown jewel of high-resolution molecular spectroscopy. While its significance in numerous scientific and engineering applications and the challenges behind its interpretation have been well known, the extensive experimental analysis performed for this molecule, from the microwave to the ultraviolet, is admirable. To determine empirical energy levels for H 2 16 O , this study utilizes an improved version of the MARVEL (Measured Active Rotational-Vibrational Energy Levels) scheme, which now takes into account multiplet constraints and first-principles energy-level splittings. This analysis delivers 19027 empirical energy values, with individual uncertainties and confidence intervals, utilizing 309 290 transition wavenumbers collected from 189 (mostly experimental) data sources. Relying on these empirical, as well as some computed, energies and first-principles intensities, an extensive composite line list, named CW2024, has been assembled. The CW2024 dataset is compared to lines in the canonical HITRAN 2020 spectroscopic database, providing guidance for future experimental investigations.

水分子的振动光谱是高分辨率分子光谱学的皇冠上的明珠。虽然它在众多科学和工程应用中的重要性及其解释背后的挑战已众所周知,但对该分子进行的从微波到紫外线的广泛实验分析令人钦佩。为了确定 H 2 16 O 的经验能级,本研究采用了 MARVEL(测量到的有源旋转振动能级)方案的改进版本,该方案现在考虑到了多重约束和第一原理能级分裂。这项分析利用从 189 个数据源(主要是实验数据源)收集到的 309 290 个转变波文数,提供了 19027 个经验能量值,其中包括各个不确定性和置信区间。根据这些经验值以及一些计算值、能量和第一原理强度,我们编制了一份内容广泛的复合线表,命名为 CW2024。CW2024 数据集与 HITRAN 2020 光谱数据库中的典型谱线进行了比较,为未来的实验研究提供了指导。
{"title":"<ArticleTitle xmlns:ns0=\"http://www.w3.org/1998/Math/MathML\">The W2024 database of the water isotopologue <ns0:math> <ns0:msubsup> <ns0:mrow> <ns0:mrow><ns0:mrow><ns0:mi>H</ns0:mi></ns0:mrow> </ns0:mrow> </ns0:mrow> <ns0:mrow><ns0:mn>2</ns0:mn></ns0:mrow> <ns0:mrow><ns0:mspace /> <ns0:mn>16</ns0:mn></ns0:mrow> </ns0:msubsup> <ns0:mrow><ns0:mrow><ns0:mi>O</ns0:mi></ns0:mrow> </ns0:mrow></ns0:math>.","authors":"Tibor Furtenbacher, Roland Tóbiás, Jonathan Tennyson, Robert R Gamache, Attila G Császár","doi":"10.1038/s41597-024-03847-3","DOIUrl":"https://doi.org/10.1038/s41597-024-03847-3","url":null,"abstract":"<p><p>The rovibrational spectrum of the water molecule is the crown jewel of high-resolution molecular spectroscopy. While its significance in numerous scientific and engineering applications and the challenges behind its interpretation have been well known, the extensive experimental analysis performed for this molecule, from the microwave to the ultraviolet, is admirable. To determine empirical energy levels for <math> <msubsup> <mrow> <mrow><mrow><mi>H</mi></mrow> </mrow> </mrow> <mrow><mn>2</mn></mrow> <mrow><mspace></mspace> <mn>16</mn></mrow> </msubsup> <mrow><mrow><mi>O</mi></mrow> </mrow> </math> , this study utilizes an improved version of the MARVEL (Measured Active Rotational-Vibrational Energy Levels) scheme, which now takes into account multiplet constraints and first-principles energy-level splittings. This analysis delivers 19027 empirical energy values, with individual uncertainties and confidence intervals, utilizing 309 290 transition wavenumbers collected from 189 (mostly experimental) data sources. Relying on these empirical, as well as some computed, energies and first-principles intensities, an extensive composite line list, named CW2024, has been assembled. The CW2024 dataset is compared to lines in the canonical HITRAN 2020 spectroscopic database, providing guidance for future experimental investigations.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11439062/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142353129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Chromosome-scale and haplotype-resolved genome assembly of the autotetraploid Misgurnus anguillicaudatus. 自交四倍体 Misgurnus anguillicaudatus 的染色体尺度和单倍型解析基因组组装。
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-09-28 DOI: 10.1038/s41597-024-03891-z
Bing Sun, Qingshan Li, Yihui Mei, Yunbang Zhang, Yuxuan Zheng, Yuwei Huang, Xinxin Xiao, Jianwei Zhang, Gao Jian, Xiaojuan Cao

In nature, diploids and tetraploids are two common types of polyploid evolution. Misgurnus anguillicaudatus (mud loach) is a remarkable fish species that exhibits both diploid and tetraploid forms. However, reconstructing the four haplotypes of its autotetraploid genome remains unresolved. Here, we generated the first haplotype-resolved, chromosome-level genome of autotetraploid M. anguillicaudatus with a size of 4.76 Gb, contig N50 of 6.78 Mb, and scaffold N50 of 44.11 Mb. We identified approximately 2.9 Gb (61.03% of genome) of repetitive sequences and predicted 91,485 protein-coding genes. Moreover, allelic gene expression levels indicated the absence of significant dominant haplotypes within the autotetraploid loach genome. This genome will provide a valuable biological model for unraveling the mechanisms of polyploid formation and evolution, adaptation to environmental changes, and benefit for aquaculture applications and biodiversity conservation.

在自然界中,二倍体和四倍体是多倍体进化的两种常见类型。泥鳅(Misgurnus anguillicaudatus)是一种既有二倍体又有四倍体的特殊鱼类。然而,重建其自体四倍体基因组的四个单倍型仍是一个悬而未决的问题。在这里,我们首次生成了单倍型解析的自交四倍体鳗鲡染色体级基因组,其大小为 4.76 Gb,等位基因 N50 为 6.78 Mb,支架 N50 为 44.11 Mb。我们发现了约 2.9 Gb(占基因组的 61.03%)的重复序列,并预测了 91,485 个编码蛋白质的基因。此外,等位基因表达水平表明,自交四倍体泥鳅基因组中不存在明显的显性单倍型。该基因组将为揭示多倍体形成和进化机制、适应环境变化提供宝贵的生物模型,并有利于水产养殖应用和生物多样性保护。
{"title":"Chromosome-scale and haplotype-resolved genome assembly of the autotetraploid Misgurnus anguillicaudatus.","authors":"Bing Sun, Qingshan Li, Yihui Mei, Yunbang Zhang, Yuxuan Zheng, Yuwei Huang, Xinxin Xiao, Jianwei Zhang, Gao Jian, Xiaojuan Cao","doi":"10.1038/s41597-024-03891-z","DOIUrl":"https://doi.org/10.1038/s41597-024-03891-z","url":null,"abstract":"<p><p>In nature, diploids and tetraploids are two common types of polyploid evolution. Misgurnus anguillicaudatus (mud loach) is a remarkable fish species that exhibits both diploid and tetraploid forms. However, reconstructing the four haplotypes of its autotetraploid genome remains unresolved. Here, we generated the first haplotype-resolved, chromosome-level genome of autotetraploid M. anguillicaudatus with a size of 4.76 Gb, contig N50 of 6.78 Mb, and scaffold N50 of 44.11 Mb. We identified approximately 2.9 Gb (61.03% of genome) of repetitive sequences and predicted 91,485 protein-coding genes. Moreover, allelic gene expression levels indicated the absence of significant dominant haplotypes within the autotetraploid loach genome. This genome will provide a valuable biological model for unraveling the mechanisms of polyploid formation and evolution, adaptation to environmental changes, and benefit for aquaculture applications and biodiversity conservation.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11438953/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142353142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
1.5 million materials narratives generated by chatbots. 聊天机器人生成了 150 万条材料叙述。
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-09-28 DOI: 10.1038/s41597-024-03886-w
Yang Jeong Park, Sung Eun Jerng, Sungroh Yoon, Ju Li

The advent of artificial intelligence (AI) has enabled a comprehensive exploration of materials for various applications. However, AI models often prioritize frequently encountered material examples in the scientific literature, limiting the selection of suitable candidates based on inherent physical and chemical attributes. To address this imbalance, we generated a dataset consisting of 1,453,493 natural language-material narratives from OQMD, Materials Project, JARVIS, and AFLOW2 databases based on ab initio calculation results that are more evenly distributed across the periodic table. The generated text narratives were then scored by both human experts and GPT-4, based on three rubrics: technical accuracy, language and structure, and relevance and depth of content, showing similar scores but with human-scored depth of content being the most lagging. The integration of multimodal data sources and large language models holds immense potential for AI frameworks to aid the exploration and discovery of solid-state materials for specific applications of interest.

人工智能(AI)的出现使人们能够全面探索各种应用材料。然而,人工智能模型往往优先考虑科学文献中经常出现的材料实例,从而限制了根据固有物理和化学属性选择合适的候选材料。为了解决这一不平衡问题,我们从 OQMD、Materials Project、JARVIS 和 AFLOW2 数据库中生成了一个由 1,453,493 篇自然语言材料叙述组成的数据集,该数据集基于在整个元素周期表中分布较为均匀的 ab initio 计算结果。然后,人类专家和 GPT-4 根据技术准确性、语言和结构以及内容的相关性和深度三个评分标准对生成的文本叙述进行评分,结果显示得分相近,但人类评分的内容深度最为滞后。多模态数据源与大型语言模型的整合为人工智能框架提供了巨大的潜力,有助于探索和发现固态材料的特定应用。
{"title":"1.5 million materials narratives generated by chatbots.","authors":"Yang Jeong Park, Sung Eun Jerng, Sungroh Yoon, Ju Li","doi":"10.1038/s41597-024-03886-w","DOIUrl":"https://doi.org/10.1038/s41597-024-03886-w","url":null,"abstract":"<p><p>The advent of artificial intelligence (AI) has enabled a comprehensive exploration of materials for various applications. However, AI models often prioritize frequently encountered material examples in the scientific literature, limiting the selection of suitable candidates based on inherent physical and chemical attributes. To address this imbalance, we generated a dataset consisting of 1,453,493 natural language-material narratives from OQMD, Materials Project, JARVIS, and AFLOW2 databases based on ab initio calculation results that are more evenly distributed across the periodic table. The generated text narratives were then scored by both human experts and GPT-4, based on three rubrics: technical accuracy, language and structure, and relevance and depth of content, showing similar scores but with human-scored depth of content being the most lagging. The integration of multimodal data sources and large language models holds immense potential for AI frameworks to aid the exploration and discovery of solid-state materials for specific applications of interest.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11439064/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142353130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assessing semantic interoperability in environmental sciences: variety of approaches and semantic artefacts. 评估环境科学中的语义互操作性:各种方法和语义人工制品。
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-09-27 DOI: 10.1038/s41597-024-03669-3
Cristina Di Muri, Martina Pulieri, Davide Raho, Alexandra N Muresan, Andrea Tarallo, Jessica Titocci, Enrica Nestola, Alberto Basset, Sabrina Mazzoni, Ilaria Rosati

The integration and reuse of digital research products can be only ensured through the adoption of machine-actionable (meta)data standards enriched with semantic artefacts. This study compiles 540 semantic artefacts in environmental sciences to: i. examine their coverage in scientific domains and topics; ii. assess key aspects of their FAIRness; and iii. evaluate management and governance concerns. The analyses showed that the majority of semantic artefacts concern the terrestrial biosphere domain, and that a small portion of the total failed to meet the FAIR principles. For example, 5.5% of semantic artefacts were not available in semantic catalogues, 8% were not built with standard model languages and formats, 24.6% were published without usage licences and 22.4% without version information or with divergent versions across catalogues in which they were available. This investigation discusses common semantic practices, outlines existing gaps and suggests potential solutions to address semantic interoperability challenges in some of the resources originally designed to guarantee it.

只有通过采用富含语义人工制品的机器可操作(元)数据标准,才能确保数字研究产品的集成和再利用。本研究汇编了环境科学领域的 540 个语义人工制品,以便:i. 检查其在科学领域和主题中的覆盖范围;ii. 评估其公平合理性的关键方面;iii. 评估管理和治理方面的问题。分析表明,大多数语义人工制品涉及陆地生物圈领域,其中一小部分不符合 FAIR 原则。例如,5.5%的语义人工制品未在语义目录中提供,8%的人工制品未使用标准模型语言和格式构建,24.6%的人工制品在发布时未获得使用许可,22.4%的人工制品未提供版本信息,或在提供这些人工制品的目录中存在不同版本。这项调查讨论了常见的语义实践,概述了现有的差距,并提出了潜在的解决方案,以应对一些原本旨在保证语义互操作性的资源所面临的语义互操作性挑战。
{"title":"Assessing semantic interoperability in environmental sciences: variety of approaches and semantic artefacts.","authors":"Cristina Di Muri, Martina Pulieri, Davide Raho, Alexandra N Muresan, Andrea Tarallo, Jessica Titocci, Enrica Nestola, Alberto Basset, Sabrina Mazzoni, Ilaria Rosati","doi":"10.1038/s41597-024-03669-3","DOIUrl":"https://doi.org/10.1038/s41597-024-03669-3","url":null,"abstract":"<p><p>The integration and reuse of digital research products can be only ensured through the adoption of machine-actionable (meta)data standards enriched with semantic artefacts. This study compiles 540 semantic artefacts in environmental sciences to: i. examine their coverage in scientific domains and topics; ii. assess key aspects of their FAIRness; and iii. evaluate management and governance concerns. The analyses showed that the majority of semantic artefacts concern the terrestrial biosphere domain, and that a small portion of the total failed to meet the FAIR principles. For example, 5.5% of semantic artefacts were not available in semantic catalogues, 8% were not built with standard model languages and formats, 24.6% were published without usage licences and 22.4% without version information or with divergent versions across catalogues in which they were available. This investigation discusses common semantic practices, outlines existing gaps and suggests potential solutions to address semantic interoperability challenges in some of the resources originally designed to guarantee it.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11437166/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142353137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
NPreciSe - An Automated Satellite Precipitation Product Assessment Tool. NPreciSe - 卫星降水产品自动评估工具。
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-09-27 DOI: 10.1038/s41597-024-03877-x
Malarvizhi Arulraj, Veljko Petković, Susan Wen, Ralph R Ferraro, Huan Meng

Satellite-based Quantitative Precipitation Estimates (QPE) are indirect estimates of precipitation rates and as such are often prone to errors, warranting a need for characterizing the associated uncertainties before being used in application-specific studies. Moreover, multiple satellite-based QPE products are offered through different agencies, each with their own specifications, formats and requirements, posing a challenge to understanding the products uncertainties. This manuscript presents a standardized validation system named NPreciSe - NOAA Satellite-based Precipitation Validation System, which assesses the performance of satellite-based precipitation products in near real-time over the continental United States. NPreciSe is coupled with a user-interactive web platform and built using an open-source software, Python. It is structured to help (1) the end-users determine the best satellite QPE for their specific application, and (2) the algorithm developers identify systematic biases in QPE retrievals. This manuscript presents the capabilities of the NPreciSe, discusses the methodology adopted in developing the standardized validation system, and introduces the web portal.

基于卫星的定量降水估算(QPE)是对降水率的间接估算,因此往往容易出现误差,因此在用于特定应用研究之前需要确定相关不确定性的特征。此外,不同机构提供了多种基于卫星的 QPE 产品,每种产品都有自己的规格、格式和要求,这给了解产品的不确定性带来了挑战。本手稿介绍了一个标准化验证系统,名为 NPreciSe - NOAA 星基降水验证系统,该系统对美国大陆上空的星基降水产品性能进行近实时评估。NPreciSe 与用户交互式网络平台相结合,使用开源软件 Python 构建。它的结构可帮助(1)最终用户确定最适合其特定应用的卫星 QPE,以及(2)算法开发人员识别 QPE 检索中的系统性偏差。本手稿介绍了 NPreciSe 的功能,讨论了开发标准化验证系统所采用的方法,并介绍了门户网站。
{"title":"NPreciSe - An Automated Satellite Precipitation Product Assessment Tool.","authors":"Malarvizhi Arulraj, Veljko Petković, Susan Wen, Ralph R Ferraro, Huan Meng","doi":"10.1038/s41597-024-03877-x","DOIUrl":"https://doi.org/10.1038/s41597-024-03877-x","url":null,"abstract":"<p><p>Satellite-based Quantitative Precipitation Estimates (QPE) are indirect estimates of precipitation rates and as such are often prone to errors, warranting a need for characterizing the associated uncertainties before being used in application-specific studies. Moreover, multiple satellite-based QPE products are offered through different agencies, each with their own specifications, formats and requirements, posing a challenge to understanding the products uncertainties. This manuscript presents a standardized validation system named NPreciSe - NOAA Satellite-based Precipitation Validation System, which assesses the performance of satellite-based precipitation products in near real-time over the continental United States. NPreciSe is coupled with a user-interactive web platform and built using an open-source software, Python. It is structured to help (1) the end-users determine the best satellite QPE for their specific application, and (2) the algorithm developers identify systematic biases in QPE retrievals. This manuscript presents the capabilities of the NPreciSe, discusses the methodology adopted in developing the standardized validation system, and introduces the web portal.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11437106/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142353156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
European Union crop map 2022: Earth observation's 10-meter dive into Europe's crop tapestry. 欧洲联盟 2022 年作物地图:地球观测以 10 米的深度深入欧洲的作物织锦。
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-09-27 DOI: 10.1038/s41597-024-03884-y
Babak Ghassemi, Emma Izquierdo-Verdiguier, Astrid Verhegghen, Momchil Yordanov, Guido Lemoine, Álvaro Moreno Martínez, Davide De Marchi, Marijn van der Velde, Francesco Vuolo, Raphaël d'Andrimont

To provide the information needed for a detailed monitoring of crop types across the European Union (EU), we present an advanced 10-metre resolution map for the EU and Ukraine with 19 crop types for 2022, updating the 2018 version. Using Earth Observation (EO) and in-situ data from Eurostat's Land Use and Coverage Area Frame Survey (LUCAS) 2022, the methodology included 134,684 LUCAS Copernicus polygons, Sentinel-1 and Sentinel-2 satellite imagery, land surface temperature and a digital elevation model. Based on this data, two classification layers were developed using a Random Forest machine learning approach: a primary map and a gap-filling map to address cloud-covered gaps. The combined maps, covering 27 EU countries, show an overall accuracy of 79.3% for seven major land cover classes and 70.6% for all 19 crop types. The trained model was used to derive the 2022 map for Ukraine, demonstrating its robustness even in regions without labelled samples for model training.

为了提供对整个欧盟(EU)作物类型进行详细监测所需的信息,我们在 2018 年版本的基础上,为欧盟和乌克兰提供了一张先进的 10 米分辨率地图,其中包含 2022 年的 19 种作物类型。该方法使用来自欧盟统计局 2022 年土地利用和覆盖区框架调查(LUCAS)的地球观测(EO)和原位数据,包括 134,684 个 LUCAS 哥白尼多边形、哨兵-1 和哨兵-2 卫星图像、地表温度和数字高程模型。在这些数据的基础上,利用随机森林机器学习方法开发了两个分类层:一个是主要地图,另一个是填补云层空白的地图。综合地图覆盖了 27 个欧盟国家,显示七种主要土地覆被类别的总体准确率为 79.3%,所有 19 种作物类型的准确率为 70.6%。经过训练的模型被用于绘制乌克兰的 2022 年地图,这表明即使在没有用于模型训练的标记样本的地区,该模型也具有很强的鲁棒性。
{"title":"European Union crop map 2022: Earth observation's 10-meter dive into Europe's crop tapestry.","authors":"Babak Ghassemi, Emma Izquierdo-Verdiguier, Astrid Verhegghen, Momchil Yordanov, Guido Lemoine, Álvaro Moreno Martínez, Davide De Marchi, Marijn van der Velde, Francesco Vuolo, Raphaël d'Andrimont","doi":"10.1038/s41597-024-03884-y","DOIUrl":"https://doi.org/10.1038/s41597-024-03884-y","url":null,"abstract":"<p><p>To provide the information needed for a detailed monitoring of crop types across the European Union (EU), we present an advanced 10-metre resolution map for the EU and Ukraine with 19 crop types for 2022, updating the 2018 version. Using Earth Observation (EO) and in-situ data from Eurostat's Land Use and Coverage Area Frame Survey (LUCAS) 2022, the methodology included 134,684 LUCAS Copernicus polygons, Sentinel-1 and Sentinel-2 satellite imagery, land surface temperature and a digital elevation model. Based on this data, two classification layers were developed using a Random Forest machine learning approach: a primary map and a gap-filling map to address cloud-covered gaps. The combined maps, covering 27 EU countries, show an overall accuracy of 79.3% for seven major land cover classes and 70.6% for all 19 crop types. The trained model was used to derive the 2022 map for Ukraine, demonstrating its robustness even in regions without labelled samples for model training.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11436679/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142353148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Scientific Data
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1