首页 > 最新文献

Scientific Data最新文献

英文 中文
Ancient Yi Script Handwriting Sample Repository. 古彝文手写样本库。
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-10-30 DOI: 10.1038/s41597-024-03918-5
Xiaojuan Liu, Xu Han, Shanxiong Chen, Weijia Dai, Qiuyue Ruan

The ancient Yi script has been used for over 8000 years, which can be ranked with Oracle,Sumerian,Egyptian,Mayan and Harappan,and is one of the six ancient scripts in the world. In this article, we collected 2922 handwritten single word samples of commonly used ancient Yi characters. Each character was written by 310 people respectively, with a total of 427,939 valid characters. We completed continuous handwritten text sampling, written by 250 people, with 5 texts per person, covering topics such as Yi astronomy, geography, rituals, and agriculture. In the process of data collection, we proposed an automatic sampling method for ancient Yi script, and completed the automatic cutting and labeling of handwritten samples. Furthermore, we tested the recognition performance of the sorted data set under different deep learning network models. The results show that ancient Yi script has diverse shape structures and rich writing styles, which can be used as a benchmark data set in related fields such as handwritten text recognition and handwritten text generation.

古彝文已有 8000 多年的历史,与甲骨文、苏美尔文、埃及文、玛雅文、哈拉帕文齐名,是世界六大古文字之一。本文收集了 2922 个常用古彝文手写单字样本。每个字分别由 310 人书写,共计 427 939 个有效字。我们完成了由 250 人书写的连续手写文本采样,每人 5 篇,内容涉及彝族天文、地理、礼仪、农业等。在数据采集过程中,我们提出了彝文古文字自动采样方法,并完成了手写样本的自动切割和标注。此外,我们还测试了分类数据集在不同深度学习网络模型下的识别性能。结果表明,古彝文具有多样的形状结构和丰富的书写风格,可以作为手写文字识别和手写文字生成等相关领域的基准数据集。
{"title":"Ancient Yi Script Handwriting Sample Repository.","authors":"Xiaojuan Liu, Xu Han, Shanxiong Chen, Weijia Dai, Qiuyue Ruan","doi":"10.1038/s41597-024-03918-5","DOIUrl":"10.1038/s41597-024-03918-5","url":null,"abstract":"<p><p>The ancient Yi script has been used for over 8000 years, which can be ranked with Oracle,Sumerian,Egyptian,Mayan and Harappan,and is one of the six ancient scripts in the world. In this article, we collected 2922 handwritten single word samples of commonly used ancient Yi characters. Each character was written by 310 people respectively, with a total of 427,939 valid characters. We completed continuous handwritten text sampling, written by 250 people, with 5 texts per person, covering topics such as Yi astronomy, geography, rituals, and agriculture. In the process of data collection, we proposed an automatic sampling method for ancient Yi script, and completed the automatic cutting and labeling of handwritten samples. Furthermore, we tested the recognition performance of the sorted data set under different deep learning network models. The results show that ancient Yi script has diverse shape structures and rich writing styles, which can be used as a benchmark data set in related fields such as handwritten text recognition and handwritten text generation.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"11 1","pages":"1183"},"PeriodicalIF":5.8,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11526026/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142547174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
NPKGRIDS: a global georeferenced dataset of N, P2O5, and K2O fertilizer application rates for 173 crops. NPKGRIDS:173 种作物的氮、五氧化二磷和氧化钾施肥量的全球地理参照数据集。
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-10-30 DOI: 10.1038/s41597-024-04030-4
Thu Ha Nguyen, Fiona H M Tang, Giulia Conchedda, Leon Casse, Griffiths Obli-Laryea, Francesco N Tubiello, Federico Maggi

We introduce NPKGRIDS, a new geospatial dataset, providing for the first time data on application rates for all three main plant nutrients, nitrogen (N), phosphorus (P, in terms of phosphorus pentoxide, P2O5) and potassium (K, in terms of potassium oxide, K2O) across 173 crops as of 2020, with a geospatial resolution of 0.05° (approximately 5.6 km at the equator). Development of NPKGRIDS adopted a data fusion approach to integrate crop mask information with eight published datasets of fertilizer application rates, compiled from either georeferenced data or national and subnational statistics. Furthermore, the total applied mass of N, P2O5, and K2O were benchmarked against the country level information from FAO and the International Fertilizers Association (IFA) and validated against data available from National Statistical Offices (NSOs). NPKGRIDS can be used in global modelling, and decision and policy making to help maximize crop yields while reducing environmental impacts.

我们介绍的 NPKGRIDS 是一个新的地理空间数据集,它首次提供了截至 2020 年 173 种作物的所有三种主要植物养分,即氮(N)、磷(P,以五氧化二磷表示)和钾(K,以氧化钾表示)的施肥量数据,地理空间分辨率为 0.05°(赤道约 5.6 千米)。NPKGRIDS 的开发采用了数据融合方法,将作物掩膜信息与八个已发布的化肥施用量数据集整合在一起,这些数据集由地理参照数据或国家和国家以下各级统计数据编制而成。此外,N、P2O5 和 K2O 的总施用量以粮农组织和国际肥料协会 (IFA) 提供的国家级信息为基准,并与国家统计局 (NSO) 提供的数据进行了验证。NPKGRIDS 可用于全球建模、决策和政策制定,以帮助最大限度地提高作物产量,同时减少对环境的影响。
{"title":"NPKGRIDS: a global georeferenced dataset of N, P<sub>2</sub>O<sub>5</sub>, and K<sub>2</sub>O fertilizer application rates for 173 crops.","authors":"Thu Ha Nguyen, Fiona H M Tang, Giulia Conchedda, Leon Casse, Griffiths Obli-Laryea, Francesco N Tubiello, Federico Maggi","doi":"10.1038/s41597-024-04030-4","DOIUrl":"10.1038/s41597-024-04030-4","url":null,"abstract":"<p><p>We introduce NPKGRIDS, a new geospatial dataset, providing for the first time data on application rates for all three main plant nutrients, nitrogen (N), phosphorus (P, in terms of phosphorus pentoxide, P<sub>2</sub>O<sub>5</sub>) and potassium (K, in terms of potassium oxide, K<sub>2</sub>O) across 173 crops as of 2020, with a geospatial resolution of 0.05° (approximately 5.6 km at the equator). Development of NPKGRIDS adopted a data fusion approach to integrate crop mask information with eight published datasets of fertilizer application rates, compiled from either georeferenced data or national and subnational statistics. Furthermore, the total applied mass of N, P<sub>2</sub>O<sub>5</sub>, and K<sub>2</sub>O were benchmarked against the country level information from FAO and the International Fertilizers Association (IFA) and validated against data available from National Statistical Offices (NSOs). NPKGRIDS can be used in global modelling, and decision and policy making to help maximize crop yields while reducing environmental impacts.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"11 1","pages":"1179"},"PeriodicalIF":5.8,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11526156/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142547181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The mRNA and protein datasets after cold stress of red tilapia. 红罗非鱼冷应激后的 mRNA 和蛋白质数据集。
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-10-30 DOI: 10.1038/s41597-024-04025-1
Lanmei Wang, Haoran Yang, Herbert Brightmore Munyaradzia, Wenbin Zhu, Zai-Jie Dong

The cold stress during overwintering is considered the bottleneck of red tilapia industry. In this study, the water temperature (WT) was reduced by 2 °C per day from 20 °C to 8 °C in the cold (C) group. Then transcriptome of brain(B), gill(G), liver(L) and skin(S) tissues and proteome of G, L and S tissues were performed in C and Normal (N) (WT: 20 °C) group. 24 transcriptomes were completed, and 168.8 Gb data were obtained, with more than 5.89 Gb clean data of each sample. A total of 30499 annotation results were obtained with 3199, 4697, 4393, and 3382 differentially expressed mRNAs in NB_vs_CB, NG_vs_CG, NL_vs_CL, NS_vs_CS. 18 DIA proteomes were performed, and 6341 proteins were obtained with 178, 500 and 166 differentially expressed proteins in NG_vs_CG, NL_vs_CL, NS_vs_CS. Our datasets can be reused for key genes and proteins identification, omics joint analysis and regulatory mechanism analysis of low temperature or cold stress in fish, which will help understanding the regulatory mechanism and facilitate the molecular selective breeding of cold-resistant varieties of fish.

越冬期间的低温胁迫被认为是红罗非鱼产业的瓶颈。在本研究中,低温组(C)的水温(WT)每天降低2 °C,从20 °C降至8 °C。然后对 C 组和正常组(WT:20 °C)的脑(B)、鳃(G)、肝(L)和皮肤(S)组织的转录组和 G、L 和 S 组织的蛋白质组进行研究。共完成了 24 个转录组,获得了 168.8 Gb 数据,每个样本的干净数据超过 5.89 Gb。在 NB_vs_CB、NG_vs_CG、NL_vs_CL、NS_vs_CS 中分别获得了 3199、4697、4393 和 3382 个差异表达的 mRNA,共得到 30499 个注释结果。在 NB_vs_CB、NG_vs_CG、NL_vs_CL、NS_vs_CS 中,进行了 18 个 DIA 蛋白体组的研究,获得了 6341 个蛋白质,其中 178 个、500 个和 166 个蛋白质有差异表达。我们的数据集可用于鱼类低温或寒冷胁迫的关键基因和蛋白质鉴定、omics联合分析和调控机制分析,这将有助于理解调控机制,促进抗寒鱼类品种的分子选育。
{"title":"The mRNA and protein datasets after cold stress of red tilapia.","authors":"Lanmei Wang, Haoran Yang, Herbert Brightmore Munyaradzia, Wenbin Zhu, Zai-Jie Dong","doi":"10.1038/s41597-024-04025-1","DOIUrl":"10.1038/s41597-024-04025-1","url":null,"abstract":"<p><p>The cold stress during overwintering is considered the bottleneck of red tilapia industry. In this study, the water temperature (WT) was reduced by 2 °C per day from 20 °C to 8 °C in the cold (C) group. Then transcriptome of brain(B), gill(G), liver(L) and skin(S) tissues and proteome of G, L and S tissues were performed in C and Normal (N) (WT: 20 °C) group. 24 transcriptomes were completed, and 168.8 Gb data were obtained, with more than 5.89 Gb clean data of each sample. A total of 30499 annotation results were obtained with 3199, 4697, 4393, and 3382 differentially expressed mRNAs in NB_vs_CB, NG_vs_CG, NL_vs_CL, NS_vs_CS. 18 DIA proteomes were performed, and 6341 proteins were obtained with 178, 500 and 166 differentially expressed proteins in NG_vs_CG, NL_vs_CL, NS_vs_CS. Our datasets can be reused for key genes and proteins identification, omics joint analysis and regulatory mechanism analysis of low temperature or cold stress in fish, which will help understanding the regulatory mechanism and facilitate the molecular selective breeding of cold-resistant varieties of fish.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"11 1","pages":"1177"},"PeriodicalIF":5.8,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11525570/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142547182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A high-quality genome of the early diverging tychoplanktonic diatom Paralia guyana. 早期分化的浮游硅藻 Paralia guyana 的高质量基因组。
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-10-30 DOI: 10.1038/s41597-024-03843-7
Jianbo Jian, Feichao Du, Binhu Wang, Xiaodong Fang, Thomas Ostenfeld Larsen, Yuhang Li, Eva C Sonnenschein

The diatom Paralia guyana is a tychoplanktonic microalgal species that represents one of the early diverging diatoms. P. guyana can thrive in both planktonic and benthic habitats, making a significant contribution to the occurrence of red tide events. Although a dozen diatom genomes have been sequenced, the identity of the early diverging diatoms remains elusive. The understanding of the evolutionary clades and mechanisms of ecological adaptation in P. guyana is limited by the absence of a high-quality genome assembly. In this study, the first high-quality genome assembly for the early diverging diatom P. guyana was established using PacBio single molecular sequencing. The assembled genome has a size of 558.85 Mb, making it the largest diatom genome on record, with a contig N50 size of 26.06 Mb. A total of 27,121 protein-coding genes were predicted in the P. guyana genome, of which 22,904 predicted genes (84.45%) were functionally annotated. This data and analysis provide innovative genomic resources for tychoplanktonic microalgal species and shed light on the evolutionary origins of diatoms.

硅藻 Paralia guyana 是一种浮游微藻,是早期分化硅藻的代表之一。P. guyana 既能在浮游生物栖息地也能在底栖生物栖息地生长,对赤潮事件的发生做出了重要贡献。虽然已经对十几个硅藻基因组进行了测序,但早期分化硅藻的身份仍然难以确定。由于缺乏高质量的基因组组装,人们对 P. guyana 的进化支系和生态适应机制的了解受到了限制。在本研究中,利用 PacBio 单分子测序技术首次建立了早期分化硅藻 P. guyana 的高质量基因组。组装的基因组大小为 558.85 Mb,是有记录以来最大的硅藻基因组,等位基因 N50 大小为 26.06 Mb。在 P. guyana 基因组中,共预测出 27 121 个编码蛋白质的基因,其中 22 904 个预测基因(84.45%)已进行了功能注释。这些数据和分析为浮游微藻物种提供了创新的基因组资源,并揭示了硅藻的进化起源。
{"title":"A high-quality genome of the early diverging tychoplanktonic diatom Paralia guyana.","authors":"Jianbo Jian, Feichao Du, Binhu Wang, Xiaodong Fang, Thomas Ostenfeld Larsen, Yuhang Li, Eva C Sonnenschein","doi":"10.1038/s41597-024-03843-7","DOIUrl":"10.1038/s41597-024-03843-7","url":null,"abstract":"<p><p>The diatom Paralia guyana is a tychoplanktonic microalgal species that represents one of the early diverging diatoms. P. guyana can thrive in both planktonic and benthic habitats, making a significant contribution to the occurrence of red tide events. Although a dozen diatom genomes have been sequenced, the identity of the early diverging diatoms remains elusive. The understanding of the evolutionary clades and mechanisms of ecological adaptation in P. guyana is limited by the absence of a high-quality genome assembly. In this study, the first high-quality genome assembly for the early diverging diatom P. guyana was established using PacBio single molecular sequencing. The assembled genome has a size of 558.85 Mb, making it the largest diatom genome on record, with a contig N50 size of 26.06 Mb. A total of 27,121 protein-coding genes were predicted in the P. guyana genome, of which 22,904 predicted genes (84.45%) were functionally annotated. This data and analysis provide innovative genomic resources for tychoplanktonic microalgal species and shed light on the evolutionary origins of diatoms.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"11 1","pages":"1175"},"PeriodicalIF":5.8,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11525933/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142547173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Chromosome-level genome assembly and annotation of the skinnycheek lanternfish Benthosema ptertum. 瘦颊灯笼鱼 Benthosema ptertum 染色体级基因组组装与注释。
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-10-30 DOI: 10.1038/s41597-024-04039-9
Qiaohong Liu, Xiaoying Cao, Lisheng Wu, Huan Wang, Hai Li, Longshan Lin, Shufang Liu, Shaoxiong Ding

Lanternfish not only boast the most abundant biomass among marine fish species but also play a vital role in marine ecosystems. As one of the lanternfish species with the highest global catch, the skinnycheek lanternfish (Benthosema pterotum) is widely distributed in the Indo-Pacific region, playing a pivotal role in the marine biological pump. This study constructed the first chromosome-level genome of B. pterotum using a combination of short-read sequencing, PacBio, and Hi-C sequencing technologies. The genome size of B. pterotum is 1,272.53 Mb, with a contig N50 of 810 Kb and a scaffold N50 of 54.49 M. More than 99.65% of contigs were successfully anchored onto 24 pseudochromosomes, and 95.7% of BUSCO genes were identified within the genome, demonstrating the high level of completeness in genome assembly. A total of 24,934 protein-coding genes were predicted, of which 99.02% were functionally annotated. The successful assembly of a high-quality genome for B. pterotum provides valuable genetic resources for better understanding its biological characteristics and potentially those of all lanternfish species.

灯笼鱼不仅拥有海洋鱼类中最丰富的生物量,而且在海洋生态系统中发挥着至关重要的作用。作为全球捕获量最高的灯笼鱼物种之一,裸颊灯笼鱼(Benthosema pterotum)广泛分布于印度洋-太平洋地区,在海洋生物泵中发挥着举足轻重的作用。本研究结合短线程测序、PacBio 和 Hi-C 测序技术,首次构建了 B. pterotum 的染色体级基因组。B.pterotum的基因组大小为1,272.53 Mb,等位基因N50为810 Kb,支架N50为54.49 M。超过 99.65% 的等位基因被成功锚定在 24 个假染色体上,95.7% 的 BUSCO 基因在基因组内被鉴定,这表明基因组组装的完整性达到了很高的水平。共预测了 24,934 个编码蛋白质的基因,其中 99.02% 的基因已进行了功能注释。成功地组装出高质量的B. pterotum基因组为更好地了解其生物学特征以及所有灯笼鱼物种的生物学特征提供了宝贵的遗传资源。
{"title":"Chromosome-level genome assembly and annotation of the skinnycheek lanternfish Benthosema ptertum.","authors":"Qiaohong Liu, Xiaoying Cao, Lisheng Wu, Huan Wang, Hai Li, Longshan Lin, Shufang Liu, Shaoxiong Ding","doi":"10.1038/s41597-024-04039-9","DOIUrl":"10.1038/s41597-024-04039-9","url":null,"abstract":"<p><p>Lanternfish not only boast the most abundant biomass among marine fish species but also play a vital role in marine ecosystems. As one of the lanternfish species with the highest global catch, the skinnycheek lanternfish (Benthosema pterotum) is widely distributed in the Indo-Pacific region, playing a pivotal role in the marine biological pump. This study constructed the first chromosome-level genome of B. pterotum using a combination of short-read sequencing, PacBio, and Hi-C sequencing technologies. The genome size of B. pterotum is 1,272.53 Mb, with a contig N50 of 810 Kb and a scaffold N50 of 54.49 M. More than 99.65% of contigs were successfully anchored onto 24 pseudochromosomes, and 95.7% of BUSCO genes were identified within the genome, demonstrating the high level of completeness in genome assembly. A total of 24,934 protein-coding genes were predicted, of which 99.02% were functionally annotated. The successful assembly of a high-quality genome for B. pterotum provides valuable genetic resources for better understanding its biological characteristics and potentially those of all lanternfish species.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"11 1","pages":"1178"},"PeriodicalIF":5.8,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11526109/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142547176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Chromosome level genome assembly of giant freshwater prawn (Macrobrachium rosenbergii). 巨型淡水对虾(Macrobrachium rosenbergii)染色体水平的基因组组装。
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-10-30 DOI: 10.1038/s41597-024-04016-2
Shiyan Liu, Meihui Li, Chong Han, Shuisheng Li, Jin Zhang, Cheng Peng, Yong Zhang

The giant freshwater prawn (Macrobrachium rosenbergii) has many advantages in aquaculture, such as fast growth rate, short breeding cycle and good nutrition, which makes it a freshwater shrimp with high economic value. Herein, high-quality chromosome-level genome of both female and male prawns were obtained by combining Illumina paired-end sequencing, PacBio single molecule sequencing technique and High-through chromosome conformation capture (Hi-C) technologies. In ZZ male prawn, a final contig assembly of 3118.58 Mb with a N50 length of 956,237 bp was obtained. In WW female prawn, a final contig assembly of 3333.31 Mb with a N50 length of 1,143,555 bp was obtained. The assembled genome sequences from prawns were anchored to 59 chromosomes. Moreover, the sex chromosomes including W chromosome and Z chromosome were generated in prawn with the length of 36.23 Mb and 27.33 Mb, respectively. The sequence similarity of Z chromosome and W chromosome reached to 74.90%. The high-quality genome resource will be useful for further molecular breeding and functional genomic research of giant freshwater prawns.

大宗淡水对虾(Macrobrachium rosenbergii)具有生长速度快、养殖周期短、营养丰富等诸多养殖优势,是一种经济价值极高的淡水对虾。本文结合Illumina成对端测序、PacBio单分子测序技术和High-through染色体构象捕获(Hi-C)技术,获得了高质量的雌雄对虾染色体组水平的基因组。在 ZZ 雄对虾中,最终获得了 3118.58 Mb 的序列,N50 长度为 956237 bp。在 WW 雌对虾中,最终获得了 3333.31 Mb 的等位基因序列,N50 长度为 1,143,555 bp。组装的对虾基因组序列锚定在 59 条染色体上。此外,对虾的性染色体包括 W 染色体和 Z 染色体,长度分别为 36.23 Mb 和 27.33 Mb。Z 染色体与 W 染色体的序列相似度达到 74.90%。高质量的基因组资源将有助于进一步开展大宗淡水对虾的分子育种和功能基因组研究。
{"title":"Chromosome level genome assembly of giant freshwater prawn (Macrobrachium rosenbergii).","authors":"Shiyan Liu, Meihui Li, Chong Han, Shuisheng Li, Jin Zhang, Cheng Peng, Yong Zhang","doi":"10.1038/s41597-024-04016-2","DOIUrl":"10.1038/s41597-024-04016-2","url":null,"abstract":"<p><p>The giant freshwater prawn (Macrobrachium rosenbergii) has many advantages in aquaculture, such as fast growth rate, short breeding cycle and good nutrition, which makes it a freshwater shrimp with high economic value. Herein, high-quality chromosome-level genome of both female and male prawns were obtained by combining Illumina paired-end sequencing, PacBio single molecule sequencing technique and High-through chromosome conformation capture (Hi-C) technologies. In ZZ male prawn, a final contig assembly of 3118.58 Mb with a N50 length of 956,237 bp was obtained. In WW female prawn, a final contig assembly of 3333.31 Mb with a N50 length of 1,143,555 bp was obtained. The assembled genome sequences from prawns were anchored to 59 chromosomes. Moreover, the sex chromosomes including W chromosome and Z chromosome were generated in prawn with the length of 36.23 Mb and 27.33 Mb, respectively. The sequence similarity of Z chromosome and W chromosome reached to 74.90%. The high-quality genome resource will be useful for further molecular breeding and functional genomic research of giant freshwater prawns.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"11 1","pages":"1181"},"PeriodicalIF":5.8,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11525972/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142547175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
NEON-SD: A 30-m Structural Diversity Product Derived from the NEON Discrete-Return LiDAR Point Cloud. NEON-SD:从 NEON 离散回归激光雷达点云中提取的 30 米结构多样性产品。
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-10-29 DOI: 10.1038/s41597-024-04018-0
Jianmin Wang, Dennis H Choi, Elizabeth LaRue, Jeff W Atkins, Jane R Foster, Jaclyn H Matthes, Robert T Fahey, Songlin Fei, Brady S Hardiman

Structural diversity (SD) characterizes the volume and physical arrangement of biotic components in an ecosystem which control critical ecosystem functions and processes. LiDAR data provides detailed 3-D spatial position information of components and has been widely used to calculate SD. However, the intensive computation of SD metrics from extensive LiDAR datasets is time-consuming and challenging for researchers who lack access to high-performance computing resources. Moreover, a lack of understanding of LiDAR data and algorithms could lead to inconsistent SD metrics. Here, we developed a SD product using the Discrete-Return LiDAR Point Cloud from the NEON Aerial Observation Platform. This product provides SD metrics detailing height, density, openness, and complexity at a spatial resolution of 30 m, aligned to the Landsat grids, for 211 site-years for 45 Terrestrial NEON sites from 2013 to 2022. To accommodate various ecosystems with different understory heights, it includes three different cut-off heights (0.5 m, 2 m, and 5 m). This structural diversity product can enable various applications such as ecosystem productivity estimation and disturbance monitoring.

结构多样性(SD)描述了生态系统中生物成分的数量和物理排列,它们控制着生态系统的关键功能和过程。激光雷达数据可提供成分的详细三维空间位置信息,已被广泛用于计算结构多样性。然而,对于缺乏高性能计算资源的研究人员来说,从大量 LiDAR 数据集中密集计算 SD 指标既耗时又具有挑战性。此外,对激光雷达数据和算法缺乏了解也会导致标度指标不一致。在此,我们利用 NEON 空中观测平台的离散回归激光雷达点云开发了 SD 产品。该产品以30米的空间分辨率提供了2013年至2022年45个陆地NEON站点211个站点年的SD指标,包括高度、密度、开阔度和复杂性,并与大地遥感卫星网格对齐。为了适应具有不同林下高度的各种生态系统,它包括三种不同的截断高度(0.5 米、2 米和 5 米)。该结构多样性产品可用于生态系统生产力估算和干扰监测等多种应用。
{"title":"NEON-SD: A 30-m Structural Diversity Product Derived from the NEON Discrete-Return LiDAR Point Cloud.","authors":"Jianmin Wang, Dennis H Choi, Elizabeth LaRue, Jeff W Atkins, Jane R Foster, Jaclyn H Matthes, Robert T Fahey, Songlin Fei, Brady S Hardiman","doi":"10.1038/s41597-024-04018-0","DOIUrl":"10.1038/s41597-024-04018-0","url":null,"abstract":"<p><p>Structural diversity (SD) characterizes the volume and physical arrangement of biotic components in an ecosystem which control critical ecosystem functions and processes. LiDAR data provides detailed 3-D spatial position information of components and has been widely used to calculate SD. However, the intensive computation of SD metrics from extensive LiDAR datasets is time-consuming and challenging for researchers who lack access to high-performance computing resources. Moreover, a lack of understanding of LiDAR data and algorithms could lead to inconsistent SD metrics. Here, we developed a SD product using the Discrete-Return LiDAR Point Cloud from the NEON Aerial Observation Platform. This product provides SD metrics detailing height, density, openness, and complexity at a spatial resolution of 30 m, aligned to the Landsat grids, for 211 site-years for 45 Terrestrial NEON sites from 2013 to 2022. To accommodate various ecosystems with different understory heights, it includes three different cut-off heights (0.5 m, 2 m, and 5 m). This structural diversity product can enable various applications such as ecosystem productivity estimation and disturbance monitoring.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"11 1","pages":"1174"},"PeriodicalIF":5.8,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11522374/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142547180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Chromosome-level genome assembly of Megachile lagopoda (Linnaeus, 1761) (Hymenoptera: Megachilidae). Megachile lagopoda (Linnaeus, 1761) (Hymenoptera: Megachilidae) 染色体级基因组组装。
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-10-29 DOI: 10.1038/s41597-024-04028-y
Dan Zhang, Jianfeng Jin, Zeqing Niu, Michael C Orr, Feng Zhang, Rafael R Ferrari, Qingtao Wu, Qingsong Zhou, Wa Da, Arong Luo, Chaodong Zhu

Megachile is one of the largest bee genera, including nearly 1,500 species, but very few chromosome-level assemblies exist for this group or the family Megachilidae. Here, we report the chromosome-level genome assembly of Megachile lagopoda collected from Xizang, China. Using PacBio CLR long reads and Hi-C data, we assembled a genome of 256.83 Mb with 96.08% of the assembly located on 16 chromosomes. Our assembly contains 266 scaffolds, with a scaffold N50 length of 15.6 Mb, and BUSCO completeness of 99.20%. We masked 27.10% (69.61 Mb) of the assembly as repetitive elements, identified 459 non-coding RNAs, and predicted 11,157 protein-coding genes. This high-quality genome of M. lagopoda represents an important step forward for our knowledge of megachilid genomics and bee evolution overall.

Megachile是最大的蜂属之一,包括近1,500个物种,但该蜂属或Megachilidae科的染色体组组装却很少。在这里,我们报告了从中国西藏采集的Megachile lagopoda的染色体组水平基因组组装。利用 PacBio CLR 长读数和 Hi-C 数据,我们组装了一个 256.83 Mb 的基因组,其中 96.08% 的基因组位于 16 条染色体上。我们的装配包含 266 个支架,支架 N50 长度为 15.6 Mb,BUSCO 完整性为 99.20%。我们屏蔽了 27.10% (69.61 Mb)的重复元件,鉴定了 459 个非编码 RNA,并预测了 11,157 个蛋白质编码基因。这个高质量的M. lagopoda基因组代表着我们在巨蜂基因组学和蜜蜂进化方面迈出了重要的一步。
{"title":"Chromosome-level genome assembly of Megachile lagopoda (Linnaeus, 1761) (Hymenoptera: Megachilidae).","authors":"Dan Zhang, Jianfeng Jin, Zeqing Niu, Michael C Orr, Feng Zhang, Rafael R Ferrari, Qingtao Wu, Qingsong Zhou, Wa Da, Arong Luo, Chaodong Zhu","doi":"10.1038/s41597-024-04028-y","DOIUrl":"10.1038/s41597-024-04028-y","url":null,"abstract":"<p><p>Megachile is one of the largest bee genera, including nearly 1,500 species, but very few chromosome-level assemblies exist for this group or the family Megachilidae. Here, we report the chromosome-level genome assembly of Megachile lagopoda collected from Xizang, China. Using PacBio CLR long reads and Hi-C data, we assembled a genome of 256.83 Mb with 96.08% of the assembly located on 16 chromosomes. Our assembly contains 266 scaffolds, with a scaffold N50 length of 15.6 Mb, and BUSCO completeness of 99.20%. We masked 27.10% (69.61 Mb) of the assembly as repetitive elements, identified 459 non-coding RNAs, and predicted 11,157 protein-coding genes. This high-quality genome of M. lagopoda represents an important step forward for our knowledge of megachilid genomics and bee evolution overall.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"11 1","pages":"1171"},"PeriodicalIF":5.8,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11522480/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142547177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
United States Precinct Boundaries and Statewide Partisan Election Results. 美国选区划分和全州党派选举结果。
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-10-29 DOI: 10.1038/s41597-024-04024-2
Brian Amos, Steven Gerontakis, Michael McDonald

We describe the creation and verification of databases of all precinct boundaries used in the United States 2016, 2018, and 2020 November general elections, enhanced with election results for all partisan statewide offices. United States election officials report election results in the smallest geographic reporting known as the precinct. Scholars and practitioners find these election results valuable for numerous use cases. However, these data cannot be augmented with other geographically-bound data, such as U.S. Census data, without precinct boundaries. Here we describe the collection of precinct boundary data from state and local election officials, sometimes provided in GIS formats, images, text descriptions, and - in rare cases - verbally. We describe how we verify boundaries with other election data, such as geocoded voter registration files. Our open-source data has appeared in redistricting litigation argued before the United States Supreme Court; and has been used by state and local redistricting authorities, media organizations, advocacy groups, scholars, and a vibrant community of mapping enthusiasts.

我们介绍了美国 2016 年、2018 年和 2020 年 11 月大选中使用的所有选区边界数据库的创建和验证情况,以及所有党派全州办公室的选举结果。美国选举官员以称为选区的最小地理报告方式报告选举结果。学者和从业人员发现,这些选举结果对许多用例都很有价值。但是,如果没有选区界限,这些数据就无法与其他地理数据(如美国人口普查数据)一起使用。在此,我们将介绍从州和地方选举官员处收集选区边界数据的情况,这些数据有时以 GIS 格式、图像、文本描述的形式提供,在极少数情况下也会以口头形式提供。我们还介绍了如何利用其他选举数据(如地理编码选民登记文件)验证选区边界。我们的开源数据曾出现在美国最高法院审理的选区重划诉讼中,并被州和地方选区重划机构、媒体组织、权益团体、学者以及充满活力的制图爱好者社区所使用。
{"title":"United States Precinct Boundaries and Statewide Partisan Election Results.","authors":"Brian Amos, Steven Gerontakis, Michael McDonald","doi":"10.1038/s41597-024-04024-2","DOIUrl":"10.1038/s41597-024-04024-2","url":null,"abstract":"<p><p>We describe the creation and verification of databases of all precinct boundaries used in the United States 2016, 2018, and 2020 November general elections, enhanced with election results for all partisan statewide offices. United States election officials report election results in the smallest geographic reporting known as the precinct. Scholars and practitioners find these election results valuable for numerous use cases. However, these data cannot be augmented with other geographically-bound data, such as U.S. Census data, without precinct boundaries. Here we describe the collection of precinct boundary data from state and local election officials, sometimes provided in GIS formats, images, text descriptions, and - in rare cases - verbally. We describe how we verify boundaries with other election data, such as geocoded voter registration files. Our open-source data has appeared in redistricting litigation argued before the United States Supreme Court; and has been used by state and local redistricting authorities, media organizations, advocacy groups, scholars, and a vibrant community of mapping enthusiasts.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"11 1","pages":"1173"},"PeriodicalIF":5.8,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11522301/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142547183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A global dataset of salmonid biomass in streams. 溪流中鲑鱼生物量的全球数据集。
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-10-29 DOI: 10.1038/s41597-024-04026-0
Kyleisha J Foote, James W A Grant, Pascale M Biron

Salmonid fishes are arguably one of the most studied fish taxa on Earth, but little is known about their biomass range in many parts of the world. We created a dataset of estimated salmonid biomass using published material of over 1000 rivers, covering 27 countries and 11 species. The dataset, spanning 84 years of data, is the largest known compilation of published studies on salmonid biomass in streams, allowing detailed analyses of differences in biomass by species, region, period, and sampling techniques. Production is also recorded for 194 rivers, allowing further analyses and relationships between biomass and production to be explored. There is scope to expand the list of variables in the dataset, which would be useful to the scientific community as it would enable models to be developed to predict salmonid biomass and production, among many other analyses.

鲑科鱼类可以说是地球上研究最多的鱼类类群之一,但人们对它们在世界许多地方的生物量范围却知之甚少。我们利用已发表的 1000 多条河流的资料创建了一个估计鲑鱼生物量的数据集,涵盖 27 个国家和 11 个物种。该数据集的数据时间跨度长达 84 年,是目前已知的关于溪流中鲑鱼生物量的最大规模的已发表研究汇编,可对不同物种、地区、时期和采样技术的生物量差异进行详细分析。该数据还记录了 194 条河流的产量,以便进一步分析和探讨生物量与产量之间的关系。数据集中的变量清单还有扩大的余地,这将对科学界很有帮助,因为这样就可以建立模型来预测鲑鱼的生物量和产量,以及其他许多分析。
{"title":"A global dataset of salmonid biomass in streams.","authors":"Kyleisha J Foote, James W A Grant, Pascale M Biron","doi":"10.1038/s41597-024-04026-0","DOIUrl":"10.1038/s41597-024-04026-0","url":null,"abstract":"<p><p>Salmonid fishes are arguably one of the most studied fish taxa on Earth, but little is known about their biomass range in many parts of the world. We created a dataset of estimated salmonid biomass using published material of over 1000 rivers, covering 27 countries and 11 species. The dataset, spanning 84 years of data, is the largest known compilation of published studies on salmonid biomass in streams, allowing detailed analyses of differences in biomass by species, region, period, and sampling techniques. Production is also recorded for 194 rivers, allowing further analyses and relationships between biomass and production to be explored. There is scope to expand the list of variables in the dataset, which would be useful to the scientific community as it would enable models to be developed to predict salmonid biomass and production, among many other analyses.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"11 1","pages":"1172"},"PeriodicalIF":5.8,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11522555/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142547172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Scientific Data
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1