首页 > 最新文献

Scientific Data最新文献

英文 中文
De novo transcriptome analysis of the Indian squid Uroteuthis duvaucelii (Orbigny, 1848) from the Indian Ocean. 印度洋鱿鱼 Uroteuthis duvaucelii (Orbigny, 1848) 的全新转录组分析。
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-11-16 DOI: 10.1038/s41597-024-04112-3
Nisha Krishnan, Sandhya Sukumaran, V G Vysakh, Wilson Sebastian, Anjaly Jose, Neenu Raj, A Gopalakrishnan

Cephalopods have dominated the oceans for hundreds of millions of years and are unquestionably at the peak of molluscan evolution. The development of the large brain and a well-sophisticated sensory system contributed significantly to its success. Therefore, it is considered the best example of convergent evolution and attracted the attention of scientists from various disciplines of biology. The aim of the present study is to construct a reference transcriptome in the Indian squid Uroteuthis duvaucelii to gain insights into cephalopod evolution and enrich the existing cephalopod database. Around 72 million short Illumina reads were generated from five different tissues, including the brain, eye, gill, heart and gonads, and assembled using the Trinity assembler. About 26230 protein-coding sequences were annotated from the assembled transcripts. The BUSCO completeness of the assembly was 71.71% compared to the Mollusca_Odb10 gene set. KEGG and REACTOME pathway analyzes revealed that U. duvaucelii shares many genes and pathways with higher vertebrates.

头足纲动物统治海洋已有数亿年之久,毫无疑问处于软体动物进化的顶峰。大型大脑和复杂感官系统的发展为其成功做出了巨大贡献。因此,它被认为是趋同进化的最佳范例,吸引了生物学各学科科学家的关注。本研究旨在构建印度鱿鱼 Uroteuthis duvaucelii 的参考转录组,以深入了解头足类动物的进化过程,并丰富现有的头足类动物数据库。研究人员从大脑、眼睛、鳃、心脏和性腺等五个不同组织中生成了约 7200 万个 Illumina 短读数,并使用 Trinity 汇编器进行了汇编。根据组装的转录本注释了约 26230 条蛋白质编码序列。与 Mollusca_Odb10 基因组相比,组装的 BUSCO 完整性为 71.71%。KEGG和REACTOME通路分析表明,U. duvaucelii与高等脊椎动物共享许多基因和通路。
{"title":"De novo transcriptome analysis of the Indian squid Uroteuthis duvaucelii (Orbigny, 1848) from the Indian Ocean.","authors":"Nisha Krishnan, Sandhya Sukumaran, V G Vysakh, Wilson Sebastian, Anjaly Jose, Neenu Raj, A Gopalakrishnan","doi":"10.1038/s41597-024-04112-3","DOIUrl":"10.1038/s41597-024-04112-3","url":null,"abstract":"<p><p>Cephalopods have dominated the oceans for hundreds of millions of years and are unquestionably at the peak of molluscan evolution. The development of the large brain and a well-sophisticated sensory system contributed significantly to its success. Therefore, it is considered the best example of convergent evolution and attracted the attention of scientists from various disciplines of biology. The aim of the present study is to construct a reference transcriptome in the Indian squid Uroteuthis duvaucelii to gain insights into cephalopod evolution and enrich the existing cephalopod database. Around 72 million short Illumina reads were generated from five different tissues, including the brain, eye, gill, heart and gonads, and assembled using the Trinity assembler. About 26230 protein-coding sequences were annotated from the assembled transcripts. The BUSCO completeness of the assembly was 71.71% compared to the Mollusca_Odb10 gene set. KEGG and REACTOME pathway analyzes revealed that U. duvaucelii shares many genes and pathways with higher vertebrates.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"11 1","pages":"1236"},"PeriodicalIF":5.8,"publicationDate":"2024-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11569149/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142644767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Temporal single-cell RNA sequencing dataset of gastroesophagus development from embryonic to post-natal stages. 从胚胎到出生后胃食管发育的单细胞 RNA 测序数据集。
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-11-16 DOI: 10.1038/s41597-024-04081-7
Pon Ganish Prakash, Naveen Kumar, Rajendra Kumar Gurumurthy, Cindrilla Chumduri

Gastroesophageal disorders and cancers impose a significant global burden. Particularly, the prevalence of esophageal adenocarcinoma (EAC) has increased dramatically in recent years. Barrett's esophagus, a precursor of EAC, features a unique tissue adaptation at the gastroesophageal squamo-columnar junction (GE-SCJ), where the esophagus meets the stomach. Investigating the evolution of GE-SCJ and understanding dysregulation in its homeostasis are crucial for elucidating cancer pathogenesis. Here, we present the technical quality of the comprehensive single-cell RNA sequencing (scRNA-seq) dataset from mice that captures the transcriptional dynamics during the development of the esophagus, stomach and the GE-SCJ at embryonic, neonatal and adult stages. Through integration with external scRNA-seq datasets and validations using organoid and animal models, we demonstrate the dataset's consistency in identified cell types and transcriptional profiles. This dataset will be a valuable resource for studying developmental patterns and associated signaling networks in the tissue microenvironment. By offering insights into cellular programs during homeostasis, it facilitates the identification of changes leading to conditions like metaplasia and cancer, crucial for developing effective intervention strategies.

胃食管疾病和癌症给全球带来了沉重的负担。特别是近年来,食管腺癌(EAC)的发病率急剧上升。巴雷特食管是 EAC 的前体,其特点是食管与胃交界处的胃食管鳞柱交界处(GE-SCJ)有独特的组织适应性。研究 GE-SCJ 的演变和了解其平衡失调对阐明癌症发病机制至关重要。在这里,我们展示了小鼠单细胞RNA测序(scRNA-seq)数据集的技术质量,该数据集捕捉了食管、胃和GE-SCJ在胚胎、新生儿和成年阶段的发育过程中的转录动态。通过与外部 scRNA-seq 数据集的整合以及使用类器官和动物模型的验证,我们证明了该数据集在识别细胞类型和转录特征方面的一致性。该数据集将成为研究组织微环境中发育模式和相关信号网络的宝贵资源。通过深入了解稳态过程中的细胞程序,它有助于识别导致变态反应和癌症等病症的变化,这对制定有效的干预策略至关重要。
{"title":"Temporal single-cell RNA sequencing dataset of gastroesophagus development from embryonic to post-natal stages.","authors":"Pon Ganish Prakash, Naveen Kumar, Rajendra Kumar Gurumurthy, Cindrilla Chumduri","doi":"10.1038/s41597-024-04081-7","DOIUrl":"10.1038/s41597-024-04081-7","url":null,"abstract":"<p><p>Gastroesophageal disorders and cancers impose a significant global burden. Particularly, the prevalence of esophageal adenocarcinoma (EAC) has increased dramatically in recent years. Barrett's esophagus, a precursor of EAC, features a unique tissue adaptation at the gastroesophageal squamo-columnar junction (GE-SCJ), where the esophagus meets the stomach. Investigating the evolution of GE-SCJ and understanding dysregulation in its homeostasis are crucial for elucidating cancer pathogenesis. Here, we present the technical quality of the comprehensive single-cell RNA sequencing (scRNA-seq) dataset from mice that captures the transcriptional dynamics during the development of the esophagus, stomach and the GE-SCJ at embryonic, neonatal and adult stages. Through integration with external scRNA-seq datasets and validations using organoid and animal models, we demonstrate the dataset's consistency in identified cell types and transcriptional profiles. This dataset will be a valuable resource for studying developmental patterns and associated signaling networks in the tissue microenvironment. By offering insights into cellular programs during homeostasis, it facilitates the identification of changes leading to conditions like metaplasia and cancer, crucial for developing effective intervention strategies.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"11 1","pages":"1238"},"PeriodicalIF":5.8,"publicationDate":"2024-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11569200/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142644769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Future land use maps for the Netherlands based on the Dutch One Health Shared Socio-economic Pathways. 根据荷兰 "同一健康 "共享社会经济路径绘制的荷兰未来土地利用图。
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-11-16 DOI: 10.1038/s41597-024-04059-5
Martha Dellar, Gertjan Geerling, Kasper Kok, Peter M van Bodegom, Gerard van der Schrier, Maarten Schrama, Eline Boelee

To enable detailed study of a wide variety of future health challenges, we have created future land use maps for the Netherlands for 2050, based on the Dutch One Health Shared Socio-economic Pathways (SSPs). This was done using the DynaCLUE modelling framework. Future land use is based on altitude, soil properties, groundwater, salinity, flood risk, agricultural land price, distance to transport hubs and climate. We also account for anticipated demand for different land use types, historic land use changes and potential spatial restrictions. These land use maps can be used to model many different health risks to people, animals and the environment, such as disease, water quality and pollution. In addition, the Netherlands can serve as an example for other rapidly urbanising deltas where many of the health risks will be similar.

为了能够详细研究未来的各种健康挑战,我们根据荷兰 "同一健康 "共享社会经济路径 (SSP) 绘制了荷兰 2050 年的未来土地利用图。这项工作是利用 DynaCLUE 建模框架完成的。未来的土地使用基于海拔高度、土壤特性、地下水、盐度、洪水风险、农业用地价格、与交通枢纽的距离和气候。我们还考虑了对不同土地利用类型的预期需求、历史土地利用变化和潜在的空间限制。这些土地利用地图可用于模拟人类、动物和环境面临的多种不同健康风险,如疾病、水质和污染。此外,荷兰还可以为其他快速城市化的三角洲提供范例,这些三角洲的许多健康风险都与荷兰类似。
{"title":"Future land use maps for the Netherlands based on the Dutch One Health Shared Socio-economic Pathways.","authors":"Martha Dellar, Gertjan Geerling, Kasper Kok, Peter M van Bodegom, Gerard van der Schrier, Maarten Schrama, Eline Boelee","doi":"10.1038/s41597-024-04059-5","DOIUrl":"10.1038/s41597-024-04059-5","url":null,"abstract":"<p><p>To enable detailed study of a wide variety of future health challenges, we have created future land use maps for the Netherlands for 2050, based on the Dutch One Health Shared Socio-economic Pathways (SSPs). This was done using the DynaCLUE modelling framework. Future land use is based on altitude, soil properties, groundwater, salinity, flood risk, agricultural land price, distance to transport hubs and climate. We also account for anticipated demand for different land use types, historic land use changes and potential spatial restrictions. These land use maps can be used to model many different health risks to people, animals and the environment, such as disease, water quality and pollution. In addition, the Netherlands can serve as an example for other rapidly urbanising deltas where many of the health risks will be similar.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"11 1","pages":"1237"},"PeriodicalIF":5.8,"publicationDate":"2024-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11569152/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142644768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Atmospheric new particle formation identifier using longitudinal global particle number size distribution data. 利用纵向全球粒径分布数据识别大气中新粒子的形成。
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-11-16 DOI: 10.1038/s41597-024-04079-1
Simonas Kecorius, Leizel Madueño, Mario Lovric, Nikolina Racic, Maximilian Schwarz, Josef Cyrys, Juan Andrés Casquero-Vera, Lucas Alados-Arboledas, Sébastien Conil, Jean Sciare, Jakub Ondracek, Anna Gannet Hallar, Francisco J Gómez-Moreno, Raymond Ellul, Adam Kristensson, Mar Sorribas, Nikolaos Kalivitis, Nikolaos Mihalopoulos, Annette Peters, Maria Gini, Konstantinos Eleftheriadis, Stergios Vratolis, Kim Jeongeun, Wolfram Birmili, Benjamin Bergmans, Nina Nikolova, Adelaide Dinoi, Daniele Contini, Angela Marinoni, Andres Alastuey, Tuukka Petäjä, Sergio Rodriguez, David Picard, Benjamin Brem, Max Priestman, David C Green, David C S Beddows, Roy M Harrison, Colin O'Dowd, Darius Ceburnis, Antti Hyvärinen, Bas Henzing, Suzanne Crumeyrolle, Jean-Philippe Putaud, Paolo Laj, Kay Weinhold, Kristina Plauškaitė, Steigvilė Byčenkienė

Atmospheric new particle formation (NPF) is a naturally occurring phenomenon, during which high concentrations of sub-10 nm particles are created through gas to particle conversion. The NPF is observed in multiple environments around the world. Although it has observable influence onto annual total and ultrafine particle number concentrations (PNC and UFP, respectively), only limited epidemiological studies have investigated whether these particles are associated with adverse health effects. One plausible reason for this limitation may be related to the absence of NPF identifiers available in UFP and PNC data sets. Until recently, the regional NPF events were usually identified manually from particle number size distribution contour plots. Identification of NPF across multi-annual and multiple station data sets remained a tedious task. In this work, we introduce a regional NPF identifier, created using an automated, machine learning based algorithm. The regional NPF event tag was created for 65 measurement sites globally, covering the period from 1996 to 2023. The discussed data set can be used in future studies related to regional NPF.

大气中新粒子的形成(NPF)是一种自然发生的现象,在这一过程中,通过气体到粒子的转化,会产生高浓度的 10 纳米以下粒子。在全球多种环境中都能观测到 NPF。虽然它对年度总粒子数浓度和超细粒子数浓度(分别为 PNC 和 UFP)有明显的影响,但只有有限的流行病学研究调查了这些粒子是否与不良健康影响有关。造成这种局限性的一个合理原因可能与 UFP 和 PNC 数据集中缺乏 NPF 识别器有关。直到最近,区域 NPF 事件通常都是通过人工从粒径分布等值线图中识别出来的。多年度和多站点数据集的 NPF 识别仍然是一项繁琐的任务。在这项工作中,我们引入了区域 NPF 识别器,该识别器是利用基于机器学习的自动算法创建的。区域 NPF 事件标签是为全球 65 个测量站点创建的,涵盖 1996 年至 2023 年。所讨论的数据集可用于未来与区域 NPF 相关的研究。
{"title":"Atmospheric new particle formation identifier using longitudinal global particle number size distribution data.","authors":"Simonas Kecorius, Leizel Madueño, Mario Lovric, Nikolina Racic, Maximilian Schwarz, Josef Cyrys, Juan Andrés Casquero-Vera, Lucas Alados-Arboledas, Sébastien Conil, Jean Sciare, Jakub Ondracek, Anna Gannet Hallar, Francisco J Gómez-Moreno, Raymond Ellul, Adam Kristensson, Mar Sorribas, Nikolaos Kalivitis, Nikolaos Mihalopoulos, Annette Peters, Maria Gini, Konstantinos Eleftheriadis, Stergios Vratolis, Kim Jeongeun, Wolfram Birmili, Benjamin Bergmans, Nina Nikolova, Adelaide Dinoi, Daniele Contini, Angela Marinoni, Andres Alastuey, Tuukka Petäjä, Sergio Rodriguez, David Picard, Benjamin Brem, Max Priestman, David C Green, David C S Beddows, Roy M Harrison, Colin O'Dowd, Darius Ceburnis, Antti Hyvärinen, Bas Henzing, Suzanne Crumeyrolle, Jean-Philippe Putaud, Paolo Laj, Kay Weinhold, Kristina Plauškaitė, Steigvilė Byčenkienė","doi":"10.1038/s41597-024-04079-1","DOIUrl":"10.1038/s41597-024-04079-1","url":null,"abstract":"<p><p>Atmospheric new particle formation (NPF) is a naturally occurring phenomenon, during which high concentrations of sub-10 nm particles are created through gas to particle conversion. The NPF is observed in multiple environments around the world. Although it has observable influence onto annual total and ultrafine particle number concentrations (PNC and UFP, respectively), only limited epidemiological studies have investigated whether these particles are associated with adverse health effects. One plausible reason for this limitation may be related to the absence of NPF identifiers available in UFP and PNC data sets. Until recently, the regional NPF events were usually identified manually from particle number size distribution contour plots. Identification of NPF across multi-annual and multiple station data sets remained a tedious task. In this work, we introduce a regional NPF identifier, created using an automated, machine learning based algorithm. The regional NPF event tag was created for 65 measurement sites globally, covering the period from 1996 to 2023. The discussed data set can be used in future studies related to regional NPF.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"11 1","pages":"1239"},"PeriodicalIF":5.8,"publicationDate":"2024-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11569151/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142644765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Chromosome-level genome assembly and annotation of the Patagonian toothfish Dissostichus eleginoides. 巴塔哥尼亚齿鱼 Dissostichus eleginoides 染色体级基因组组装与注释。
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-11-16 DOI: 10.1038/s41597-024-04119-w
Seung Jae Lee, Minjoo Cho, Jinmu Kim, Eunkyung Choi, Soyun Choi, Sangdeok Chung, Jaebong Lee, Jeong-Hoon Kim, Hyun Park

The Patagonian toothfish (Dissostichus eleginoides) belongs to the Actinopterygii class, and the suborder Notothenioidei, which lives in cold waters in the Southern Hemisphere. We performed assembly and annotation, and we integrated the Illumina short-read sequencing for polishinng, PacBio long-read sequencing for contig-level assembly, and Hi-C sequencing technology to obtain high-quality of chromosome-level genome assembly. The final assembly analysis resulted in a total of 495 scaffolds, a genome size of 844.7 Mbp and an N50 length of 36 Mbp. Among these data, we confirmed 24 scaffolds exceeded 10 Mbp and classified as chromosome-level. The completeness of BUSCO rate was over 97%. A total gene set of 32,224 was identified. Furthermore, we analyzed the presence of AFGP genes, classified into Antarctic and sub-Antarctic categories through phylogenetic analysis. This study provides a useful resource for the genomic analysis of Patagonian toothfish and genetic insights into the comparison with Antarctic fishes.

巴塔哥尼亚齿鱼(Dissostichus eleginoides)属于翼手目(Actinopterygii),齿鱼亚目(Notothenioidei),生活在南半球的寒冷水域。我们进行了组装和注释,并整合了用于抛光的Illumina短线程测序技术、用于等位基因级组装的PacBio长线程测序技术和Hi-C测序技术,以获得高质量的染色体级基因组组装。通过最终的组装分析,共得到 495 个支架,基因组大小为 844.7 Mbp,N50 长度为 36 Mbp。在这些数据中,我们确认有 24 个支架超过 10 Mbp,并将其归类为染色体级。BUSCO 的完整率超过 97%。共鉴定出 32 224 个基因集。此外,我们还分析了 AFGP 基因的存在,并通过系统发育分析将其分为南极和亚南极两类。这项研究为巴塔哥尼亚齿鱼的基因组分析提供了有用的资源,并为与南极鱼类的比较提供了遗传学见解。
{"title":"Chromosome-level genome assembly and annotation of the Patagonian toothfish Dissostichus eleginoides.","authors":"Seung Jae Lee, Minjoo Cho, Jinmu Kim, Eunkyung Choi, Soyun Choi, Sangdeok Chung, Jaebong Lee, Jeong-Hoon Kim, Hyun Park","doi":"10.1038/s41597-024-04119-w","DOIUrl":"10.1038/s41597-024-04119-w","url":null,"abstract":"<p><p>The Patagonian toothfish (Dissostichus eleginoides) belongs to the Actinopterygii class, and the suborder Notothenioidei, which lives in cold waters in the Southern Hemisphere. We performed assembly and annotation, and we integrated the Illumina short-read sequencing for polishinng, PacBio long-read sequencing for contig-level assembly, and Hi-C sequencing technology to obtain high-quality of chromosome-level genome assembly. The final assembly analysis resulted in a total of 495 scaffolds, a genome size of 844.7 Mbp and an N50 length of 36 Mbp. Among these data, we confirmed 24 scaffolds exceeded 10 Mbp and classified as chromosome-level. The completeness of BUSCO rate was over 97%. A total gene set of 32,224 was identified. Furthermore, we analyzed the presence of AFGP genes, classified into Antarctic and sub-Antarctic categories through phylogenetic analysis. This study provides a useful resource for the genomic analysis of Patagonian toothfish and genetic insights into the comparison with Antarctic fishes.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"11 1","pages":"1240"},"PeriodicalIF":5.8,"publicationDate":"2024-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11569150/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142644766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Chromosome-level genome assembly of the smallscale yellowfin (Plagiognathops microlepis). 小鳞黄鳍鱼(Plagiognathops microlepis)染色体级基因组组装。
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-11-15 DOI: 10.1038/s41597-024-04105-2
Yangyang Liang, Huijuan Liu, Wenxuan Lu, Jing Li, Ting Fang, Na Gao, Cheng Chen, Xiuxia Zhao, Kun Yang, Haiyang Liu

The small-scale yellowfin (Plagiognathops microlepis) is a highly valued species in East Asian aquaculture due to its adaptability and high yield. However, the lack of genomic data has impeded genetic research and breeding efforts. In this study, we utilize PacBio Hifi long-read sequencing and Hi-C technologies to construct a highly detailed genome of P. microlepis at the chromosomal level. The assembly encompasses 976.41 Mb, with an exceptional 99.84% distribution across 24 chromosomes. Notably, the contig N50 was 34.41 Mb and scaffold N50 was 38.38 Mb. The completeness of the P. microlepis genome assembly is underscored by a BUSCO score of 98.08%. A total of 25,389 protein-coding genes were identified, with a BUSCO score of 96.98%, and 99.85% of these genes were functionally annotated. Synteny relationships at the chromosome level with Danio rerio and Chanodichthys erythropterus genomes uncover small-scale chromosomal rearrangements. This high-fidelity genome assembly serves as a pivotal resource for forthcoming endeavors such as the genome structure, functional elements, comparative genomics, and evolutionary characteristics of P. microlepis and its relative species.

小型黄鳍鱼(Plagiognathops microlepis)因其适应性强、产量高而成为东亚水产养殖业中的高价值鱼种。然而,基因组数据的缺乏阻碍了遗传研究和育种工作。在本研究中,我们利用 PacBio Hifi 长线程测序技术和 Hi-C 技术构建了小尾寒羊染色体水平的高精细基因组。该基因组包含 976.41 Mb,在 24 条染色体上的分布率高达 99.84%。值得注意的是,等位基因 N50 为 34.41 Mb,支架 N50 为 38.38 Mb。P. microlepis基因组组装的完整性得到了98.08%的BUSCO评分。共鉴定出 25,389 个编码蛋白质的基因,BUSCO 得分为 96.98%,其中 99.85% 的基因得到了功能注释。在染色体水平上,与Danio rerio和Chanodichthys erythropterus基因组的合成关系发现了小规模的染色体重排。这一高保真基因组组装是今后研究小尾寒羊及其近缘种的基因组结构、功能元件、比较基因组学和进化特征的重要资源。
{"title":"Chromosome-level genome assembly of the smallscale yellowfin (Plagiognathops microlepis).","authors":"Yangyang Liang, Huijuan Liu, Wenxuan Lu, Jing Li, Ting Fang, Na Gao, Cheng Chen, Xiuxia Zhao, Kun Yang, Haiyang Liu","doi":"10.1038/s41597-024-04105-2","DOIUrl":"10.1038/s41597-024-04105-2","url":null,"abstract":"<p><p>The small-scale yellowfin (Plagiognathops microlepis) is a highly valued species in East Asian aquaculture due to its adaptability and high yield. However, the lack of genomic data has impeded genetic research and breeding efforts. In this study, we utilize PacBio Hifi long-read sequencing and Hi-C technologies to construct a highly detailed genome of P. microlepis at the chromosomal level. The assembly encompasses 976.41 Mb, with an exceptional 99.84% distribution across 24 chromosomes. Notably, the contig N50 was 34.41 Mb and scaffold N50 was 38.38 Mb. The completeness of the P. microlepis genome assembly is underscored by a BUSCO score of 98.08%. A total of 25,389 protein-coding genes were identified, with a BUSCO score of 96.98%, and 99.85% of these genes were functionally annotated. Synteny relationships at the chromosome level with Danio rerio and Chanodichthys erythropterus genomes uncover small-scale chromosomal rearrangements. This high-fidelity genome assembly serves as a pivotal resource for forthcoming endeavors such as the genome structure, functional elements, comparative genomics, and evolutionary characteristics of P. microlepis and its relative species.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"11 1","pages":"1234"},"PeriodicalIF":5.8,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11568295/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142639702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Publisher Correction: Chromosome-level genome assembly of predatory Arma chinensis. 出版者更正:食肉动物 Arma chinensis 染色体水平的基因组组装。
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-11-15 DOI: 10.1038/s41597-024-04069-3
Luyao Fu, Changjin Lin, Wenyan Xu, Hongmei Cheng, Dianyu Liu, Le Ma, Zhihan Su, Xiaoyu Yan, Xiaolin Dong, Chenxi Liu
{"title":"Publisher Correction: Chromosome-level genome assembly of predatory Arma chinensis.","authors":"Luyao Fu, Changjin Lin, Wenyan Xu, Hongmei Cheng, Dianyu Liu, Le Ma, Zhihan Su, Xiaoyu Yan, Xiaolin Dong, Chenxi Liu","doi":"10.1038/s41597-024-04069-3","DOIUrl":"10.1038/s41597-024-04069-3","url":null,"abstract":"","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"11 1","pages":"1235"},"PeriodicalIF":5.8,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11568330/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142639704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MIMIC-BP: A curated dataset for blood pressure estimation. MIMIC-BP:用于血压估算的数据集。
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-11-15 DOI: 10.1038/s41597-024-04041-1
Ivandro Sanches, Victor V Gomes, Carlos Caetano, Lizeth S B Cabrera, Vinicius H Cene, Thomas Beltrame, Wonkyu Lee, Sanghyun Baek, Otávio A B Penatti

Blood pressure (BP) is one of the most prominent indicators of potential cardiovascular disorders. Traditionally, BP measurement relies on inflatable cuffs, which is inconvenient and limit the acquisition of such important health-related information in general population. Based on large amounts of well-collected and annotated data, deep-learning approaches present a generalization potential that arose as an alternative to enable more pervasive approaches. However, most existing work in this area currently uses datasets with limitations, such as lack of subject identification and severe data imbalance that can result in data leakage and algorithm bias. Thus, to offer a more properly curated source of information, we propose a derivative dataset composed of 380 hours of the most common biomedical signals, including arterial blood pressure, photoplethysmography, and electrocardiogram for 1,524 anonymized subjects, each having 30 segments of 30 seconds of those signals. We also validated the proposed dataset through experiments using state-of-the-art deep-learning methods, as we highlight the importance of standardized benchmarks for calibration-free blood pressure estimation scenarios.

血压(BP)是潜在心血管疾病最显著的指标之一。传统上,血压测量依赖于充气袖带,这既不方便,也限制了在普通人群中获取此类重要的健康相关信息。基于大量精心收集和注释的数据,深度学习方法具有泛化潜力,是实现更普遍方法的替代方案。然而,该领域的大多数现有工作目前使用的数据集都存在局限性,如缺乏主体识别和严重的数据不平衡,这可能导致数据泄漏和算法偏差。因此,为了提供更恰当的信息源,我们提出了一个衍生数据集,该数据集由 380 个小时的最常见生物医学信号组成,包括 1524 名匿名受试者的动脉血压、光电血压和心电图,每个受试者有 30 个 30 秒的信号片段。我们还通过使用最先进的深度学习方法进行实验,验证了所提出的数据集,因为我们强调了标准化基准对于无校准血压估计场景的重要性。
{"title":"MIMIC-BP: A curated dataset for blood pressure estimation.","authors":"Ivandro Sanches, Victor V Gomes, Carlos Caetano, Lizeth S B Cabrera, Vinicius H Cene, Thomas Beltrame, Wonkyu Lee, Sanghyun Baek, Otávio A B Penatti","doi":"10.1038/s41597-024-04041-1","DOIUrl":"10.1038/s41597-024-04041-1","url":null,"abstract":"<p><p>Blood pressure (BP) is one of the most prominent indicators of potential cardiovascular disorders. Traditionally, BP measurement relies on inflatable cuffs, which is inconvenient and limit the acquisition of such important health-related information in general population. Based on large amounts of well-collected and annotated data, deep-learning approaches present a generalization potential that arose as an alternative to enable more pervasive approaches. However, most existing work in this area currently uses datasets with limitations, such as lack of subject identification and severe data imbalance that can result in data leakage and algorithm bias. Thus, to offer a more properly curated source of information, we propose a derivative dataset composed of 380 hours of the most common biomedical signals, including arterial blood pressure, photoplethysmography, and electrocardiogram for 1,524 anonymized subjects, each having 30 segments of 30 seconds of those signals. We also validated the proposed dataset through experiments using state-of-the-art deep-learning methods, as we highlight the importance of standardized benchmarks for calibration-free blood pressure estimation scenarios.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"11 1","pages":"1233"},"PeriodicalIF":5.8,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11568151/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142639703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Biomechanical Dataset of 1,798 Healthy and Injured Subjects During Treadmill Walking and Running. 1,798 名健康和受伤受试者在跑步机上行走和跑步时的生物力学数据集。
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-11-14 DOI: 10.1038/s41597-024-04011-7
Reed Ferber, Allan Brett, Reginaldo K Fukuchi, Blayne Hettinga, Sean T Osis

Quantitative biomechanical gait analysis is an important clinical and research tool for injury and disease diagnosis and treatment. However, one major criticism is that gait analysis laboratories largely operate in isolation and there is a lack of benchmark datasets, which can be used to advance research and statistical methodologies. To address this, we present an open biomechanics dataset of n = 1798 healthy and injured, young and older adults during treadmill walking and/or running at a range of gait speeds. The full dataset is available on Figshare+ and data files are contained within a series of zipped folders with folder names representing the subject ID. Each subject ID folder contains walking and/or running data containing raw marker trajectory data along with metadata for each participant. Five tutorials are also provided, demonstrating aspects such as loading data files, sample analyses of discrete variables, and calculating joint angles from code along with covering more complex topics such as principal component analysis for dimensionality reduction, statistical parametric mapping, and conducting unsupervised clustering.

定量生物力学步态分析是伤病诊断和治疗的重要临床和研究工具。然而,一个主要的批评意见是,步态分析实验室大多是孤立运作的,缺乏可用于推进研究和统计方法的基准数据集。为了解决这个问题,我们提供了一个公开的生物力学数据集,其中包括 n = 1798 名健康和受伤的年轻和年长成年人在跑步机上以不同步速行走和/或跑步的数据。完整的数据集可在 Figshare+ 上获取,数据文件包含在一系列压缩文件夹中,文件夹名称代表受试者 ID。每个受试者 ID 文件夹都包含步行和/或跑步数据,其中包含原始标记轨迹数据以及每个受试者的元数据。此外,还提供了五个教程,演示了加载数据文件、离散变量样本分析、根据代码计算关节角度等方面的内容,并涵盖了更复杂的主题,如用于降维的主成分分析、统计参数映射以及进行无监督聚类。
{"title":"A Biomechanical Dataset of 1,798 Healthy and Injured Subjects During Treadmill Walking and Running.","authors":"Reed Ferber, Allan Brett, Reginaldo K Fukuchi, Blayne Hettinga, Sean T Osis","doi":"10.1038/s41597-024-04011-7","DOIUrl":"10.1038/s41597-024-04011-7","url":null,"abstract":"<p><p>Quantitative biomechanical gait analysis is an important clinical and research tool for injury and disease diagnosis and treatment. However, one major criticism is that gait analysis laboratories largely operate in isolation and there is a lack of benchmark datasets, which can be used to advance research and statistical methodologies. To address this, we present an open biomechanics dataset of n = 1798 healthy and injured, young and older adults during treadmill walking and/or running at a range of gait speeds. The full dataset is available on Figshare+ and data files are contained within a series of zipped folders with folder names representing the subject ID. Each subject ID folder contains walking and/or running data containing raw marker trajectory data along with metadata for each participant. Five tutorials are also provided, demonstrating aspects such as loading data files, sample analyses of discrete variables, and calculating joint angles from code along with covering more complex topics such as principal component analysis for dimensionality reduction, statistical parametric mapping, and conducting unsupervised clustering.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"11 1","pages":"1232"},"PeriodicalIF":5.8,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11564798/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142627209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Full-coverage estimation of CO2 concentrations in China via multisource satellite data and Deep Forest model. 通过多源卫星数据和深林模型对中国二氧化碳浓度进行全覆盖估算。
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-11-14 DOI: 10.1038/s41597-024-04063-9
Kun Cai, Liuyin Guan, Shenshen Li, Shuo Zhang, Yang Liu, Yang Liu

Monitoring China's carbon dioxide (CO2) concentration is essential for formulating effective carbon cycle policies to achieve carbon peaking and neutrality. Despite insufficient satellite observation coverage, this study utilizes high-resolution spatiotemporal data from the Orbiting Carbon Observatory 2 (OCO-2), supplemented with various auxiliary datasets, to estimate full-coverage, monthly, column-averaged carbon dioxide (XCO2) values across China from 2015 to 2022 at a spatial resolution of 0.05° via the deep forest model. The 10-fold cross-validation results indicate a correlation coefficient (R) of 0.95 and a determination coefficient (R²) of 0.90. Validation against ground-based station data yielded R values of 0.93, and R² values reached 0.81. Further validation from the Greenhouse Gases Observing Satellite (GOSAT) and the Copernicus Atmosphere Monitoring Service Reanalysis dataset (CAMS) produced R² values of 0.87 and 0.80, respectively. During the study period, CO2 concentrations in China were higher in spring and winter than in summer and autumn, indicating a clear annual increase. The estimates generated by this study could potentially support CO2 monitoring in China.

监测中国的二氧化碳(CO2)浓度对于制定有效的碳循环政策以实现碳封顶和碳中和至关重要。尽管卫星观测覆盖面不足,但本研究利用轨道碳观测站 2 号(OCO-2)的高分辨率时空数据,辅以各种辅助数据集,通过深林模型估算了 2015 年至 2022 年中国全覆盖、月度、柱平均二氧化碳(XCO2)值,空间分辨率为 0.05°。10 倍交叉验证结果表明,相关系数 (R) 为 0.95,判定系数 (R²) 为 0.90。通过对地面站数据的验证,R 值为 0.93,R² 值达到 0.81。温室气体观测卫星(GOSAT)和哥白尼大气监测服务再分析数据集(CAMS)的进一步验证产生的 R² 值分别为 0.87 和 0.80。在研究期间,中国春季和冬季的二氧化碳浓度高于夏季和秋季,显示出明显的逐年上升趋势。本研究得出的估算值有可能为中国的二氧化碳监测提供支持。
{"title":"Full-coverage estimation of CO<sub>2</sub> concentrations in China via multisource satellite data and Deep Forest model.","authors":"Kun Cai, Liuyin Guan, Shenshen Li, Shuo Zhang, Yang Liu, Yang Liu","doi":"10.1038/s41597-024-04063-9","DOIUrl":"10.1038/s41597-024-04063-9","url":null,"abstract":"<p><p>Monitoring China's carbon dioxide (CO<sub>2</sub>) concentration is essential for formulating effective carbon cycle policies to achieve carbon peaking and neutrality. Despite insufficient satellite observation coverage, this study utilizes high-resolution spatiotemporal data from the Orbiting Carbon Observatory 2 (OCO-2), supplemented with various auxiliary datasets, to estimate full-coverage, monthly, column-averaged carbon dioxide (XCO<sub>2</sub>) values across China from 2015 to 2022 at a spatial resolution of 0.05° via the deep forest model. The 10-fold cross-validation results indicate a correlation coefficient (R) of 0.95 and a determination coefficient (R²) of 0.90. Validation against ground-based station data yielded R values of 0.93, and R² values reached 0.81. Further validation from the Greenhouse Gases Observing Satellite (GOSAT) and the Copernicus Atmosphere Monitoring Service Reanalysis dataset (CAMS) produced R² values of 0.87 and 0.80, respectively. During the study period, CO<sub>2</sub> concentrations in China were higher in spring and winter than in summer and autumn, indicating a clear annual increase. The estimates generated by this study could potentially support CO<sub>2</sub> monitoring in China.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"11 1","pages":"1231"},"PeriodicalIF":5.8,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11564725/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142627262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Scientific Data
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1