首页 > 最新文献

Scientific Data最新文献

英文 中文
Chromosome-level genome assembly of the intertidal lucinid clam Indoaustriella scarlatoi.
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2025-02-15 DOI: 10.1038/s41597-025-04606-8
Yang Guo, Zhaoshan Zhong, Nannan Zhang, Minxiao Wang, Chaolun Li

Lucinidae, renowned as the most diverse chemosymbiotic invertebrate group, functions as a sulfide cleaner in coastal ecosystems and is thus ecologically important. Despite their significance, genomic studies on these organisms have been limited. Here, we present the chromosome-level genome assembly of Indoaustriella scarlatoi, an intertidal lucinid clam. Employing both short and long reads, and Hi-C sequencing, we assembled a 1.58 Gb genome comprising 690 contigs with a contig N50 length of 9.00 Mb, which were anchored to 17 chromosomes. The genome exhibits a high completeness of 95.4%, as assessed by the BUSCO analysis. Transposable elements account for 56.02% of the genome, with long terminal repeat retrotransposons (LTR, 42.66%) being the most abundant. We identified 34,469 protein-coding genes, 74.43% of which were functionally annotated. This high-quality genome assembly serves as a valuable resource for further studies on the evolutionary and ecological aspects of chemosymbiotic bivalves.

{"title":"Chromosome-level genome assembly of the intertidal lucinid clam Indoaustriella scarlatoi.","authors":"Yang Guo, Zhaoshan Zhong, Nannan Zhang, Minxiao Wang, Chaolun Li","doi":"10.1038/s41597-025-04606-8","DOIUrl":"https://doi.org/10.1038/s41597-025-04606-8","url":null,"abstract":"<p><p>Lucinidae, renowned as the most diverse chemosymbiotic invertebrate group, functions as a sulfide cleaner in coastal ecosystems and is thus ecologically important. Despite their significance, genomic studies on these organisms have been limited. Here, we present the chromosome-level genome assembly of Indoaustriella scarlatoi, an intertidal lucinid clam. Employing both short and long reads, and Hi-C sequencing, we assembled a 1.58 Gb genome comprising 690 contigs with a contig N50 length of 9.00 Mb, which were anchored to 17 chromosomes. The genome exhibits a high completeness of 95.4%, as assessed by the BUSCO analysis. Transposable elements account for 56.02% of the genome, with long terminal repeat retrotransposons (LTR, 42.66%) being the most abundant. We identified 34,469 protein-coding genes, 74.43% of which were functionally annotated. This high-quality genome assembly serves as a valuable resource for further studies on the evolutionary and ecological aspects of chemosymbiotic bivalves.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"275"},"PeriodicalIF":5.8,"publicationDate":"2025-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143425767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Transcriptomics and epigenomics datasets of primary brain cancers in formalin-fixed paraffin embedded format.
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2025-02-15 DOI: 10.1038/s41597-025-04597-6
Anabel García-Heredia, Luna Guerra-Núñez, Paula Martín-Climent, Estefanía Rojas, Raúl López-Domínguez, Clara Alcántara-Domínguez, Cristina Alenda, Luis M Valor

The access of public omics-based datasets is of paramount importance in brain cancer research as allows the proposal and validation of both biomarkers and therapeutic targets in gliomas, especially in the most prevalent and aggressive glioblastomas. Taking profit of current advances in next generation sequencing and DNA methylation profiling, we have created datasets from approximately 150 formalin-fixed paraffin embedded (FFPE) tumours. These datasets enable for the first time integrative transcriptional and epigenetics studies in a context that consider the degradation and fixation-derived chemical alterations of the most extended archiving format in hospitals, and provide an independent cohort from current public databases for further validation of putative novel biomarkers. Alongside with the most profusely known glioblastomas, astrocytomas and oligodendrogliomas, we have also included for comparison purposes few examples of rare tumours that are often neglected in brain cancer research. Taken together, we provide a valuable tool to explore combined gene expression and DNA methylation patterns in the study of gliomas and glioneuronal tumours.

{"title":"Transcriptomics and epigenomics datasets of primary brain cancers in formalin-fixed paraffin embedded format.","authors":"Anabel García-Heredia, Luna Guerra-Núñez, Paula Martín-Climent, Estefanía Rojas, Raúl López-Domínguez, Clara Alcántara-Domínguez, Cristina Alenda, Luis M Valor","doi":"10.1038/s41597-025-04597-6","DOIUrl":"https://doi.org/10.1038/s41597-025-04597-6","url":null,"abstract":"<p><p>The access of public omics-based datasets is of paramount importance in brain cancer research as allows the proposal and validation of both biomarkers and therapeutic targets in gliomas, especially in the most prevalent and aggressive glioblastomas. Taking profit of current advances in next generation sequencing and DNA methylation profiling, we have created datasets from approximately 150 formalin-fixed paraffin embedded (FFPE) tumours. These datasets enable for the first time integrative transcriptional and epigenetics studies in a context that consider the degradation and fixation-derived chemical alterations of the most extended archiving format in hospitals, and provide an independent cohort from current public databases for further validation of putative novel biomarkers. Alongside with the most profusely known glioblastomas, astrocytomas and oligodendrogliomas, we have also included for comparison purposes few examples of rare tumours that are often neglected in brain cancer research. Taken together, we provide a valuable tool to explore combined gene expression and DNA methylation patterns in the study of gliomas and glioneuronal tumours.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"273"},"PeriodicalIF":5.8,"publicationDate":"2025-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143425932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DeepFlood for Inundated Vegetation High-Resolution Dataset for Accurate Flood Mapping and Segmentation.
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2025-02-15 DOI: 10.1038/s41597-025-04554-3
Mulham Fawakherji, Jeffrey Blay, Matilda Anokye, Leila Hashemi-Beni, Jennifer Dorton

Rapid and accurate assessment of flood extent is important for effective disaster response, mitigation planning, and resource allocation. Traditional flood mapping methods encounter challenges in scalability and transferability. However, the emergence of deep learning, particularly convolutional neural networks (CNNs), revolutionizes flood mapping by autonomously learning intricate spatial patterns and semantic features directly from raw data. DeepFlood is introduced to address the essential requirement for high-quality training datasets. This is a novel dataset comprising high-resolution manned and unmanned aerial imagery and Synthetic Aperture Radar (SAR) imagery, enriched with detailed labels including inundated vegetation, one of the most challenging areas for flood mapping. DeepFlood enables multi-modal flood mapping approaches and mitigates limitations in existing datasets by providing comprehensive annotations and diverse landscape coverage. We evaluate several semantic segmentation architectures on DeepFlood, demonstrating its usability and efficacy in post-disaster flood mapping scenarios.

{"title":"DeepFlood for Inundated Vegetation High-Resolution Dataset for Accurate Flood Mapping and Segmentation.","authors":"Mulham Fawakherji, Jeffrey Blay, Matilda Anokye, Leila Hashemi-Beni, Jennifer Dorton","doi":"10.1038/s41597-025-04554-3","DOIUrl":"https://doi.org/10.1038/s41597-025-04554-3","url":null,"abstract":"<p><p>Rapid and accurate assessment of flood extent is important for effective disaster response, mitigation planning, and resource allocation. Traditional flood mapping methods encounter challenges in scalability and transferability. However, the emergence of deep learning, particularly convolutional neural networks (CNNs), revolutionizes flood mapping by autonomously learning intricate spatial patterns and semantic features directly from raw data. DeepFlood is introduced to address the essential requirement for high-quality training datasets. This is a novel dataset comprising high-resolution manned and unmanned aerial imagery and Synthetic Aperture Radar (SAR) imagery, enriched with detailed labels including inundated vegetation, one of the most challenging areas for flood mapping. DeepFlood enables multi-modal flood mapping approaches and mitigates limitations in existing datasets by providing comprehensive annotations and diverse landscape coverage. We evaluate several semantic segmentation architectures on DeepFlood, demonstrating its usability and efficacy in post-disaster flood mapping scenarios.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"271"},"PeriodicalIF":5.8,"publicationDate":"2025-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143425711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A chromosomal-scale reference genome for Rosa hugonis.
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2025-02-15 DOI: 10.1038/s41597-025-04526-7
Zhenlong Liang, Jia Miao, Hengning Deng, Ruifang Jiao, Liangying Li, Shiqi Li, Zhongyu Tang, Jian Ru, Xinfen Gao

Rosa hugonis is widely distributed in the Hengduan Mountains, Qinling Mountains, and northern China. It is an important candidate species for ecological restoration, given its good adaptability. Here, we present the first high-quality chromosome-level assembly of R. hugonis based on HiFi reads and Hi-C data. The sequencing data were then assembled onto seven pseudochromosomes of R. hugonis. The genome sizes of R. hugonis is 337.92 Mb, with contig N50 length of 26.84 Mb. We annotated 36,218 protein-coding genes in R. hugonis. In summary, the high-quality genome sequences of R. hugonis provide a genetic roadmap for the study of its genetics and species relationships. This will facilitate future genomic comparative studies across more species within Rosa.

{"title":"A chromosomal-scale reference genome for Rosa hugonis.","authors":"Zhenlong Liang, Jia Miao, Hengning Deng, Ruifang Jiao, Liangying Li, Shiqi Li, Zhongyu Tang, Jian Ru, Xinfen Gao","doi":"10.1038/s41597-025-04526-7","DOIUrl":"https://doi.org/10.1038/s41597-025-04526-7","url":null,"abstract":"<p><p>Rosa hugonis is widely distributed in the Hengduan Mountains, Qinling Mountains, and northern China. It is an important candidate species for ecological restoration, given its good adaptability. Here, we present the first high-quality chromosome-level assembly of R. hugonis based on HiFi reads and Hi-C data. The sequencing data were then assembled onto seven pseudochromosomes of R. hugonis. The genome sizes of R. hugonis is 337.92 Mb, with contig N50 length of 26.84 Mb. We annotated 36,218 protein-coding genes in R. hugonis. In summary, the high-quality genome sequences of R. hugonis provide a genetic roadmap for the study of its genetics and species relationships. This will facilitate future genomic comparative studies across more species within Rosa.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"272"},"PeriodicalIF":5.8,"publicationDate":"2025-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143425759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Water Health Open Knowledge Graph.
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2025-02-15 DOI: 10.1038/s41597-025-04537-4
Anna Sofia Lippolis, Giorgia Lodi, Andrea Giovanni Nuzzolese

Global sustainability challenges have recently led to an increasing interest in the management of water and health resources. Thus, the availability of effective, meaningful and open data is crucial to address those issues in the broader context of the Sustainable Development Goals of clean water and sanitation as targeted by the United Nations. In this paper, we present the Water Health Open Knowledge Graph (WHOW-KG) along with its design methodology and analysis on impact. Developed in the context of the EU-funded WHOW (Water Health Open Knowledge) project, the WHOW-KG is a semantic knowledge graph that models data on water consumption, pollution, extreme weather events, infectious disease rates and drug distribution. Indeed, it aims at supporting a wide range of applications: from knowledge discovery to decision-making, making it a valuable resource for researchers, policymakers, and practitioners in the water and health domains. The WHOW-KG consists of a network of five ontologies and related linked open data, modelled according to those ontologies. As a fully distributed system, it is sustainable over time, can handle large datasets, and allows data providers full control, establishing it as a vital European asset in the fields of water consumption and pollution.

{"title":"The Water Health Open Knowledge Graph.","authors":"Anna Sofia Lippolis, Giorgia Lodi, Andrea Giovanni Nuzzolese","doi":"10.1038/s41597-025-04537-4","DOIUrl":"https://doi.org/10.1038/s41597-025-04537-4","url":null,"abstract":"<p><p>Global sustainability challenges have recently led to an increasing interest in the management of water and health resources. Thus, the availability of effective, meaningful and open data is crucial to address those issues in the broader context of the Sustainable Development Goals of clean water and sanitation as targeted by the United Nations. In this paper, we present the Water Health Open Knowledge Graph (WHOW-KG) along with its design methodology and analysis on impact. Developed in the context of the EU-funded WHOW (Water Health Open Knowledge) project, the WHOW-KG is a semantic knowledge graph that models data on water consumption, pollution, extreme weather events, infectious disease rates and drug distribution. Indeed, it aims at supporting a wide range of applications: from knowledge discovery to decision-making, making it a valuable resource for researchers, policymakers, and practitioners in the water and health domains. The WHOW-KG consists of a network of five ontologies and related linked open data, modelled according to those ontologies. As a fully distributed system, it is sustainable over time, can handle large datasets, and allows data providers full control, establishing it as a vital European asset in the fields of water consumption and pollution.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"274"},"PeriodicalIF":5.8,"publicationDate":"2025-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143425928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Auto-generating a database on the fabrication details of perovskite solar devices.
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2025-02-14 DOI: 10.1038/s41597-025-04566-z
Agnes Valencia, Fei Liu, Xiangyang Zhang, Xiangkun Bo, Weilu Li, Walid A Daoud

The rapid development of perovskite solar devices has led to a rising number of publications over the past decade. As a result, a project aiming to compile all published device data was initiated in 2022. However, with its method of manual data collection, one of the project's hurdles is encouraging the participation of the perovskite community to spend time and effort in inputting new device data. To ensure the project's sustainability, adequate participation is necessary but is challenging to achieve. In response to this, we propose the utilization of natural language processing algorithms to extract various attributes of perovskite solar devices from journal articles. When data collection is performed by programs instead of humans, the lack of community participation can be overcome. For each device, the identifying device information, intrinsic device data, extrinsic cell definition, and the details of the fabrication procedure were extracted. A total of 30 attributes from 3164 journal articles were compiled, with an average accuracy of 0.899. The dataset and source code are made publicly available.

{"title":"Auto-generating a database on the fabrication details of perovskite solar devices.","authors":"Agnes Valencia, Fei Liu, Xiangyang Zhang, Xiangkun Bo, Weilu Li, Walid A Daoud","doi":"10.1038/s41597-025-04566-z","DOIUrl":"https://doi.org/10.1038/s41597-025-04566-z","url":null,"abstract":"<p><p>The rapid development of perovskite solar devices has led to a rising number of publications over the past decade. As a result, a project aiming to compile all published device data was initiated in 2022. However, with its method of manual data collection, one of the project's hurdles is encouraging the participation of the perovskite community to spend time and effort in inputting new device data. To ensure the project's sustainability, adequate participation is necessary but is challenging to achieve. In response to this, we propose the utilization of natural language processing algorithms to extract various attributes of perovskite solar devices from journal articles. When data collection is performed by programs instead of humans, the lack of community participation can be overcome. For each device, the identifying device information, intrinsic device data, extrinsic cell definition, and the details of the fabrication procedure were extracted. A total of 30 attributes from 3164 journal articles were compiled, with an average accuracy of 0.899. The dataset and source code are made publicly available.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"270"},"PeriodicalIF":5.8,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143425764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A chromosome-level genome assembly of Mylabris sibirica Fischer von Waldheim, 1823 (Coleoptera, Meloidae).
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2025-02-14 DOI: 10.1038/s41597-025-04532-9
Chenhui Shen, Guofeng Yang, Min Tang, Xiaofei Li, Li Zhu, Wei Li, Lin Jin, Pan Deng, Huanhuan Zhang, Qing Zhai, Gang Wu, Xiaohong Yan

Mylabris sibirica is a hypermetamorphic insect that primarily feeds on oilseed rape during the adult stage. However, the limited availability of genomic resources hinders our understanding of the gene function, medical use, and ecological adaptation in M. sibirica. Here, a high-quality chromosome-level genome of M. sibirica was generated by PacBio, Illumina, and Hi-C technologies. Its genome size was 138.45 Mb, with a scaffold N50 of 13.84 Mb and 99.85% (138.25 Mb) of the assembly anchors onto 10 pseudo-chromosomes. BUSCO analysis showed this genome assembly had a high-level completeness of 100% (n = 1,367), containing 1,358 (99.4%) single-copy BUSCOs and 8 (0.6%) duplicated BUSCOs. In addition, a total of 11,687 protein-coding genes and 35.46% (49.10 Mb) repetitive elements were identified. The high-quality genome assembly offers valuable genomic resources for exploring gene function, medical use, and ecology.

{"title":"A chromosome-level genome assembly of Mylabris sibirica Fischer von Waldheim, 1823 (Coleoptera, Meloidae).","authors":"Chenhui Shen, Guofeng Yang, Min Tang, Xiaofei Li, Li Zhu, Wei Li, Lin Jin, Pan Deng, Huanhuan Zhang, Qing Zhai, Gang Wu, Xiaohong Yan","doi":"10.1038/s41597-025-04532-9","DOIUrl":"https://doi.org/10.1038/s41597-025-04532-9","url":null,"abstract":"<p><p>Mylabris sibirica is a hypermetamorphic insect that primarily feeds on oilseed rape during the adult stage. However, the limited availability of genomic resources hinders our understanding of the gene function, medical use, and ecological adaptation in M. sibirica. Here, a high-quality chromosome-level genome of M. sibirica was generated by PacBio, Illumina, and Hi-C technologies. Its genome size was 138.45 Mb, with a scaffold N50 of 13.84 Mb and 99.85% (138.25 Mb) of the assembly anchors onto 10 pseudo-chromosomes. BUSCO analysis showed this genome assembly had a high-level completeness of 100% (n = 1,367), containing 1,358 (99.4%) single-copy BUSCOs and 8 (0.6%) duplicated BUSCOs. In addition, a total of 11,687 protein-coding genes and 35.46% (49.10 Mb) repetitive elements were identified. The high-quality genome assembly offers valuable genomic resources for exploring gene function, medical use, and ecology.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"269"},"PeriodicalIF":5.8,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143425762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Proteomics profiling of research models for studying pancreatic ductal adenocarcinoma.
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2025-02-14 DOI: 10.1038/s41597-025-04522-x
Mathilde Resell, Hanne-Line Rabben, Animesh Sharma, Lars Hagen, Linh Hoang, Nan T Skogaker, Anne Aarvik, Eirik Knudsen Bjåstad, Magnus K Svensson, Manoj Amrutkar, Caroline S Verbeke, Surinder K Batra, Gunnar Qvigstad, Timothy C Wang, Anil Rustgi, Duan Chen, Chun-Mei Zhao

Pancreatic ductal adenocarcinoma (PDAC) remains one of the most lethal malignancies, with a five-year survival rate of 10-15% due to late-stage diagnosis and limited efficacy of existing treatments. This study utilized proteomics-based systems modelling to generate multimodal datasets from various research models, including PDAC cells, spheroids, organoids, and tissues derived from murine and human samples. Identical mass spectrometry-based proteomics was applied across the different models. The preparation and validation of the research models and the proteomics were described in detail. The assembly datasets we present here contribute to the data collection on PDAC, which will be useful for systems modelling, data mining, knowledge discovery in databases, and bioinformatics of individual models. Further data analysis may lead to the generation of research hypotheses, predictions of targets for diagnosis and treatment, and relationships between data variables.

{"title":"Proteomics profiling of research models for studying pancreatic ductal adenocarcinoma.","authors":"Mathilde Resell, Hanne-Line Rabben, Animesh Sharma, Lars Hagen, Linh Hoang, Nan T Skogaker, Anne Aarvik, Eirik Knudsen Bjåstad, Magnus K Svensson, Manoj Amrutkar, Caroline S Verbeke, Surinder K Batra, Gunnar Qvigstad, Timothy C Wang, Anil Rustgi, Duan Chen, Chun-Mei Zhao","doi":"10.1038/s41597-025-04522-x","DOIUrl":"https://doi.org/10.1038/s41597-025-04522-x","url":null,"abstract":"<p><p>Pancreatic ductal adenocarcinoma (PDAC) remains one of the most lethal malignancies, with a five-year survival rate of 10-15% due to late-stage diagnosis and limited efficacy of existing treatments. This study utilized proteomics-based systems modelling to generate multimodal datasets from various research models, including PDAC cells, spheroids, organoids, and tissues derived from murine and human samples. Identical mass spectrometry-based proteomics was applied across the different models. The preparation and validation of the research models and the proteomics were described in detail. The assembly datasets we present here contribute to the data collection on PDAC, which will be useful for systems modelling, data mining, knowledge discovery in databases, and bioinformatics of individual models. Further data analysis may lead to the generation of research hypotheses, predictions of targets for diagnosis and treatment, and relationships between data variables.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"266"},"PeriodicalIF":5.8,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143425925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Divergence in cellular markers observed in single-cell transcriptomics datasets between cultured primary trabecular meshwork cells and tissues.
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2025-02-14 DOI: 10.1038/s41597-025-04528-5
Alice Tian, Sangbae Kim, Hasna Baidouri, Jin Li, Xuesen Cheng, Janice Vranka, Yumei Li, Rui Chen, VijayKrishna Raghunathan

The trabecular meshwork within the outflow apparatus is critical in maintaining intraocular pressure homeostasis. In vitro studies employing primary cell cultures of the human trabecular meshwork (hTM) have conventionally served as surrogates for investigating the pathobiology of TM dysfunction. Despite its abundant use, translation of outcomes from in vitro studies to ex vivo and/or in vivo studies remains a challenge. Given the cell heterogeneity, performing single-cell RNA sequencing comparing primary hTM cell cultures to hTM tissue may provide important insights on cellular identity and translatability, as such an approach has not been reported before. In this study, we assembled a total of 14 primary hTM in vitro samples across passages 1-4, including 4 samples from individuals diagnosed with glaucoma. This dataset offers a comprehensive transcriptomic resource of primary hTM in vitro scRNA-seq data to study global changes in gene expression in comparison to cells in tissue in situ. We have performed extensive preprocessing and quality control, allowing the research community to access and utilize this public resource.

{"title":"Divergence in cellular markers observed in single-cell transcriptomics datasets between cultured primary trabecular meshwork cells and tissues.","authors":"Alice Tian, Sangbae Kim, Hasna Baidouri, Jin Li, Xuesen Cheng, Janice Vranka, Yumei Li, Rui Chen, VijayKrishna Raghunathan","doi":"10.1038/s41597-025-04528-5","DOIUrl":"https://doi.org/10.1038/s41597-025-04528-5","url":null,"abstract":"<p><p>The trabecular meshwork within the outflow apparatus is critical in maintaining intraocular pressure homeostasis. In vitro studies employing primary cell cultures of the human trabecular meshwork (hTM) have conventionally served as surrogates for investigating the pathobiology of TM dysfunction. Despite its abundant use, translation of outcomes from in vitro studies to ex vivo and/or in vivo studies remains a challenge. Given the cell heterogeneity, performing single-cell RNA sequencing comparing primary hTM cell cultures to hTM tissue may provide important insights on cellular identity and translatability, as such an approach has not been reported before. In this study, we assembled a total of 14 primary hTM in vitro samples across passages 1-4, including 4 samples from individuals diagnosed with glaucoma. This dataset offers a comprehensive transcriptomic resource of primary hTM in vitro scRNA-seq data to study global changes in gene expression in comparison to cells in tissue in situ. We have performed extensive preprocessing and quality control, allowing the research community to access and utilize this public resource.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"264"},"PeriodicalIF":5.8,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143425780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Global Daily Column Average CO2 at 0.1° × 0.1° Spatial Resolution Integrating OCO-3, GOSAT, CAMS with EOF and Deep Learning.
IF 5.8 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2025-02-14 DOI: 10.1038/s41597-024-04135-w
Franz Pablo Antezana Lopez, Guanhua Zhou, Guifei Jing, Kai Zhang, Liangfu Chen, Lin Chen, Yumin Tan

Accurate global carbon dioxide (CO2) distribution with high spatial and temporal resolution is essential for understanding its dynamics and impacts on climate change. This study tackles the challenge of data gaps in satellite observations of greenhouse gases, caused by orbital and observational limitations. We reconstructed a comprehensive dataset of Column-averaged CO2 (XCO2) concentrations by integrating re-analyzed data from the Copernicus Atmosphere Monitoring Service (CAMS) with observations from GOSAT and OCO-3 satellites. Using two advanced data reconstruction methods-Data Interpolating Empirical Orthogonal Functions (DINEOF) and Convolutional Auto-Encoder (DINCAE)-we imputed missing data, preserving spatial and temporal consistency. The combined approach achieved high accuracy, with Pearson correlation values between 0.94 and 0.95 against TCCON measurements, and we also reported root mean square error (RMSE) to assess model performance further. Our results indicate that these techniques generate a daily, high-resolution, gap-free XCO2 dataset, enabling improved CO2 monitoring, climate modeling, and policy development.

{"title":"Global Daily Column Average CO<sub>2</sub> at 0.1° × 0.1° Spatial Resolution Integrating OCO-3, GOSAT, CAMS with EOF and Deep Learning.","authors":"Franz Pablo Antezana Lopez, Guanhua Zhou, Guifei Jing, Kai Zhang, Liangfu Chen, Lin Chen, Yumin Tan","doi":"10.1038/s41597-024-04135-w","DOIUrl":"https://doi.org/10.1038/s41597-024-04135-w","url":null,"abstract":"<p><p>Accurate global carbon dioxide (CO<sub>2</sub>) distribution with high spatial and temporal resolution is essential for understanding its dynamics and impacts on climate change. This study tackles the challenge of data gaps in satellite observations of greenhouse gases, caused by orbital and observational limitations. We reconstructed a comprehensive dataset of Column-averaged CO2 (XCO<sub>2</sub>) concentrations by integrating re-analyzed data from the Copernicus Atmosphere Monitoring Service (CAMS) with observations from GOSAT and OCO-3 satellites. Using two advanced data reconstruction methods-Data Interpolating Empirical Orthogonal Functions (DINEOF) and Convolutional Auto-Encoder (DINCAE)-we imputed missing data, preserving spatial and temporal consistency. The combined approach achieved high accuracy, with Pearson correlation values between 0.94 and 0.95 against TCCON measurements, and we also reported root mean square error (RMSE) to assess model performance further. Our results indicate that these techniques generate a daily, high-resolution, gap-free XCO<sub>2</sub> dataset, enabling improved CO<sub>2</sub> monitoring, climate modeling, and policy development.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"268"},"PeriodicalIF":5.8,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143425906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Scientific Data
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1