Pub Date : 2024-09-27DOI: 10.1038/s41597-024-03876-y
Azim Ahmadzadeh, Rohan Adhyapak, Kartik Chaurasiya, Laxmi Alekhya Nagubandi, V Aparna, Petrus C Martens, Alexei Pevtsov, Luca Bertello, Alexander Pevtsov, Naomi Douglas, Samuel McDonald, Apaar Bawa, Eugene Kang, Riley Wu, Dustin J Kempton, Aya Abdelkarem, Patrick M Copeland, Sri Harsha Seelamneni
We present the Manually Annotated GONG Filaments in H-alpha Observations (MAGFiLO v1.0) dataset. This dataset contains 10,244 annotated filaments from 1,593 observations captured by the Global Oscillation Network Group (GONG), spanning the years 2011 through 2022. Each annotation details one filament's segmentation, minimum bounding box, spine, and magnetic field chirality. With a total of over one thousand person-hours of annotation, and a double-blind review process, we ensured high-quality ground-truth data. Our inter-annotator agreement reaches a Kappa score of 0.66. We also verified that the hemispheric preference of filaments as annotated in MAGFiLO aligns with the findings from similar datasets of much smaller sample sizes. MAGFiLO is the first dataset of its size, enabling advanced deep learning models to identify filaments and their features with unprecedented precision. It also provides a testbed for solar physicists interested in large-scale analysis of filaments. In this report, we document the details of the annotation and the post-processing phases that were applied.
{"title":"A dataset of manually annotated filaments from H-alpha observations.","authors":"Azim Ahmadzadeh, Rohan Adhyapak, Kartik Chaurasiya, Laxmi Alekhya Nagubandi, V Aparna, Petrus C Martens, Alexei Pevtsov, Luca Bertello, Alexander Pevtsov, Naomi Douglas, Samuel McDonald, Apaar Bawa, Eugene Kang, Riley Wu, Dustin J Kempton, Aya Abdelkarem, Patrick M Copeland, Sri Harsha Seelamneni","doi":"10.1038/s41597-024-03876-y","DOIUrl":"https://doi.org/10.1038/s41597-024-03876-y","url":null,"abstract":"<p><p>We present the Manually Annotated GONG Filaments in H-alpha Observations (MAGFiLO v1.0) dataset. This dataset contains 10,244 annotated filaments from 1,593 observations captured by the Global Oscillation Network Group (GONG), spanning the years 2011 through 2022. Each annotation details one filament's segmentation, minimum bounding box, spine, and magnetic field chirality. With a total of over one thousand person-hours of annotation, and a double-blind review process, we ensured high-quality ground-truth data. Our inter-annotator agreement reaches a Kappa score of 0.66. We also verified that the hemispheric preference of filaments as annotated in MAGFiLO aligns with the findings from similar datasets of much smaller sample sizes. MAGFiLO is the first dataset of its size, enabling advanced deep learning models to identify filaments and their features with unprecedented precision. It also provides a testbed for solar physicists interested in large-scale analysis of filaments. In this report, we document the details of the annotation and the post-processing phases that were applied.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11437148/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142353133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-27DOI: 10.1038/s41597-024-03868-y
Marielle Vigouroux, Petr Novák, Ludmila Cristina Oliveira, Carmen Santos, Jitender Cheema, Roland H M Wouters, Pirita Paajanen, Martin Vickers, Andrea Koblížková, Maria Carlota Vaz Patto, Jiří Macas, Burkhard Steuernagel, Cathie Martin, Peter M F Emmrich
Grasspea (Lathyrus sativus L.) is an underutilised but promising legume crop with tolerance to a wide range of abiotic and biotic stress factors, and potential for climate-resilient agriculture. Despite a long history and wide geographical distribution of cultivation, only limited breeding resources are available. This paper reports a 5.96 Gbp genome assembly of grasspea genotype LS007, of which 5.03 Gbp is scaffolded into 7 pseudo-chromosomes. The assembly has a BUSCO completeness score of 99.1% and is annotated with 31719 gene models and repeat elements. This represents the most contiguous and accurate assembly of the grasspea genome to date.
{"title":"A chromosome-scale reference genome of grasspea (Lathyrus sativus).","authors":"Marielle Vigouroux, Petr Novák, Ludmila Cristina Oliveira, Carmen Santos, Jitender Cheema, Roland H M Wouters, Pirita Paajanen, Martin Vickers, Andrea Koblížková, Maria Carlota Vaz Patto, Jiří Macas, Burkhard Steuernagel, Cathie Martin, Peter M F Emmrich","doi":"10.1038/s41597-024-03868-y","DOIUrl":"https://doi.org/10.1038/s41597-024-03868-y","url":null,"abstract":"<p><p>Grasspea (Lathyrus sativus L.) is an underutilised but promising legume crop with tolerance to a wide range of abiotic and biotic stress factors, and potential for climate-resilient agriculture. Despite a long history and wide geographical distribution of cultivation, only limited breeding resources are available. This paper reports a 5.96 Gbp genome assembly of grasspea genotype LS007, of which 5.03 Gbp is scaffolded into 7 pseudo-chromosomes. The assembly has a BUSCO completeness score of 99.1% and is annotated with 31719 gene models and repeat elements. This represents the most contiguous and accurate assembly of the grasspea genome to date.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11437036/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142353132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-27DOI: 10.1038/s41597-024-03889-7
Binghe Zhao, Mucong Zi, Xiaoyu Zhang, Yong Wang
Coastal sediments are rich in embedded recalcitrant organic carbons that are biotransformed into methane. In this study, gas composition (carbon dioxide, methane and nitrogen) and chemical indicators (total nitrogen, total carbon, and total sulfate) were examined in five deep sediment cores (up to 130 m in length) obtained from the Hangzhou Bay. The V3-V4 region of the 16S rRNA gene amplicons was amplified and sequenced for the prokaryotic community analysis. The species composition, along with the physicochemical factors of the sediments, revealed a strong correlation with methane content in one of the sediment cores. We then obtained metagenomes of the two sediment samples selected for their high methane content and enrichment of methanogenic Bathyarchaeota with phylogenetic evidence. A total of 27 draft genomes were retrieved through metagenomic binning methodologies and were classified into Bathyarchaeota, Asgard archaea, Planctomycetes, and other microbial groups. The data provided are valuable for understanding the relationship between methane generation and microbial community composition in deep sediment core samples from coastal to marine environments.
{"title":"Microbial communities and metagenomes in methane-rich deep coastal sediments.","authors":"Binghe Zhao, Mucong Zi, Xiaoyu Zhang, Yong Wang","doi":"10.1038/s41597-024-03889-7","DOIUrl":"https://doi.org/10.1038/s41597-024-03889-7","url":null,"abstract":"<p><p>Coastal sediments are rich in embedded recalcitrant organic carbons that are biotransformed into methane. In this study, gas composition (carbon dioxide, methane and nitrogen) and chemical indicators (total nitrogen, total carbon, and total sulfate) were examined in five deep sediment cores (up to 130 m in length) obtained from the Hangzhou Bay. The V3-V4 region of the 16S rRNA gene amplicons was amplified and sequenced for the prokaryotic community analysis. The species composition, along with the physicochemical factors of the sediments, revealed a strong correlation with methane content in one of the sediment cores. We then obtained metagenomes of the two sediment samples selected for their high methane content and enrichment of methanogenic Bathyarchaeota with phylogenetic evidence. A total of 27 draft genomes were retrieved through metagenomic binning methodologies and were classified into Bathyarchaeota, Asgard archaea, Planctomycetes, and other microbial groups. The data provided are valuable for understanding the relationship between methane generation and microbial community composition in deep sediment core samples from coastal to marine environments.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11437075/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142353155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-20DOI: 10.1038/s41597-024-03818-8
Mostafa Y Abdel-Glil, Johannes Solle, Daniel Wibberg, Heinrich Neubauer, Lisa D Sprague
Tritrichomonas foetus is a parasitic protist responsible for bovine trichomonosis, a reproductive disease associated with significant economic burden to the livestock industry throughout the world. Here, we present a chromosome-level reference genome of T. foetus -KV-1 (ATCC 30924) using short-read (Illumina Miseq), long-read (Oxford Nanopore) and chromatin-linked (Hi-C) sequencing. This is the first chromosome-level genome of a parasitic protist of the order Tritrichomonadida and the second within the Parabasalia lineage, after Trichomonas vaginalis, the human-associated causative agent of the sexually transmitted infection in humans. Our constructed genome is 148 Mb in size, with a N50 length of the scaffolds of 22.9 Mb. The contigs are anchored in five super-scaffolds, corresponding to the expected five chromosomes of the species and covering 78% of the genome assembly. We predict 41,341 protein-coding genes, of which 95.10% have been functionally annotated. This high-quality genome assembly serves as a valuable reference genome for T. foetus to support future studies in functional genomics, genetic conservation and taxonomy.
{"title":"Chromosome-level genome assembly of Tritrichomonas foetus, the causative agent of Bovine Trichomonosis.","authors":"Mostafa Y Abdel-Glil, Johannes Solle, Daniel Wibberg, Heinrich Neubauer, Lisa D Sprague","doi":"10.1038/s41597-024-03818-8","DOIUrl":"10.1038/s41597-024-03818-8","url":null,"abstract":"<p><p>Tritrichomonas foetus is a parasitic protist responsible for bovine trichomonosis, a reproductive disease associated with significant economic burden to the livestock industry throughout the world. Here, we present a chromosome-level reference genome of T. foetus -KV-1 (ATCC 30924) using short-read (Illumina Miseq), long-read (Oxford Nanopore) and chromatin-linked (Hi-C) sequencing. This is the first chromosome-level genome of a parasitic protist of the order Tritrichomonadida and the second within the Parabasalia lineage, after Trichomonas vaginalis, the human-associated causative agent of the sexually transmitted infection in humans. Our constructed genome is 148 Mb in size, with a N50 length of the scaffolds of 22.9 Mb. The contigs are anchored in five super-scaffolds, corresponding to the expected five chromosomes of the species and covering 78% of the genome assembly. We predict 41,341 protein-coding genes, of which 95.10% have been functionally annotated. This high-quality genome assembly serves as a valuable reference genome for T. foetus to support future studies in functional genomics, genetic conservation and taxonomy.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11415386/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142294475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-19DOI: 10.1038/s41597-024-03874-0
Jiahui Su, Yuri A Mazei, Andrey N Tsyganov, Viktor A Chernyshov, Alexander A Komarov, Elena A Malysheva, Kirill V Babeshko, Natalia G Mazei, Damir A Saldaev, Boris Levin, Basil N Yakimov
The functional traits of soil protists have been employed in ecological research to enhance comprehension of the underlying mechanisms of ecological processes. Among the numerous soil protists, testate amoebae emerge as a prominent and abundant group, playing a pivotal role in soil micro-food webs. Furthermore, they are regarded as valuable bioindicators for environmental monitoring and palaeoecological studies due to their sensitivity to environmental changes. We screened 372 testate amoebae species widely distributed across Northern Holarctic realm and collected trait data, representing the morphological and feeding characteristics of testate amoebae. The dataset would provide valuable basis for investigation of the functional diversity and ecological roles of testate amoebae, thus facilitating further research on soil protist communities and ecosystem dynamics.
{"title":"Functional traits data for testate amoebae of Northern Holarctic realm.","authors":"Jiahui Su, Yuri A Mazei, Andrey N Tsyganov, Viktor A Chernyshov, Alexander A Komarov, Elena A Malysheva, Kirill V Babeshko, Natalia G Mazei, Damir A Saldaev, Boris Levin, Basil N Yakimov","doi":"10.1038/s41597-024-03874-0","DOIUrl":"10.1038/s41597-024-03874-0","url":null,"abstract":"<p><p>The functional traits of soil protists have been employed in ecological research to enhance comprehension of the underlying mechanisms of ecological processes. Among the numerous soil protists, testate amoebae emerge as a prominent and abundant group, playing a pivotal role in soil micro-food webs. Furthermore, they are regarded as valuable bioindicators for environmental monitoring and palaeoecological studies due to their sensitivity to environmental changes. We screened 372 testate amoebae species widely distributed across Northern Holarctic realm and collected trait data, representing the morphological and feeding characteristics of testate amoebae. The dataset would provide valuable basis for investigation of the functional diversity and ecological roles of testate amoebae, thus facilitating further research on soil protist communities and ecosystem dynamics.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11413188/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142294482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-19DOI: 10.1038/s41597-024-03866-0
Xiaobin Xu, Lili Zhou, James Taylor, Raffaele Casa, Chengzhi Fan, Xiaoyu Song, Guijun Yang, Wenjiang Huang, Zhenhai Li
In China, the exigency for precise wheat grain protein content (GPC) data rises with growing food consumption demands and global market competition. However, due to the lack of extensive, prolonged high-resolution benchmark data, previous GPC studies have primarily focused on experimental fields, small geographic units, and limited temporal scopes. Additionally, the diverse geographical terrain in China exacerbates the challenges of large-scale GPC estimation. To address this challenge and the data gap, the first 500-meter spatial resolution, long-term winter wheat dataset covering major planting regions in China (CNWheatGPC-500) was created by integrating multi-source data from ERA5 and MODIS. The results demonstrate that the GPC estimation model based on hierarchical linear model significantly outperformed other conventional models. The validation dataset exhibited an R2 of 0.45 and an RMSE of 0.96%. In cross-validation, the RMSE values ranged from 0.90% in Gansu to 1.32% in Anhui. For leave-one-year-out cross-validation, the RMSE values ranged from 0.77% to 1.11%. CNWheatGPC-500 offers valuable insights for enhancing wheat production, quality control, and agricultural decision-making.
{"title":"The 500-meter long-term winter wheat grain protein content dataset for China from multi-source data.","authors":"Xiaobin Xu, Lili Zhou, James Taylor, Raffaele Casa, Chengzhi Fan, Xiaoyu Song, Guijun Yang, Wenjiang Huang, Zhenhai Li","doi":"10.1038/s41597-024-03866-0","DOIUrl":"10.1038/s41597-024-03866-0","url":null,"abstract":"<p><p>In China, the exigency for precise wheat grain protein content (GPC) data rises with growing food consumption demands and global market competition. However, due to the lack of extensive, prolonged high-resolution benchmark data, previous GPC studies have primarily focused on experimental fields, small geographic units, and limited temporal scopes. Additionally, the diverse geographical terrain in China exacerbates the challenges of large-scale GPC estimation. To address this challenge and the data gap, the first 500-meter spatial resolution, long-term winter wheat dataset covering major planting regions in China (CNWheatGPC-500) was created by integrating multi-source data from ERA5 and MODIS. The results demonstrate that the GPC estimation model based on hierarchical linear model significantly outperformed other conventional models. The validation dataset exhibited an R<sup>2</sup> of 0.45 and an RMSE of 0.96%. In cross-validation, the RMSE values ranged from 0.90% in Gansu to 1.32% in Anhui. For leave-one-year-out cross-validation, the RMSE values ranged from 0.77% to 1.11%. CNWheatGPC-500 offers valuable insights for enhancing wheat production, quality control, and agricultural decision-making.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11413012/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142294494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-19DOI: 10.1038/s41597-024-03869-x
Kathleen I Pishas, Karla J Cowley, Marta Llaurado-Fernandez, Hannah Kim, Jennii Luu, Robert Vary, Nikola A Bowden, Ian G Campbell, Mark S Carey, Kaylene J Simpson, Dane Cheasley
Low grade serous carcinoma (LGSOC) is a rare epithelial ovarian cancer with unique molecular characteristics compared to the more common tubo-ovarian high-grade serous ovarian carcinoma. Pivotal clinical trials guiding the management of epithelial ovarian cancer lack sufficient cases of LGSOC for meaningful subgroup analysis, hence overall findings cannot be extrapolated to rarer chemo-resistant subtypes such as LGSOC. Furthermore, there is a need for more effective therapies for the treatment of relapsed disease, as treatment options are limited. To address this, we conducted the largest quantitative high-throughput drug screening effort (n = 3436 compounds) in 12 patient-derived LGSOC cell lines and one normal ovary cell line to identify unexplored therapeutic avenues. Using a combination of high-throughput robotics, high-content imaging and novel data analysis pipelines, our data set identified 60 high and 19 moderate confidence hits which induced cancer cell specific cytotoxicity at the lowest compound dose assessed (0.1 µM). We also revealed a series of known (mTOR/PI3K/AKT) and novel (EGFR and MDM2-p53) drug classes in which LGSOC cell lines showed demonstrable susceptibility to.
{"title":"High-throughput drug screening identifies novel therapeutics for Low Grade Serous Ovarian Carcinoma.","authors":"Kathleen I Pishas, Karla J Cowley, Marta Llaurado-Fernandez, Hannah Kim, Jennii Luu, Robert Vary, Nikola A Bowden, Ian G Campbell, Mark S Carey, Kaylene J Simpson, Dane Cheasley","doi":"10.1038/s41597-024-03869-x","DOIUrl":"https://doi.org/10.1038/s41597-024-03869-x","url":null,"abstract":"<p><p>Low grade serous carcinoma (LGSOC) is a rare epithelial ovarian cancer with unique molecular characteristics compared to the more common tubo-ovarian high-grade serous ovarian carcinoma. Pivotal clinical trials guiding the management of epithelial ovarian cancer lack sufficient cases of LGSOC for meaningful subgroup analysis, hence overall findings cannot be extrapolated to rarer chemo-resistant subtypes such as LGSOC. Furthermore, there is a need for more effective therapies for the treatment of relapsed disease, as treatment options are limited. To address this, we conducted the largest quantitative high-throughput drug screening effort (n = 3436 compounds) in 12 patient-derived LGSOC cell lines and one normal ovary cell line to identify unexplored therapeutic avenues. Using a combination of high-throughput robotics, high-content imaging and novel data analysis pipelines, our data set identified 60 high and 19 moderate confidence hits which induced cancer cell specific cytotoxicity at the lowest compound dose assessed (0.1 µM). We also revealed a series of known (mTOR/PI3K/AKT) and novel (EGFR and MDM2-p53) drug classes in which LGSOC cell lines showed demonstrable susceptibility to.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11413243/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142294486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-19DOI: 10.1038/s41597-024-03814-y
Malte Jensen, Andreas Clemmensen, Jacob Gorm Hansen, Julie van Krimpen Mortensen, Emil N Christensen, Andreas Kjaer, Rasmus Sejersten Ripa
A pivotal animal model for development of anticancer molecules is mice with subcutaneous tumors, grown by injection of xenografted tumor cells, where micro-Computed Tomography (µCT) of the mice is used to analyze the efficacy of the anticancer molecule. Manual delineation of the tumor region is necessary for the analysis, which is time-consuming and inconsistent, highlighting the need for automatic segmentation (AS) tools. This study introduces a preclinical µCT database, comprising 452 whole-body scans from 223 individual mice with subcutaneous tumors, spanning ten diverse µCT datasets conducted between 2014 and 2020 on a preclinical PET/CT scanner, making it the hitherto largest dataset of its kind. Each tumor is annotated manually by three expert annotators, allowing for robust model development. Inter-annotator agreement was analyzed, and we report an overall annotation agreement of 0.903 ± 0.046 (mean ± std) Fleiss' Kappa and a mean deviation in volume estimation of 0.015 ± 0.010 cm3 (6.9% ± 4.7), which establishes a human baseline accuracy for delineation of subcutaneous tumors, while showing good inter-annotator agreement.
{"title":"3D whole body preclinical micro-CT database of subcutaneous tumors in mice with annotations from 3 annotators.","authors":"Malte Jensen, Andreas Clemmensen, Jacob Gorm Hansen, Julie van Krimpen Mortensen, Emil N Christensen, Andreas Kjaer, Rasmus Sejersten Ripa","doi":"10.1038/s41597-024-03814-y","DOIUrl":"10.1038/s41597-024-03814-y","url":null,"abstract":"<p><p>A pivotal animal model for development of anticancer molecules is mice with subcutaneous tumors, grown by injection of xenografted tumor cells, where micro-Computed Tomography (µCT) of the mice is used to analyze the efficacy of the anticancer molecule. Manual delineation of the tumor region is necessary for the analysis, which is time-consuming and inconsistent, highlighting the need for automatic segmentation (AS) tools. This study introduces a preclinical µCT database, comprising 452 whole-body scans from 223 individual mice with subcutaneous tumors, spanning ten diverse µCT datasets conducted between 2014 and 2020 on a preclinical PET/CT scanner, making it the hitherto largest dataset of its kind. Each tumor is annotated manually by three expert annotators, allowing for robust model development. Inter-annotator agreement was analyzed, and we report an overall annotation agreement of 0.903 ± 0.046 (mean ± std) Fleiss' Kappa and a mean deviation in volume estimation of 0.015 ± 0.010 cm<sup>3</sup> (6.9% ± 4.7), which establishes a human baseline accuracy for delineation of subcutaneous tumors, while showing good inter-annotator agreement.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11412993/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142294449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-19DOI: 10.1038/s41597-024-03801-3
Irvylle Cavalcante, Alberto Rodrigues da Silva, Matej Zajc, Igor Mendek, Lisa Calearo, Anna Malkova, Charalampos Ziras, Panagiotis Pediaditis, Konstantinos Michos, João Mateus, Samuel Matias, Miguel Brito, Alexis Lekidis, Cindy P Guzman, Ana Rita Nunes, Hugo Morais
An increasing adoption of electric vehicles (EVs) is expected in the coming decades mainly due to the need to achieve carbon neutrality until 2050. However, predicting electric mobility's future is challenging due to three main factors: technological advancements, regulatory policies, and consumer behaviour. The projections presented in this study are based on several scenarios driven mainly from reports published by public entities and consultants. It considers the evolution of electric road mobility by defined targets in the electrification of the transport sector. Therefore, the gathered data addresses different horizon times regarding EV penetration in the World, Europe, Portugal, Denmark, Greece, and Slovenia. Thus, an extensive literature review and estimating approach for EV forecast was conducted concerning EV markets, charging infrastructure, and electricity demand. Also, the dataset aims to provide a demand projection by 2050 and serving as a critical input to further work on EV mass deployment in the context of the project Electric Vehicles Management for carbon neutrality in Europe (EV4EU) and other works related to this field.
{"title":"Dataset on Electric Road Mobility: Historical and Evolution Scenarios until 2050.","authors":"Irvylle Cavalcante, Alberto Rodrigues da Silva, Matej Zajc, Igor Mendek, Lisa Calearo, Anna Malkova, Charalampos Ziras, Panagiotis Pediaditis, Konstantinos Michos, João Mateus, Samuel Matias, Miguel Brito, Alexis Lekidis, Cindy P Guzman, Ana Rita Nunes, Hugo Morais","doi":"10.1038/s41597-024-03801-3","DOIUrl":"10.1038/s41597-024-03801-3","url":null,"abstract":"<p><p>An increasing adoption of electric vehicles (EVs) is expected in the coming decades mainly due to the need to achieve carbon neutrality until 2050. However, predicting electric mobility's future is challenging due to three main factors: technological advancements, regulatory policies, and consumer behaviour. The projections presented in this study are based on several scenarios driven mainly from reports published by public entities and consultants. It considers the evolution of electric road mobility by defined targets in the electrification of the transport sector. Therefore, the gathered data addresses different horizon times regarding EV penetration in the World, Europe, Portugal, Denmark, Greece, and Slovenia. Thus, an extensive literature review and estimating approach for EV forecast was conducted concerning EV markets, charging infrastructure, and electricity demand. Also, the dataset aims to provide a demand projection by 2050 and serving as a critical input to further work on EV mass deployment in the context of the project Electric Vehicles Management for carbon neutrality in Europe (EV4EU) and other works related to this field.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11413019/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142294478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Understanding emotional states is pivotal for the development of next-generation human-machine interfaces. Human behaviors in social interactions have resulted in psycho-physiological processes influenced by perceptual inputs. Therefore, efforts to comprehend brain functions and human behavior could potentially catalyze the development of AI models with human-like attributes. In this study, we introduce a multimodal emotion dataset comprising data from 30-channel electroencephalography (EEG), audio, and video recordings from 42 participants. Each participant engaged in a cue-based conversation scenario, eliciting five distinct emotions: neutral, anger, happiness, sadness, and calmness. Throughout the experiment, each participant contributed 200 interactions, which encompassed both listening and speaking. This resulted in a cumulative total of 8,400 interactions across all participants. We evaluated the baseline performance of emotion recognition for each modality using established deep neural network (DNN) methods. The Emotion in EEG-Audio-Visual (EAV) dataset represents the first public dataset to incorporate three primary modalities for emotion recognition within a conversational context. We anticipate that this dataset will make significant contributions to the modeling of the human emotional process, encompassing both fundamental neuroscience and machine learning viewpoints.
{"title":"EAV: EEG-Audio-Video Dataset for Emotion Recognition in Conversational Contexts.","authors":"Min-Ho Lee, Adai Shomanov, Balgyn Begim, Zhuldyz Kabidenova, Aruna Nyssanbay, Adnan Yazici, Seong-Whan Lee","doi":"10.1038/s41597-024-03838-4","DOIUrl":"10.1038/s41597-024-03838-4","url":null,"abstract":"<p><p>Understanding emotional states is pivotal for the development of next-generation human-machine interfaces. Human behaviors in social interactions have resulted in psycho-physiological processes influenced by perceptual inputs. Therefore, efforts to comprehend brain functions and human behavior could potentially catalyze the development of AI models with human-like attributes. In this study, we introduce a multimodal emotion dataset comprising data from 30-channel electroencephalography (EEG), audio, and video recordings from 42 participants. Each participant engaged in a cue-based conversation scenario, eliciting five distinct emotions: neutral, anger, happiness, sadness, and calmness. Throughout the experiment, each participant contributed 200 interactions, which encompassed both listening and speaking. This resulted in a cumulative total of 8,400 interactions across all participants. We evaluated the baseline performance of emotion recognition for each modality using established deep neural network (DNN) methods. The Emotion in EEG-Audio-Visual (EAV) dataset represents the first public dataset to incorporate three primary modalities for emotion recognition within a conversational context. We anticipate that this dataset will make significant contributions to the modeling of the human emotional process, encompassing both fundamental neuroscience and machine learning viewpoints.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11413008/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142294479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}