Pub Date : 2024-09-12DOI: 10.1038/s41597-024-03840-w
Roy Barkan, Ira Cooke, Sue-Ann Watson, Sally C Y Lau, Jan M Strugnell
Abalone (family Haliotidae) are an ecologically and economically significant group of marine gastropods that can be found in tropical and temperate waters. To date, only a few Haliotis genomes are available, all belonging to temperate species. Here, we provide the first chromosome-scale abalone genome assembly and the first reference genome of the tropical abalone Haliotis asinina. The combination of PacBio long-read HiFi sequencing and Dovetail's Omni-C sequencing allowed the chromosome-level assembly of this genome, while PacBio Isoform sequencing across five tissue types enabled the construction of high-quality gene models. This assembly resulted in 16 pseudo-chromosomes spanning over 1.12 Gb (98.1% of total scaffolds length), N50 of 67.09 Mb, the longest scaffold length of 105.96 Mb, and a BUSCO completeness score of 97.6%. This study identified 25,422 protein-coding genes and 61,149 transcripts. In an era of climate change and ocean warming, this genome of a heat-tolerant species can be used for comparative genomics with a focus on thermal resistance. This high-quality reference genome of H. asinina is a valuable resource for aquaculture, fisheries, and ecological studies.
{"title":"Chromosome-scale genome assembly of the tropical abalone (Haliotis asinina).","authors":"Roy Barkan, Ira Cooke, Sue-Ann Watson, Sally C Y Lau, Jan M Strugnell","doi":"10.1038/s41597-024-03840-w","DOIUrl":"https://doi.org/10.1038/s41597-024-03840-w","url":null,"abstract":"<p><p>Abalone (family Haliotidae) are an ecologically and economically significant group of marine gastropods that can be found in tropical and temperate waters. To date, only a few Haliotis genomes are available, all belonging to temperate species. Here, we provide the first chromosome-scale abalone genome assembly and the first reference genome of the tropical abalone Haliotis asinina. The combination of PacBio long-read HiFi sequencing and Dovetail's Omni-C sequencing allowed the chromosome-level assembly of this genome, while PacBio Isoform sequencing across five tissue types enabled the construction of high-quality gene models. This assembly resulted in 16 pseudo-chromosomes spanning over 1.12 Gb (98.1% of total scaffolds length), N50 of 67.09 Mb, the longest scaffold length of 105.96 Mb, and a BUSCO completeness score of 97.6%. This study identified 25,422 protein-coding genes and 61,149 transcripts. In an era of climate change and ocean warming, this genome of a heat-tolerant species can be used for comparative genomics with a focus on thermal resistance. This high-quality reference genome of H. asinina is a valuable resource for aquaculture, fisheries, and ecological studies.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11393055/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142294476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-11DOI: 10.1038/s41597-024-03824-w
Ariel Saffer, Thom Worm, Yu Takeuchi, Ross Meentemeyer
Monitoring and managing the global spread of invasive and alien species requires accurate spatiotemporal records of species presence and information about the biological characteristics of species of interest including life cycle information, biotic and abiotic constraints and pathways of spread. The Global Invasive and Alien Traits And Records (GIATAR) dataset provides consolidated dated records of invasive and alien presence at the country-scale combined with a suite of biological information about pests of interest in a standardized, machine-readable format. We provide dated presence records for 46,666 alien taxa in 249 countries constituting 827,300 country-taxon pairs in locations where the taxon's invasive status is either alien, invasive, or unknown, joined with additional biological information for thousands of taxa. GIATAR is designed to be quickly updateable with future data and easy to integrate into ongoing research on global patterns of alien species movement using scripts provided to query and analyze data. GIATAR provides crucial data needed for researchers and policymakers to compare global invasion trends across a wide range of taxa.
{"title":"GIATAR: a Spatio-temporal Dataset of Global Invasive and Alien Species and their Traits.","authors":"Ariel Saffer, Thom Worm, Yu Takeuchi, Ross Meentemeyer","doi":"10.1038/s41597-024-03824-w","DOIUrl":"10.1038/s41597-024-03824-w","url":null,"abstract":"<p><p>Monitoring and managing the global spread of invasive and alien species requires accurate spatiotemporal records of species presence and information about the biological characteristics of species of interest including life cycle information, biotic and abiotic constraints and pathways of spread. The Global Invasive and Alien Traits And Records (GIATAR) dataset provides consolidated dated records of invasive and alien presence at the country-scale combined with a suite of biological information about pests of interest in a standardized, machine-readable format. We provide dated presence records for 46,666 alien taxa in 249 countries constituting 827,300 country-taxon pairs in locations where the taxon's invasive status is either alien, invasive, or unknown, joined with additional biological information for thousands of taxa. GIATAR is designed to be quickly updateable with future data and easy to integrate into ongoing research on global patterns of alien species movement using scripts provided to query and analyze data. GIATAR provides crucial data needed for researchers and policymakers to compare global invasion trends across a wide range of taxa.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11390876/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142294484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-11DOI: 10.1038/s41597-024-03745-8
Mohammad Momenian, Zhengwu Ma, Shuyi Wu, Chengcheng Wang, Jonathan Brennan, John Hale, Lars Meyer, Jixing Li
Currently, the field of neurobiology of language is based on data from only a few Indo-European languages. The majority of this data comes from younger adults neglecting other age groups. Here we present a multimodal database which consists of task-based and resting state fMRI, structural MRI, and EEG data while participants over 65 years old listened to sections of the story The Little Prince in Cantonese. We also provide data on participants' language history, lifetime experiences, linguistic and cognitive skills. Audio and text annotations, including time-aligned speech segmentation and prosodic information, as well as word-by-word predictors such as frequency and part-of-speech tagging derived from natural language processing (NLP) tools are included in this database. Both MRI and EEG data diagnostics revealed that the data has good quality. This multimodal database could advance our understanding of spatiotemporal dynamics of language comprehension in the older population and help us study the effects of healthy aging on the relationship between brain and behaviour.
{"title":"Le Petit Prince Hong Kong (LPPHK): Naturalistic fMRI and EEG data from older Cantonese speakers.","authors":"Mohammad Momenian, Zhengwu Ma, Shuyi Wu, Chengcheng Wang, Jonathan Brennan, John Hale, Lars Meyer, Jixing Li","doi":"10.1038/s41597-024-03745-8","DOIUrl":"10.1038/s41597-024-03745-8","url":null,"abstract":"<p><p>Currently, the field of neurobiology of language is based on data from only a few Indo-European languages. The majority of this data comes from younger adults neglecting other age groups. Here we present a multimodal database which consists of task-based and resting state fMRI, structural MRI, and EEG data while participants over 65 years old listened to sections of the story The Little Prince in Cantonese. We also provide data on participants' language history, lifetime experiences, linguistic and cognitive skills. Audio and text annotations, including time-aligned speech segmentation and prosodic information, as well as word-by-word predictors such as frequency and part-of-speech tagging derived from natural language processing (NLP) tools are included in this database. Both MRI and EEG data diagnostics revealed that the data has good quality. This multimodal database could advance our understanding of spatiotemporal dynamics of language comprehension in the older population and help us study the effects of healthy aging on the relationship between brain and behaviour.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11390913/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142294487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-11DOI: 10.1038/s41597-024-03736-9
Umair Akram, Jodie Stevenson
This data resource provides evidence concerning the prevalence of perceptual alterations of emotional faces amongst individuals experiencing symptoms of insomnia, anxiety, depression, mania, psychotic experiences, and schizotypal tendencies. More specifically, we explored the categorisation accuracy (whether the displayed emotion was correctly identified), misperception (which emotion an incorrect judgment was perceived to be), intensity (extent of the emotion signal strength) and emotional valence (the extent and direction of perceived affect) of six facial expressions of emotion from the Karolinska Directed Emotional Faces database. Complete data from N = 572 respondents are included. The dataset is available to other researchers and is provided on Figshare. Information concerning the data records, usage notes, code availability and technical validation are presented. Finally, we present demographic and correlational data concerning psychiatric symptoms and alterations in the perception of emotional faces.
这一数据资源提供了有关失眠、焦虑、抑郁、躁狂、精神病性体验和精神分裂症倾向等症状患者对情绪面孔的感知改变的普遍程度的证据。更具体地说,我们探讨了卡罗林斯卡定向情绪面孔数据库中六种情绪面部表情的分类准确性(显示的情绪是否被正确识别)、错误感知(错误判断被认为是哪种情绪)、强度(情绪信号强度的程度)和情绪价位(感知到的情绪的程度和方向)。其中包括 N = 572 名受访者的完整数据。该数据集可供其他研究人员使用,并在 Figshare 上提供。我们将介绍有关数据记录、使用说明、代码可用性和技术验证的信息。最后,我们介绍了有关精神症状和情感面孔感知改变的人口统计学和相关数据。
{"title":"Altered emotion perception in insomnia, anxiety, depression, mania, psychotic experiences and schizotypal symptoms: a dataset.","authors":"Umair Akram, Jodie Stevenson","doi":"10.1038/s41597-024-03736-9","DOIUrl":"10.1038/s41597-024-03736-9","url":null,"abstract":"<p><p>This data resource provides evidence concerning the prevalence of perceptual alterations of emotional faces amongst individuals experiencing symptoms of insomnia, anxiety, depression, mania, psychotic experiences, and schizotypal tendencies. More specifically, we explored the categorisation accuracy (whether the displayed emotion was correctly identified), misperception (which emotion an incorrect judgment was perceived to be), intensity (extent of the emotion signal strength) and emotional valence (the extent and direction of perceived affect) of six facial expressions of emotion from the Karolinska Directed Emotional Faces database. Complete data from N = 572 respondents are included. The dataset is available to other researchers and is provided on Figshare. Information concerning the data records, usage notes, code availability and technical validation are presented. Finally, we present demographic and correlational data concerning psychiatric symptoms and alterations in the perception of emotional faces.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11391038/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142294465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-10DOI: 10.1038/s41597-024-03787-y
Lucia Campese, Luca Russo, Maria Abagnale, Adriana Alberti, Giancarlo Bachi, Cecilia Balestra, Daniele Bellardini, Angela Buondonno, Ulisse Cardini, Ylenia Carotenuto, Giovanni Checcucci, Maria Luisa Chiusano, Isabella D'Ambra, Giuliana d'Ippolito, Iole Di Capua, Vincenzo Donnarumma, Angelo Fontana, Marta Furia, Denisse Galarza-Verkovitch, Roberto Gallia, Karine Labadie, Serena Leone, Priscilla Licandro, Antonio Longo, Maira Maselli, Louise Merquiol, Carola Murano, Pedro H Oliveira, Augusto Passarelli, Isabella Percopo, Aude Perdereau, Roberta Piredda, Francesca Raffini, Vittoria Roncalli, Hans-Joachim Ruscheweyh, Ennio Russo, Maria Saggiomo, Chiara Santinelli, Diana Sarno, Shinichi Sunagawa, Ferdinando Tramontano, Anna Chiara Trano, Marco Uttieri, Patrick Wincker, Gianpaolo Zampicinini, Raffaella Casotti, Fabio Conversano, Domenico D'Alelio, Daniele Iudicone, Francesca Margiotta, Marina Montresor
The NEREA (Naples Ecological REsearch for Augmented observatories) initiative aims to establish an augmented observatory in the Gulf of Naples (GoN), designed to advance the understanding of marine ecosystems through a holistic approach. Inspired by the Tara Oceans expedition and building on the scientific legacy of the MareChiara Long-Term Ecological Research (LTER-MC) site, NEREA integrates traditional physical, chemical, and biological measurements with state-of-the-art methodologies such as metabarcoding and metagenomics. Here we present the first 10 months of NEREA data, collected from April 2019 to January 2020, encompassing physico-chemical parameters, plankton biodiversity (e.g., microscopy and flow cytometry), prokaryotic and eukaryotic metabarcoding, a prokaryotic gene catalogue, and a collection of 3818 prokaryotic Metagenome-Assembled Genomes (MAGs). NEREA's efforts produce a significant volume of multifaceted data, which enhances our understanding of marine ecosystems and promotes the development of scientific hypotheses and ideas.
{"title":"The NEREA Augmented Observatory: an integrative approach to marine coastal ecology.","authors":"Lucia Campese, Luca Russo, Maria Abagnale, Adriana Alberti, Giancarlo Bachi, Cecilia Balestra, Daniele Bellardini, Angela Buondonno, Ulisse Cardini, Ylenia Carotenuto, Giovanni Checcucci, Maria Luisa Chiusano, Isabella D'Ambra, Giuliana d'Ippolito, Iole Di Capua, Vincenzo Donnarumma, Angelo Fontana, Marta Furia, Denisse Galarza-Verkovitch, Roberto Gallia, Karine Labadie, Serena Leone, Priscilla Licandro, Antonio Longo, Maira Maselli, Louise Merquiol, Carola Murano, Pedro H Oliveira, Augusto Passarelli, Isabella Percopo, Aude Perdereau, Roberta Piredda, Francesca Raffini, Vittoria Roncalli, Hans-Joachim Ruscheweyh, Ennio Russo, Maria Saggiomo, Chiara Santinelli, Diana Sarno, Shinichi Sunagawa, Ferdinando Tramontano, Anna Chiara Trano, Marco Uttieri, Patrick Wincker, Gianpaolo Zampicinini, Raffaella Casotti, Fabio Conversano, Domenico D'Alelio, Daniele Iudicone, Francesca Margiotta, Marina Montresor","doi":"10.1038/s41597-024-03787-y","DOIUrl":"https://doi.org/10.1038/s41597-024-03787-y","url":null,"abstract":"<p><p>The NEREA (Naples Ecological REsearch for Augmented observatories) initiative aims to establish an augmented observatory in the Gulf of Naples (GoN), designed to advance the understanding of marine ecosystems through a holistic approach. Inspired by the Tara Oceans expedition and building on the scientific legacy of the MareChiara Long-Term Ecological Research (LTER-MC) site, NEREA integrates traditional physical, chemical, and biological measurements with state-of-the-art methodologies such as metabarcoding and metagenomics. Here we present the first 10 months of NEREA data, collected from April 2019 to January 2020, encompassing physico-chemical parameters, plankton biodiversity (e.g., microscopy and flow cytometry), prokaryotic and eukaryotic metabarcoding, a prokaryotic gene catalogue, and a collection of 3818 prokaryotic Metagenome-Assembled Genomes (MAGs). NEREA's efforts produce a significant volume of multifaceted data, which enhances our understanding of marine ecosystems and promotes the development of scientific hypotheses and ideas.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11387787/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142294496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-10DOI: 10.1038/s41597-024-03823-x
Ran Yi, Shuai Chen, Mingfeng Guan, Chunyan Liao, Yao Zhu, Jacque Pak Kan Ip, Tao Ye, Yu Chen
Astrocytes, the predominant glial cells in the central nervous system, play essential roles in maintaining brain function. Reprogramming induced pluripotent stem cells (iPSCs) to become astrocytes through overexpression of the transcription factors, NFIB and SOX9, is a rapid and efficient approach for studying human neurological diseases and identifying therapeutic targets. However, the precise differentiation path and molecular signatures of induced astrocytes remain incompletely understood. Accordingly, we performed single-cell RNA sequencing analysis on 64,736 cells to establish a comprehensive atlas of NFIB/SOX9-directed astrocyte differentiation from human iPSCs. Our dataset provides detailed information about the path of astrocyte differentiation, highlighting the stepwise molecular changes that occur throughout the differentiation process. This dataset serves as a valuable reference for dissecting uncharacterized transcriptomic features of NFIB/SOX9-induced astrocytes and investigating lineage progression during astrocyte differentiation. Moreover, these findings pave the way for future studies on neurological diseases using the NFIB/SOX9-induced astrocyte model.
{"title":"A single-cell transcriptomic dataset of pluripotent stem cell-derived astrocytes via NFIB/SOX9 overexpression.","authors":"Ran Yi, Shuai Chen, Mingfeng Guan, Chunyan Liao, Yao Zhu, Jacque Pak Kan Ip, Tao Ye, Yu Chen","doi":"10.1038/s41597-024-03823-x","DOIUrl":"10.1038/s41597-024-03823-x","url":null,"abstract":"<p><p>Astrocytes, the predominant glial cells in the central nervous system, play essential roles in maintaining brain function. Reprogramming induced pluripotent stem cells (iPSCs) to become astrocytes through overexpression of the transcription factors, NFIB and SOX9, is a rapid and efficient approach for studying human neurological diseases and identifying therapeutic targets. However, the precise differentiation path and molecular signatures of induced astrocytes remain incompletely understood. Accordingly, we performed single-cell RNA sequencing analysis on 64,736 cells to establish a comprehensive atlas of NFIB/SOX9-directed astrocyte differentiation from human iPSCs. Our dataset provides detailed information about the path of astrocyte differentiation, highlighting the stepwise molecular changes that occur throughout the differentiation process. This dataset serves as a valuable reference for dissecting uncharacterized transcriptomic features of NFIB/SOX9-induced astrocytes and investigating lineage progression during astrocyte differentiation. Moreover, these findings pave the way for future studies on neurological diseases using the NFIB/SOX9-induced astrocyte model.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11387634/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142294462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-10DOI: 10.1038/s41597-024-03687-1
Callum V Bucklow, Martin J Genner, George F Turner, James Maclaine, Roger Benson, Berta Verd
Here we describe a dataset of freely available, readily processed, whole-body μCT-scans of 56 species (116 specimens) of Lake Malawi cichlid fishes that captures a considerable majority of the morphological variation present in this remarkable adaptive radiation. We contextualise the scanned specimens within a discussion of their respective ecomorphological groupings and suggest possible macroevolutionary studies that could be conducted with these data. In addition, we describe a methodology to efficiently μCT-scan (on average) 23 specimens per hour, limiting scanning time and alleviating the financial cost whilst maintaining high resolution. We demonstrate the utility of this method by reconstructing 3D models of multiple bones from multiple specimens within the dataset. We hope this dataset will enable further morphological study of this fascinating system and permit wider-scale comparisons with other cichlid adaptive radiations.
{"title":"A whole-body micro-CT scan library that captures the skeletal diversity of Lake Malawi cichlid fishes.","authors":"Callum V Bucklow, Martin J Genner, George F Turner, James Maclaine, Roger Benson, Berta Verd","doi":"10.1038/s41597-024-03687-1","DOIUrl":"10.1038/s41597-024-03687-1","url":null,"abstract":"<p><p>Here we describe a dataset of freely available, readily processed, whole-body μCT-scans of 56 species (116 specimens) of Lake Malawi cichlid fishes that captures a considerable majority of the morphological variation present in this remarkable adaptive radiation. We contextualise the scanned specimens within a discussion of their respective ecomorphological groupings and suggest possible macroevolutionary studies that could be conducted with these data. In addition, we describe a methodology to efficiently μCT-scan (on average) 23 specimens per hour, limiting scanning time and alleviating the financial cost whilst maintaining high resolution. We demonstrate the utility of this method by reconstructing 3D models of multiple bones from multiple specimens within the dataset. We hope this dataset will enable further morphological study of this fascinating system and permit wider-scale comparisons with other cichlid adaptive radiations.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11387623/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142294463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-10DOI: 10.1038/s41597-024-03751-w
Vasileios Gkinis, Sarah Jackson, Nerilie J Abram, Christopher Plummer, Thomas Blunier, Margaret Harlan, Helle Astrid Kjær, Andrew D Moy, Kerttu Maria Peensoo, Thea Quistgaard, Anders Svensson, Tessa R Vance
We report high resolution measurements of the stable water isotope ratios (δ18O, δD) from the Mount Brown South ice core (MBS, 69.11° S 86.31° E). The record covers the period 873 - 2009 CE with sub-annual temporal resolution. Preliminary analyses of surface cores have shown the Mount Brown South site has relatively high annual snowfall accumulation (0.3 metres ice equivalent) with a seasonal bias toward lower snowfall during austral summer. Precipitation at the site is frequently related to intense, short term synoptic scale events from the mid-latitudes of the southern Indian Ocean. Higher snowfall regimes are associated with easterly winds, while lower snowfall regimes are associated with south-easterly winds. Isotope ratios are measured with Infra-Red Cavity Ring Down Spectroscopy, calibrated on the VSMOW/SLAP scale and reported on the MBS2023 time scale interpolated accordingly. We provide estimates for measurement precision and internal accuracy for δ18O and δD.
{"title":"An East Antarctic, sub-annual resolution water isotope record from the Mount Brown South Ice core.","authors":"Vasileios Gkinis, Sarah Jackson, Nerilie J Abram, Christopher Plummer, Thomas Blunier, Margaret Harlan, Helle Astrid Kjær, Andrew D Moy, Kerttu Maria Peensoo, Thea Quistgaard, Anders Svensson, Tessa R Vance","doi":"10.1038/s41597-024-03751-w","DOIUrl":"10.1038/s41597-024-03751-w","url":null,"abstract":"<p><p>We report high resolution measurements of the stable water isotope ratios (δ<sup>18</sup>O, δD) from the Mount Brown South ice core (MBS, 69.11<sup>°</sup> S 86.31<sup>°</sup> E). The record covers the period 873 - 2009 CE with sub-annual temporal resolution. Preliminary analyses of surface cores have shown the Mount Brown South site has relatively high annual snowfall accumulation (0.3 metres ice equivalent) with a seasonal bias toward lower snowfall during austral summer. Precipitation at the site is frequently related to intense, short term synoptic scale events from the mid-latitudes of the southern Indian Ocean. Higher snowfall regimes are associated with easterly winds, while lower snowfall regimes are associated with south-easterly winds. Isotope ratios are measured with Infra-Red Cavity Ring Down Spectroscopy, calibrated on the VSMOW/SLAP scale and reported on the MBS2023 time scale interpolated accordingly. We provide estimates for measurement precision and internal accuracy for δ<sup>18</sup>O and δD.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11387611/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142294467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-10DOI: 10.1038/s41597-024-03797-w
Stephan Getzmann, Patrick D Gajewski, Daniel Schneider, Edmund Wascher
This dataset consists of 64-channels resting-state EEG recordings of 608 participants aged between 20 and 70 years, 61.8% female, as well as follow-up measurements after approximately 5 years of 208 participants, starting 2021. The EEG was measured for three minutes with eyes open and eyes closed before and after a 2-hour block of cognitive experimental tasks. The data set is part of the Dortmund Vital Study, a prospective study on the determinants of healthy cognitive aging. The dataset can be used for (1) analyzing cross-sectional resting-state EEG of healthy individuals across the adult life span; (2) generating normalization data sets for comparison of resting-state EEG data of patients with clinically relevant disorders; (3) studying effects of performing cognitive tasks on resting-state EEG and age; (4) exploring intra-individual changes in resting-state EEG and effects of task performance over a time period of about 5 years. The data are provided in Brain Imaging Data Structure (BIDS) format and are available on OpenNeuro.
{"title":"Resting-state EEG data before and after cognitive activity across the adult lifespan and a 5-year follow-up.","authors":"Stephan Getzmann, Patrick D Gajewski, Daniel Schneider, Edmund Wascher","doi":"10.1038/s41597-024-03797-w","DOIUrl":"10.1038/s41597-024-03797-w","url":null,"abstract":"<p><p>This dataset consists of 64-channels resting-state EEG recordings of 608 participants aged between 20 and 70 years, 61.8% female, as well as follow-up measurements after approximately 5 years of 208 participants, starting 2021. The EEG was measured for three minutes with eyes open and eyes closed before and after a 2-hour block of cognitive experimental tasks. The data set is part of the Dortmund Vital Study, a prospective study on the determinants of healthy cognitive aging. The dataset can be used for (1) analyzing cross-sectional resting-state EEG of healthy individuals across the adult life span; (2) generating normalization data sets for comparison of resting-state EEG data of patients with clinically relevant disorders; (3) studying effects of performing cognitive tasks on resting-state EEG and age; (4) exploring intra-individual changes in resting-state EEG and effects of task performance over a time period of about 5 years. The data are provided in Brain Imaging Data Structure (BIDS) format and are available on OpenNeuro.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11387823/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142294493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Accurately predicting ADMET (Absorption, Distribution, Metabolism, Excretion, and Toxicity) properties early in drug development is essential for selecting compounds with optimal pharmacokinetics and minimal toxicity. Existing ADMET-related benchmark sets are limited in utility due to their small dataset sizes and the lack of representation of compounds used in drug discovery projects. These shortcomings hinder their application in model building for drug discovery. To address this issue, we propose a multi-agent data mining system based on Large Language Models that effectively identifies experimental conditions within 14,401 bioassays. This approach facilitates merging entries from different sources, culminating in the creation of PharmaBench. Additionally, we have developed a data processing workflow to integrate data from various sources, resulting in 156,618 raw entries. Through this workflow, we constructed PharmaBench, a comprehensive benchmark set for ADMET properties, which comprises eleven ADMET datasets and 52,482 entries. This benchmark set is designed to serve as an open-source dataset for the development of AI models relevant to drug discovery projects.
{"title":"PharmaBench: Enhancing ADMET benchmarks with large language models.","authors":"Zhangming Niu, Xianglu Xiao, Wenfan Wu, Qiwei Cai, Yinghui Jiang, Wangzhen Jin, Minhao Wang, Guojian Yang, Lingkang Kong, Xurui Jin, Guang Yang, Hongming Chen","doi":"10.1038/s41597-024-03793-0","DOIUrl":"10.1038/s41597-024-03793-0","url":null,"abstract":"<p><p>Accurately predicting ADMET (Absorption, Distribution, Metabolism, Excretion, and Toxicity) properties early in drug development is essential for selecting compounds with optimal pharmacokinetics and minimal toxicity. Existing ADMET-related benchmark sets are limited in utility due to their small dataset sizes and the lack of representation of compounds used in drug discovery projects. These shortcomings hinder their application in model building for drug discovery. To address this issue, we propose a multi-agent data mining system based on Large Language Models that effectively identifies experimental conditions within 14,401 bioassays. This approach facilitates merging entries from different sources, culminating in the creation of PharmaBench. Additionally, we have developed a data processing workflow to integrate data from various sources, resulting in 156,618 raw entries. Through this workflow, we constructed PharmaBench, a comprehensive benchmark set for ADMET properties, which comprises eleven ADMET datasets and 52,482 entries. This benchmark set is designed to serve as an open-source dataset for the development of AI models relevant to drug discovery projects.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11387650/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142294490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}