首页 > 最新文献

Data最新文献

英文 中文
Comprehensive Dataset on Pre-SARS-CoV-2 Infection Sports-Related Physical Activity Levels, Disease Severity, and Treatment Outcomes: Insights and Implications for COVID-19 Management 关于 SARS-CoV-2 感染前与运动相关的体育活动水平、疾病严重程度和治疗结果的综合数据集:对 COVID-19 管理的启示和影响
Pub Date : 2024-01-26 DOI: 10.3390/data9020023
Dimitrios I. Bourdas, Panteleimon Bakirtzoglou, Antonios K. Travlos, Vasileios Andrianopoulos, E. Zacharakis
This dataset aimed to explore associations between pre-SARS-CoV-2 infection exercise and sports-related physical activity (PA) levels and disease severity, along with treatments administered following the most recent SARS-CoV-2 infection. A comprehensive analysis investigated the relationships between PA categories (“Inactive”, “Low PA”, “Moderate PA”, “High PA”), disease severity (“Sporadic”, “Episodic”, “Recurrent”, “Frequent”, “Persistent”), and treatments post-SARS-CoV-2 infection (“No treatment”, “Home remedies”, “Prescribed medication”, “Hospital admission”, “Intensive care unit admission”) within a sample population (n = 5829) from the Hellenic territory. Utilizing the Active-Q questionnaire, data were collected from February to March 2023, capturing PA habits, participant characteristics, medical history, vaccination status, and illness experiences. Findings revealed an independent relationship between preinfection PA levels and disease severity (χ2 = 9.097, df = 12, p = 0.695). Additionally, a statistical dependency emerged between PA levels and illness treatment categories (χ2 = 39.362, df = 12, p < 0.001), particularly linking inactive PA with home remedies treatment. These results highlight the potential influence of preinfection PA on disease severity and treatment choices following SARS-CoV-2 infection. The dataset offers valuable insights into the interplay between PA, disease outcomes, and treatment decisions, aiding future research in shaping targeted interventions and public health strategies related to COVID-19 management.
该数据集旨在探讨感染 SARS-CoV-2 前的运动和体育相关体力活动 (PA) 水平与疾病严重程度之间的关系,以及最近一次感染 SARS-CoV-2 后所采取的治疗措施。一项综合分析调查了希腊地区样本人群(n = 5829)中的体育活动类别("不活跃"、"低体育活动"、"中等体育活动"、"高体育活动")、疾病严重程度("零星"、"偶发"、"复发"、"频繁"、"持续")和 SARS-CoV-2 感染后的治疗("未治疗"、"家庭疗法"、"处方药物"、"入院"、"入住重症监护室")之间的关系。我们在 2023 年 2 月至 3 月期间利用 Active-Q 问卷收集了数据,其中包括 PA 习惯、参与者特征、病史、疫苗接种情况和疾病经历。研究结果显示,感染前 PA 水平与疾病严重程度之间存在独立关系(χ2 = 9.097,df = 12,p = 0.695)。此外,PA 水平与疾病治疗类别之间存在统计学依赖关系(χ2 = 39.362,df = 12,p < 0.001),尤其是非活动 PA 与家庭疗法治疗之间的关系。这些结果凸显了感染前PA对感染SARS-CoV-2后疾病严重程度和治疗选择的潜在影响。该数据集为了解 PA、疾病结果和治疗决定之间的相互作用提供了有价值的见解,有助于未来研究制定与 COVID-19 管理相关的有针对性的干预措施和公共卫生策略。
{"title":"Comprehensive Dataset on Pre-SARS-CoV-2 Infection Sports-Related Physical Activity Levels, Disease Severity, and Treatment Outcomes: Insights and Implications for COVID-19 Management","authors":"Dimitrios I. Bourdas, Panteleimon Bakirtzoglou, Antonios K. Travlos, Vasileios Andrianopoulos, E. Zacharakis","doi":"10.3390/data9020023","DOIUrl":"https://doi.org/10.3390/data9020023","url":null,"abstract":"This dataset aimed to explore associations between pre-SARS-CoV-2 infection exercise and sports-related physical activity (PA) levels and disease severity, along with treatments administered following the most recent SARS-CoV-2 infection. A comprehensive analysis investigated the relationships between PA categories (“Inactive”, “Low PA”, “Moderate PA”, “High PA”), disease severity (“Sporadic”, “Episodic”, “Recurrent”, “Frequent”, “Persistent”), and treatments post-SARS-CoV-2 infection (“No treatment”, “Home remedies”, “Prescribed medication”, “Hospital admission”, “Intensive care unit admission”) within a sample population (n = 5829) from the Hellenic territory. Utilizing the Active-Q questionnaire, data were collected from February to March 2023, capturing PA habits, participant characteristics, medical history, vaccination status, and illness experiences. Findings revealed an independent relationship between preinfection PA levels and disease severity (χ2 = 9.097, df = 12, p = 0.695). Additionally, a statistical dependency emerged between PA levels and illness treatment categories (χ2 = 39.362, df = 12, p < 0.001), particularly linking inactive PA with home remedies treatment. These results highlight the potential influence of preinfection PA on disease severity and treatment choices following SARS-CoV-2 infection. The dataset offers valuable insights into the interplay between PA, disease outcomes, and treatment decisions, aiding future research in shaping targeted interventions and public health strategies related to COVID-19 management.","PeriodicalId":502371,"journal":{"name":"Data","volume":"27 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139595793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Genomic Epidemiology Dataset for the Important Nosocomial Pathogenic Bacterium Acinetobacter baumannii 重要的非社会致病性细菌鲍曼不动杆菌的基因组流行病学数据集
Pub Date : 2024-01-26 DOI: 10.3390/data9020022
A. Shelenkov, Yu. D. Mikhaylova, Vasiliy Akimkin
The infections caused by various bacterial pathogens both in clinical and community settings represent a significant threat to public healthcare worldwide. The growing resistance to antimicrobial drugs acquired by bacterial species causing healthcare-associated infections has already become a life-threatening danger noticed by the World Health Organization. Several groups or lineages of bacterial isolates, usually called ‘the clones of high risk’, often drive the spread of resistance within particular species. Thus, it is vitally important to reveal and track the spread of such clones and the mechanisms by which they acquire antibiotic resistance and enhance their survival skills. Currently, the analysis of whole-genome sequences for bacterial isolates of interest is increasingly used for these purposes, including epidemiological surveillance and the development of spread prevention measures. However, the availability and uniformity of the data derived from genomic sequences often represent a bottleneck for such investigations. With this dataset, we present the results of a genomic epidemiology analysis of 17,546 genomes of a dangerous bacterial pathogen, Acinetobacter baumannii. Important typing information, including multilocus sequence typing (MLST)-based sequence types (STs), intrinsic blaOXA-51-like gene variants, capsular (KL) and oligosaccharide (OCL) types, CRISPR-Cas systems, and cgMLST profiles are presented, as well as the assignment of particular isolates to nine known international clones of high risk. The presence of antimicrobial resistance genes within the genomes is also reported. These data will be useful for researchers in the field of A. baumannii genomic epidemiology, resistance analysis, and prevention measure development.
在临床和社区环境中,由各种细菌病原体引起的感染是对全球公共医疗保健的重大威胁。世界卫生组织已经注意到,引起医疗相关感染的细菌对抗菌药物的耐药性不断增强,已经成为威胁生命的危险因素。通常被称为 "高风险克隆 "的几组或几系细菌分离物,往往会在特定物种内部推动耐药性的传播。因此,揭示和追踪这些克隆的传播以及它们获得抗生素耐药性和提高生存技能的机制至关重要。目前,相关细菌分离物的全基因组序列分析正越来越多地用于上述目的,包括流行病学监测和制定传播预防措施。然而,基因组序列数据的可用性和统一性往往成为此类研究的瓶颈。通过这个数据集,我们展示了对危险细菌病原体鲍曼不动杆菌的 17,546 个基因组进行基因组流行病学分析的结果。我们提供了重要的分型信息,包括基于多焦点序列分型(MLST)的序列类型(ST)、固有的 blaOXA-51 样基因变体、胶囊(KL)和寡糖(OCL)类型、CRISPR-Cas 系统和 cgMLST 图谱,并将特定分离株归入九个已知的国际高风险克隆。此外,还报告了基因组中抗菌药耐药性基因的存在情况。这些数据将对鲍曼不动杆菌基因组流行病学、耐药性分析和预防措施开发领域的研究人员有所帮助。
{"title":"Genomic Epidemiology Dataset for the Important Nosocomial Pathogenic Bacterium Acinetobacter baumannii","authors":"A. Shelenkov, Yu. D. Mikhaylova, Vasiliy Akimkin","doi":"10.3390/data9020022","DOIUrl":"https://doi.org/10.3390/data9020022","url":null,"abstract":"The infections caused by various bacterial pathogens both in clinical and community settings represent a significant threat to public healthcare worldwide. The growing resistance to antimicrobial drugs acquired by bacterial species causing healthcare-associated infections has already become a life-threatening danger noticed by the World Health Organization. Several groups or lineages of bacterial isolates, usually called ‘the clones of high risk’, often drive the spread of resistance within particular species. Thus, it is vitally important to reveal and track the spread of such clones and the mechanisms by which they acquire antibiotic resistance and enhance their survival skills. Currently, the analysis of whole-genome sequences for bacterial isolates of interest is increasingly used for these purposes, including epidemiological surveillance and the development of spread prevention measures. However, the availability and uniformity of the data derived from genomic sequences often represent a bottleneck for such investigations. With this dataset, we present the results of a genomic epidemiology analysis of 17,546 genomes of a dangerous bacterial pathogen, Acinetobacter baumannii. Important typing information, including multilocus sequence typing (MLST)-based sequence types (STs), intrinsic blaOXA-51-like gene variants, capsular (KL) and oligosaccharide (OCL) types, CRISPR-Cas systems, and cgMLST profiles are presented, as well as the assignment of particular isolates to nine known international clones of high risk. The presence of antimicrobial resistance genes within the genomes is also reported. These data will be useful for researchers in the field of A. baumannii genomic epidemiology, resistance analysis, and prevention measure development.","PeriodicalId":502371,"journal":{"name":"Data","volume":"27 19","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139595771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MHAiR: A Dataset of Audio-Image Representations for Multimodal Human Actions MHAiR:多模态人类行为的音像表示数据集
Pub Date : 2024-01-25 DOI: 10.3390/data9020021
M. Shaikh, Douglas Chai, Syed Mohammed Shamsul Islam, Naveed Akhtar
Audio-image representations for a multimodal human action (MHAiR) dataset contains six different image representations of the audio signals that capture the temporal dynamics of the actions in a very compact and informative way. The dataset was extracted from the audio recordings which were captured from an existing video dataset, i.e., UCF101. Each data sample captured a duration of approximately 10 s long, and the overall dataset was split into 4893 training samples and 1944 testing samples. The resulting feature sequences were then converted into images, which can be used for human action recognition and other related tasks. These images can be used as a benchmark dataset for evaluating the performance of machine learning models for human action recognition and related tasks. These audio-image representations could be suitable for a wide range of applications, such as surveillance, healthcare monitoring, and robotics. The dataset can also be used for transfer learning, where pre-trained models can be fine-tuned on a specific task using specific audio images. Thus, this dataset can facilitate the development of new techniques and approaches for improving the accuracy of human action-related tasks and also serve as a standard benchmark for testing the performance of different machine learning models and algorithms.
多模态人类动作的音频图像表示法(MHAiR)数据集包含音频信号的六种不同图像表示法,能以非常紧凑和翔实的方式捕捉动作的时间动态。该数据集是从现有视频数据集(即 UCF101)中捕获的音频记录中提取的。每个数据样本的捕获时长约为 10 秒,整个数据集被分成 4893 个训练样本和 1944 个测试样本。然后将得到的特征序列转换成图像,用于人类动作识别和其他相关任务。这些图像可作为基准数据集,用于评估机器学习模型在人类动作识别和相关任务中的性能。这些音频图像表示法可适用于广泛的应用领域,如监控、医疗保健监测和机器人技术。该数据集还可用于迁移学习,使用特定的音频图像在特定任务中对预先训练好的模型进行微调。因此,该数据集可促进新技术和新方法的开发,以提高人类动作相关任务的准确性,还可作为测试不同机器学习模型和算法性能的标准基准。
{"title":"MHAiR: A Dataset of Audio-Image Representations for Multimodal Human Actions","authors":"M. Shaikh, Douglas Chai, Syed Mohammed Shamsul Islam, Naveed Akhtar","doi":"10.3390/data9020021","DOIUrl":"https://doi.org/10.3390/data9020021","url":null,"abstract":"Audio-image representations for a multimodal human action (MHAiR) dataset contains six different image representations of the audio signals that capture the temporal dynamics of the actions in a very compact and informative way. The dataset was extracted from the audio recordings which were captured from an existing video dataset, i.e., UCF101. Each data sample captured a duration of approximately 10 s long, and the overall dataset was split into 4893 training samples and 1944 testing samples. The resulting feature sequences were then converted into images, which can be used for human action recognition and other related tasks. These images can be used as a benchmark dataset for evaluating the performance of machine learning models for human action recognition and related tasks. These audio-image representations could be suitable for a wide range of applications, such as surveillance, healthcare monitoring, and robotics. The dataset can also be used for transfer learning, where pre-trained models can be fine-tuned on a specific task using specific audio images. Thus, this dataset can facilitate the development of new techniques and approaches for improving the accuracy of human action-related tasks and also serve as a standard benchmark for testing the performance of different machine learning models and algorithms.","PeriodicalId":502371,"journal":{"name":"Data","volume":"17 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139597532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Optimized Hybrid Approach for Feature Selection Based on Chi-Square and Particle Swarm Optimization Algorithms 基于 Chi-Square 算法和粒子群优化算法的特征选择优化混合方法
Pub Date : 2024-01-25 DOI: 10.3390/data9020020
A. Abdo, Rasha Mostafa, Laila Abdelhamid
Feature selection is a significant issue in the machine learning process. Most datasets include features that are not needed for the problem being studied. These irrelevant features reduce both the efficiency and accuracy of the algorithm. It is possible to think about feature selection as an optimization problem. Swarm intelligence algorithms are promising techniques for solving this problem. This research paper presents a hybrid approach for tackling the problem of feature selection. A filter method (chi-square) and two wrapper swarm intelligence algorithms (grey wolf optimization (GWO) and particle swarm optimization (PSO)) are used in two different techniques to improve feature selection accuracy and system execution time. The performance of the two phases of the proposed approach is assessed using two distinct datasets. The results show that PSOGWO yields a maximum accuracy boost of 95.3%, while chi2-PSOGWO yields a maximum accuracy improvement of 95.961% for feature selection. The experimental results show that the proposed approach performs better than the compared approaches.
特征选择是机器学习过程中的一个重要问题。大多数数据集都包含研究问题所不需要的特征。这些不相关的特征会降低算法的效率和准确性。我们可以将特征选择视为一个优化问题。群智能算法是解决这一问题的有前途的技术。本研究论文提出了一种解决特征选择问题的混合方法。在两种不同的技术中使用了一种滤波方法(chi-square)和两种包装群智能算法(灰狼优化(GWO)和粒子群优化(PSO)),以提高特征选择的准确性和系统执行时间。使用两个不同的数据集评估了拟议方法两个阶段的性能。结果表明,PSOGWO 的最大准确率提高了 95.3%,而 chi2-PSOGWO 在特征选择方面的最大准确率提高了 95.961%。实验结果表明,所提出的方法比相比之下的方法表现更好。
{"title":"An Optimized Hybrid Approach for Feature Selection Based on Chi-Square and Particle Swarm Optimization Algorithms","authors":"A. Abdo, Rasha Mostafa, Laila Abdelhamid","doi":"10.3390/data9020020","DOIUrl":"https://doi.org/10.3390/data9020020","url":null,"abstract":"Feature selection is a significant issue in the machine learning process. Most datasets include features that are not needed for the problem being studied. These irrelevant features reduce both the efficiency and accuracy of the algorithm. It is possible to think about feature selection as an optimization problem. Swarm intelligence algorithms are promising techniques for solving this problem. This research paper presents a hybrid approach for tackling the problem of feature selection. A filter method (chi-square) and two wrapper swarm intelligence algorithms (grey wolf optimization (GWO) and particle swarm optimization (PSO)) are used in two different techniques to improve feature selection accuracy and system execution time. The performance of the two phases of the proposed approach is assessed using two distinct datasets. The results show that PSOGWO yields a maximum accuracy boost of 95.3%, while chi2-PSOGWO yields a maximum accuracy improvement of 95.961% for feature selection. The experimental results show that the proposed approach performs better than the compared approaches.","PeriodicalId":502371,"journal":{"name":"Data","volume":"43 16","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139598152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Draft Genome Sequence of the Commercial Strain Rhizobium ruizarguesonis bv. viciae RCAM1022 商业菌株根瘤菌 ruizarguesonis bv. viciae RCAM1022 的基因组序列草案
Pub Date : 2024-01-23 DOI: 10.3390/data9020019
O. Kulaeva, E. Zorin, A. Sulima, G. Akhtemova, Vladimir A Zhukov
Legume plants enter a symbiosis with soil nitrogen-fixing bacteria (rhizobia), thereby gaining access to assimilable atmospheric nitrogen. Since this symbiosis is important for agriculture, biofertilizers with effective strains of rhizobia are created for crop legumes to increase their yield and minimize the amounts of mineral fertilizers required. In this work, we sequenced and characterized the genome of Rhizobium ruizarguesonis bv. viciae strain RCAM1022, a component of the ‘Rhizotorfin’ biofertilizer produced in Russia and used for pea (Pisum sativum L.).
豆科植物与土壤固氮菌(根瘤菌)共生,从而获得可被大气同化的氮。由于这种共生关系对农业非常重要,人们为豆科作物创造了含有有效根瘤菌株的生物肥料,以提高豆科植物的产量,并最大限度地减少所需的矿物肥料。在这项工作中,我们对根瘤菌 ruizarguesonis bv. viciae 菌株 RCAM1022 的基因组进行了测序和表征,该菌株是俄罗斯生产的 "Rhizotorfin "生物肥料的组成部分,用于豌豆(Pisum sativum L.)。
{"title":"Draft Genome Sequence of the Commercial Strain Rhizobium ruizarguesonis bv. viciae RCAM1022","authors":"O. Kulaeva, E. Zorin, A. Sulima, G. Akhtemova, Vladimir A Zhukov","doi":"10.3390/data9020019","DOIUrl":"https://doi.org/10.3390/data9020019","url":null,"abstract":"Legume plants enter a symbiosis with soil nitrogen-fixing bacteria (rhizobia), thereby gaining access to assimilable atmospheric nitrogen. Since this symbiosis is important for agriculture, biofertilizers with effective strains of rhizobia are created for crop legumes to increase their yield and minimize the amounts of mineral fertilizers required. In this work, we sequenced and characterized the genome of Rhizobium ruizarguesonis bv. viciae strain RCAM1022, a component of the ‘Rhizotorfin’ biofertilizer produced in Russia and used for pea (Pisum sativum L.).","PeriodicalId":502371,"journal":{"name":"Data","volume":"62 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139604455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Can Data and Machine Learning Change the Future of Basic Income Models? A Bayesian Belief Networks Approach 数据和机器学习能否改变基本收入模式的未来?贝叶斯信念网络方法
Pub Date : 2024-01-23 DOI: 10.3390/data9020018
Hamed Khalili
Appeals to governments for implementing basic income are contemporary. The theoretical backgrounds of the basic income notion only prescribe transferring equal amounts to individuals irrespective of their specific attributes. However, the most recent basic income initiatives all around the world are attached to certain rules with regard to the attributes of the households. This approach is facing significant challenges to appropriately recognize vulnerable groups. A possible alternative for setting rules with regard to the welfare attributes of the households is to employ artificial intelligence algorithms that can process unprecedented amounts of data. Can integrating machine learning change the future of basic income by predicting households vulnerable to future poverty? In this paper, we utilize multidimensional and longitudinal welfare data comprising one and a half million individuals’ data and a Bayesian beliefs network approach to examine the feasibility of predicting households’ vulnerability to future poverty based on the existing households’ welfare attributes.
呼吁政府实施基本收入的呼声与时俱进。基本收入概念的理论背景只规定向个人转移同等数额的收入,而不考虑其具体属性。然而,世界各地最近提出的基本收入倡议都附加了有关家庭属性的某些规则。这种做法在适当承认弱势群体方面面临重大挑战。在制定有关家庭福利属性的规则时,一种可能的替代方法是采用人工智能算法,这种算法可以处理前所未有的大量数据。整合机器学习能否通过预测未来易陷入贫困的家庭来改变基本收入的未来?在本文中,我们利用由 150 万个人数据组成的多维度纵向福利数据和贝叶斯信念网络方法,研究了基于现有家庭福利属性预测家庭未来贫困脆弱性的可行性。
{"title":"Can Data and Machine Learning Change the Future of Basic Income Models? A Bayesian Belief Networks Approach","authors":"Hamed Khalili","doi":"10.3390/data9020018","DOIUrl":"https://doi.org/10.3390/data9020018","url":null,"abstract":"Appeals to governments for implementing basic income are contemporary. The theoretical backgrounds of the basic income notion only prescribe transferring equal amounts to individuals irrespective of their specific attributes. However, the most recent basic income initiatives all around the world are attached to certain rules with regard to the attributes of the households. This approach is facing significant challenges to appropriately recognize vulnerable groups. A possible alternative for setting rules with regard to the welfare attributes of the households is to employ artificial intelligence algorithms that can process unprecedented amounts of data. Can integrating machine learning change the future of basic income by predicting households vulnerable to future poverty? In this paper, we utilize multidimensional and longitudinal welfare data comprising one and a half million individuals’ data and a Bayesian beliefs network approach to examine the feasibility of predicting households’ vulnerability to future poverty based on the existing households’ welfare attributes.","PeriodicalId":502371,"journal":{"name":"Data","volume":"52 8","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139603600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Elliott State Research Forest Timber Cruise, Oregon, 2015–2016 埃利奥特州立研究林木材巡航,俄勒冈州,2015-2016 年
Pub Date : 2024-01-18 DOI: 10.3390/data9010016
Todd West, Bogdan M. Strimbu
The Elliott State Research Forest comprises 33,700 ha of temperate, Douglas-fir rainforest along North America’s Pacific Coast (Oregon, United States). In 2015, naturally regenerated stands at least 92 years old covered 49% of the research area and sawtimber plantations younger than 68 years another 50%. During the winter of 2015–2016, a forest wide inventory sampled both naturally regenerated and plantation stands, recording 97,424 trees on 17,866 plots in 738 stands. The resulting dataset is atypical for the area as plot locations were not restricted to upland, commercially harvestable timber. Multiage stands and riparian areas were therefore documented along with plantations 2–61 years old and trees retained through clearcut harvests. This dataset constitutes the only open access, stand-based forest inventory currently available for a large area within the Oregon Coast Range. The dataset enables development of suites of models as well as many comparisons across stand ages and types, both at stand level and at the level of individual trees.
埃利奥特州立研究林位于北美太平洋沿岸(美国俄勒冈州),由 33,700 公顷温带花旗松雨林组成。2015 年,树龄至少 92 年的天然更新林木占研究区面积的 49%,树龄小于 68 年的锯材种植林木占研究区面积的 50%。2015-2016 年冬季,森林普查对自然再生林和人工林进行了采样,在 738 个林分的 17866 个地块上记录了 97424 棵树木。由此产生的数据集对该地区而言是非典型的,因为地块位置并不局限于高地、可商业采伐的木材。因此,多树龄林分和河岸地区,以及树龄在 2-61 年的人工林和通过砍伐保留下来的树木都被记录在案。该数据集是俄勒冈海岸山脉大片地区目前唯一可公开获取的、以林分为基础的森林资源清单。通过该数据集,可以开发成套模型,并在林分层面和单棵树木层面对不同树龄和类型的林木进行比较。
{"title":"Elliott State Research Forest Timber Cruise, Oregon, 2015–2016","authors":"Todd West, Bogdan M. Strimbu","doi":"10.3390/data9010016","DOIUrl":"https://doi.org/10.3390/data9010016","url":null,"abstract":"The Elliott State Research Forest comprises 33,700 ha of temperate, Douglas-fir rainforest along North America’s Pacific Coast (Oregon, United States). In 2015, naturally regenerated stands at least 92 years old covered 49% of the research area and sawtimber plantations younger than 68 years another 50%. During the winter of 2015–2016, a forest wide inventory sampled both naturally regenerated and plantation stands, recording 97,424 trees on 17,866 plots in 738 stands. The resulting dataset is atypical for the area as plot locations were not restricted to upland, commercially harvestable timber. Multiage stands and riparian areas were therefore documented along with plantations 2–61 years old and trees retained through clearcut harvests. This dataset constitutes the only open access, stand-based forest inventory currently available for a large area within the Oregon Coast Range. The dataset enables development of suites of models as well as many comparisons across stand ages and types, both at stand level and at the level of individual trees.","PeriodicalId":502371,"journal":{"name":"Data","volume":"119 40","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139615177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine Learning Classification Workflow and Datasets for Ionospheric VLF Data Exclusion 电离层甚低频数据排除的机器学习分类工作流程和数据集
Pub Date : 2024-01-18 DOI: 10.3390/data9010017
Filip Arnaut, A. Kolarski, V. Srećković
Machine learning (ML) methods are commonly applied in the fields of extraterrestrial physics, space science, and plasma physics. In a prior publication, an ML classification technique, the Random Forest (RF) algorithm, was utilized to automatically identify and categorize erroneous signals, including instrument errors, noisy signals, outlier data points, and the impact of solar flares (SFs) on the ionosphere. This data communication includes the pre-processed dataset used in the aforementioned research, along with a workflow that utilizes the PyCaret library and a post-processing workflow. The code and data serve educational purposes in the interdisciplinary field of ML and ionospheric physics science, as well as being useful to other researchers for diverse objectives.
机器学习(ML)方法通常应用于地外物理学、空间科学和等离子物理学领域。在之前发表的一篇文章中,使用了一种 ML 分类技术,即随机森林(RF)算法,来自动识别和分类错误信号,包括仪器误差、噪声信号、离群数据点以及太阳耀斑(SF)对电离层的影响。此次数据交流包括上述研究中使用的预处理数据集,以及利用 PyCaret 库的工作流程和后处理工作流程。这些代码和数据可用于 ML 和电离层物理科学跨学科领域的教育目的,也可用于其他研究人员的不同目标。
{"title":"Machine Learning Classification Workflow and Datasets for Ionospheric VLF Data Exclusion","authors":"Filip Arnaut, A. Kolarski, V. Srećković","doi":"10.3390/data9010017","DOIUrl":"https://doi.org/10.3390/data9010017","url":null,"abstract":"Machine learning (ML) methods are commonly applied in the fields of extraterrestrial physics, space science, and plasma physics. In a prior publication, an ML classification technique, the Random Forest (RF) algorithm, was utilized to automatically identify and categorize erroneous signals, including instrument errors, noisy signals, outlier data points, and the impact of solar flares (SFs) on the ionosphere. This data communication includes the pre-processed dataset used in the aforementioned research, along with a workflow that utilizes the PyCaret library and a post-processing workflow. The code and data serve educational purposes in the interdisciplinary field of ML and ionospheric physics science, as well as being useful to other researchers for diverse objectives.","PeriodicalId":502371,"journal":{"name":"Data","volume":"105 21","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139614503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Proteomic and Metabolomic Analyses of the Blood Samples of Highly Trained Athletes 高度训练运动员血液样本的蛋白质组和代谢组分析
Pub Date : 2024-01-16 DOI: 10.3390/data9010015
K. Malsagova, A. Kopylov, V. Pustovoyt, E. I. Balakin, Ksenia A. Yurku, A. Stepanov, L. Kulikova, V. Rudnev, A. Kaysheva
High exercise loading causes intricate and ambiguous proteomic and metabolic changes. This study aims to describe the dataset on protein and metabolite contents in plasma samples collected from highly trained athletes across different sports disciplines. The proteomic and metabolomic analyses of the plasma samples of highly trained athletes engaged in sports disciplines of different intensities were carried out using HPLC-MS/MS. The results are reported as two datasets (proteomic data in a derived mgf-file and metabolomic data in processed format), each containing the findings obtained by analyzing 93 mass spectra. Variations in the protein and metabolite contents of the biological samples are observed, depending on the intensity of training load for different sports disciplines. Mass spectrometric proteomic and metabolomic studies can be used for classifying different athlete phenotypes according to the intensity of sports discipline and for the assessment of the efficiency of the recovery period.
大运动量负荷会导致蛋白质组和代谢发生复杂而模糊的变化。本研究旨在描述从不同运动项目中训练有素的运动员采集的血浆样本中蛋白质和代谢物含量的数据集。研究采用 HPLC-MS/MS 技术,对不同运动强度的高水平运动员的血浆样本进行了蛋白质组和代谢组分析。分析结果以两个数据集(以 mgf 文件格式生成的蛋白质组数据和以处理格式生成的代谢组数据)的形式报告,每个数据集包含通过分析 93 个质谱获得的结果。根据不同运动项目的训练负荷强度,可以观察到生物样本中蛋白质和代谢物含量的变化。质谱蛋白质组学和代谢组学研究可用于根据运动强度对不同运动员的表型进行分类,以及评估恢复期的效率。
{"title":"Proteomic and Metabolomic Analyses of the Blood Samples of Highly Trained Athletes","authors":"K. Malsagova, A. Kopylov, V. Pustovoyt, E. I. Balakin, Ksenia A. Yurku, A. Stepanov, L. Kulikova, V. Rudnev, A. Kaysheva","doi":"10.3390/data9010015","DOIUrl":"https://doi.org/10.3390/data9010015","url":null,"abstract":"High exercise loading causes intricate and ambiguous proteomic and metabolic changes. This study aims to describe the dataset on protein and metabolite contents in plasma samples collected from highly trained athletes across different sports disciplines. The proteomic and metabolomic analyses of the plasma samples of highly trained athletes engaged in sports disciplines of different intensities were carried out using HPLC-MS/MS. The results are reported as two datasets (proteomic data in a derived mgf-file and metabolomic data in processed format), each containing the findings obtained by analyzing 93 mass spectra. Variations in the protein and metabolite contents of the biological samples are observed, depending on the intensity of training load for different sports disciplines. Mass spectrometric proteomic and metabolomic studies can be used for classifying different athlete phenotypes according to the intensity of sports discipline and for the assessment of the efficiency of the recovery period.","PeriodicalId":502371,"journal":{"name":"Data","volume":" 22","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139620005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GeMSyD: Generic Framework for Synthetic Data Generation GeMSyD:合成数据生成通用框架
Pub Date : 2024-01-11 DOI: 10.3390/data9010014
Ramona Tolas, Raluca Portase, R. Potolea
In the era of data-driven technologies, the need for diverse and high-quality datasets for training and testing machine learning models has become increasingly critical. In this article, we present a versatile methodology, the Generic Methodology for Constructing Synthetic Data Generation (GeMSyD), which addresses the challenge of synthetic data creation in the context of smart devices. GeMSyD provides a framework that enables the generation of synthetic datasets, aligning them closely with real-world data. To demonstrate the utility of GeMSyD, we instantiate the methodology by constructing a synthetic data generation framework tailored to the domain of event-based data modeling, specifically focusing on user interactions with smart devices. Our framework leverages GeMSyD to create synthetic datasets that faithfully emulate the dynamics of human–device interactions, including the temporal dependencies. Furthermore, we showcase how the synthetic data generated using our framework can serve as a valuable resource for machine learning practitioners. By employing these synthetic datasets, we perform a series of experiments to evaluate the performance of a neural-network-based prediction model in the domain of smart device interaction. Our results underscore the potential of synthetic data in facilitating model development and benchmarking.
在数据驱动技术的时代,对用于训练和测试机器学习模型的多样化高质量数据集的需求变得越来越迫切。在本文中,我们提出了一种通用方法论--合成数据生成通用方法论(GeMSyD),以应对智能设备背景下合成数据创建的挑战。GeMSyD 提供了一个能够生成合成数据集的框架,使其与真实世界的数据紧密结合。为了证明 GeMSyD 的实用性,我们构建了一个合成数据生成框架,专门针对基于事件的数据建模领域,特别是用户与智能设备的交互,将该方法实例化。我们的框架利用 GeMSyD 创建合成数据集,忠实模拟人与设备的交互动态,包括时间依赖关系。此外,我们还展示了使用我们的框架生成的合成数据如何成为机器学习从业人员的宝贵资源。通过使用这些合成数据集,我们进行了一系列实验,以评估基于神经网络的预测模型在智能设备交互领域的性能。我们的结果凸显了合成数据在促进模型开发和基准测试方面的潜力。
{"title":"GeMSyD: Generic Framework for Synthetic Data Generation","authors":"Ramona Tolas, Raluca Portase, R. Potolea","doi":"10.3390/data9010014","DOIUrl":"https://doi.org/10.3390/data9010014","url":null,"abstract":"In the era of data-driven technologies, the need for diverse and high-quality datasets for training and testing machine learning models has become increasingly critical. In this article, we present a versatile methodology, the Generic Methodology for Constructing Synthetic Data Generation (GeMSyD), which addresses the challenge of synthetic data creation in the context of smart devices. GeMSyD provides a framework that enables the generation of synthetic datasets, aligning them closely with real-world data. To demonstrate the utility of GeMSyD, we instantiate the methodology by constructing a synthetic data generation framework tailored to the domain of event-based data modeling, specifically focusing on user interactions with smart devices. Our framework leverages GeMSyD to create synthetic datasets that faithfully emulate the dynamics of human–device interactions, including the temporal dependencies. Furthermore, we showcase how the synthetic data generated using our framework can serve as a valuable resource for machine learning practitioners. By employing these synthetic datasets, we perform a series of experiments to evaluate the performance of a neural-network-based prediction model in the domain of smart device interaction. Our results underscore the potential of synthetic data in facilitating model development and benchmarking.","PeriodicalId":502371,"journal":{"name":"Data","volume":" 10","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139626711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Data
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1