首页 > 最新文献

Data in Brief最新文献

英文 中文
Process control block information dataset: Towards android malware detection 过程控制块信息数据集:实现安卓恶意软件检测
IF 1 Q3 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-09-26 DOI: 10.1016/j.dib.2024.110975
Heba Alawneh, Hamza Alkofahi
This article proposes a Process Control Block (PCB) dataset [1] mined over the process execution time of tested Android applications. The PCB data from 2620 malware-infested applications and 1610 benign applications were collected. The PCB data sequence was collected for 25 seconds, with an average of 18,500 PCB records stored for each application.The mining method was implemented at the kernel level and synced with the process (job) context switching. The data for each program comprises the PCB information for all threads running the application. The application automation testing and PCB gathering for benign and malicious applications were conducted in a closed dynamic malware analysis framework. The dataset can be used to compare and contrast the low-level (kernel) behavior of benign and malicious Android programs. For the vast majority of tested applications, the mining approach effectively captured 99% of the context switches.
本文提出了一个进程控制块(PCB)数据集[1],该数据集是在经过测试的安卓应用程序的进程执行时间内挖掘出来的。本文收集了 2620 个受恶意软件攻击的应用程序和 1610 个良性应用程序的 PCB 数据。PCB数据序列的收集时间为25秒,每个应用程序平均存储18500条PCB记录。挖掘方法在内核级实现,并与进程(任务)上下文切换同步。每个程序的数据包括运行该程序的所有线程的 PCB 信息。在封闭的动态恶意软件分析框架中,对良性和恶意应用程序进行了应用程序自动化测试和 PCB 收集。该数据集可用于比较和对比良性和恶意 Android 程序的底层(内核)行为。对于绝大多数测试应用程序,挖掘方法有效捕获了 99% 的上下文切换。
{"title":"Process control block information dataset: Towards android malware detection","authors":"Heba Alawneh,&nbsp;Hamza Alkofahi","doi":"10.1016/j.dib.2024.110975","DOIUrl":"10.1016/j.dib.2024.110975","url":null,"abstract":"<div><div>This article proposes a Process Control Block (PCB) dataset <span><span>[1]</span></span> mined over the process execution time of tested Android applications. The PCB data from 2620 malware-infested applications and 1610 benign applications were collected. The PCB data sequence was collected for 25 seconds, with an average of 18,500 PCB records stored for each application.The mining method was implemented at the kernel level and synced with the process (job) context switching. The data for each program comprises the PCB information for all threads running the application. The application automation testing and PCB gathering for benign and malicious applications were conducted in a closed dynamic malware analysis framework. The dataset can be used to compare and contrast the low-level (kernel) behavior of benign and malicious Android programs. For the vast majority of tested applications, the mining approach effectively captured 99% of the context switches.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142427211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Perma_Crops_PT: A geolocated dataset for permanent crops in Portugal Perma_Crops_PT:葡萄牙永久性作物地理定位数据集
IF 1 Q3 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-09-25 DOI: 10.1016/j.dib.2024.110971
Helder Fraga, Teresa Freitas, Nathalie Guimarães, João A. Santos
Crop landcover datasets are crucial for modern agriculture, aiding farmers, researchers, policymakers, and stakeholders. These databases offer extensive insights into crop distribution, facilitating informed decision-making for sustainable practices, particularly under a changing climate. Moreover, these datasets drive research, fostering collaborations and innovation for resilient agriculture. In Portugal, the COS dataset is vital, offering insights into agrarian landscapes and supporting sustainable practices. However, in recent versions, since 2007, information on permanent crops has been aggregated, necessitating complementary datasets and tools. The current paper addresses this gap by providing an open-source dataset focusing on perennial crops in mainland Portugal. Based on the 2019 agricultural census from the Portuguese Statistical Institute (INE), this dataset contributes to the spatial understanding of permanent crop distribution, being freely available for researchers, farmers and policymakers. The dataset includes a selection of perennial crops commonly cultivated in Portugal, such as Prunus dulcis (Almond), Malus domestica (Apple), Castanea sativa (Chestnut), Ceratonia siliqua (Carob), Prunus avium (Sweet Cherry), Vitis vinifera (Grapevine), Olea europaea (Olive), Citrus limon (Lemon), Citrus sinensis (Sweet Orange), Juglans regia (Walnut), Citrus reticulata (Mandarin), Prunus persica (Peach), Pyrus communis (Pear), and Prunus domestica (Plum). Further information regarding the Administrative Units of each crop is also available. This comprehensive list provides a detailed overview of the types of permanent crops included in the dataset, offering valuable insights into the Portuguese agricultural landscape.
农作物土地覆盖物数据集对现代农业至关重要,可为农民、研究人员、政策制定者和利益相关者提供帮助。这些数据库提供了有关作物分布的广泛见解,有助于为可持续做法做出知情决策,尤其是在气候不断变化的情况下。此外,这些数据集还推动了研究工作,促进了合作和创新,提高了农业的抗灾能力。在葡萄牙,COS 数据集至关重要,它提供了对农业景观的深入了解,并支持可持续做法。然而,自 2007 年以来,在最近的版本中,有关永久性作物的信息已被汇总,因此需要补充数据集和工具。本文提供了一个侧重于葡萄牙大陆多年生作物的开源数据集,填补了这一空白。该数据集以葡萄牙统计研究所(INE)2019 年农业普查为基础,有助于在空间上了解多年生作物的分布情况,可供研究人员、农民和政策制定者免费使用。该数据集包括葡萄牙通常种植的一些多年生作物,如杏树(Prunus dulcis)、苹果(Malus domestica)、板栗(Castanea sativa)、角豆树(Ceratonia siliqua)、甜樱桃(Prunus avium)、Vitis vinifera(葡萄)、Olea europaea(橄榄)、Citrus limon(柠檬)、Citrus sinensis(甜橙)、Juglans regia(胡桃)、Citrus reticulata(柑橘)、Prunus persica(桃)、Pyrus communis(梨)和 Prunus domestica(李)。此外,还提供了有关每种作物的行政单位的更多信息。这份全面的清单详细概述了数据集中包含的永久性作物类型,为了解葡萄牙农业景观提供了宝贵的信息。
{"title":"Perma_Crops_PT: A geolocated dataset for permanent crops in Portugal","authors":"Helder Fraga,&nbsp;Teresa Freitas,&nbsp;Nathalie Guimarães,&nbsp;João A. Santos","doi":"10.1016/j.dib.2024.110971","DOIUrl":"10.1016/j.dib.2024.110971","url":null,"abstract":"<div><div>Crop landcover datasets are crucial for modern agriculture, aiding farmers, researchers, policymakers, and stakeholders. These databases offer extensive insights into crop distribution, facilitating informed decision-making for sustainable practices, particularly under a changing climate. Moreover, these datasets drive research, fostering collaborations and innovation for resilient agriculture. In Portugal, the COS dataset is vital, offering insights into agrarian landscapes and supporting sustainable practices. However, in recent versions, since 2007, information on permanent crops has been aggregated, necessitating complementary datasets and tools. The current paper addresses this gap by providing an open-source dataset focusing on perennial crops in mainland Portugal. Based on the 2019 agricultural census from the Portuguese Statistical Institute (INE), this dataset contributes to the spatial understanding of permanent crop distribution, being freely available for researchers, farmers and policymakers. The dataset includes a selection of perennial crops commonly cultivated in Portugal, such as <em>Prunus dulcis</em> (Almond), <em>Malus domestica</em> (Apple), <em>Castanea sativa</em> (Chestnut), <em>Ceratonia siliqua</em> (Carob), <em>Prunus avium</em> (Sweet Cherry), <em>Vitis vinifera</em> (Grapevine), <em>Olea europaea</em> (Olive), <em>Citrus limon</em> (Lemon), <em>Citrus sinensis</em> (Sweet Orange), <em>Juglans regia</em> (Walnut), <em>Citrus reticulata</em> (Mandarin), <em>Prunus persica</em> (Peach), <em>Pyrus communis</em> (Pear), and <em>Prunus domestica</em> (Plum). Further information regarding the Administrative Units of each crop is also available. This comprehensive list provides a detailed overview of the types of permanent crops included in the dataset, offering valuable insights into the Portuguese agricultural landscape.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2024-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142427224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data on the analysis of draft genome sequence of Raoultella ornithinolytica isolate carrying antimicrobial resistance genes, plasmid and CRISPR-Cas system 携带抗菌药耐药性基因、质粒和 CRISPR-Cas 系统的鸟疫酵母菌分离物基因组序列草案分析数据
IF 1 Q3 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-09-24 DOI: 10.1016/j.dib.2024.110973
Anna Karpenko, Yulia Mikhaylova, Andrey Shelenkov, Aleksey Tutelyan, Vasiliy Akimkin
Environmental bacterial species Raoultella ornithinolytica is an emerging pathogen becoming increasingly important in causing human infections. Thus far, the clinical isolates of this species have not exhibited multidrug resistance very often, but some reports underline the necessity for continuous monitoring of this potentially dangerous pathogen. Currently, epidemiological surveillance and antimicrobial resistance investigations of any bacterial pathogen usually rely on whole genome sequencing, which is becoming more affordable while providing increasingly important data in the recent years. However, R. ornithinolytica genomic information is scantily presented in public databases. Here, we report, to the best of our knowledge, the first whole genome sequence and corresponding raw data for a clinical R. ornithinolytica isolate from Russian Federation, which carried antimicrobial resistance (AMR) genes, virulence factors, one plasmid, and CRISPR-Cas system of type I-F. The data provided will facilitate epidemiological surveillance and antimicrobial resistance monitoring of this emerging pathogen.
环境细菌Raoultella ornithinolytica是一种新出现的病原体,在引起人类感染方面越来越重要。迄今为止,这种细菌的临床分离株并不经常表现出多药耐药性,但一些报告强调了对这种潜在危险病原体进行持续监测的必要性。目前,对任何细菌病原体的流行病学监测和抗菌药耐药性调查通常都依赖于全基因组测序,近年来,全基因组测序的价格越来越低廉,同时提供的数据也越来越重要。然而,鸟疫杆菌的基因组信息在公共数据库中却很少见。据我们所知,我们在此报告了首个全基因组序列和相应的原始数据,它们是来自俄罗斯联邦的一个临床 R. ornithinolytica 分离物,该分离物携带有抗菌素耐药性(AMR)基因、毒力因子、一个质粒和 I-F 型 CRISPR-Cas 系统。所提供的数据将有助于对这一新兴病原体进行流行病学监测和抗菌药耐药性监测。
{"title":"Data on the analysis of draft genome sequence of Raoultella ornithinolytica isolate carrying antimicrobial resistance genes, plasmid and CRISPR-Cas system","authors":"Anna Karpenko,&nbsp;Yulia Mikhaylova,&nbsp;Andrey Shelenkov,&nbsp;Aleksey Tutelyan,&nbsp;Vasiliy Akimkin","doi":"10.1016/j.dib.2024.110973","DOIUrl":"10.1016/j.dib.2024.110973","url":null,"abstract":"<div><div>Environmental bacterial species <em>Raoultella ornithinolytica</em> is an emerging pathogen becoming increasingly important in causing human infections. Thus far, the clinical isolates of this species have not exhibited multidrug resistance very often, but some reports underline the necessity for continuous monitoring of this potentially dangerous pathogen. Currently, epidemiological surveillance and antimicrobial resistance investigations of any bacterial pathogen usually rely on whole genome sequencing, which is becoming more affordable while providing increasingly important data in the recent years. However, <em>R. ornithinolytica</em> genomic information is scantily presented in public databases. Here, we report, to the best of our knowledge, the first whole genome sequence and corresponding raw data for a clinical <em>R. ornithinolytica</em> isolate from Russian Federation, which carried antimicrobial resistance (AMR) genes, virulence factors, one plasmid, and CRISPR-Cas system of type I-F. The data provided will facilitate epidemiological surveillance and antimicrobial resistance monitoring of this emerging pathogen.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142427215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Draft genome sequence data of Fusarium verticillioides strain REC01, a phytopathogen isolated from a Peruvian maize 从秘鲁玉米中分离出的植物病原体 Fusarium verticillioides 菌株 REC01 的基因组序列数据草案
IF 1 Q3 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-09-23 DOI: 10.1016/j.dib.2024.110951
Richard Estrada , Liliana Aragón , Wendy E. Pérez , Yolanda Romero , Gabriel Martínez , Karina Garcia , Juancarlos Cruz , Carlos I. Arbizu
Fusarium verticillioides represents a major phytopathogenic threat to maize crops worldwide. In this study, we present genomic sequence data of a phytopathogen isolated from a maize stem that shows obvious signs of vascular rot. Using rigorous microbiological identification techniques, we correlated the disease symptoms observed in an affected maize region with the presence of the pathogen. Subsequently, the pathogen was cultured in a suitable fungal growth medium and extensive morphological characterization was performed. In addition, a pathogenicity test was carried out in a DCA model with three treatments and seven repetitions. De novo assembly from Illumina Novaseq 6000 sequencing yielded 456 contigs, which together constitute a 42.8 Mb genome assembly with a GC % content of 48.26. Subsequent comparative analyses were performed with other Fusarium genomes available in the NCBI database.
疣孢镰刀菌(Fusarium verticillioides)是威胁全球玉米作物的主要植物病原菌。在本研究中,我们展示了从有明显维管束腐烂症状的玉米茎中分离出的植物病原菌的基因组序列数据。通过严格的微生物鉴定技术,我们将在受影响玉米区域观察到的病害症状与病原体的存在联系起来。随后,我们在合适的真菌生长培养基中培养了病原体,并对其进行了广泛的形态鉴定。此外,我们还在 DCA 模型中进行了致病性测试,共进行了三次处理和七次重复。从 Illumina Novaseq 6000 测序中重新组装得到了 456 个等位基因,它们共同构成了一个 42.8 Mb 的基因组,GC%含量为 48.26。随后与 NCBI 数据库中的其他镰刀菌基因组进行了比较分析。
{"title":"Draft genome sequence data of Fusarium verticillioides strain REC01, a phytopathogen isolated from a Peruvian maize","authors":"Richard Estrada ,&nbsp;Liliana Aragón ,&nbsp;Wendy E. Pérez ,&nbsp;Yolanda Romero ,&nbsp;Gabriel Martínez ,&nbsp;Karina Garcia ,&nbsp;Juancarlos Cruz ,&nbsp;Carlos I. Arbizu","doi":"10.1016/j.dib.2024.110951","DOIUrl":"10.1016/j.dib.2024.110951","url":null,"abstract":"<div><div><em>Fusarium verticillioides</em> represents a major phytopathogenic threat to maize crops worldwide. In this study, we present genomic sequence data of a phytopathogen isolated from a maize stem that shows obvious signs of vascular rot. Using rigorous microbiological identification techniques, we correlated the disease symptoms observed in an affected maize region with the presence of the pathogen. Subsequently, the pathogen was cultured in a suitable fungal growth medium and extensive morphological characterization was performed. In addition, a pathogenicity test was carried out in a DCA model with three treatments and seven repetitions. De novo assembly from Illumina Novaseq 6000 sequencing yielded 456 contigs, which together constitute a 42.8 Mb genome assembly with a GC % content of 48.26. Subsequent comparative analyses were performed with other <em>Fusarium</em> genomes available in the NCBI database.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142427130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A semi-labelled dataset for fault detection in air handling units from a large-scale office 用于检测大型办公室空气处理装置故障的半标签数据集
IF 1 Q3 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-09-21 DOI: 10.1016/j.dib.2024.110956
Seunghyeon Wang, Ikchul Eum, Sangkyun Park, Jaejun Kim
Fault detection and diagnosis (FDD) in Air Handling Units (AHUs) ensure building functions such as energy efficiency and occupant comfort by quickly identifying and diagnosing faults. Combining deep learning with FDD has demonstrated high generalization ability in this field. To develop deep learning models, this research constructed a dataset sourced from real data collected from a large-scale office in South Korea. The raw AHU data were extracted from the Building Management System (BMS) at 1-h intervals, spanning from November 2023 to May 2024. The dataset was partially labeled by annotation experts, categorizing the data into six types: normal condition, supply fan fault, total heating pump fault, return air temperature sensor fault, supply air Temperature sensor fault, and valve position fault. Additionally, semi-supervised learning methods were applied as an application example using this constructed dataset. The main contributions of this dataset to the field are twofold. First, it represents a unique dataset sourced from the real operational data of a large-scale office, which is currently non-existent in this domain. Second, the dataset's expert labeling adds significant value by ensuring accurate fault classification. Therefore, we hope that this dataset will encourage the development of robust FDD techniques that are more suitable for real-world applications.
空气处理机组(AHU)中的故障检测与诊断(FDD)可通过快速识别和诊断故障,确保建筑物的能效和居住舒适度等功能。将深度学习与 FDD 相结合已在该领域展现出很高的泛化能力。为了开发深度学习模型,本研究构建了一个数据集,该数据集来源于从韩国大型办公室收集的真实数据。AHU 原始数据是从楼宇管理系统(BMS)中以 1 小时为间隔提取的,时间跨度为 2023 年 11 月至 2024 年 5 月。数据集由标注专家进行了部分标注,将数据分为六种类型:正常状态、送风机故障、总加热泵故障、回风温度传感器故障、送风温度传感器故障和阀门位置故障。此外,还利用所构建的数据集作为应用实例,应用了半监督学习方法。该数据集对该领域的主要贡献有两个方面。首先,它代表了一个独特的数据集,该数据集来源于大型办公室的真实运行数据,目前在该领域尚不存在。其次,数据集的专家标注确保了故障分类的准确性,从而增加了数据集的重要价值。因此,我们希望该数据集能鼓励开发更适合实际应用的稳健 FDD 技术。
{"title":"A semi-labelled dataset for fault detection in air handling units from a large-scale office","authors":"Seunghyeon Wang,&nbsp;Ikchul Eum,&nbsp;Sangkyun Park,&nbsp;Jaejun Kim","doi":"10.1016/j.dib.2024.110956","DOIUrl":"10.1016/j.dib.2024.110956","url":null,"abstract":"<div><div>Fault detection and diagnosis (FDD) in Air Handling Units (AHUs) ensure building functions such as energy efficiency and occupant comfort by quickly identifying and diagnosing faults. Combining deep learning with FDD has demonstrated high generalization ability in this field. To develop deep learning models, this research constructed a dataset sourced from real data collected from a large-scale office in South Korea. The raw AHU data were extracted from the Building Management System (BMS) at 1-h intervals, spanning from November 2023 to May 2024. The dataset was partially labeled by annotation experts, categorizing the data into six types: normal condition, supply fan fault, total heating pump fault, return air temperature sensor fault, supply air Temperature sensor fault, and valve position fault. Additionally, semi-supervised learning methods were applied as an application example using this constructed dataset. The main contributions of this dataset to the field are twofold. First, it represents a unique dataset sourced from the real operational data of a large-scale office, which is currently non-existent in this domain. Second, the dataset's expert labeling adds significant value by ensuring accurate fault classification. Therefore, we hope that this dataset will encourage the development of robust FDD techniques that are more suitable for real-world applications.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2024-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142318523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Metagenome assembly and annotation of data from the rhizosphere soil of drought-stressed CRN-3505 maize cultivar 干旱胁迫下 CRN-3505 玉米栽培品种根瘤土壤中的元基因组组装和数据注释
IF 1 Q3 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-09-21 DOI: 10.1016/j.dib.2024.110966
Olubukola O. Babalola , Rebaona R. Molefe , Adenike E. Amoo
This data article reports shotgun metagenomic data obtained from drought-stressed maize rhizosphere through the Illumina Novaseq platform, utilizing the KBase online platform. 428,339,852 high-quality post-sequences were obtained, showcasing an average GC content of 65.45 %. The investigation, conducted at Molelwane farm in Mafikeng, South Africa, identified 13 metagenome-assembled genomes (MAGs). Functional annotation of these MAGs revealed their involvement in essential plant growth and development functions, such as sulfur and nitrogen metabolism. The dataset was deposited into the NCBI database, and MAGs accessions are available at DDBJ/ENA/GenBank under the accession number PRJNA101755.
这篇数据文章报告了利用 KBase 在线平台,通过 Illumina Novaseq 平台从干旱胁迫玉米根瘤菌群中获得的猎枪元基因组数据。共获得 428,339,852 条高质量后序列,平均 GC 含量为 65.45%。这项调查在南非马菲肯的 Molelwane 农场进行,发现了 13 个元基因组组装基因组(MAG)。这些 MAGs 的功能注释显示,它们参与了植物生长和发育的基本功能,如硫和氮的代谢。数据集已存入 NCBI 数据库,MAGs 的登录号为 PRJNA101755,可在 DDBJ/ENA/GenBank 获取。
{"title":"Metagenome assembly and annotation of data from the rhizosphere soil of drought-stressed CRN-3505 maize cultivar","authors":"Olubukola O. Babalola ,&nbsp;Rebaona R. Molefe ,&nbsp;Adenike E. Amoo","doi":"10.1016/j.dib.2024.110966","DOIUrl":"10.1016/j.dib.2024.110966","url":null,"abstract":"<div><div>This data article reports shotgun metagenomic data obtained from drought-stressed maize rhizosphere through the Illumina Novaseq platform, utilizing the KBase online platform. 428,339,852 high-quality post-sequences were obtained, showcasing an average GC content of 65.45 %. The investigation, conducted at Molelwane farm in Mafikeng, South Africa, identified 13 metagenome-assembled genomes (MAGs). Functional annotation of these MAGs revealed their involvement in essential plant growth and development functions, such as sulfur and nitrogen metabolism. The dataset was deposited into the NCBI database, and MAGs accessions are available at DDBJ/ENA/GenBank under the accession number PRJNA101755.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2024-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142322642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Transcriptomic dataset of Phaseolus vulgaris leaves in response to the inoculation of pathogenic Xanthomonas citri pv. fuscans and its type III secretion system-defective mutant hrcV 普通相思豆叶片对病原性柠檬黄单胞菌 pv. fuscans 及其 III 型分泌系统缺陷突变体 hrcV 接种反应的转录组数据集
IF 1 Q3 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-09-21 DOI: 10.1016/j.dib.2024.110938
Christopher Gihaut , Chrystelle Brin , Martial Briand , Jérôme Verdier , Matthieu Barret , Thomas Roitsch , Tristan Boureau
Xanthomonas citri pv. fuscans (Xcf) and Xanthomonas phaseoli pv. phaseoli (Xpp) are responsible for the Common Bacterial Blight (CBB), a major common bean (Phaseolus vulgaris) disease. The pathogenicity of Xcf and Xpp is known to be dependent upon a functional Type III Secretion System (T3SS) allowing the injection of numerous bacterial Type III Effectors (T3Es) into plant cells. T3Es have been described as able to disrupt plant defence and manipulate plant metabolism.
In this work we described the transcriptomic response of one susceptible (Flavert) and one resistant (Vezer) cultivars of P. vulgaris to the inoculation of the virulent strain Xcf CFBP4885 or its avirulent T3SS-defective hrcV mutant (CFBP13802).
Leaves of both bean cultivars were infiltrated with water or bacterial suspensions. Inoculated leaves were sampled at 24 or 48 h post inoculation (hpi). The experiment was independently repeated three times for total RNA extraction and sequencing analysis. Library construction and total RNA sequencing were performed with BGISEQ-500 at Beijing Genomics Institute (BGI, Hong-Kong), generating an average of 24M of paired-end reads of 100bp per sample. FastQC was used to check reads quality. Mapping analyses were made using a quasi-mapping alignment from Salmon (version 1.2.1) against the Phaseolus vulgaris reference genome (version 2.1), revealing the expression profiles of 36,978 transcripts in leaf tissues.
Fastq raw data and count files from 36 samples are available in the Gene Expression Omnibus (GEO) repository of the National Center for Biotechnology Information (NCBI) under the accession number GSE271236.
This dataset is a valuable resource to investigate the role of T3Es in subverting the cellular functions of bean.
柠檬黄单胞菌(Xanthomonas citri pv. fuscans,Xcf)和相思豆黄单胞菌(Xanthomonas phaseoli pv. phaseoli,Xpp)是普通细菌性疫病(CBB)的致病菌,CBB是一种主要的普通豆类(Phaseolus vulgaris)病害。已知 Xcf 和 Xpp 的致病性取决于功能性 III 型分泌系统(T3SS),该系统允许将大量细菌 III 型效应物(T3Es)注入植物细胞。在这项工作中,我们描述了一种易感性(Flavert)和一种抗性(Vezer)豆角菌栽培品种对接种毒力菌株 Xcf CFBP4885 或其无毒 T3SS 缺陷 hrcV 突变体(CFBP13802)的转录组反应。在接种后 24 或 48 小时(hpi)对接种叶片进行取样。实验独立重复三次,进行总 RNA 提取和测序分析。文库构建和总 RNA 测序由北京基因组研究所(BGI,香港)的 BGISEQ-500 进行,平均每个样本产生 24M 个 100bp 的成对末端读数。使用 FastQC 检查读数质量。利用 Salmon(1.2.1 版)与 Phaseolus vulgaris 参考基因组(2.1 版)的准映射比对进行了映射分析,揭示了叶组织中 36,978 个转录本的表达谱。36 个样本的 Fastq 原始数据和计数文件可在美国国家生物技术信息中心(NCBI)的基因表达总库(GEO)中找到,登录号为 GSE271236。
{"title":"Transcriptomic dataset of Phaseolus vulgaris leaves in response to the inoculation of pathogenic Xanthomonas citri pv. fuscans and its type III secretion system-defective mutant hrcV","authors":"Christopher Gihaut ,&nbsp;Chrystelle Brin ,&nbsp;Martial Briand ,&nbsp;Jérôme Verdier ,&nbsp;Matthieu Barret ,&nbsp;Thomas Roitsch ,&nbsp;Tristan Boureau","doi":"10.1016/j.dib.2024.110938","DOIUrl":"10.1016/j.dib.2024.110938","url":null,"abstract":"<div><div><em>Xanthomonas citri</em> pv. <em>fuscans</em> (<em>Xcf</em>) and <em>Xanthomonas phaseoli</em> pv. <em>phaseoli</em> (<em>Xpp</em>) are responsible for the Common Bacterial Blight (CBB), a major common bean (<em>Phaseolus vulgaris</em>) disease. The pathogenicity of <em>Xcf</em> and <em>Xpp</em> is known to be dependent upon a functional Type III Secretion System (T3SS) allowing the injection of numerous bacterial Type III Effectors (T3Es) into plant cells. T3Es have been described as able to disrupt plant defence and manipulate plant metabolism.</div><div>In this work we described the transcriptomic response of one susceptible (Flavert) and one resistant (Vezer) cultivars of <em>P. vulgaris</em> to the inoculation of the virulent strain <em>Xcf</em> CFBP4885 or its avirulent T3SS-defective <em>hrcV</em> mutant (CFBP13802).</div><div>Leaves of both bean cultivars were infiltrated with water or bacterial suspensions. Inoculated leaves were sampled at 24 or 48 h post inoculation (hpi). The experiment was independently repeated three times for total RNA extraction and sequencing analysis. Library construction and total RNA sequencing were performed with BGISEQ-500 at Beijing Genomics Institute (BGI, Hong-Kong), generating an average of 24M of paired-end reads of 100bp per sample. FastQC was used to check reads quality. Mapping analyses were made using a quasi-mapping alignment from Salmon (version 1.2.1) against the <em>Phaseolus vulgaris</em> reference genome (version 2.1), revealing the expression profiles of 36,978 transcripts in leaf tissues.</div><div>Fastq raw data and count files from 36 samples are available in the Gene Expression Omnibus (GEO) repository of the National Center for Biotechnology Information (NCBI) under the accession number GSE271236.</div><div>This dataset is a valuable resource to investigate the role of T3Es in subverting the cellular functions of bean.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2024-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142427220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A comprehensive dental dataset of six classes for deep learning based object detection study 基于深度学习的物体检测研究的六类综合牙科数据集
IF 1 Q3 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-09-21 DOI: 10.1016/j.dib.2024.110970
Rubaba Binte Rahman, Sharia Arfin Tanim, Nazia Alfaz, Tahmid Enam Shrestha, Md Saef Ullah Miah, M.F. Mridha
This article presents a dental dataset for the improvement of research on deep learning-based detection and classification of dental diseases. The dataset is consisted of 232 panoramic dental radiographs, categorized into six major classes: healthy teeth, caries, impacted teeth, infections, fractured teeth, and broken-down crowns/roots (BDC/BDR). The images were collected from three renowned private clinics in Dhaka, Bangladesh, with the help of an experienced dental practitioner who ensured the confidentiality of patients and high-quality data acquisition using a 64-megapixel Android phone camera. To enhance the value of the dataset for machine and deep learning applications, we applied Contrast-Limited Adaptive Histogram Equalization (CLAHE) for image enhancement and augmented the data. The images were annotated using the CVAT tool and reviewed by dental experts. This benchmark dataset is publicly available and provides a valuable resource for researchers in artificial intelligence, computer science, and dental informatics to promote interdisciplinary collaboration and the development of advanced algorithms for dental disease detection.
本文介绍了一个牙科数据集,用于改进基于深度学习的牙科疾病检测和分类研究。该数据集由 232 张全景牙科 X 光片组成,分为六大类:健康牙齿、龋齿、阻生牙、感染、牙齿折断和牙冠/牙根折断(BDC/BDR)。这些图像来自孟加拉国达卡的三家知名私人诊所,由一名经验丰富的牙科医生协助收集,他使用 6400 万像素的安卓手机摄像头确保了患者的保密性和高质量的数据采集。为了提高数据集在机器学习和深度学习应用中的价值,我们应用了对比度受限自适应直方图均衡化(CLAHE)技术进行图像增强,并对数据进行了扩增。使用 CVAT 工具对图像进行了注释,并由牙科专家进行了审查。这个基准数据集是公开可用的,为人工智能、计算机科学和牙科信息学研究人员提供了宝贵的资源,促进了跨学科合作和牙科疾病检测先进算法的开发。
{"title":"A comprehensive dental dataset of six classes for deep learning based object detection study","authors":"Rubaba Binte Rahman,&nbsp;Sharia Arfin Tanim,&nbsp;Nazia Alfaz,&nbsp;Tahmid Enam Shrestha,&nbsp;Md Saef Ullah Miah,&nbsp;M.F. Mridha","doi":"10.1016/j.dib.2024.110970","DOIUrl":"10.1016/j.dib.2024.110970","url":null,"abstract":"<div><div>This article presents a dental dataset for the improvement of research on deep learning-based detection and classification of dental diseases. The dataset is consisted of 232 panoramic dental radiographs, categorized into six major classes: healthy teeth, caries, impacted teeth, infections, fractured teeth, and broken-down crowns/roots (BDC/BDR). The images were collected from three renowned private clinics in Dhaka, Bangladesh, with the help of an experienced dental practitioner who ensured the confidentiality of patients and high-quality data acquisition using a 64-megapixel Android phone camera. To enhance the value of the dataset for machine and deep learning applications, we applied Contrast-Limited Adaptive Histogram Equalization (CLAHE) for image enhancement and augmented the data. The images were annotated using the CVAT tool and reviewed by dental experts. This benchmark dataset is publicly available and provides a valuable resource for researchers in artificial intelligence, computer science, and dental informatics to promote interdisciplinary collaboration and the development of advanced algorithms for dental disease detection.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2024-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142322641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Facilitating spice recognition and classification: An image dataset of Indian spices 促进香料识别和分类:印度香料图像数据集
IF 1 Q3 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-09-21 DOI: 10.1016/j.dib.2024.110936
Sandip Thite , Deepali Godse , Kailas Patil , Prawit Chumchu , Alfa Nyandoro
This data paper presents a comprehensive visual dataset of 19 distinct types of Indian spices, consisting of high-quality images meticulously curated to facilitate various research and educational applications. The dataset includes extensive imagery of the following spices: Asafoetida, Bay Leaf, Black Cardamom, Black Pepper, Caraway Seeds, Cinnamon Stick, Cloves, Coriander Seeds, Cubeb Pepper, Cumin Seeds, Dry Ginger, Dry Red Chilly, Fennel Seeds, Green Cardamom, Mace, Nutmeg, Poppy Seeds, Star Anise, and Stone Flowers. Each image in the dataset has been captured under controlled conditions to ensure consistency and clarity, making it an invaluable resource for studies in food science, agriculture, and culinary arts. The dataset can also support machine learning and computer vision applications, such as spice recognition and classification. By providing detailed visual documentation, this dataset aims to promote a deeper understanding and appreciation of the rich diversity of Indian spices.
这篇数据论文展示了一个包含 19 种不同印度香料的综合可视化数据集,该数据集由精心策划的高质量图像组成,以促进各种研究和教育应用。数据集包括以下香料的大量图像:阿苏、贝叶、黑豆蔻、黑胡椒、香芹籽、肉桂棒、丁香、芫荽籽、库比胡椒、小茴香籽、干姜、干红辣椒、茴香籽、绿豆蔻、肉豆蔻、肉豆蔻、罂粟籽、八角和石花。数据集中的每张图像都是在受控条件下采集的,以确保一致性和清晰度,因此是食品科学、农业和烹饪艺术研究的宝贵资源。该数据集还可支持机器学习和计算机视觉应用,如香料识别和分类。通过提供详细的视觉记录,该数据集旨在促进人们更深入地了解和欣赏印度香料的丰富多样性。
{"title":"Facilitating spice recognition and classification: An image dataset of Indian spices","authors":"Sandip Thite ,&nbsp;Deepali Godse ,&nbsp;Kailas Patil ,&nbsp;Prawit Chumchu ,&nbsp;Alfa Nyandoro","doi":"10.1016/j.dib.2024.110936","DOIUrl":"10.1016/j.dib.2024.110936","url":null,"abstract":"<div><div>This data paper presents a comprehensive visual dataset of 19 distinct types of Indian spices, consisting of high-quality images meticulously curated to facilitate various research and educational applications. The dataset includes extensive imagery of the following spices: Asafoetida, Bay Leaf, Black Cardamom, Black Pepper, Caraway Seeds, Cinnamon Stick, Cloves, Coriander Seeds, Cubeb Pepper, Cumin Seeds, Dry Ginger, Dry Red Chilly, Fennel Seeds, Green Cardamom, Mace, Nutmeg, Poppy Seeds, Star Anise, and Stone Flowers. Each image in the dataset has been captured under controlled conditions to ensure consistency and clarity, making it an invaluable resource for studies in food science, agriculture, and culinary arts. The dataset can also support machine learning and computer vision applications, such as spice recognition and classification. By providing detailed visual documentation, this dataset aims to promote a deeper understanding and appreciation of the rich diversity of Indian spices.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2024-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142427131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mobile brain–body imaging data set of indoor treadmill walking and outdoor walking with a visual search task 带有视觉搜索任务的室内跑步机行走和室外行走的移动脑体成像数据集
IF 1 Q3 MULTIDISCIPLINARY SCIENCES Pub Date : 2024-09-21 DOI: 10.1016/j.dib.2024.110968
Grant M. Hanada , Marija Kalabic , Daniel P. Ferris
To fully understand brain processes in the real world, it is necessary to record and quantitatively analyse brain processes during real world human experiences. Mobile electroencephalography (EEG) and physiological data sensors provide new opportunities for studying humans outside of the laboratory. The purpose of this study was to document data from high-density EEG and mobile physiological sensors while humans performed a visual search task both on a treadmill in a laboratory setting and overground in a natural outdoor setting. The data set includes 49 young, healthy participants on an outdoor arboretum path and on a treadmill in a laboratory with a large virtual reality screen. The data provide a valuable research tool for scientists interested in signal processing, electrocortical brain processes, mobile brain imaging, and brain-computer interfaces based on mobile EEG. Given the comparison data between laboratory and real world conditions, researchers can test the viability of new processing algorithms across conditions or investigate changes in electrocortical activity related to behavioural dynamics coded into the data.
要充分了解现实世界中的大脑过程,就必须记录和定量分析人类在现实世界中的经历。移动脑电图(EEG)和生理数据传感器为在实验室外研究人类提供了新的机会。本研究的目的是记录人类在实验室环境的跑步机上和户外自然环境的地面上执行视觉搜索任务时,高密度脑电图和移动生理传感器提供的数据。数据集包括 49 名年轻、健康的参与者在室外树木园的道路上和在实验室带有大型虚拟现实屏幕的跑步机上的数据。这些数据为对信号处理、大脑皮层过程、移动大脑成像和基于移动脑电图的脑机接口感兴趣的科学家提供了宝贵的研究工具。有了实验室和真实世界条件下的对比数据,研究人员就可以测试新处理算法在不同条件下的可行性,或研究与编码到数据中的行为动态有关的皮层电活动变化。
{"title":"Mobile brain–body imaging data set of indoor treadmill walking and outdoor walking with a visual search task","authors":"Grant M. Hanada ,&nbsp;Marija Kalabic ,&nbsp;Daniel P. Ferris","doi":"10.1016/j.dib.2024.110968","DOIUrl":"10.1016/j.dib.2024.110968","url":null,"abstract":"<div><div>To fully understand brain processes in the real world, it is necessary to record and quantitatively analyse brain processes during real world human experiences. Mobile electroencephalography (EEG) and physiological data sensors provide new opportunities for studying humans outside of the laboratory. The purpose of this study was to document data from high-density EEG and mobile physiological sensors while humans performed a visual search task both on a treadmill in a laboratory setting and overground in a natural outdoor setting. The data set includes 49 young, healthy participants on an outdoor arboretum path and on a treadmill in a laboratory with a large virtual reality screen. The data provide a valuable research tool for scientists interested in signal processing, electrocortical brain processes, mobile brain imaging, and brain-computer interfaces based on mobile EEG. Given the comparison data between laboratory and real world conditions, researchers can test the viability of new processing algorithms across conditions or investigate changes in electrocortical activity related to behavioural dynamics coded into the data.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2024-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142357549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Data in Brief
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1