Pub Date : 2024-10-03DOI: 10.1016/j.dib.2024.110989
Sarkhel H. Taher Karim
This article presents a thorough compilation of 5108 Central Kurdish comments taken from YouTube and Facebook. The purpose of compiling the dataset was to investigate public perceptions of Misyar marriage, a non-traditional form of marriage, in the Kurdistan region. The goal of the 135-day data collection period was to gather comments from specific public pages on these social media platforms. there are two columns in the dataset: sentiments and comments. The sentiments column classifies each comment into one of eight sentiment labels: Positive, Negative, Neutral, Sarcastic or Humorous, Suggestive, Dismissive, Skeptical, and Curious. The comments column contains the text of the comments in Central Kurdish. To improve the quality and uniformity of the data, a great deal of preprocessing was done to address problems like noise removal, character replacement, and space adjustments.
Researchers interested in sentiment analysis, social media studies, Islamic studies, and Kurdish cultural practices will find the dataset to be a useful resource. It can be used for sentiment analysis, trend analysis, linguistic studies, and other analyses. It provides insights into the public discourse surrounding Misyar marriage. The labeled data can aid in the creation of machine learning models and further our knowledge of societal perceptions of emerging religious trends.
{"title":"Kurdish social media sentiment corpus: Misyar marriage perspectives","authors":"Sarkhel H. Taher Karim","doi":"10.1016/j.dib.2024.110989","DOIUrl":"10.1016/j.dib.2024.110989","url":null,"abstract":"<div><div>This article presents a thorough compilation of 5108 Central Kurdish comments taken from YouTube and Facebook. The purpose of compiling the dataset was to investigate public perceptions of Misyar marriage, a non-traditional form of marriage, in the Kurdistan region. The goal of the 135-day data collection period was to gather comments from specific public pages on these social media platforms. there are two columns in the dataset: sentiments and comments. The sentiments column classifies each comment into one of eight sentiment labels: Positive, Negative, Neutral, Sarcastic or Humorous, Suggestive, Dismissive, Skeptical, and Curious. The comments column contains the text of the comments in Central Kurdish. To improve the quality and uniformity of the data, a great deal of preprocessing was done to address problems like noise removal, character replacement, and space adjustments.</div><div>Researchers interested in sentiment analysis, social media studies, Islamic studies, and Kurdish cultural practices will find the dataset to be a useful resource. It can be used for sentiment analysis, trend analysis, linguistic studies, and other analyses. It provides insights into the public discourse surrounding Misyar marriage. The labeled data can aid in the creation of machine learning models and further our knowledge of societal perceptions of emerging religious trends<em>.</em></div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"57 ","pages":"Article 110989"},"PeriodicalIF":1.0,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142427250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-03DOI: 10.1016/j.dib.2024.110993
Javed Ali Khan , Nek Dil Khan , Muhammad Yaqoob , Affan Yasin , Ayed Alwadain
For software development and evolution, end-user feedback from app stores and the Twitter (X) platform has been intensively used recently. However, Reddit forums that provide an argumentative platform to argue and reason about various software features and issues have been less likely to be explored for software evolution and improvement in the literature. Therefore, this study explores Reddit forums as an alternative source for software evolution compared to App Stores, Twitter (X), and Amazon reviews. For this purpose, a Python script is developed to extract end-user discussions related to the Google Maps (GM) app from Reddit forums using Python Praw API, keep the original argumentative structure in user discussions. In total, 3119 end-user discussions from seven related topics about the GMM app are extracted for software evolution. This dataset includes detailed end-user feedback and associated metadata, including Comments ID, Parent ID, author names, timestamps, and upvotes. This dataset is a crucial and valuable resource for software vendors, developers, researchers, and educationists to improve their understanding of identifying new features to include in upcoming app versions. Also, it is of pivotal importance in better understanding recently occurring issues, unlike app stores where user debate on it and provide their justifications. Moreover, the replication package and process of the dataset can enable software researchers, vendors, and developers to extract data from the Reddit forum and use it for the software evolution and improvement process.
{"title":"Exploring reddit forum for software evolution as an alternative requirements source: An end-user discussion dataset on Google maps","authors":"Javed Ali Khan , Nek Dil Khan , Muhammad Yaqoob , Affan Yasin , Ayed Alwadain","doi":"10.1016/j.dib.2024.110993","DOIUrl":"10.1016/j.dib.2024.110993","url":null,"abstract":"<div><div>For software development and evolution, end-user feedback from app stores and the Twitter (X) platform has been intensively used recently. However, Reddit forums that provide an argumentative platform to argue and reason about various software features and issues have been less likely to be explored for software evolution and improvement in the literature. Therefore, this study explores Reddit forums as an alternative source for software evolution compared to App Stores, Twitter (X), and Amazon reviews. For this purpose, a Python script is developed to extract end-user discussions related to the Google Maps (GM) app from Reddit forums using Python Praw API, keep the original argumentative structure in user discussions. In total, 3119 end-user discussions from seven related topics about the GMM app are extracted for software evolution. This dataset includes detailed end-user feedback and associated metadata, including Comments ID, Parent ID, author names, timestamps, and upvotes. This dataset is a crucial and valuable resource for software vendors, developers, researchers, and educationists to improve their understanding of identifying new features to include in upcoming app versions. Also, it is of pivotal importance in better understanding recently occurring issues, unlike app stores where user debate on it and provide their justifications. Moreover, the replication package and process of the dataset can enable software researchers, vendors, and developers to extract data from the Reddit forum and use it for the software evolution and improvement process.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"57 ","pages":"Article 110993"},"PeriodicalIF":1.0,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142427132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-03DOI: 10.1016/j.dib.2024.110991
Ladislav Huraj, Marek Šimon, Jakub Lietava
DDoS attacks pose a significant security risk to smart homes and can disrupt the functionality and availability of connected devices in the home. This dataset documents Distributed Denial of Service (DDoS) attacks against the Fibaro Home Center 3 central control unit, which is used to automate smart homes within the Internet of Things. The focus is on three types of DDoS attacks: TCP SYN flood, ICMP flood and HTTP flood. Data collection was performed on the local network, where SYN flood and ICMP flood attacks were performed using the hping3 tool, and HTTP flood attack was performed using the LOIC tool. The data was captured using Wireshark software and is available in PCAP and CSV formats, allowing detailed analysis of the network traffic. The logs include information such as timestamps, source and destination IP addresses, protocols, packet lengths, and port numbers. The dataset includes raw and anonymized data for each type of attack.
The dataset is a resource for researchers focused on cybersecurity and IoT device protection. It allows simulation and analysis of DDoS attacks on a specific IoT device, providing insight into attack patterns and the effectiveness of defenses. The simplicity and specialization of the dataset makes it a practical resource for developing and testing intrusion detection systems and predictive models to mitigate and prevent DDoS attacks. The use of the PCAP format facilitates the import of the data into various research software platforms.
分布式拒绝服务(DDoS)攻击对智能家居构成重大安全风险,可能会破坏家庭中联网设备的功能和可用性。本数据集记录了针对 Fibaro Home Center 3 中央控制装置的分布式拒绝服务 (DDoS) 攻击,该装置用于在物联网内实现智能家居自动化。重点是三种类型的 DDoS 攻击:TCP SYN flood、ICMP flood 和 HTTP flood。数据收集在本地网络上进行,其中 SYN flood 和 ICMP flood 攻击使用 hping3 工具执行,HTTP flood 攻击使用 LOIC 工具执行。数据使用 Wireshark 软件捕获,并以 PCAP 和 CSV 格式提供,以便对网络流量进行详细分析。日志包括时间戳、源和目标 IP 地址、协议、数据包长度和端口号等信息。该数据集包括每种攻击类型的原始数据和匿名数据。该数据集为专注于网络安全和物联网设备保护的研究人员提供了资源。通过该数据集,可以模拟和分析针对特定物联网设备的 DDoS 攻击,深入了解攻击模式和防御效果。数据集的简单性和专业性使其成为开发和测试入侵检测系统和预测模型的实用资源,以缓解和预防 DDoS 攻击。PCAP 格式的使用便于将数据导入各种研究软件平台。
{"title":"Dataset of DDoS attacks on Fibaro home center 3 for smart home security","authors":"Ladislav Huraj, Marek Šimon, Jakub Lietava","doi":"10.1016/j.dib.2024.110991","DOIUrl":"10.1016/j.dib.2024.110991","url":null,"abstract":"<div><div>DDoS attacks pose a significant security risk to smart homes and can disrupt the functionality and availability of connected devices in the home. This dataset documents Distributed Denial of Service (DDoS) attacks against the Fibaro Home Center 3 central control unit, which is used to automate smart homes within the Internet of Things. The focus is on three types of DDoS attacks: TCP SYN flood, ICMP flood and HTTP flood. Data collection was performed on the local network, where SYN flood and ICMP flood attacks were performed using the hping3 tool, and HTTP flood attack was performed using the LOIC tool. The data was captured using Wireshark software and is available in PCAP and CSV formats, allowing detailed analysis of the network traffic. The logs include information such as timestamps, source and destination IP addresses, protocols, packet lengths, and port numbers. The dataset includes raw and anonymized data for each type of attack.</div><div>The dataset is a resource for researchers focused on cybersecurity and IoT device protection. It allows simulation and analysis of DDoS attacks on a specific IoT device, providing insight into attack patterns and the effectiveness of defenses. The simplicity and specialization of the dataset makes it a practical resource for developing and testing intrusion detection systems and predictive models to mitigate and prevent DDoS attacks. The use of the PCAP format facilitates the import of the data into various research software platforms.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"57 ","pages":"Article 110991"},"PeriodicalIF":1.0,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142427288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01DOI: 10.1016/j.dib.2024.110982
Salvatore Graci, Amalia Barone
Climate change is a major concern for agricultural crops, and the selection of tolerant genotypes in response to abiotic stresses represents an important breeding strategy to reduce yield losses. In addition, the continuous development of new and more accurate high-throughput technologies for the analysis of DNA sequences is the key to improve biological understanding and application of biological knowledge. In the present work, 27 tomato genotypes already evaluated for their response under high temperature conditions were sequenced by using the ddRAD sequencing technology. The main goal was to provide genomic data useful for identifying candidate genes and variants to cope with current climate changes. Total genomic DNA was extracted from leaves and sequenced on the HiSeq2500 Illumina instrument. Raw reads of the dataset were processed using different bioinformatics tools to generate a Variant Calling Format (VCF) file. The availability of resources reporting polymorphisms among genomes of different genotypes provides a useful basis for studying tomato tolerance to current climate changes and can be used by researchers and breeders to investigate the molecular response mechanisms and develop new breeding programs, also aided by Marked Assisted Selection (MAS). The raw reads were deposited into SRA database (https://www.ncbi.nlm.nih.gov/sra/PRJNA1137563).
{"title":"Exploring ddRAD sequencing data of tomato genotypes evaluated for the heat stress tolerance","authors":"Salvatore Graci, Amalia Barone","doi":"10.1016/j.dib.2024.110982","DOIUrl":"10.1016/j.dib.2024.110982","url":null,"abstract":"<div><div>Climate change is a major concern for agricultural crops, and the selection of tolerant genotypes in response to abiotic stresses represents an important breeding strategy to reduce yield losses. In addition, the continuous development of new and more accurate high-throughput technologies for the analysis of DNA sequences is the key to improve biological understanding and application of biological knowledge. In the present work, 27 tomato genotypes already evaluated for their response under high temperature conditions were sequenced by using the ddRAD sequencing technology. The main goal was to provide genomic data useful for identifying candidate genes and variants to cope with current climate changes. Total genomic DNA was extracted from leaves and sequenced on the HiSeq2500 Illumina instrument. Raw reads of the dataset were processed using different bioinformatics tools to generate a Variant Calling Format (VCF) file. The availability of resources reporting polymorphisms among genomes of different genotypes provides a useful basis for studying tomato tolerance to current climate changes and can be used by researchers and breeders to investigate the molecular response mechanisms and develop new breeding programs, also aided by Marked Assisted Selection (MAS). The raw reads were deposited into SRA database (<span><span>https://www.ncbi.nlm.nih.gov/sra/PRJNA1137563</span><svg><path></path></svg></span>).</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"57 ","pages":"Article 110982"},"PeriodicalIF":1.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142427213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01DOI: 10.1016/j.dib.2024.110976
Nasreddine Haqiq , Mounia Zaim , Mohamed Sbihi , Khalid El Amraoui , Mustapha El Alaoui , Lhoussaine Masmoudi , Hamza Echarrafi
The article presents Mine 4.0-MineCareerDB, a publicly available dataset of high-resolution image captured by a DJI Phantom 4 RTK drone specifically designed for analyzing mining careers. The dataset comprises a collection of 373 images depicting various mining operations and activities. Each image is georeferenced and offers a detailed view of mining activities, including the use of various equipment, infrastructure, and overall mining environment. This dataset has the potential to be a valuable resource for computer vision applications in the mining industry such as developing algorithms for identifying mining equipment, training deep learning models for safety analysis and optimization, and research on automation in mining operations. By making Mine4.0-MineCareerDB publicly available, we aim to stimulate further advancements in computer vision research and its applications in the mining sector. The dataset is available at: https://data.mendeley.com/datasets/c5s76mj4bm/5
{"title":"Mine 4.0-mineCareerDB: A high-resolution image dataset for mining career segmentation and object detection","authors":"Nasreddine Haqiq , Mounia Zaim , Mohamed Sbihi , Khalid El Amraoui , Mustapha El Alaoui , Lhoussaine Masmoudi , Hamza Echarrafi","doi":"10.1016/j.dib.2024.110976","DOIUrl":"10.1016/j.dib.2024.110976","url":null,"abstract":"<div><div>The article presents Mine 4.0-MineCareerDB, a publicly available dataset of high-resolution image captured by a DJI Phantom 4 RTK drone specifically designed for analyzing mining careers. The dataset comprises a collection of 373 images depicting various mining operations and activities. Each image is georeferenced and offers a detailed view of mining activities, including the use of various equipment, infrastructure, and overall mining environment. This dataset has the potential to be a valuable resource for computer vision applications in the mining industry such as developing algorithms for identifying mining equipment, training deep learning models for safety analysis and optimization, and research on automation in mining operations. By making Mine4.0-MineCareerDB publicly available, we aim to stimulate further advancements in computer vision research and its applications in the mining sector. The dataset is available at: <span><span>https://data.mendeley.com/datasets/c5s76mj4bm/5</span><svg><path></path></svg></span></div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"57 ","pages":"Article 110976"},"PeriodicalIF":1.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142427251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01DOI: 10.1016/j.dib.2024.110969
Manlio Bacco , Alexander Kocian , Stefano Chessa , Antonino Crivello , Paolo Barsocchi
Data spaces, a novel concept pushing data sharing and exchange, are experiencing momentum because of recent developments motivated by the increasing need for interoperability and data sovereignty. After an initial phase, dating back to approximately twenty years ago, in which this concept has been tentatively explored in different scenarios, it is presently going through a consolidation phase in which both specifications and implementations converge towards a common reference for standardisation. In this context, we offer our view on data spaces by presenting a systematic literature survey, a description of the components needed to build them, how they work, and of existing mature software implementations. We thoroughly present the architectural vision behind the concept and we analyse the Reference Architectural Model by IDS. We provide practical pointers to readers interested in experimenting with software components used in data spaces, and we conclude by highlighting open challenges for their success.
{"title":"What are data spaces? Systematic survey and future outlook","authors":"Manlio Bacco , Alexander Kocian , Stefano Chessa , Antonino Crivello , Paolo Barsocchi","doi":"10.1016/j.dib.2024.110969","DOIUrl":"10.1016/j.dib.2024.110969","url":null,"abstract":"<div><div>Data spaces, a novel concept pushing data sharing and exchange, are experiencing momentum because of recent developments motivated by the increasing need for interoperability and data sovereignty. After an initial phase, dating back to approximately twenty years ago, in which this concept has been tentatively explored in different scenarios, it is presently going through a consolidation phase in which both specifications and implementations converge towards a common reference for standardisation. In this context, we offer our view on data spaces by presenting a systematic literature survey, a description of the components needed to build them, how they work, and of existing mature software implementations. We thoroughly present the architectural vision behind the concept and we analyse the Reference Architectural Model by IDS. We provide practical pointers to readers interested in experimenting with software components used in data spaces, and we conclude by highlighting open challenges for their success.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"57 ","pages":"Article 110969"},"PeriodicalIF":1.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142427207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-30DOI: 10.1016/j.dib.2024.110980
Hamid Behravan , Naga Raju Gudhe , Hidemi Okuma , Mazen Sudah , Arto Mannermaa
A new dataset is presented to propel research in automated breast density estimation, a crucial factor in mammogram interpretation. Mammography, a low-dose X-ray technique for breast cancer screening, can be affected by breast density. Dense tissue appears white on mammograms, potentially obscuring tumors. This dataset, built upon the public VinDr-Mammo dataset, offers 745 mammogram images (including training and test sets) along with expert-radiologist annotations for both the entire breast and dense tissue regions. Researchers can leverage this dataset for multiple purposes: training deep learning models for automated breast density analysis, refining segmentation methods for accurate delineation of breast tissue, and benchmarking existing and novel breast density estimation algorithms. This resource holds promise for improving breast cancer screening through advancements in automated breast density analysis.
本文介绍了一个新的数据集,以推动乳房密度自动估算方面的研究,这是乳房X光照片判读的一个关键因素。乳房 X 射线照相术是一种用于乳腺癌筛查的低剂量 X 射线技术,会受到乳房密度的影响。致密组织在乳房 X 光照片上显示为白色,可能会遮挡肿瘤。该数据集以公开的 VinDr-Mammo 数据集为基础,提供了 745 幅乳房 X 光图像(包括训练集和测试集),以及专家-放射线学家对整个乳房和致密组织区域的注释。研究人员可以利用该数据集实现多种目的:训练用于自动乳腺密度分析的深度学习模型,改进用于准确划分乳腺组织的分割方法,以及对现有和新型乳腺密度估计算法进行基准测试。该资源有望通过自动乳腺密度分析的进步改善乳腺癌筛查。
{"title":"A dataset of mammography images with area-based breast density values, breast area, and dense tissue segmentation masks","authors":"Hamid Behravan , Naga Raju Gudhe , Hidemi Okuma , Mazen Sudah , Arto Mannermaa","doi":"10.1016/j.dib.2024.110980","DOIUrl":"10.1016/j.dib.2024.110980","url":null,"abstract":"<div><div>A new dataset is presented to propel research in automated breast density estimation, a crucial factor in mammogram interpretation. Mammography, a low-dose X-ray technique for breast cancer screening, can be affected by breast density. Dense tissue appears white on mammograms, potentially obscuring tumors. This dataset, built upon the public VinDr-Mammo dataset, offers 745 mammogram images (including training and test sets) along with expert-radiologist annotations for both the entire breast and dense tissue regions. Researchers can leverage this dataset for multiple purposes: training deep learning models for automated breast density analysis, refining segmentation methods for accurate delineation of breast tissue, and benchmarking existing and novel breast density estimation algorithms. This resource holds promise for improving breast cancer screening through advancements in automated breast density analysis.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"57 ","pages":"Article 110980"},"PeriodicalIF":1.0,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142427208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-28DOI: 10.1016/j.dib.2024.110983
Nina Liland , Ivar Rønnestad , Marina Azevedo , Floriana Lai , Frida Oulie , Luís Conceição , Filipe Soares
Atlantic salmon (Salmo salar) cultivated in cages and net-pens are regularly exposed to natural variations in dissolved oxygen levels, occasionally experiencing events of low oxygen availability. Quantifying the impact of low dissolved oxygen levels on fish performance can help fish farmers better manage the risks associated with such events.
This article describes the zootechnical performance of Atlantic salmon reared under experimental conditions at three different dissolved oxygen levels (i.e., low: 50 % saturation; medium: 60 % saturation; high: 95 % saturation). The data was collected in the context of two in vivo trials: (i) Trial A, where fish with an initial average body weight of 312.44 ± 11.53 g were reared in indoor tanks at the different DO levels for 30 days; (ii) Trial B, where fish with an initial average body weight of 735.33 ± 40.42 g were reared in indoor tanks at the different DO levels for 26 days.
The dataset [1] is composed of spreadsheets (.xlsx format) and charts (.png format), and includes daily and hourly resolution data (e.g., dissolved oxygen, water temperature, salinity, number of fish and feed intake), sampling and laboratory data (e.g., fish weight, fork length, sex, organs weight, whole-body composition, and tail and opercular beat frequency), and zootechnical indicators calculated at the tank level and averaged per treatment (e.g., survival rate, weight gain, cumulative feed intake, feed conversion ratio and somatic indexes). The differences between treatment means were analyzed using ANOVA, followed by post-hoc testing.
The data presented here has the potential to be used in subsequent analyses, for example when analyzed together with other experimental data or through its use to parameterize mathematical models, aiming at better understand and describe the effects of dissolved oxygen on the performance of Atlantic salmon.
{"title":"Dataset on the performance of Atlantic salmon (Salmo salar) reared at different dissolved oxygen levels under experimental conditions","authors":"Nina Liland , Ivar Rønnestad , Marina Azevedo , Floriana Lai , Frida Oulie , Luís Conceição , Filipe Soares","doi":"10.1016/j.dib.2024.110983","DOIUrl":"10.1016/j.dib.2024.110983","url":null,"abstract":"<div><div>Atlantic salmon (<em>Salmo salar</em>) cultivated in cages and net-pens are regularly exposed to natural variations in dissolved oxygen levels, occasionally experiencing events of low oxygen availability. Quantifying the impact of low dissolved oxygen levels on fish performance can help fish farmers better manage the risks associated with such events.</div><div>This article describes the zootechnical performance of Atlantic salmon reared under experimental conditions at three different dissolved oxygen levels (i.e., low: 50 % saturation; medium: 60 % saturation; high: 95 % saturation). The data was collected in the context of two in vivo trials: (i) Trial A, where fish with an initial average body weight of 312.44 ± 11.53 g were reared in indoor tanks at the different DO levels for 30 days; (ii) Trial B, where fish with an initial average body weight of 735.33 ± 40.42 g were reared in indoor tanks at the different DO levels for 26 days.</div><div>The dataset [1] is composed of spreadsheets (.xlsx format) and charts (.png format), and includes daily and hourly resolution data (e.g., dissolved oxygen, water temperature, salinity, number of fish and feed intake), sampling and laboratory data (e.g., fish weight, fork length, sex, organs weight, whole-body composition, and tail and opercular beat frequency), and zootechnical indicators calculated at the tank level and averaged per treatment (e.g., survival rate, weight gain, cumulative feed intake, feed conversion ratio and somatic indexes). The differences between treatment means were analyzed using ANOVA, followed by post-hoc testing.</div><div>The data presented here has the potential to be used in subsequent analyses, for example when analyzed together with other experimental data or through its use to parameterize mathematical models, aiming at better understand and describe the effects of dissolved oxygen on the performance of Atlantic salmon.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"57 ","pages":"Article 110983"},"PeriodicalIF":1.0,"publicationDate":"2024-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142445186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-28DOI: 10.1016/j.dib.2024.110981
Md Sawkat Ali , Mohammad Rifat Ahmmad Rashid , Tasnim Hossain , Md Ahsan Kabir , Md. Kamrul , Sayam Hossain Bhuiyan Aumy , Mehedi Hasan Mridha , Imam Hossain Sajeeb , Mohammad Manzurul Islam , Taskeed Jabid
In agricultural research, particularly concerning rice cultivation, the presence of weeds within rice fields is acknowledged as a significant contributor to both diminished crop quality and increased production costs. Rice fields, due to their inherently moist environment, offer ideal conditions for weed proliferation. Traditionally, the control of these weeds has been managed through labor-intensive manual methods. However, as the agricultural sector evolves, there is a notable pivot towards leveraging advanced technological solutions, including deep learning and machine learning. The efficacy of these technologies hinges on the availability of high-quality, relevant data. To address this, a comprehensive dataset comprising 3632 high-resolution RGB images has been developed. This dataset is designed to capture a diverse range of weed species, specifically 11 types that are frequently found in rice fields. The diversity of the dataset ensures that machine learning models trained using this data can effectively identify and differentiate between desired and undesired plant species. While the dataset predominantly includes images from Bangladesh, the weed species it documents are commonly found across various global rice-growing regions, enhancing the dataset's applicability in different agricultural settings.
{"title":"A comprehensive dataset of rice field weed detection from Bangladesh","authors":"Md Sawkat Ali , Mohammad Rifat Ahmmad Rashid , Tasnim Hossain , Md Ahsan Kabir , Md. Kamrul , Sayam Hossain Bhuiyan Aumy , Mehedi Hasan Mridha , Imam Hossain Sajeeb , Mohammad Manzurul Islam , Taskeed Jabid","doi":"10.1016/j.dib.2024.110981","DOIUrl":"10.1016/j.dib.2024.110981","url":null,"abstract":"<div><div>In agricultural research, particularly concerning rice cultivation, the presence of weeds within rice fields is acknowledged as a significant contributor to both diminished crop quality and increased production costs. Rice fields, due to their inherently moist environment, offer ideal conditions for weed proliferation. Traditionally, the control of these weeds has been managed through labor-intensive manual methods. However, as the agricultural sector evolves, there is a notable pivot towards leveraging advanced technological solutions, including deep learning and machine learning. The efficacy of these technologies hinges on the availability of high-quality, relevant data. To address this, a comprehensive dataset comprising 3632 high-resolution RGB images has been developed. This dataset is designed to capture a diverse range of weed species, specifically 11 types that are frequently found in rice fields. The diversity of the dataset ensures that machine learning models trained using this data can effectively identify and differentiate between desired and undesired plant species. While the dataset predominantly includes images from Bangladesh, the weed species it documents are commonly found across various global rice-growing regions, enhancing the dataset's applicability in different agricultural settings.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"57 ","pages":"Article 110981"},"PeriodicalIF":1.0,"publicationDate":"2024-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142427219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-28DOI: 10.1016/j.dib.2024.110972
Rafael Verão Françozo , Afonso Henriques Silva Leite , Leonardo Lopes Honda , Felipe Fernandes de Oliveira , Marcio Teixeira Oliveira , Calvin Rodrigues da Costa
The dataset provided shows the segregation rate between students with and without disabilities in Brazilian cities between 2008 and 2023. Student enrolment data was extracted from the microdata of the Brazilian school census. The segregation rate was calculated using the dissimilarity index, which quantifies how dissimilar or segregated two populations are. The dataset consists of a .csv file with the calculated data and thematic maps of the Brazilian states, highlighting the cities. This data can be useful for researchers in the field of inclusion and decision-makers to support the development of public policies that enable eliminate disparities in education and ensure equal access to all levels of education for the persons with disabilities.
{"title":"A dataset on the segregation of students with disabilities in Brazil","authors":"Rafael Verão Françozo , Afonso Henriques Silva Leite , Leonardo Lopes Honda , Felipe Fernandes de Oliveira , Marcio Teixeira Oliveira , Calvin Rodrigues da Costa","doi":"10.1016/j.dib.2024.110972","DOIUrl":"10.1016/j.dib.2024.110972","url":null,"abstract":"<div><div>The dataset provided shows the segregation rate between students with and without disabilities in Brazilian cities between 2008 and 2023. Student enrolment data was extracted from the microdata of the Brazilian school census. The segregation rate was calculated using the dissimilarity index, which quantifies how dissimilar or segregated two populations are. The dataset consists of a .csv file with the calculated data and thematic maps of the Brazilian states, highlighting the cities. This data can be useful for researchers in the field of inclusion and decision-makers to support the development of public policies that enable eliminate disparities in education and ensure equal access to all levels of education for the persons with disabilities.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"57 ","pages":"Article 110972"},"PeriodicalIF":1.0,"publicationDate":"2024-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142427218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}