BDPapayaLeaf: A dataset of papaya leaf for disease detection, classification, and analysis

IF 1 Q3 MULTIDISCIPLINARY SCIENCES Data in Brief Pub Date : 2024-09-10 DOI:10.1016/j.dib.2024.110910
Sumaya Mustofa, Md Taimur Ahad, Yousuf Rayhan Emon, Arpita Sarker
{"title":"BDPapayaLeaf: A dataset of papaya leaf for disease detection, classification, and analysis","authors":"Sumaya Mustofa,&nbsp;Md Taimur Ahad,&nbsp;Yousuf Rayhan Emon,&nbsp;Arpita Sarker","doi":"10.1016/j.dib.2024.110910","DOIUrl":null,"url":null,"abstract":"<div><div>Papaya is a popular vegetable and fruit in both developing and developed countries. Nonetheless, Bangladeshʼs agricultural landscape is significantly influenced by papaya cultivation. However, disease is a common impediment to papaya productivity, adversely affecting papaya quality and yield and leading to substantial economic losses for farmers. Research suggests that computer-aided disease diagnosis and machine learning (ML) models can improve papaya production by detecting and classifying diseases. In this line, a dataset of papaya is required to diagnose the disease. Moreover, like many other fruits, papaya disease may vary from country to country. Therefore, the country-based papaya disease dataset is required. In this study, a papaya dataset is collected from Dhaka, Bangladesh. This dataset contains 2159 original images from five classes, including the healthy control class and four papaya leaf diseases: Anthracnose, Bacterial Spot, Curl, and Ring spot. Besides the original images, the dataset contains 210 annotated data for each of the five classes. The dataset contains two types of data: the <em>whole image</em> and the <em>annotated image</em>. The image will interest data scientists who apply disease detection through a convolutional neural network (CNN) and its variants. Furthermore, the annotated images, such as You Only Look Once (YOLO), U-Net, Mask R-CNN, and Single Shot Detection (SSD), will be helpful for semantic segmentation. Since firm-applicable AI devices and mobile and web applications are in demand, the dataset collected in this study will offer multiple options for integrating ML models into AI devices. In countries with weather and climate similar to Bangladesh, data scientists may use their dataset in that context.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":null,"pages":null},"PeriodicalIF":1.0000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data in Brief","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352340924008734","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

Papaya is a popular vegetable and fruit in both developing and developed countries. Nonetheless, Bangladeshʼs agricultural landscape is significantly influenced by papaya cultivation. However, disease is a common impediment to papaya productivity, adversely affecting papaya quality and yield and leading to substantial economic losses for farmers. Research suggests that computer-aided disease diagnosis and machine learning (ML) models can improve papaya production by detecting and classifying diseases. In this line, a dataset of papaya is required to diagnose the disease. Moreover, like many other fruits, papaya disease may vary from country to country. Therefore, the country-based papaya disease dataset is required. In this study, a papaya dataset is collected from Dhaka, Bangladesh. This dataset contains 2159 original images from five classes, including the healthy control class and four papaya leaf diseases: Anthracnose, Bacterial Spot, Curl, and Ring spot. Besides the original images, the dataset contains 210 annotated data for each of the five classes. The dataset contains two types of data: the whole image and the annotated image. The image will interest data scientists who apply disease detection through a convolutional neural network (CNN) and its variants. Furthermore, the annotated images, such as You Only Look Once (YOLO), U-Net, Mask R-CNN, and Single Shot Detection (SSD), will be helpful for semantic segmentation. Since firm-applicable AI devices and mobile and web applications are in demand, the dataset collected in this study will offer multiple options for integrating ML models into AI devices. In countries with weather and climate similar to Bangladesh, data scientists may use their dataset in that context.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
BDPapayaLeaf:用于疾病检测、分类和分析的木瓜叶数据集
木瓜在发展中国家和发达国家都是很受欢迎的蔬菜和水果。然而,孟加拉国的农业景观深受木瓜种植的影响。然而,病害是木瓜生产的常见障碍,对木瓜的质量和产量造成不利影响,并给农民带来巨大的经济损失。研究表明,计算机辅助病害诊断和机器学习(ML)模型可以通过检测和分类病害来提高木瓜产量。在这一思路中,需要一个木瓜数据集来诊断疾病。此外,与许多其他水果一样,木瓜病害也会因国家而异。因此,需要基于国家的木瓜疾病数据集。本研究从孟加拉国达卡收集了一个木瓜数据集。该数据集包含 5 个类别的 2159 张原始图像,其中包括健康对照类别和 4 种木瓜叶片病害:炭疽病、菌斑病、卷曲病和环斑病。除原始图像外,数据集还包含五个类别中每个类别的 210 个注释数据。数据集包含两类数据:完整图像和注释图像。通过卷积神经网络(CNN)及其变体进行疾病检测的数据科学家会对图像感兴趣。此外,注释图像,如 "你只看一次(YOLO)"、U-Net、掩码 R-CNN 和单次拍摄检测(SSD),将有助于语义分割。由于企业适用的人工智能设备以及移动和网络应用需求旺盛,本研究收集的数据集将为将 ML 模型集成到人工智能设备中提供多种选择。在天气和气候与孟加拉国相似的国家,数据科学家可以在这种情况下使用他们的数据集。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Data in Brief
Data in Brief MULTIDISCIPLINARY SCIENCES-
CiteScore
3.10
自引率
0.00%
发文量
996
审稿时长
70 days
期刊介绍: Data in Brief provides a way for researchers to easily share and reuse each other''s datasets by publishing data articles that: -Thoroughly describe your data, facilitating reproducibility. -Make your data, which is often buried in supplementary material, easier to find. -Increase traffic towards associated research articles and data, leading to more citations. -Open up doors for new collaborations. Because you never know what data will be useful to someone else, Data in Brief welcomes submissions that describe data from all research areas.
期刊最新文献
Dataset of dendrometer and environmental parameter measurements of two different species of the group of genera known as eucalypts in South Africa and Portugal Bulk mRNA-sequencing data of the estrogen and androgen responses in the human prostate cancer cell line VCaP A refined spirometry dataset for comparing segmented (piecewise) linear models to that of GAMLSS Shotgun metagenomics sequencing data of root microbial community of Huanglongbing-infected Citrus nobilis BEEHIVE: A dataset of Apis mellifera images to empower honeybee monitoring research
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1