BDPapayaLeaf: A dataset of papaya leaf for disease detection, classification, and analysis

IF 1 Q3 MULTIDISCIPLINARY SCIENCES Data in Brief Pub Date : 2024-09-10 DOI:10.1016/j.dib.2024.110910

Sumaya Mustofa, Md Taimur Ahad, Yousuf Rayhan Emon, Arpita Sarker

{"title":"BDPapayaLeaf: A dataset of papaya leaf for disease detection, classification, and analysis","authors":"Sumaya Mustofa, Md Taimur Ahad, Yousuf Rayhan Emon, Arpita Sarker","doi":"10.1016/j.dib.2024.110910","DOIUrl":null,"url":null,"abstract":"<div><div>Papaya is a popular vegetable and fruit in both developing and developed countries. Nonetheless, Bangladeshʼs agricultural landscape is significantly influenced by papaya cultivation. However, disease is a common impediment to papaya productivity, adversely affecting papaya quality and yield and leading to substantial economic losses for farmers. Research suggests that computer-aided disease diagnosis and machine learning (ML) models can improve papaya production by detecting and classifying diseases. In this line, a dataset of papaya is required to diagnose the disease. Moreover, like many other fruits, papaya disease may vary from country to country. Therefore, the country-based papaya disease dataset is required. In this study, a papaya dataset is collected from Dhaka, Bangladesh. This dataset contains 2159 original images from five classes, including the healthy control class and four papaya leaf diseases: Anthracnose, Bacterial Spot, Curl, and Ring spot. Besides the original images, the dataset contains 210 annotated data for each of the five classes. The dataset contains two types of data: the <em>whole image</em> and the <em>annotated image</em>. The image will interest data scientists who apply disease detection through a convolutional neural network (CNN) and its variants. Furthermore, the annotated images, such as You Only Look Once (YOLO), U-Net, Mask R-CNN, and Single Shot Detection (SSD), will be helpful for semantic segmentation. Since firm-applicable AI devices and mobile and web applications are in demand, the dataset collected in this study will offer multiple options for integrating ML models into AI devices. In countries with weather and climate similar to Bangladesh, data scientists may use their dataset in that context.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":null,"pages":null},"PeriodicalIF":1.0000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data in Brief","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352340924008734","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}

引用次数: 0

Abstract

Papaya is a popular vegetable and fruit in both developing and developed countries. Nonetheless, Bangladeshʼs agricultural landscape is significantly influenced by papaya cultivation. However, disease is a common impediment to papaya productivity, adversely affecting papaya quality and yield and leading to substantial economic losses for farmers. Research suggests that computer-aided disease diagnosis and machine learning (ML) models can improve papaya production by detecting and classifying diseases. In this line, a dataset of papaya is required to diagnose the disease. Moreover, like many other fruits, papaya disease may vary from country to country. Therefore, the country-based papaya disease dataset is required. In this study, a papaya dataset is collected from Dhaka, Bangladesh. This dataset contains 2159 original images from five classes, including the healthy control class and four papaya leaf diseases: Anthracnose, Bacterial Spot, Curl, and Ring spot. Besides the original images, the dataset contains 210 annotated data for each of the five classes. The dataset contains two types of data: the whole image and the annotated image. The image will interest data scientists who apply disease detection through a convolutional neural network (CNN) and its variants. Furthermore, the annotated images, such as You Only Look Once (YOLO), U-Net, Mask R-CNN, and Single Shot Detection (SSD), will be helpful for semantic segmentation. Since firm-applicable AI devices and mobile and web applications are in demand, the dataset collected in this study will offer multiple options for integrating ML models into AI devices. In countries with weather and climate similar to Bangladesh, data scientists may use their dataset in that context.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

BDPapayaLeaf：用于疾病检测、分类和分析的木瓜叶数据集

木瓜在发展中国家和发达国家都是很受欢迎的蔬菜和水果。然而，孟加拉国的农业景观深受木瓜种植的影响。然而，病害是木瓜生产的常见障碍，对木瓜的质量和产量造成不利影响，并给农民带来巨大的经济损失。研究表明，计算机辅助病害诊断和机器学习（ML）模型可以通过检测和分类病害来提高木瓜产量。在这一思路中，需要一个木瓜数据集来诊断疾病。此外，与许多其他水果一样，木瓜病害也会因国家而异。因此，需要基于国家的木瓜疾病数据集。本研究从孟加拉国达卡收集了一个木瓜数据集。该数据集包含 5 个类别的 2159 张原始图像，其中包括健康对照类别和 4 种木瓜叶片病害：炭疽病、菌斑病、卷曲病和环斑病。除原始图像外，数据集还包含五个类别中每个类别的 210 个注释数据。数据集包含两类数据：完整图像和注释图像。通过卷积神经网络（CNN）及其变体进行疾病检测的数据科学家会对图像感兴趣。此外，注释图像，如 "你只看一次（YOLO）"、U-Net、掩码 R-CNN 和单次拍摄检测（SSD），将有助于语义分割。由于企业适用的人工智能设备以及移动和网络应用需求旺盛，本研究收集的数据集将为将 ML 模型集成到人工智能设备中提供多种选择。在天气和气候与孟加拉国相似的国家，数据科学家可以在这种情况下使用他们的数据集。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Data in Brief MULTIDISCIPLINARY SCIENCES-

CiteScore

3.10

自引率

0.00%

发文量

996

审稿时长

70 days

期刊介绍： Data in Brief provides a way for researchers to easily share and reuse each other''s datasets by publishing data articles that: -Thoroughly describe your data, facilitating reproducibility. -Make your data, which is often buried in supplementary material, easier to find. -Increase traffic towards associated research articles and data, leading to more citations. -Open up doors for new collaborations. Because you never know what data will be useful to someone else, Data in Brief welcomes submissions that describe data from all research areas.