Towards rigorous dataset quality standards for deep learning tasks in precision agriculture: A case study exploration

IF 6.3 Q1 AGRICULTURAL ENGINEERING Smart agricultural technology Pub Date : 2024-12-15 DOI:10.1016/j.atech.2024.100721
A. Carraro , G. Saurio , F. Marinello
{"title":"Towards rigorous dataset quality standards for deep learning tasks in precision agriculture: A case study exploration","authors":"A. Carraro ,&nbsp;G. Saurio ,&nbsp;F. Marinello","doi":"10.1016/j.atech.2024.100721","DOIUrl":null,"url":null,"abstract":"<div><div>Deep Learning (DL) through Convolutional Neural Networks (CNNs) has emerged as a critical player in classifying plant diseases from images. This prominence has intensified the demand for a substantial volume of annotated training data. However, acquiring such data is costly and intricate, fraught with subtle challenges. In the domain of plants, where data collection can be even more complex, this study scrutinises how one dataset was gathered. Specifically, it delves into the nuances of collecting images of grapevine leaves in an open field for a binary classification task, discerning the presence or absence of Esca disease.</div><div>Adherence to rigorous dataset quality standards during image collection is paramount in precision agriculture. Errors made in this phase can have devastating repercussions on all subsequent work. For instance, collections of photos may exhibit a consistent disparity in background characteristics between images belonging to different classes. This persistent difference can lead a deep-learning algorithm to learn undesired correlations, even though the algorithm's performances are excellent because the train and test sets possess the same kind of disparity.</div></div>","PeriodicalId":74813,"journal":{"name":"Smart agricultural technology","volume":"10 ","pages":"Article 100721"},"PeriodicalIF":6.3000,"publicationDate":"2024-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Smart agricultural technology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772375524003253","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURAL ENGINEERING","Score":null,"Total":0}
引用次数: 0

Abstract

Deep Learning (DL) through Convolutional Neural Networks (CNNs) has emerged as a critical player in classifying plant diseases from images. This prominence has intensified the demand for a substantial volume of annotated training data. However, acquiring such data is costly and intricate, fraught with subtle challenges. In the domain of plants, where data collection can be even more complex, this study scrutinises how one dataset was gathered. Specifically, it delves into the nuances of collecting images of grapevine leaves in an open field for a binary classification task, discerning the presence or absence of Esca disease.
Adherence to rigorous dataset quality standards during image collection is paramount in precision agriculture. Errors made in this phase can have devastating repercussions on all subsequent work. For instance, collections of photos may exhibit a consistent disparity in background characteristics between images belonging to different classes. This persistent difference can lead a deep-learning algorithm to learn undesired correlations, even though the algorithm's performances are excellent because the train and test sets possess the same kind of disparity.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
为精准农业中的深度学习任务制定严格的数据集质量标准:案例研究探索
通过卷积神经网络(CNN)进行的深度学习(DL)已成为从图像中对植物病害进行分类的重要手段。这一显著地位加剧了对大量注释训练数据的需求。然而,获取这些数据既昂贵又复杂,充满了微妙的挑战。在植物领域,数据收集工作可能更加复杂,本研究仔细研究了一个数据集的收集过程。具体来说,本研究深入探讨了在露天田野中收集葡萄叶片图像的细微差别,以完成二元分类任务,辨别是否存在埃斯卡病。在这一阶段出现的错误会对所有后续工作产生破坏性影响。例如,收集的照片可能会显示出属于不同类别的图像在背景特征上的持续差异。这种持续存在的差异会导致深度学习算法学习到不想要的相关性,即使该算法的性能非常出色,因为训练集和测试集具有相同的差异。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
4.20
自引率
0.00%
发文量
0
期刊最新文献
YSD-BPTrack: A multi-object tracking framework for calves in occluded environments Validation of the FERTI-drip model for the evaluation and simulation of fertigation events in drip irrigation Spectral bands vs. vegetation indices: An AutoML approach for processing tomato yield predictions based on Sentinel-2 imagery Factors influencing learning attitude of farmers regarding adoption of farming technologies in farms of Kentucky, USA Precision agriculture for iceberg lettuce: From spatial sensing to per plant decision making and control
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1