Optimizing Strawberry Disease and Quality Detection with Vision Transformers and Attention-Based Convolutional Neural Networks

Foods Pub Date : 2024-06-14 DOI:10.3390/foods13121869
Kimia Aghamohammadesmaeilketabforoosh, Soodeh Nikan, Giorgio Antonini, Joshua M. Pearce
{"title":"Optimizing Strawberry Disease and Quality Detection with Vision Transformers and Attention-Based Convolutional Neural Networks","authors":"Kimia Aghamohammadesmaeilketabforoosh, Soodeh Nikan, Giorgio Antonini, Joshua M. Pearce","doi":"10.3390/foods13121869","DOIUrl":null,"url":null,"abstract":"Machine learning and computer vision have proven to be valuable tools for farmers to streamline their resource utilization to lead to more sustainable and efficient agricultural production. These techniques have been applied to strawberry cultivation in the past with limited success. To build on this past work, in this study, two separate sets of strawberry images, along with their associated diseases, were collected and subjected to resizing and augmentation. Subsequently, a combined dataset consisting of nine classes was utilized to fine-tune three distinct pretrained models: vision transformer (ViT), MobileNetV2, and ResNet18. To address the imbalanced class distribution in the dataset, each class was assigned weights to ensure nearly equal impact during the training process. To enhance the outcomes, new images were generated by removing backgrounds, reducing noise, and flipping them. The performances of ViT, MobileNetV2, and ResNet18 were compared after being selected. Customization specific to the task was applied to all three algorithms, and their performances were assessed. Throughout this experiment, none of the layers were frozen, ensuring all layers remained active during training. Attention heads were incorporated into the first five and last five layers of MobileNetV2 and ResNet18, while the architecture of ViT was modified. The results indicated accuracy factors of 98.4%, 98.1%, and 97.9% for ViT, MobileNetV2, and ResNet18, respectively. Despite the data being imbalanced, the precision, which indicates the proportion of correctly identified positive instances among all predicted positive instances, approached nearly 99% with the ViT. MobileNetV2 and ResNet18 demonstrated similar results. Overall, the analysis revealed that the vision transformer model exhibited superior performance in strawberry ripeness and disease classification. The inclusion of attention heads in the early layers of ResNet18 and MobileNet18, along with the inherent attention mechanism in ViT, improved the accuracy of image identification. These findings offer the potential for farmers to enhance strawberry cultivation through passive camera monitoring alone, promoting the health and well-being of the population.","PeriodicalId":502667,"journal":{"name":"Foods","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Foods","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/foods13121869","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Machine learning and computer vision have proven to be valuable tools for farmers to streamline their resource utilization to lead to more sustainable and efficient agricultural production. These techniques have been applied to strawberry cultivation in the past with limited success. To build on this past work, in this study, two separate sets of strawberry images, along with their associated diseases, were collected and subjected to resizing and augmentation. Subsequently, a combined dataset consisting of nine classes was utilized to fine-tune three distinct pretrained models: vision transformer (ViT), MobileNetV2, and ResNet18. To address the imbalanced class distribution in the dataset, each class was assigned weights to ensure nearly equal impact during the training process. To enhance the outcomes, new images were generated by removing backgrounds, reducing noise, and flipping them. The performances of ViT, MobileNetV2, and ResNet18 were compared after being selected. Customization specific to the task was applied to all three algorithms, and their performances were assessed. Throughout this experiment, none of the layers were frozen, ensuring all layers remained active during training. Attention heads were incorporated into the first five and last five layers of MobileNetV2 and ResNet18, while the architecture of ViT was modified. The results indicated accuracy factors of 98.4%, 98.1%, and 97.9% for ViT, MobileNetV2, and ResNet18, respectively. Despite the data being imbalanced, the precision, which indicates the proportion of correctly identified positive instances among all predicted positive instances, approached nearly 99% with the ViT. MobileNetV2 and ResNet18 demonstrated similar results. Overall, the analysis revealed that the vision transformer model exhibited superior performance in strawberry ripeness and disease classification. The inclusion of attention heads in the early layers of ResNet18 and MobileNet18, along with the inherent attention mechanism in ViT, improved the accuracy of image identification. These findings offer the potential for farmers to enhance strawberry cultivation through passive camera monitoring alone, promoting the health and well-being of the population.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用视觉变换器和基于注意力的卷积神经网络优化草莓病害和质量检测
事实证明,机器学习和计算机视觉是农民简化资源利用、提高农业生产可持续性和效率的重要工具。这些技术过去曾被应用于草莓种植,但成效有限。为了在过去工作的基础上更进一步,本研究收集了两组独立的草莓图像及其相关病害,并对其进行了大小调整和增强。随后,由九个类别组成的综合数据集被用来微调三个不同的预训练模型:视觉转换器(ViT)、MobileNetV2 和 ResNet18。为了解决数据集中类别分布不平衡的问题,我们为每个类别分配了权重,以确保在训练过程中产生几乎相同的影响。为了提高结果,通过去除背景、减少噪音和翻转生成了新的图像。选定 ViT、MobileNetV2 和 ResNet18 后,对它们的性能进行了比较。对所有三种算法都进行了任务定制,并对其性能进行了评估。在整个实验过程中,没有一个层被冻结,以确保所有层在训练过程中保持活跃。在 MobileNetV2 和 ResNet18 的前五层和后五层中加入了注意力头,同时修改了 ViT 的架构。结果显示,ViT、MobileNetV2 和 ResNet18 的准确率分别为 98.4%、98.1% 和 97.9%。尽管数据不平衡,但 ViT 的精确度(表示在所有预测的正向实例中正确识别出正向实例的比例)接近 99%。MobileNetV2 和 ResNet18 也取得了类似的结果。总之,分析结果表明,视觉转换器模型在草莓成熟度和疾病分类方面表现出卓越的性能。在 ResNet18 和 MobileNet18 的早期层中加入注意力头,再加上 ViT 固有的注意力机制,提高了图像识别的准确性。这些发现为农民提供了仅通过被动相机监测来加强草莓种植的潜力,促进了人们的健康和福祉。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Recent Advances of Natural Pentacyclic Triterpenoids as Bioactive Delivery System for Synergetic Biological Applications Intelligent Food Packaging: Quaternary Ammonium Chitosan/Gelatin Blended Films Enriched with Blueberry Anthocyanin-Derived Cyanidin for Shrimp and Milk Freshness Monitoring Comprehensive Amelioration of Metabolic Dysfunction through Administration of Lactiplantibacillus plantarum APsulloc 331261 (GTB1™) in High-Fat-Diet-Fed Mice Heat Stability Assessment of Milk: A Review of Traditional and Innovative Methods Quality Characterization of Honeys from Iraqi Kurdistan and Comparison with Central European Honeys
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1