{"title":"基于多模态掩码自动编码器的农业害虫识别技术","authors":"Yinshuo Zhang , Lei Chen , Yuan Yuan","doi":"10.1016/j.cropro.2024.106993","DOIUrl":null,"url":null,"abstract":"<div><div>Visual recognition methods based on deep convolutional neural networks have performed well in pest diagnosis and have gradually become a research hotspot. However, agricultural pest recognition faces challenges such as few-shot learning, category imbalance, similarity in appearance, and small pest targets. Existing deep learning-based pest recognition methods typically rely solely on unimodal image data, which results in a model whose recognition performance is heavily dependent on the size and quality of the annotated training dataset. However, the construction of large-scale, high-quality pest datasets requires significant economic and technical costs, limiting the practical generalization of existing methods for pest recognition. To address these challenges, this paper proposes a few-shot pest recognition model called MMAE (multimodal masked autoencoder). Firstly, the masked autoencoder of MMAE integrates self-supervised learning, which can be applied to few-shot datasets and improves recognition accuracy. Secondly, MMAE embeds textual modal information on top of image modal information, thus improving the performance of pest recognition by utilizing the correlation and complementarity between the two modalities. The experimental results show that MMAE is the most effective for pest identification compared with the existing excellent models, and the identification accuracy is as high as 98.12%, which is 1.61 percentage points higher than the current state-of-the-art MAE method. The work in this paper shows that the introduction of textual information can assist the visual coder in capturing agricultural pest characterization information at a higher level of granularity, providing a methodological reference for solving the problem of agricultural pest recognition under few-shot conditions.</div></div>","PeriodicalId":10785,"journal":{"name":"Crop Protection","volume":"187 ","pages":"Article 106993"},"PeriodicalIF":2.5000,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Few-shot agricultural pest recognition based on multimodal masked autoencoder\",\"authors\":\"Yinshuo Zhang , Lei Chen , Yuan Yuan\",\"doi\":\"10.1016/j.cropro.2024.106993\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Visual recognition methods based on deep convolutional neural networks have performed well in pest diagnosis and have gradually become a research hotspot. However, agricultural pest recognition faces challenges such as few-shot learning, category imbalance, similarity in appearance, and small pest targets. Existing deep learning-based pest recognition methods typically rely solely on unimodal image data, which results in a model whose recognition performance is heavily dependent on the size and quality of the annotated training dataset. However, the construction of large-scale, high-quality pest datasets requires significant economic and technical costs, limiting the practical generalization of existing methods for pest recognition. To address these challenges, this paper proposes a few-shot pest recognition model called MMAE (multimodal masked autoencoder). Firstly, the masked autoencoder of MMAE integrates self-supervised learning, which can be applied to few-shot datasets and improves recognition accuracy. Secondly, MMAE embeds textual modal information on top of image modal information, thus improving the performance of pest recognition by utilizing the correlation and complementarity between the two modalities. The experimental results show that MMAE is the most effective for pest identification compared with the existing excellent models, and the identification accuracy is as high as 98.12%, which is 1.61 percentage points higher than the current state-of-the-art MAE method. The work in this paper shows that the introduction of textual information can assist the visual coder in capturing agricultural pest characterization information at a higher level of granularity, providing a methodological reference for solving the problem of agricultural pest recognition under few-shot conditions.</div></div>\",\"PeriodicalId\":10785,\"journal\":{\"name\":\"Crop Protection\",\"volume\":\"187 \",\"pages\":\"Article 106993\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2024-10-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Crop Protection\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0261219424004216\",\"RegionNum\":2,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AGRONOMY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Crop Protection","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0261219424004216","RegionNum":2,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRONOMY","Score":null,"Total":0}
引用次数: 0
摘要
基于深度卷积神经网络的视觉识别方法在害虫诊断中表现出色,逐渐成为研究热点。然而,农业害虫识别面临着学习次数少、类别不平衡、外观相似性和害虫目标小等挑战。现有的基于深度学习的害虫识别方法通常仅依赖于单模态图像数据,这导致模型的识别性能严重依赖于标注训练数据集的规模和质量。然而,构建大规模、高质量的害虫数据集需要大量的经济和技术成本,这限制了现有害虫识别方法的实际推广。为了应对这些挑战,本文提出了一种名为 MMAE(多模态掩蔽自动编码器)的少量害虫识别模型。首先,MMAE 的遮蔽自编码器集成了自监督学习,可应用于少量数据集并提高识别准确率。其次,MMAE 在图像模态信息的基础上嵌入了文本模态信息,从而利用两种模态之间的相关性和互补性提高了虫害识别的性能。实验结果表明,与现有的优秀模型相比,MMAE 是最有效的害虫识别方法,识别准确率高达 98.12%,比目前最先进的 MAE 方法高出 1.61 个百分点。本文的工作表明,文本信息的引入可以帮助视觉编码器捕捉到更高粒度的农业害虫特征信息,为解决少镜头条件下的农业害虫识别问题提供了方法参考。
Few-shot agricultural pest recognition based on multimodal masked autoencoder
Visual recognition methods based on deep convolutional neural networks have performed well in pest diagnosis and have gradually become a research hotspot. However, agricultural pest recognition faces challenges such as few-shot learning, category imbalance, similarity in appearance, and small pest targets. Existing deep learning-based pest recognition methods typically rely solely on unimodal image data, which results in a model whose recognition performance is heavily dependent on the size and quality of the annotated training dataset. However, the construction of large-scale, high-quality pest datasets requires significant economic and technical costs, limiting the practical generalization of existing methods for pest recognition. To address these challenges, this paper proposes a few-shot pest recognition model called MMAE (multimodal masked autoencoder). Firstly, the masked autoencoder of MMAE integrates self-supervised learning, which can be applied to few-shot datasets and improves recognition accuracy. Secondly, MMAE embeds textual modal information on top of image modal information, thus improving the performance of pest recognition by utilizing the correlation and complementarity between the two modalities. The experimental results show that MMAE is the most effective for pest identification compared with the existing excellent models, and the identification accuracy is as high as 98.12%, which is 1.61 percentage points higher than the current state-of-the-art MAE method. The work in this paper shows that the introduction of textual information can assist the visual coder in capturing agricultural pest characterization information at a higher level of granularity, providing a methodological reference for solving the problem of agricultural pest recognition under few-shot conditions.
期刊介绍:
The Editors of Crop Protection especially welcome papers describing an interdisciplinary approach showing how different control strategies can be integrated into practical pest management programs, covering high and low input agricultural systems worldwide. Crop Protection particularly emphasizes the practical aspects of control in the field and for protected crops, and includes work which may lead in the near future to more effective control. The journal does not duplicate the many existing excellent biological science journals, which deal mainly with the more fundamental aspects of plant pathology, applied zoology and weed science. Crop Protection covers all practical aspects of pest, disease and weed control, including the following topics:
-Abiotic damage-
Agronomic control methods-
Assessment of pest and disease damage-
Molecular methods for the detection and assessment of pests and diseases-
Biological control-
Biorational pesticides-
Control of animal pests of world crops-
Control of diseases of crop plants caused by microorganisms-
Control of weeds and integrated management-
Economic considerations-
Effects of plant growth regulators-
Environmental benefits of reduced pesticide use-
Environmental effects of pesticides-
Epidemiology of pests and diseases in relation to control-
GM Crops, and genetic engineering applications-
Importance and control of postharvest crop losses-
Integrated control-
Interrelationships and compatibility among different control strategies-
Invasive species as they relate to implications for crop protection-
Pesticide application methods-
Pest management-
Phytobiomes for pest and disease control-
Resistance management-
Sampling and monitoring schemes for diseases, nematodes, pests and weeds.