利用controlnet对杂草图像进行增强,增加稳定扩散,实现多类杂草检测

IF 10.3 1区 农林科学 Q1 AGRICULTURE, MULTIDISCIPLINARY Computers and Electronics in Agriculture Pub Date : 2025-05-01 Epub Date: 2025-02-16 DOI:10.1016/j.compag.2025.110123
Boyang Deng, Yuzhen Lu
{"title":"利用controlnet对杂草图像进行增强,增加稳定扩散,实现多类杂草检测","authors":"Boyang Deng,&nbsp;Yuzhen Lu","doi":"10.1016/j.compag.2025.110123","DOIUrl":null,"url":null,"abstract":"<div><div>Robust weed recognition for vision-guided weeding relies on curating large-scale, diverse field datasets, which however are practically difficult to come by. Text-to-image generative artificial intelligence opens new avenues for synthesizing perceptually realistic images beneficial for wide-ranging computer vision tasks in precision agriculture. This study investigates the efficacy of state-of-the-art diffusion models as an image augmentation technique for synthesizing multi-class weed images towards enhanced weed detection performance. A three-season 10-weed-class dataset was created as a testbed for image generation and weed detection tasks. The ControlNet-added stable diffusion models were trained to generate weed images with broad intra-class variations of targeted weed species and diverse backgrounds to adapt to changing field conditions. The quality of generated images was assessed using metrics including the Fréchet Inception Distance (FID) and Inception Score (IS), resulting in an average FID of 0.98 and IS of 3.63. The generated weed images were selected to supplement real-world images for weed detection by YOLOv8-large. Combining the manually selected, generated images with real images yielded an overall mAP@50:95 of 88.3 % and mAP@50 of 95.0 %, representing performance gains of 1.4 % and 0.8 %, respectively, compared to the baseline model trained using only real images. It also performed competitively or comparably with modeling by combining real images with the images generated by external, traditional data augmentation techniques. The proposed automated post-generation image filtering approach still needs improvements to select high-quality images for enhanced weed detection. Both the weed dataset<span><span><sup>1</sup></span></span> and software programs<span><span><sup>2</sup></span></span> developed in this study have been made publicly available. Considerable research is needed to exploit more controllable diffusion models for generating high-fidelity, diverse weed images to substantially enhance weed detection in changing field conditions.</div></div>","PeriodicalId":50627,"journal":{"name":"Computers and Electronics in Agriculture","volume":"232 ","pages":"Article 110123"},"PeriodicalIF":10.3000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Weed image augmentation by ControlNet-added stable diffusion for multi-class weed detection\",\"authors\":\"Boyang Deng,&nbsp;Yuzhen Lu\",\"doi\":\"10.1016/j.compag.2025.110123\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Robust weed recognition for vision-guided weeding relies on curating large-scale, diverse field datasets, which however are practically difficult to come by. Text-to-image generative artificial intelligence opens new avenues for synthesizing perceptually realistic images beneficial for wide-ranging computer vision tasks in precision agriculture. This study investigates the efficacy of state-of-the-art diffusion models as an image augmentation technique for synthesizing multi-class weed images towards enhanced weed detection performance. A three-season 10-weed-class dataset was created as a testbed for image generation and weed detection tasks. The ControlNet-added stable diffusion models were trained to generate weed images with broad intra-class variations of targeted weed species and diverse backgrounds to adapt to changing field conditions. The quality of generated images was assessed using metrics including the Fréchet Inception Distance (FID) and Inception Score (IS), resulting in an average FID of 0.98 and IS of 3.63. The generated weed images were selected to supplement real-world images for weed detection by YOLOv8-large. Combining the manually selected, generated images with real images yielded an overall mAP@50:95 of 88.3 % and mAP@50 of 95.0 %, representing performance gains of 1.4 % and 0.8 %, respectively, compared to the baseline model trained using only real images. It also performed competitively or comparably with modeling by combining real images with the images generated by external, traditional data augmentation techniques. The proposed automated post-generation image filtering approach still needs improvements to select high-quality images for enhanced weed detection. Both the weed dataset<span><span><sup>1</sup></span></span> and software programs<span><span><sup>2</sup></span></span> developed in this study have been made publicly available. Considerable research is needed to exploit more controllable diffusion models for generating high-fidelity, diverse weed images to substantially enhance weed detection in changing field conditions.</div></div>\",\"PeriodicalId\":50627,\"journal\":{\"name\":\"Computers and Electronics in Agriculture\",\"volume\":\"232 \",\"pages\":\"Article 110123\"},\"PeriodicalIF\":10.3000,\"publicationDate\":\"2025-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers and Electronics in Agriculture\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0168169925002297\",\"RegionNum\":1,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/2/16 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"AGRICULTURE, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers and Electronics in Agriculture","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0168169925002297","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/2/16 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"AGRICULTURE, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

视觉引导除草的强大杂草识别依赖于策划大规模,多样化的现场数据集,然而实际上很难获得。文本到图像生成人工智能为合成感知逼真的图像开辟了新的途径,有利于精准农业中广泛的计算机视觉任务。本研究探讨了最先进的扩散模型作为一种图像增强技术,用于合成多类杂草图像,以提高杂草检测性能。创建了一个三季10种杂草类数据集作为图像生成和杂草检测任务的测试平台。通过训练controlnet添加的稳定扩散模型,生成的杂草图像具有目标杂草种类的广泛类内变化和不同的背景,以适应不断变化的田间条件。生成图像的质量使用包括fr起始距离(FID)和起始分数(IS)在内的指标进行评估,结果平均FID为0.98,IS为3.63。生成的杂草图像被选择作为真实图像的补充,使用YOLOv8-large进行杂草检测。将手动选择的生成的图像与真实图像相结合,产生了总体的mAP@50:95(88.3%)和mAP@50(95.0%),与仅使用真实图像训练的基线模型相比,分别代表了1.4%和0.8%的性能提升。通过将真实图像与外部传统数据增强技术生成的图像相结合,它还可以与建模相媲美。提出的自动后生成图像滤波方法仍然需要改进,以选择高质量的图像,以增强杂草检测。本研究开发的杂草数据和软件程序都已公开。需要大量的研究来开发更可控的扩散模型来生成高保真、多样化的杂草图像,以大大提高在不断变化的田间条件下的杂草检测。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Weed image augmentation by ControlNet-added stable diffusion for multi-class weed detection
Robust weed recognition for vision-guided weeding relies on curating large-scale, diverse field datasets, which however are practically difficult to come by. Text-to-image generative artificial intelligence opens new avenues for synthesizing perceptually realistic images beneficial for wide-ranging computer vision tasks in precision agriculture. This study investigates the efficacy of state-of-the-art diffusion models as an image augmentation technique for synthesizing multi-class weed images towards enhanced weed detection performance. A three-season 10-weed-class dataset was created as a testbed for image generation and weed detection tasks. The ControlNet-added stable diffusion models were trained to generate weed images with broad intra-class variations of targeted weed species and diverse backgrounds to adapt to changing field conditions. The quality of generated images was assessed using metrics including the Fréchet Inception Distance (FID) and Inception Score (IS), resulting in an average FID of 0.98 and IS of 3.63. The generated weed images were selected to supplement real-world images for weed detection by YOLOv8-large. Combining the manually selected, generated images with real images yielded an overall mAP@50:95 of 88.3 % and mAP@50 of 95.0 %, representing performance gains of 1.4 % and 0.8 %, respectively, compared to the baseline model trained using only real images. It also performed competitively or comparably with modeling by combining real images with the images generated by external, traditional data augmentation techniques. The proposed automated post-generation image filtering approach still needs improvements to select high-quality images for enhanced weed detection. Both the weed dataset1 and software programs2 developed in this study have been made publicly available. Considerable research is needed to exploit more controllable diffusion models for generating high-fidelity, diverse weed images to substantially enhance weed detection in changing field conditions.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computers and Electronics in Agriculture
Computers and Electronics in Agriculture 工程技术-计算机:跨学科应用
CiteScore
15.30
自引率
14.50%
发文量
800
审稿时长
62 days
期刊介绍: Computers and Electronics in Agriculture provides international coverage of advancements in computer hardware, software, electronic instrumentation, and control systems applied to agricultural challenges. Encompassing agronomy, horticulture, forestry, aquaculture, and animal farming, the journal publishes original papers, reviews, and applications notes. It explores the use of computers and electronics in plant or animal agricultural production, covering topics like agricultural soils, water, pests, controlled environments, and waste. The scope extends to on-farm post-harvest operations and relevant technologies, including artificial intelligence, sensors, machine vision, robotics, networking, and simulation modeling. Its companion journal, Smart Agricultural Technology, continues the focus on smart applications in production agriculture.
期刊最新文献
A computer vision driven ecosystem for cattle monitoring: Multi-disease classification with severity grading, multi-view individual identification, and weight estimation Improving cotton biomass estimation by assimilating SAR data into a modified crop growth model with simple calibration Accurate individual-tree aboveground biomass estimation via physics-guided machine learning on UAV-based LiDAR and multispectral data Optimal grasping direction of a flexible gripper and its RGB-D multimodal estimation method Development and multiple cultivation seasons evaluation of a temperature-driven process-based model for tomato growth and yield in greenhouse conditions
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1