Transfer learning with generative models for object detection on limited datasets

IF 6.3 2区 物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Machine Learning Science and Technology Pub Date : 2024-08-11 DOI:10.1088/2632-2153/ad65b5
M Paiano, S Martina, C Giannelli and F Caruso
{"title":"Transfer learning with generative models for object detection on limited datasets","authors":"M Paiano, S Martina, C Giannelli and F Caruso","doi":"10.1088/2632-2153/ad65b5","DOIUrl":null,"url":null,"abstract":"The availability of data is limited in some fields, especially for object detection tasks, where it is necessary to have correctly labeled bounding boxes around each object. A notable example of such data scarcity is found in the domain of marine biology, where it is useful to develop methods to automatically detect submarine species for environmental monitoring. To address this data limitation, the state-of-the-art machine learning strategies employ two main approaches. The first involves pretraining models on existing datasets before generalizing to the specific domain of interest. The second strategy is to create synthetic datasets specifically tailored to the target domain using methods like copy-paste techniques or ad-hoc simulators. The first strategy often faces a significant domain shift, while the second demands custom solutions crafted for the specific task. In response to these challenges, here we propose a transfer learning framework that is valid for a generic scenario. In this framework, generated images help to improve the performances of an object detector in a few-real data regime. This is achieved through a diffusion-based generative model that was pretrained on large generic datasets. With respect to the state-of-the-art, we find that it is not necessary to fine tune the generative model on the specific domain of interest. We believe that this is an important advance because it mitigates the labor-intensive task of manual labeling the images in object detection tasks. We validate our approach focusing on fishes in an underwater environment, and on the more common domain of cars in an urban setting. Our method achieves detection performance comparable to models trained on thousands of images, using only a few hundreds of input data. Our results pave the way for new generative AI-based protocols for machine learning applications in various domains, for instance ranging from geophysics to biology and medicine.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"192 1","pages":""},"PeriodicalIF":6.3000,"publicationDate":"2024-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning Science and Technology","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1088/2632-2153/ad65b5","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

The availability of data is limited in some fields, especially for object detection tasks, where it is necessary to have correctly labeled bounding boxes around each object. A notable example of such data scarcity is found in the domain of marine biology, where it is useful to develop methods to automatically detect submarine species for environmental monitoring. To address this data limitation, the state-of-the-art machine learning strategies employ two main approaches. The first involves pretraining models on existing datasets before generalizing to the specific domain of interest. The second strategy is to create synthetic datasets specifically tailored to the target domain using methods like copy-paste techniques or ad-hoc simulators. The first strategy often faces a significant domain shift, while the second demands custom solutions crafted for the specific task. In response to these challenges, here we propose a transfer learning framework that is valid for a generic scenario. In this framework, generated images help to improve the performances of an object detector in a few-real data regime. This is achieved through a diffusion-based generative model that was pretrained on large generic datasets. With respect to the state-of-the-art, we find that it is not necessary to fine tune the generative model on the specific domain of interest. We believe that this is an important advance because it mitigates the labor-intensive task of manual labeling the images in object detection tasks. We validate our approach focusing on fishes in an underwater environment, and on the more common domain of cars in an urban setting. Our method achieves detection performance comparable to models trained on thousands of images, using only a few hundreds of input data. Our results pave the way for new generative AI-based protocols for machine learning applications in various domains, for instance ranging from geophysics to biology and medicine.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用生成模型在有限数据集上进行物体检测的迁移学习
在某些领域,数据的可用性是有限的,特别是在物体检测任务中,需要在每个物体周围有正确标注的边界框。海洋生物学领域就是这种数据匮乏的一个显著例子,在该领域,开发用于环境监测的海底物种自动检测方法非常有用。为了解决这种数据限制,最先进的机器学习策略主要采用两种方法。第一种方法是在现有数据集上对模型进行预训练,然后再将其推广到感兴趣的特定领域。第二种策略是使用复制粘贴技术或临时模拟器等方法创建专门针对目标领域的合成数据集。第一种策略通常会面临重大的领域转变,而第二种策略则需要为特定任务量身定制解决方案。为了应对这些挑战,我们在这里提出了一个适用于通用场景的迁移学习框架。在这个框架中,生成的图像有助于提高物体检测器在少量真实数据环境中的性能。这是通过在大型通用数据集上预训练的基于扩散的生成模型实现的。与最先进的技术相比,我们发现无需对特定兴趣领域的生成模型进行微调。我们认为这是一个重要的进步,因为它减轻了物体检测任务中人工标记图像的劳动密集型任务。我们以水下环境中的鱼类和城市环境中更常见的汽车领域为重点,对我们的方法进行了验证。我们的方法只使用了几百个输入数据,就实现了与在数千张图像上训练的模型相当的检测性能。我们的研究成果为新的基于生成式人工智能的协议铺平了道路,该协议适用于从地球物理学到生物学和医学等各个领域的机器学习应用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Machine Learning Science and Technology
Machine Learning Science and Technology Computer Science-Artificial Intelligence
CiteScore
9.10
自引率
4.40%
发文量
86
审稿时长
5 weeks
期刊介绍: Machine Learning Science and Technology is a multidisciplinary open access journal that bridges the application of machine learning across the sciences with advances in machine learning methods and theory as motivated by physical insights. Specifically, articles must fall into one of the following categories: advance the state of machine learning-driven applications in the sciences or make conceptual, methodological or theoretical advances in machine learning with applications to, inspiration from, or motivated by scientific problems.
期刊最新文献
Quality assurance for online adaptive radiotherapy: a secondary dose verification model with geometry-encoded U-Net. Optimizing ZX-diagrams with deep reinforcement learning DiffLense: a conditional diffusion model for super-resolution of gravitational lensing data Equivariant tensor network potentials Masked particle modeling on sets: towards self-supervised high energy physics foundation models
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1