研究深度学习物体检测模型从模拟到现实的通用性。

IF 2.7 Q3 IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY Journal of Imaging Pub Date : 2024-10-18 DOI:10.3390/jimaging10100259

Joachim Rüter, Umut Durak, Johann C Dauer

{"title":"研究深度学习物体检测模型从模拟到现实的通用性。","authors":"Joachim Rüter, Umut Durak, Johann C Dauer","doi":"10.3390/jimaging10100259","DOIUrl":null,"url":null,"abstract":"State-of-the-art object detection models need large and diverse datasets for training. As these are hard to acquire for many practical applications, training images from simulation environments gain more and more attention. A problem arises as deep learning models trained on simulation images usually have problems generalizing to real-world images shown by a sharp performance drop. Definite reasons and influences for this performance drop are not yet found. While previous work mostly investigated the influence of the data as well as the use of domain adaptation, this work provides a novel perspective by investigating the influence of the object detection model itself. Against this background, first, a corresponding measure called sim-to-real generalizability is defined, comprising the capability of an object detection model to generalize from simulation training images to real-world evaluation images. Second, 12 different deep learning-based object detection models are trained and their sim-to-real generalizability is evaluated. The models are trained with a variation of hyperparameters resulting in a total of 144 trained and evaluated versions. The results show a clear influence of the feature extractor and offer further insights and correlations. They open up future research on investigating influences on the sim-to-real generalizability of deep learning-based object detection models as well as on developing feature extractors that have better sim-to-real generalizability capabilities.","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"10 10","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11509078/pdf/","citationCount":"0","resultStr":"{\"title\":\"Investigating the Sim-to-Real Generalizability of Deep Learning Object Detection Models.\",\"authors\":\"Joachim Rüter, Umut Durak, Johann C Dauer\",\"doi\":\"10.3390/jimaging10100259\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"State-of-the-art object detection models need large and diverse datasets for training. As these are hard to acquire for many practical applications, training images from simulation environments gain more and more attention. A problem arises as deep learning models trained on simulation images usually have problems generalizing to real-world images shown by a sharp performance drop. Definite reasons and influences for this performance drop are not yet found. While previous work mostly investigated the influence of the data as well as the use of domain adaptation, this work provides a novel perspective by investigating the influence of the object detection model itself. Against this background, first, a corresponding measure called sim-to-real generalizability is defined, comprising the capability of an object detection model to generalize from simulation training images to real-world evaluation images. Second, 12 different deep learning-based object detection models are trained and their sim-to-real generalizability is evaluated. The models are trained with a variation of hyperparameters resulting in a total of 144 trained and evaluated versions. The results show a clear influence of the feature extractor and offer further insights and correlations. They open up future research on investigating influences on the sim-to-real generalizability of deep learning-based object detection models as well as on developing feature extractors that have better sim-to-real generalizability capabilities.\",\"PeriodicalId\":37035,\"journal\":{\"name\":\"Journal of Imaging\",\"volume\":\"10 10\",\"pages\":\"\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2024-10-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11509078/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Imaging\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/jimaging10100259\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Imaging","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/jimaging10100259","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

最先进的物体检测模型需要大量不同的数据集进行训练。由于在许多实际应用中很难获得这些数据集，因此来自模拟环境的训练图像受到越来越多的关注。问题来了，在模拟图像上训练的深度学习模型在泛化到真实世界图像时通常会出现问题，表现为性能急剧下降。这种性能下降的明确原因和影响因素尚未找到。以往的工作主要研究了数据的影响以及领域适应的使用，而本研究则提供了一个新的视角，即研究物体检测模型本身的影响。在此背景下，首先定义了一种称为 "模拟到真实泛化能力 "的相应测量方法，包括物体检测模型从模拟训练图像泛化到真实世界评估图像的能力。其次，对 12 种不同的基于深度学习的物体检测模型进行了训练，并评估了它们的仿真-真实泛化能力。这些模型在训练时使用了不同的超参数，从而产生了总共 144 个训练和评估版本。结果显示了特征提取器的明显影响，并提供了进一步的见解和相关性。这些研究开启了未来的研究方向，即调查基于深度学习的物体检测模型的仿真-真实泛化能力的影响因素，以及开发具有更好仿真-真实泛化能力的特征提取器。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Investigating the Sim-to-Real Generalizability of Deep Learning Object Detection Models.

State-of-the-art object detection models need large and diverse datasets for training. As these are hard to acquire for many practical applications, training images from simulation environments gain more and more attention. A problem arises as deep learning models trained on simulation images usually have problems generalizing to real-world images shown by a sharp performance drop. Definite reasons and influences for this performance drop are not yet found. While previous work mostly investigated the influence of the data as well as the use of domain adaptation, this work provides a novel perspective by investigating the influence of the object detection model itself. Against this background, first, a corresponding measure called sim-to-real generalizability is defined, comprising the capability of an object detection model to generalize from simulation training images to real-world evaluation images. Second, 12 different deep learning-based object detection models are trained and their sim-to-real generalizability is evaluated. The models are trained with a variation of hyperparameters resulting in a total of 144 trained and evaluated versions. The results show a clear influence of the feature extractor and offer further insights and correlations. They open up future research on investigating influences on the sim-to-real generalizability of deep learning-based object detection models as well as on developing feature extractors that have better sim-to-real generalizability capabilities.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊