对自动驾驶场景中开放世界目标检测的再思考

Zeyu Ma, Yang Yang, Guoqing Wang, Xing Xu, Heng Tao Shen, Mingxing Zhang
{"title":"对自动驾驶场景中开放世界目标检测的再思考","authors":"Zeyu Ma, Yang Yang, Guoqing Wang, Xing Xu, Heng Tao Shen, Mingxing Zhang","doi":"10.1145/3503161.3548165","DOIUrl":null,"url":null,"abstract":"Existing object detection models have been demonstrated to successfully discriminate and localize the predefined object categories under the seen or similar situations. However, the open-world object detection as required by autonomous driving perception systems refers to recognizing unseen objects under various scenarios. On the one hand, the knowledge gap between seen and unseen object categories poses extreme challenges for models trained with supervision only from the seen object categories. On the other hand, the domain differences across different scenarios also cause an additional urge to take the domain gap into consideration by aligning the sample or label distribution. Aimed at resolving these two challenges simultaneously, we firstly design a pre-training model to formulate the mappings between visual images and semantic embeddings from the extra annotations as guidance to link the seen and unseen object categories through a self-supervised manner. Within this formulation, the domain adaptation is then utilized for extracting the domain-agnostic feature representations and alleviating the misdetection of unseen objects caused by the domain appearance changes. As a result, the more realistic and practical open-world object detection problem is visited and resolved by our novel formulation, which could detect the unseen categories from unseen domains without any bounding box annotations while there is no obvious performance drop in detecting the seen categories. We are the first to formulate a unified model for open-world task and establish a new state-of-the-art performance for this challenge.","PeriodicalId":412792,"journal":{"name":"Proceedings of the 30th ACM International Conference on Multimedia","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Rethinking Open-World Object Detection in Autonomous Driving Scenarios\",\"authors\":\"Zeyu Ma, Yang Yang, Guoqing Wang, Xing Xu, Heng Tao Shen, Mingxing Zhang\",\"doi\":\"10.1145/3503161.3548165\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Existing object detection models have been demonstrated to successfully discriminate and localize the predefined object categories under the seen or similar situations. However, the open-world object detection as required by autonomous driving perception systems refers to recognizing unseen objects under various scenarios. On the one hand, the knowledge gap between seen and unseen object categories poses extreme challenges for models trained with supervision only from the seen object categories. On the other hand, the domain differences across different scenarios also cause an additional urge to take the domain gap into consideration by aligning the sample or label distribution. Aimed at resolving these two challenges simultaneously, we firstly design a pre-training model to formulate the mappings between visual images and semantic embeddings from the extra annotations as guidance to link the seen and unseen object categories through a self-supervised manner. Within this formulation, the domain adaptation is then utilized for extracting the domain-agnostic feature representations and alleviating the misdetection of unseen objects caused by the domain appearance changes. As a result, the more realistic and practical open-world object detection problem is visited and resolved by our novel formulation, which could detect the unseen categories from unseen domains without any bounding box annotations while there is no obvious performance drop in detecting the seen categories. We are the first to formulate a unified model for open-world task and establish a new state-of-the-art performance for this challenge.\",\"PeriodicalId\":412792,\"journal\":{\"name\":\"Proceedings of the 30th ACM International Conference on Multimedia\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 30th ACM International Conference on Multimedia\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3503161.3548165\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 30th ACM International Conference on Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3503161.3548165","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

摘要

现有的目标检测模型已经被证明可以在已知或相似的情况下成功地区分和定位预定义的目标类别。而自动驾驶感知系统所要求的开放世界物体检测,是指对各种场景下看不见的物体进行识别。一方面,可见和不可见对象类别之间的知识差距对仅从可见对象类别进行监督训练的模型提出了极大的挑战。另一方面,不同场景之间的领域差异也会导致通过对齐样本或标签分布来考虑领域差距的额外冲动。为了同时解决这两个挑战,我们首先设计了一个预训练模型,从额外的注释中形成视觉图像和语义嵌入之间的映射,作为指导,通过自监督的方式将可见和未见的对象类别联系起来。在此公式中,利用领域自适应提取与领域无关的特征表示,减轻因领域外观变化而导致的未见物体的误检测。该方法能够在不需要任何边界框标注的情况下,从不可见的域中检测出不可见的类别,并且在检测可见类别时没有明显的性能下降,从而解决了更加现实和实用的开放世界目标检测问题。我们是第一个为开放世界任务制定统一模型并为这一挑战建立新的最先进性能的公司。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Rethinking Open-World Object Detection in Autonomous Driving Scenarios
Existing object detection models have been demonstrated to successfully discriminate and localize the predefined object categories under the seen or similar situations. However, the open-world object detection as required by autonomous driving perception systems refers to recognizing unseen objects under various scenarios. On the one hand, the knowledge gap between seen and unseen object categories poses extreme challenges for models trained with supervision only from the seen object categories. On the other hand, the domain differences across different scenarios also cause an additional urge to take the domain gap into consideration by aligning the sample or label distribution. Aimed at resolving these two challenges simultaneously, we firstly design a pre-training model to formulate the mappings between visual images and semantic embeddings from the extra annotations as guidance to link the seen and unseen object categories through a self-supervised manner. Within this formulation, the domain adaptation is then utilized for extracting the domain-agnostic feature representations and alleviating the misdetection of unseen objects caused by the domain appearance changes. As a result, the more realistic and practical open-world object detection problem is visited and resolved by our novel formulation, which could detect the unseen categories from unseen domains without any bounding box annotations while there is no obvious performance drop in detecting the seen categories. We are the first to formulate a unified model for open-world task and establish a new state-of-the-art performance for this challenge.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Adaptive Anti-Bottleneck Multi-Modal Graph Learning Network for Personalized Micro-video Recommendation Composite Photograph Harmonization with Complete Background Cues Domain-Specific Conditional Jigsaw Adaptation for Enhancing transferability and Discriminability Enabling Effective Low-Light Perception using Ubiquitous Low-Cost Visible-Light Cameras Restoration of Analog Videos Using Swin-UNet
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1