Deformable Capsules for Object Detection

IF 6.8 Q1 AUTOMATION & CONTROL SYSTEMS Advanced intelligent systems (Weinheim an der Bergstrasse, Germany) Pub Date : 2024-08-22 DOI:10.1002/aisy.202400044
Rodney LaLonde, Naji Khosravan, Ulas Bagci
{"title":"Deformable Capsules for Object Detection","authors":"Rodney LaLonde,&nbsp;Naji Khosravan,&nbsp;Ulas Bagci","doi":"10.1002/aisy.202400044","DOIUrl":null,"url":null,"abstract":"<p>Capsule networks promise significant benefits over convolutional neural networks (CNN) by storing stronger internal representations and routing information based on the agreement between intermediate representations’ projections. Despite this, their success has been limited to small-scale classification datasets due to their computationally expensive nature. Though memory-efficient, convolutional capsules impose geometric constraints that fundamentally limit the ability of capsules to model the pose/deformation of objects. Further, they do not address the bigger memory concern of class capsules scaling up to bigger tasks such as detection or large-scale classification. Herein, a new family of capsule networks, deformable capsules (<i>DeformCaps</i>), is introduced to address object detection problem in computer vision. Two new algorithms associated with our <i>DeformCaps</i>, a novel capsule structure (<i>SplitCaps</i>), and a novel dynamic routing algorithm (<i>SE-Routing</i>), which balance computational efficiency with the need for modeling a large number of objects and classes, are proposed. This has never been achieved with capsule networks before. The proposed methods efficiently scale up to create the first-ever capsule network for object detection in the literature. The proposed architecture is a one-stage detection framework and it obtains results on microsoft common objects in context which are on par with state-of-the-art one-stage CNN-based methods, while producing fewer false-positive detection, generalizing to unusual poses/viewpoints of objects.</p>","PeriodicalId":93858,"journal":{"name":"Advanced intelligent systems (Weinheim an der Bergstrasse, Germany)","volume":null,"pages":null},"PeriodicalIF":6.8000,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aisy.202400044","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advanced intelligent systems (Weinheim an der Bergstrasse, Germany)","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/aisy.202400044","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Capsule networks promise significant benefits over convolutional neural networks (CNN) by storing stronger internal representations and routing information based on the agreement between intermediate representations’ projections. Despite this, their success has been limited to small-scale classification datasets due to their computationally expensive nature. Though memory-efficient, convolutional capsules impose geometric constraints that fundamentally limit the ability of capsules to model the pose/deformation of objects. Further, they do not address the bigger memory concern of class capsules scaling up to bigger tasks such as detection or large-scale classification. Herein, a new family of capsule networks, deformable capsules (DeformCaps), is introduced to address object detection problem in computer vision. Two new algorithms associated with our DeformCaps, a novel capsule structure (SplitCaps), and a novel dynamic routing algorithm (SE-Routing), which balance computational efficiency with the need for modeling a large number of objects and classes, are proposed. This has never been achieved with capsule networks before. The proposed methods efficiently scale up to create the first-ever capsule network for object detection in the literature. The proposed architecture is a one-stage detection framework and it obtains results on microsoft common objects in context which are on par with state-of-the-art one-stage CNN-based methods, while producing fewer false-positive detection, generalizing to unusual poses/viewpoints of objects.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于物体探测的可变形胶囊
与卷积神经网络(CNN)相比,胶囊网络能存储更强的内部表征,并根据中间表征投影之间的一致性来路由信息,因而具有显著的优势。尽管如此,由于其计算昂贵的特性,它们的成功仅限于小规模分类数据集。虽然卷积胶囊具有内存效率高的特点,但其几何限制从根本上限制了胶囊对物体的姿势/变形进行建模的能力。此外,它们没有解决类胶囊在扩展到更大任务(如检测或大规模分类)时更大的内存问题。在此,我们引入了一个新的胶囊网络系列--可变形胶囊(DeformCaps),以解决计算机视觉中的物体检测问题。我们还提出了两种与 DeformCaps 相关的新算法,一种是新颖的胶囊结构(SplitCaps),另一种是新颖的动态路由算法(SE-Routing),这两种算法在计算效率与大量对象和类别建模需求之间取得了平衡。这在以前的胶囊网络中从未实现过。所提出的方法可以有效地扩展,在文献中首次创建了用于物体检测的胶囊网络。所提出的架构是一个单级检测框架,它在微型软件常见物体的上下文中获得的结果与基于单级 CNN 的先进方法相当,同时产生的假阳性检测结果较少,并可泛化到物体的异常姿势/视角。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
1.30
自引率
0.00%
发文量
0
审稿时长
4 weeks
期刊最新文献
Masthead Reconstructing Soft Robotic Touch via In-Finger Vision A Cable-Actuated Soft Manipulator for Dexterous Grasping Based on Deep Reinforcement Learning Liquid Metal Chameleon Tongues: Modulating Surface Tension and Phase Transition to Enable Bioinspired Soft Actuators Reprogrammable, Recyclable Origami Robots Controlled by Magnetic Fields
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1