Empowering lightweight video transformer via the kernel learning

IF 0.7 4区 工程技术 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC Electronics Letters Pub Date : 2024-05-10 DOI:10.1049/ell2.13215
Xiaoxi Liu, Ju Liu, Lingchen Gu
{"title":"Empowering lightweight video transformer via the kernel learning","authors":"Xiaoxi Liu,&nbsp;Ju Liu,&nbsp;Lingchen Gu","doi":"10.1049/ell2.13215","DOIUrl":null,"url":null,"abstract":"<p>Video transformers achieve superior performance in video recognition. Despite the recent advances in video transformers, they still require substantial computation and memory resources. To cater for the computation efficiency, a kernel-based video transformer is proposed, including: (1) a new formulation of the video transformer via the kernel learning is presented to better understand the individual components of it; (2) a lightweight Kernel-based spatial–temporal multi-head self-attention block is explored to learn the compact joint spatial–temporal video feature; (3) an adaptive-score position embedding method is conducted to promote the flexibility of video transformer. Experimental results on several action recognition datasets demonstrate the effectiveness of the proposed method. Only pretrained on ImageNet-1K, the method achieves the preferable balance between computation and accuracy, while requiring 7<span></span><math>\n <semantics>\n <mo>×</mo>\n <annotation>$\\times$</annotation>\n </semantics></math> fewer parameters and 13<span></span><math>\n <semantics>\n <mo>×</mo>\n <annotation>$\\times$</annotation>\n </semantics></math> fewer floating point operations than other comparable methods.</p>","PeriodicalId":11556,"journal":{"name":"Electronics Letters","volume":"60 9","pages":""},"PeriodicalIF":0.7000,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ell2.13215","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Electronics Letters","FirstCategoryId":"5","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/ell2.13215","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

Video transformers achieve superior performance in video recognition. Despite the recent advances in video transformers, they still require substantial computation and memory resources. To cater for the computation efficiency, a kernel-based video transformer is proposed, including: (1) a new formulation of the video transformer via the kernel learning is presented to better understand the individual components of it; (2) a lightweight Kernel-based spatial–temporal multi-head self-attention block is explored to learn the compact joint spatial–temporal video feature; (3) an adaptive-score position embedding method is conducted to promote the flexibility of video transformer. Experimental results on several action recognition datasets demonstrate the effectiveness of the proposed method. Only pretrained on ImageNet-1K, the method achieves the preferable balance between computation and accuracy, while requiring 7 × $\times$ fewer parameters and 13 × $\times$ fewer floating point operations than other comparable methods.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过内核学习增强轻量级视频转换器的能力
视频变换器在视频识别中表现出色。尽管视频变换器近年来取得了长足进步,但仍需要大量的计算和内存资源。为了提高计算效率,本文提出了一种基于内核的视频变换器,包括:(1)通过内核学习对视频变换器进行新的表述,以更好地理解其各个组成部分;(2)探索一种轻量级的基于内核的时空多头自关注块,以学习紧凑的联合时空视频特征;(3)进行自适应分数位置嵌入方法,以提高视频变换器的灵活性。在多个动作识别数据集上的实验结果证明了所提方法的有效性。与其他同类方法相比,该方法只在ImageNet-1K上进行了预训练,就实现了计算量和准确率之间的最佳平衡,同时需要的参数和浮点运算分别比其他同类方法少7 × $/times$和13 × $/times$。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Electronics Letters
Electronics Letters 工程技术-工程:电子与电气
CiteScore
2.70
自引率
0.00%
发文量
268
审稿时长
3.6 months
期刊介绍: Electronics Letters is an internationally renowned peer-reviewed rapid-communication journal that publishes short original research papers every two weeks. Its broad and interdisciplinary scope covers the latest developments in all electronic engineering related fields including communication, biomedical, optical and device technologies. Electronics Letters also provides further insight into some of the latest developments through special features and interviews. Scope As a journal at the forefront of its field, Electronics Letters publishes papers covering all themes of electronic and electrical engineering. The major themes of the journal are listed below. Antennas and Propagation Biomedical and Bioinspired Technologies, Signal Processing and Applications Control Engineering Electromagnetism: Theory, Materials and Devices Electronic Circuits and Systems Image, Video and Vision Processing and Applications Information, Computing and Communications Instrumentation and Measurement Microwave Technology Optical Communications Photonics and Opto-Electronics Power Electronics, Energy and Sustainability Radar, Sonar and Navigation Semiconductor Technology Signal Processing MIMO
期刊最新文献
Lasers with double asymmetric barrier layers: Direct versus indirect capture of carriers into the lasing ground state in quantum dots Modified fast discrete-time PID formulas for obtaining double precision accuracy W-band LNA employing current reuse and non-linearity cancellation in 28 nm CMOS for automotive radar and 6G applications A power-saving control voltage-retention circuit for fast-locking phase-locked loops with sleep mode Study of bistability and hyperchaos in a coupled class-B laser system
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1