首页 > 最新文献

International Journal of Computer Vision最新文献

英文 中文
Exocentric-to-Egocentric Adaptation for Temporal Action Segmentation with Unlabeled Synchronized Video Pairs 非标记同步视频对的外心到自中心自适应动作分割
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-06 DOI: 10.1007/s11263-025-02675-1
Camillo Quattrocchi, Antonino Furnari, Daniele Di Mauro, Mario Valerio Giuffrida, Giovanni Maria Farinella
{"title":"Exocentric-to-Egocentric Adaptation for Temporal Action Segmentation with Unlabeled Synchronized Video Pairs","authors":"Camillo Quattrocchi, Antonino Furnari, Daniele Di Mauro, Mario Valerio Giuffrida, Giovanni Maria Farinella","doi":"10.1007/s11263-025-02675-1","DOIUrl":"https://doi.org/10.1007/s11263-025-02675-1","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"9 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146138689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Survey on Human Interaction Motion Generation 人机交互运动生成研究进展
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-06 DOI: 10.1007/s11263-025-02582-5
Kewei Sui, Anindita Ghosh, Inwoo Hwang, Bing Zhou, Jian Wang, Chuan Guo
{"title":"A Survey on Human Interaction Motion Generation","authors":"Kewei Sui, Anindita Ghosh, Inwoo Hwang, Bing Zhou, Jian Wang, Chuan Guo","doi":"10.1007/s11263-025-02582-5","DOIUrl":"https://doi.org/10.1007/s11263-025-02582-5","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"133 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146138870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Homography Decomposition Revisited 重新审视单应性分解
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-05 DOI: 10.1007/s11263-025-02680-4
Yaqing Ding, Jian Yang, Zuzana Kukelova
Homography refers to a specific type of transformation that relates two images of the same planar surface taken from different perspectives. Recovering motion parameters from a homography matrix is a classic problem in computer vision. It is important to derive a fast and stable solution to homography decomposition, since it forms a critical component of many vision systems, e . g ., in Structure-from-Motion and visual localization. The current state-of-the-art solvers can be categorized into two types of methods, the numerical procedures based on singular value decomposition (SVD), and the closed-form solution. The SVD-based methods are stable but time-consuming, while the existing closed-form solution is faster but less stable. In this paper, we discuss the homography decomposition problem from a different viewpoint. In contrast to the existing methods which focus on the properties of the homography matrix, we propose a new method that uses three random point correspondences to obtain the motion parameters in closed form. The proposed method is conceptually simple, easy to understand and implement, and has a good geometrical interpretation. This solution can be seen as an alternative to the existing closed-form solution. We also discuss the configurations where the closed-form solutions might be unstable and present a framework for homography decomposition taking into account both the efficiency and stability.
同形变换是指将同一平面上从不同角度拍摄的两个图像联系起来的一种特定类型的变换。从单应矩阵中恢复运动参数是计算机视觉中的一个经典问题。重要的是推导出一个快速和稳定的解决方案,因为它形成了许多视觉系统的关键组成部分,例如。g。,在结构-从运动和视觉定位。目前最先进的求解方法可以分为两种类型,即基于奇异值分解(SVD)的数值过程和封闭形式解。基于奇异值分解的方法稳定但耗时长,而现有的封闭形式解速度较快但稳定性较差。本文从另一个角度讨论了单应分解问题。相对于现有的方法只关注单应矩阵的性质,我们提出了一种利用三个随机点对应来获得闭合形式的运动参数的新方法。该方法概念简单,易于理解和实现,并具有良好的几何解释。该解决方案可以看作是现有封闭形式解决方案的替代方案。我们还讨论了闭型解可能不稳定的构型,并提出了考虑效率和稳定性的单应分解框架。
{"title":"Homography Decomposition Revisited","authors":"Yaqing Ding, Jian Yang, Zuzana Kukelova","doi":"10.1007/s11263-025-02680-4","DOIUrl":"https://doi.org/10.1007/s11263-025-02680-4","url":null,"abstract":"Homography refers to a specific type of transformation that relates two images of the same planar surface taken from different perspectives. Recovering motion parameters from a homography matrix is a classic problem in computer vision. It is important to derive a fast and stable solution to homography decomposition, since it forms a critical component of many vision systems, <jats:italic>e</jats:italic> . <jats:italic>g</jats:italic> ., in Structure-from-Motion and visual localization. The current state-of-the-art solvers can be categorized into two types of methods, the numerical procedures based on singular value decomposition (SVD), and the closed-form solution. The SVD-based methods are stable but time-consuming, while the existing closed-form solution is faster but less stable. In this paper, we discuss the homography decomposition problem from a different viewpoint. In contrast to the existing methods which focus on the properties of the homography matrix, we propose a new method that uses three random point correspondences to obtain the motion parameters in closed form. The proposed method is conceptually simple, easy to understand and implement, and has a good geometrical interpretation. This solution can be seen as an alternative to the existing closed-form solution. We also discuss the configurations where the closed-form solutions might be unstable and present a framework for homography decomposition taking into account both the efficiency and stability.","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"12 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146138691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Relaxed Knowledge Distillation 轻松知识蒸馏
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-04 DOI: 10.1007/s11263-025-02705-y
Zheng Qu, Xiwen Yao, Xuguang Yang, Jie Tang, Lang Li, Gong Cheng, Junwei Han
{"title":"Relaxed Knowledge Distillation","authors":"Zheng Qu, Xiwen Yao, Xuguang Yang, Jie Tang, Lang Li, Gong Cheng, Junwei Han","doi":"10.1007/s11263-025-02705-y","DOIUrl":"https://doi.org/10.1007/s11263-025-02705-y","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"109 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146138692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring Scale Shift in Crowd Localization under the Context of Domain Generalization 领域泛化背景下人群定位的尺度转移研究
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-02 DOI: 10.1007/s11263-025-02637-7
Juncheng Wang, Lei Shang, Ziqi Liu, Wang Lu, Xixu Hu, Zhe Hu, Jindong Wang, Shujun Wang
{"title":"Exploring Scale Shift in Crowd Localization under the Context of Domain Generalization","authors":"Juncheng Wang, Lei Shang, Ziqi Liu, Wang Lu, Xixu Hu, Zhe Hu, Jindong Wang, Shujun Wang","doi":"10.1007/s11263-025-02637-7","DOIUrl":"https://doi.org/10.1007/s11263-025-02637-7","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"285 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146101329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On Combining Animal Re-Identification Models to Address Small Datasets 结合动物再识别模型解决小数据集问题
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-30 DOI: 10.1007/s11263-025-02708-9
Aleksandr Algasov, Ekaterina Nepovinnykh, Fedor Zolotarev, Tuomas Eerola, Heikki Kälviäinen, Charles V. Stewart, Lasha Otarashvili, Jason A. Holmberg
Recent advancements in the automatic re-identification of animal individuals from images have opened up new possibilities for studying wildlife through camera traps and citizen science projects. Existing methods leverage distinct and permanent visual body markings, such as fur patterns or scars, and typically employ one of two approaches: local features or end-to-end learning. The end-to-end learning-based methods outperform local feature-based methods given a sufficient amount of good-quality training data, but the challenge of gathering such datasets for wildlife animals means that local feature-based methods remain a more practical approach for many species. In this study, we aim to achieve two goals: (1) to obtain a better understanding of the impact of training-set size on animal re-identification, and (2) to explore ways to combine various methods to leverage the advantages of their approaches for re-identification. In the work, we conduct comprehensive experiments across six different methods and six animal species with various training set sizes. Furthermore, we propose a simple yet effective combination strategy and show that a properly selected method combinations outperform the individual methods with both small and large training sets up to 30%. Additionally, the proposed combination strategy offers a generalizable framework to improve accuracy across species and address the challenges posed by small datasets, which are common in ecological research. This work lays the foundation for more robust and accessible tools to support wildlife conservation, population monitoring, and behavioral studies.
从图像中自动重新识别动物个体的最新进展,为通过相机陷阱和公民科学项目研究野生动物开辟了新的可能性。现有的方法利用独特和永久的视觉身体标记,如皮毛图案或疤痕,通常采用两种方法之一:局部特征或端到端学习。在给定足够数量的高质量训练数据的情况下,基于端到端学习的方法优于基于局部特征的方法,但是收集野生动物数据集的挑战意味着基于局部特征的方法对于许多物种来说仍然是一种更实用的方法。在本研究中,我们旨在实现两个目标:(1)更好地理解训练集大小对动物再识别的影响;(2)探索如何将各种方法结合起来,利用它们的优势进行再识别。在工作中,我们对六种不同的方法和六种动物进行了综合实验,并使用了不同的训练集大小。此外,我们提出了一个简单而有效的组合策略,并表明适当选择的方法组合在小型和大型训练集上的性能都优于单个方法高达30%。此外,提出的组合策略提供了一个可推广的框架,以提高跨物种的准确性,并解决小数据集带来的挑战,这在生态研究中很常见。这项工作为更强大、更容易获得的工具奠定了基础,以支持野生动物保护、种群监测和行为研究。
{"title":"On Combining Animal Re-Identification Models to Address Small Datasets","authors":"Aleksandr Algasov, Ekaterina Nepovinnykh, Fedor Zolotarev, Tuomas Eerola, Heikki Kälviäinen, Charles V. Stewart, Lasha Otarashvili, Jason A. Holmberg","doi":"10.1007/s11263-025-02708-9","DOIUrl":"https://doi.org/10.1007/s11263-025-02708-9","url":null,"abstract":"Recent advancements in the automatic re-identification of animal individuals from images have opened up new possibilities for studying wildlife through camera traps and citizen science projects. Existing methods leverage distinct and permanent visual body markings, such as fur patterns or scars, and typically employ one of two approaches: local features or end-to-end learning. The end-to-end learning-based methods outperform local feature-based methods given a sufficient amount of good-quality training data, but the challenge of gathering such datasets for wildlife animals means that local feature-based methods remain a more practical approach for many species. In this study, we aim to achieve two goals: (1) to obtain a better understanding of the impact of training-set size on animal re-identification, and (2) to explore ways to combine various methods to leverage the advantages of their approaches for re-identification. In the work, we conduct comprehensive experiments across six different methods and six animal species with various training set sizes. Furthermore, we propose a simple yet effective combination strategy and show that a properly selected method combinations outperform the individual methods with both small and large training sets up to 30%. Additionally, the proposed combination strategy offers a generalizable framework to improve accuracy across species and address the challenges posed by small datasets, which are common in ecological research. This work lays the foundation for more robust and accessible tools to support wildlife conservation, population monitoring, and behavioral studies.","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"74 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146095833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Practical Video Object Detection via Feature Selection and Aggregation 基于特征选择和聚合的实用视频目标检测
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-30 DOI: 10.1007/s11263-025-02700-3
Yuheng Shi, Tong Zhang, Xiaojie Guo
{"title":"Practical Video Object Detection via Feature Selection and Aggregation","authors":"Yuheng Shi, Tong Zhang, Xiaojie Guo","doi":"10.1007/s11263-025-02700-3","DOIUrl":"https://doi.org/10.1007/s11263-025-02700-3","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"288 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146095832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Wild Animal Tracking with High-Quality Segment Anything Model and Domain Adaptation 基于高质量分段任意模型和领域自适应的野生动物跟踪
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-29 DOI: 10.1007/s11263-025-02710-1
Ganggang Huang, Fasheng Wang, Binbin Wang, Hanwei Li, Mingshu Zhang, Mengyin Wang, Fuming Sun, Haojie Li
{"title":"Wild Animal Tracking with High-Quality Segment Anything Model and Domain Adaptation","authors":"Ganggang Huang, Fasheng Wang, Binbin Wang, Hanwei Li, Mingshu Zhang, Mengyin Wang, Fuming Sun, Haojie Li","doi":"10.1007/s11263-025-02710-1","DOIUrl":"https://doi.org/10.1007/s11263-025-02710-1","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"43 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146095834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bilateral Transformation of Biased Pseudo-Labels under Distribution Inconsistency 分布不一致条件下有偏伪标签的双边变换
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-29 DOI: 10.1007/s11263-025-02701-2
Ruibing Hou, Hong Chang, MinYang Hu, BingPeng Ma, Shiguang Shan, Xilin Chen
{"title":"Bilateral Transformation of Biased Pseudo-Labels under Distribution Inconsistency","authors":"Ruibing Hou, Hong Chang, MinYang Hu, BingPeng Ma, Shiguang Shan, Xilin Chen","doi":"10.1007/s11263-025-02701-2","DOIUrl":"https://doi.org/10.1007/s11263-025-02701-2","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"8 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146095835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Designing Extremely Memory-Efficient CNNs for On-device Vision and Audio Tasks 为设备上的视觉和音频任务设计极具内存效率的cnn
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-29 DOI: 10.1007/s11263-025-02688-w
Yoel Park, Jaewook Lee, Seulki Lee
In this paper, we introduce a memory-efficient CNN (convolutional neural network), which enables resource-constrained low-end embedded and IoT devices to perform on-device vision and audio tasks, such as image classification, object detection, and audio classification, using extremely low memory, i.e ., only 63 KB on ImageNet classification. Based on the bottleneck block of MobileNet, we propose three design principles that significantly curtail the peak memory usage of a CNN so that it can fit the limited KB memory of the low-end device. First, ‘input segmentation’ divides an input image into a set of patches, including the central patch overlapped with the others, reducing the size (and memory requirement) of a large input image. Second, ‘patch tunneling’ builds independent tunnel-like paths consisting of multiple bottleneck blocks per patch, penetrating through the entire model from an input patch to the last layer of the network, maintaining lightweight memory usage throughout the whole network. Lastly, ‘bottleneck reordering’ rearranges the execution order of convolution operations inside the bottleneck block such that the memory usage remains constant regardless of the size of the convolution output channels. We also present ‘peak memory aware quantization’, enabling desired peak memory reduction in actual deployment of quantized network. The experiment result shows that the proposed network classifies ImageNet with extremely low memory ( i.e ., 63 KB) while achieving competitive top-1 accuracy ( i.e ., 61.58%). To the best of our knowledge, the memory usage of the proposed network is far smaller than state-of-the-art memory-efficient networks, i.e ., up to 89x and 3.1x smaller than MobileNet ( i.e ., 5.6 MB) and MCUNet ( i.e ., 196 KB), respectively.
在本文中,我们引入了一种内存高效的CNN(卷积神经网络),它使资源受限的低端嵌入式和物联网设备能够使用极低的内存来执行设备上的视觉和音频任务,如图像分类、对象检测和音频分类。, ImageNet分类只有63 KB。基于MobileNet的瓶颈块,我们提出了三个设计原则,可以显著减少CNN的峰值内存使用,使其能够适应低端设备有限的KB内存。首先,“输入分割”将输入图像分成一组补丁,包括与其他补丁重叠的中心补丁,减少大型输入图像的大小(和内存需求)。其次,“补丁隧道”构建独立的类似隧道的路径,每个补丁由多个瓶颈块组成,从输入补丁到网络的最后一层贯穿整个模型,在整个网络中保持轻量级的内存使用。最后,“瓶颈重排序”重新安排瓶颈块内卷积操作的执行顺序,以便无论卷积输出通道的大小如何,内存使用都保持不变。我们还提出了“峰值内存感知量化”,在量化网络的实际部署中实现所需的峰值内存减少。实验结果表明,本文提出的网络可以对内存极低的ImageNet进行分类。, 63 KB),同时达到具有竞争力的前1精度(即, 61.58%)。据我们所知,所提出的网络的内存使用远远小于最先进的内存高效网络,即。,最高可达89倍,比MobileNet(即(5.6 MB)和MCUNet(即(196 KB)。
{"title":"Designing Extremely Memory-Efficient CNNs for On-device Vision and Audio Tasks","authors":"Yoel Park, Jaewook Lee, Seulki Lee","doi":"10.1007/s11263-025-02688-w","DOIUrl":"https://doi.org/10.1007/s11263-025-02688-w","url":null,"abstract":"In this paper, we introduce a memory-efficient CNN (convolutional neural network), which enables resource-constrained low-end embedded and IoT devices to perform on-device vision and audio tasks, such as image classification, object detection, and audio classification, using extremely low memory, <jats:italic>i.e</jats:italic> ., only 63 KB on ImageNet classification. Based on the bottleneck block of MobileNet, we propose three design principles that significantly curtail the peak memory usage of a CNN so that it can fit the limited KB memory of the low-end device. First, ‘input segmentation’ divides an input image into a set of patches, including the central patch overlapped with the others, reducing the size (and memory requirement) of a large input image. Second, ‘patch tunneling’ builds independent tunnel-like paths consisting of multiple bottleneck blocks per patch, penetrating through the entire model from an input patch to the last layer of the network, maintaining lightweight memory usage throughout the whole network. Lastly, ‘bottleneck reordering’ rearranges the execution order of convolution operations inside the bottleneck block such that the memory usage remains constant regardless of the size of the convolution output channels. We also present ‘peak memory aware quantization’, enabling desired peak memory reduction in actual deployment of quantized network. The experiment result shows that the proposed network classifies ImageNet with extremely low memory ( <jats:italic>i.e</jats:italic> ., 63 KB) while achieving competitive top-1 accuracy ( <jats:italic>i.e</jats:italic> ., 61.58%). To the best of our knowledge, the memory usage of the proposed network is far smaller than state-of-the-art memory-efficient networks, <jats:italic>i.e</jats:italic> ., up to 89x and 3.1x smaller than MobileNet ( <jats:italic>i.e</jats:italic> ., 5.6 MB) and MCUNet ( <jats:italic>i.e</jats:italic> ., 196 KB), respectively.","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"62 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146095836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Journal of Computer Vision
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1