首页 > 最新文献

International Journal of Computer Vision最新文献

英文 中文
Exploiting Class-agnostic Visual Prior for Few-shot Keypoint Detection 利用类无关视觉先验进行少镜头关键点检测
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-16 DOI: 10.1007/s11263-025-02671-5
Changsheng Lu, Hao Zhu, Piotr Koniusz
Deep learning based keypoint detectors can localize specific object (or body) parts well, but still fall short of general keypoint detection. Instead, few-shot keypoint detection (FSKD) is an underexplored yet more general task of localizing either base or novel keypoints, depending on the prompted support samples. In FSKD, how to build robust keypoint representations is the key to success. To this end, we propose an FSKD approach that models relations between keypoints. As keypoints are located on objects, we exploit a class-agnostic visual prior, i.e ., the unsupervised saliency map or DINO attentiveness map to obtain the region of focus within which we perform relation learning between object patches. The class-agnostic visual prior also helps suppress the background noise largely irrelevant to keypoint locations. Then, we propose a novel Visual Prior guided Vision Transformer (VPViT). The visual prior maps are refined by a bespoke morphology learner to include relevant context of objects. The masked self-attention of VPViT takes the adapted prior map as a soft mask to constrain the self-attention to foregrounds. As robust FSKD must also deal with the low number of support samples and occlusions, based on VPViT, we further investigate i) transductive FSKD to enhance keypoint representations with unlabeled data and ii) FSKD with masking and alignment (MAA) to improve robustness. We show that our model performs well in seven public datasets, and also significantly improves the accuracy in transductive inference and under occlusions. Source codes are available at https://github.com/AlanLuSun/VPViT .
基于深度学习的关键点检测器可以很好地定位特定的物体(或身体)部分,但仍然无法实现一般的关键点检测。相反,少射关键点检测(FSKD)是一个未被充分探索的任务,但更普遍的任务是根据提示的支持样本定位基本关键点或新关键点。在FSKD中,如何构建健壮的关键点表示是成功的关键。为此,我们提出了一种FSKD方法,对关键点之间的关系进行建模。由于关键点位于对象上,我们利用了与类无关的视觉先验,即。,通过无监督显著性图或DINO注意力图来获得焦点区域,在该区域内我们进行对象补丁之间的关系学习。与类别无关的视觉先验也有助于抑制与关键点位置无关的背景噪声。然后,我们提出了一种新的视觉先验引导视觉变压器(VPViT)。视觉先验地图由定制的形态学学习器改进,以包括对象的相关上下文。VPViT的遮罩自注意以适应的先验图作为软遮罩,将自注意约束在前景上。由于鲁棒FSKD还必须处理低数量的支持样本和遮挡,基于VPViT,我们进一步研究i)转导FSKD以增强未标记数据的关键点表示,ii) FSKD与掩蔽和对齐(MAA)以提高鲁棒性。我们的模型在7个公共数据集上表现良好,并且显著提高了转换推理和遮挡下的准确性。源代码可从https://github.com/AlanLuSun/VPViT获得。
{"title":"Exploiting Class-agnostic Visual Prior for Few-shot Keypoint Detection","authors":"Changsheng Lu, Hao Zhu, Piotr Koniusz","doi":"10.1007/s11263-025-02671-5","DOIUrl":"https://doi.org/10.1007/s11263-025-02671-5","url":null,"abstract":"Deep learning based keypoint detectors can localize specific object (or body) parts well, but still fall short of general keypoint detection. Instead, few-shot keypoint detection (FSKD) is an underexplored yet more general task of localizing either base or novel keypoints, depending on the prompted support samples. In FSKD, how to build robust keypoint representations is the key to success. To this end, we propose an FSKD approach that models relations between keypoints. As keypoints are located on objects, we exploit a class-agnostic visual prior, <jats:italic>i.e</jats:italic> ., the unsupervised saliency map or DINO attentiveness map to obtain the region of focus within which we perform relation learning between object patches. The class-agnostic visual prior also helps suppress the background noise largely irrelevant to keypoint locations. Then, we propose a novel Visual Prior guided Vision Transformer (VPViT). The visual prior maps are refined by a bespoke morphology learner to include relevant context of objects. The masked self-attention of VPViT takes the adapted prior map as a soft mask to constrain the self-attention to foregrounds. As robust FSKD must also deal with the low number of support samples and occlusions, based on VPViT, we further investigate i) transductive FSKD to enhance keypoint representations with unlabeled data and ii) FSKD with masking and alignment (MAA) to improve robustness. We show that our model performs well in seven public datasets, and also significantly improves the accuracy in transductive inference and under occlusions. Source codes are available at <jats:ext-link xmlns:xlink=\"http://www.w3.org/1999/xlink\" xlink:href=\"https://github.com/AlanLuSun/VPViT\" ext-link-type=\"uri\">https://github.com/AlanLuSun/VPViT</jats:ext-link> .","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"101 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146005688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
It’s Just Another Day: Unique Video Captioning by Discriminitive Prompting 这只是另一天:独特的视频字幕由歧视性提示
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-13 DOI: 10.1007/s11263-025-02664-4
Toby Perrett, Tengda Han, Dima Damen, Andrew Zisserman
Long videos contain many repeating actions, events and shots. These repetitions are frequently given identical captions, which makes it difficult to retrieve the exact desired clip using a text search. In this paper, we formulate the problem of unique captioning: Given multiple clips with the same caption, we generate a new caption for each clip that uniquely identifies it. We propose Captioning by Discriminative Prompting (CDP), which predicts a property that can separate identically captioned clips, and use it to generate unique captions. We introduce two benchmarks for unique captioning, based on egocentric footage and timeloop movies – where repeating actions are common. We demonstrate that captions generated by CDP improve text-to-video R@1 by 15% for egocentric videos and 10% in timeloop movies. https://tobyperrett.github.io/its-just-another-day
长视频包含许多重复的动作、事件和镜头。这些重复的片段经常被赋予相同的字幕,这使得使用文本搜索检索所需片段变得困难。在本文中,我们提出了唯一标题的问题:给定具有相同标题的多个片段,我们为每个片段生成一个唯一标识它的新标题。我们提出了判别提示(CDP)的字幕,它预测了一个属性,可以分离相同的字幕片段,并使用它来生成唯一的字幕。我们介绍了两个独特字幕的基准,基于以自我为中心的镜头和定时循环电影-其中重复的动作是常见的。我们证明,CDP生成的字幕将文本到视频R@1的自我中心视频提高了15%,在定时循环电影中提高了10%。https://tobyperrett.github.io/its-just-another-day
{"title":"It’s Just Another Day: Unique Video Captioning by Discriminitive Prompting","authors":"Toby Perrett, Tengda Han, Dima Damen, Andrew Zisserman","doi":"10.1007/s11263-025-02664-4","DOIUrl":"https://doi.org/10.1007/s11263-025-02664-4","url":null,"abstract":"Long videos contain many repeating actions, events and shots. These repetitions are frequently given identical captions, which makes it difficult to retrieve the exact desired clip using a text search. In this paper, we formulate the problem of unique captioning: Given multiple clips with the same caption, we generate a new caption for each clip that uniquely identifies it. We propose Captioning by Discriminative Prompting (CDP), which predicts a property that can separate identically captioned clips, and use it to generate unique captions. We introduce two benchmarks for unique captioning, based on egocentric footage and timeloop movies – where repeating actions are common. We demonstrate that captions generated by CDP improve text-to-video R@1 by 15% for egocentric videos and 10% in timeloop movies. <jats:ext-link xmlns:xlink=\"http://www.w3.org/1999/xlink\" xlink:href=\"https://tobyperrett.github.io/its-just-another-day\" ext-link-type=\"uri\">https://tobyperrett.github.io/its-just-another-day</jats:ext-link>","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"33 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145955211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic Solver Generator for Systems of Laurent Polynomial Equations 洛朗多项式方程组的自动求解器生成器
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-13 DOI: 10.1007/s11263-025-02635-9
Evgeniy Martyushev, Snehal Bhayani, Tomas Pajdla
{"title":"Automatic Solver Generator for Systems of Laurent Polynomial Equations","authors":"Evgeniy Martyushev, Snehal Bhayani, Tomas Pajdla","doi":"10.1007/s11263-025-02635-9","DOIUrl":"https://doi.org/10.1007/s11263-025-02635-9","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"15 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145955210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bridging Perspectives: A Survey on Cross-view Collaborative Intelligence with Egocentric-Exocentric Vision 桥接视角:具有自我中心-外中心视觉的跨视角协同智能研究
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-13 DOI: 10.1007/s11263-025-02608-y
Yuping He, Yifei Huang, Guo Chen, Lidong Lu, Baoqi Pei, Jilan Xu, Tong Lu, Yoichi Sato
{"title":"Bridging Perspectives: A Survey on Cross-view Collaborative Intelligence with Egocentric-Exocentric Vision","authors":"Yuping He, Yifei Huang, Guo Chen, Lidong Lu, Baoqi Pei, Jilan Xu, Tong Lu, Yoichi Sato","doi":"10.1007/s11263-025-02608-y","DOIUrl":"https://doi.org/10.1007/s11263-025-02608-y","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"29 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145961759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TCDiff++: An End-to-end Trajectory-Controllable Diffusion Model for Harmonious Music-Driven Group Choreography tcdiff++:和谐音乐驱动的群体舞蹈的端到端轨迹可控扩散模型
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-13 DOI: 10.1007/s11263-025-02611-3
Yuqin Dai, Wanlu Zhu, Ronghui Li, Xiu Li, Zhenyu Zhang, Jun Li, Jian Yang
{"title":"TCDiff++: An End-to-end Trajectory-Controllable Diffusion Model for Harmonious Music-Driven Group Choreography","authors":"Yuqin Dai, Wanlu Zhu, Ronghui Li, Xiu Li, Zhenyu Zhang, Jun Li, Jian Yang","doi":"10.1007/s11263-025-02611-3","DOIUrl":"https://doi.org/10.1007/s11263-025-02611-3","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"9 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145955870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-Head Attention Residual Unfolded Network for Model-Based Pansharpening 基于模型的泛锐化的多头注意残差展开网络
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-12 DOI: 10.1007/s11263-025-02651-9
Ivan Pereira-Sánchez, Eloi Sans, Julia Navarro, Joan Duran
{"title":"Multi-Head Attention Residual Unfolded Network for Model-Based Pansharpening","authors":"Ivan Pereira-Sánchez, Eloi Sans, Julia Navarro, Joan Duran","doi":"10.1007/s11263-025-02651-9","DOIUrl":"https://doi.org/10.1007/s11263-025-02651-9","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"27 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145955233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dynamic MAsk-Pruning Strategy for Source-Free Model Intellectual Property Protection 无源模型知识产权保护的动态掩码修剪策略
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-12 DOI: 10.1007/s11263-025-02619-9
Boyang Peng, Sanqing Qu, Yong Wu, Tianpei Zou, Lianghua He, Alois Knoll, Guang Chen, Changjun Jiang
{"title":"Dynamic MAsk-Pruning Strategy for Source-Free Model Intellectual Property Protection","authors":"Boyang Peng, Sanqing Qu, Yong Wu, Tianpei Zou, Lianghua He, Alois Knoll, Guang Chen, Changjun Jiang","doi":"10.1007/s11263-025-02619-9","DOIUrl":"https://doi.org/10.1007/s11263-025-02619-9","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"27 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145955235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Brain3D: Generating 3D Objects from fMRI Brain3D:从功能磁共振成像生成3D对象
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-12 DOI: 10.1007/s11263-025-02609-x
Yuankun Yang, Li Zhang, Ziyang Xie, Zhiyuan Yuan, Jianfeng Feng, Xiatian Zhu, Yu-Gang Jiang
{"title":"Brain3D: Generating 3D Objects from fMRI","authors":"Yuankun Yang, Li Zhang, Ziyang Xie, Zhiyuan Yuan, Jianfeng Feng, Xiatian Zhu, Yu-Gang Jiang","doi":"10.1007/s11263-025-02609-x","DOIUrl":"https://doi.org/10.1007/s11263-025-02609-x","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"146 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145955873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Guest Editorial: Special Issue for the British Machine Vision Conference (BMVC), 2024 (Glasgow, Scotland, UK) 嘉宾评论:英国机器视觉会议(BMVC)特刊,2024(格拉斯哥,苏格兰,英国)
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-12 DOI: 10.1007/s11263-025-02721-y
Carlos Francisco Moreno-García, Gerardo Aragon Camarasa, Edmond S. L. Ho, Paul Henderson, Nicolas Pugeault, Jungong Han, Sergio Escalera
{"title":"Guest Editorial: Special Issue for the British Machine Vision Conference (BMVC), 2024 (Glasgow, Scotland, UK)","authors":"Carlos Francisco Moreno-García, Gerardo Aragon Camarasa, Edmond S. L. Ho, Paul Henderson, Nicolas Pugeault, Jungong Han, Sergio Escalera","doi":"10.1007/s11263-025-02721-y","DOIUrl":"https://doi.org/10.1007/s11263-025-02721-y","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"6 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145955234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cross-domain Few-shot Classification via Invariant-content Feature Reconstruction 基于不变内容特征重构的跨域少镜头分类
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-11 DOI: 10.1007/s11263-025-02601-5
Hongduan Tian, Feng Liu, Ka Chun Cheung, Zhen Fang, Simon See, Tongliang Liu, Bo Han
In cross-domain few-shot classification (CFC), mainstream studies aim to train a simple module (e.g. a linear transformation head) to select or transform features (a.k.a., the high-level semantic features) for previously unseen domains with a few labeled training data available on top of a powerful pre-trained model. These studies usually assume that high-level semantic features are shared across these domains, and just simple feature selection or transformations are enough to adapt features to previously unseen domains. However, in this paper, we find that the simply transformed features are too general to fully cover the key content features regarding each class. Thus, we propose an effective method, invariant-content feature reconstruction (IFR), to train a simple module that simultaneously considers both high-level and fine-grained invariant-content features for the previously unseen domains. Specifically, the fine-grained invariant-content features are considered as a set of informative and discriminative features learned from a few labeled training data of tasks sampled from unseen domains and are extracted by retrieving features that are invariant to style modifications from a set of content-preserving augmented data in pixel level with an attention module. Extensive experiments on the Meta-Dataset benchmark show that IFR achieves good generalization performance on unseen domains, which demonstrates the effectiveness of the fusion of the high-level features and the fine-grained invariant-content features. Specifically, IFR improves the average accuracy on unseen domains by 1.6% and 6.5% respectively under two different cross-domain few-shot classification settings.
在跨域少射分类(cross-domain few-shot classification, CFC)中,主流研究的目的是训练一个简单的模块(如线性变换头),在一个功能强大的预训练模型之上,使用少量标记的训练数据,为以前未见过的域选择或变换特征(即高级语义特征)。这些研究通常假设高级语义特征在这些领域之间是共享的,并且只需简单的特征选择或转换就足以使特征适应以前未见过的领域。然而,在本文中,我们发现简单转换的特征过于笼统,无法完全涵盖每个类的关键内容特征。因此,我们提出了一种有效的方法,不变内容特征重建(IFR),以训练一个简单的模块,该模块同时考虑先前未见过的域的高级和细粒度不变内容特征。具体来说,细粒度不变内容特征被认为是一组信息性和判别性特征,这些特征是从一些从未见域采样的任务的标记训练数据中学习到的,并通过使用注意力模块从一组保持内容的增强数据中检索对样式修改不变化的特征来提取。在Meta-Dataset基准上的大量实验表明,IFR在不可见域上取得了良好的泛化性能,证明了高级特征与细粒度不变内容特征融合的有效性。具体而言,在两种不同的跨域少射分类设置下,IFR在未见域上的平均准确率分别提高了1.6%和6.5%。
{"title":"Cross-domain Few-shot Classification via Invariant-content Feature Reconstruction","authors":"Hongduan Tian, Feng Liu, Ka Chun Cheung, Zhen Fang, Simon See, Tongliang Liu, Bo Han","doi":"10.1007/s11263-025-02601-5","DOIUrl":"https://doi.org/10.1007/s11263-025-02601-5","url":null,"abstract":"In <jats:italic>cross-domain few-shot classification</jats:italic> (CFC), mainstream studies aim to train a simple module (e.g. a linear transformation head) to select or transform features (a.k.a., the high-level semantic features) for previously unseen domains with a few labeled training data available on top of a powerful pre-trained model. These studies usually <jats:italic>assume</jats:italic> that high-level semantic features are shared across these domains, and just simple feature selection or transformations are enough to adapt features to previously unseen domains. However, in this paper, we find that the simply transformed features are too general to fully cover the key content features regarding each class. Thus, we propose an effective method, <jats:italic>invariant-content feature reconstruction</jats:italic> (IFR), to train a simple module that simultaneously considers both high-level and fine-grained invariant-content features for the previously unseen domains. Specifically, the fine-grained invariant-content features are considered as a set of <jats:italic>informative</jats:italic> and <jats:italic>discriminative</jats:italic> features learned from a few labeled training data of tasks sampled from unseen domains and are extracted by retrieving features that are invariant to style modifications from a set of content-preserving augmented data in pixel level with an attention module. Extensive experiments on the Meta-Dataset benchmark show that IFR achieves good generalization performance on unseen domains, which demonstrates the effectiveness of the fusion of the high-level features and the fine-grained invariant-content features. Specifically, IFR improves the average accuracy on unseen domains by 1.6% and 6.5% respectively under two different cross-domain few-shot classification settings.","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"38 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145955236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Journal of Computer Vision
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1