首页 > 最新文献

International Journal of Computer Vision最新文献

英文 中文
A Survey of Multimodal Hallucination Evaluation and Detection 多模态幻觉评价与检测综述
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-21 DOI: 10.1007/s11263-026-02756-9
Zhiyuan Chen, Yuecong Min, Jie Zhang, Bei Yan, Jiahao Wang, Xiaozhen Wang, Shiguang Shan
{"title":"A Survey of Multimodal Hallucination Evaluation and Detection","authors":"Zhiyuan Chen, Yuecong Min, Jie Zhang, Bei Yan, Jiahao Wang, Xiaozhen Wang, Shiguang Shan","doi":"10.1007/s11263-026-02756-9","DOIUrl":"https://doi.org/10.1007/s11263-026-02756-9","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"15 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146230865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
B$$^{3}$$CT: Three-Branch Learning with Unlabeled Target Signals for Domain-Robust Semantic Segmentation B $$^{3}$$ CT:基于无标记目标信号的三分支学习的领域鲁棒语义分割
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-18 DOI: 10.1007/s11263-026-02782-7
Chen Liang, Xin Zhao, Jian Jia, Junyan Wang, Lijun Cao, Jianguo Zhang, Weihua Chen
{"title":"B$$^{3}$$CT: Three-Branch Learning with Unlabeled Target Signals for Domain-Robust Semantic Segmentation","authors":"Chen Liang, Xin Zhao, Jian Jia, Junyan Wang, Lijun Cao, Jianguo Zhang, Weihua Chen","doi":"10.1007/s11263-026-02782-7","DOIUrl":"https://doi.org/10.1007/s11263-026-02782-7","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"1 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146230872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semantic-Centric Alignment for Zero-shot Panoptic Segmentation with Limited Data 有限数据下零镜头全视分割的语义中心对齐
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-16 DOI: 10.1007/s11263-025-02648-4
Jialei Chen, Daisuke Deguchi, Dongyue Li, Xu Zheng, Seigo Ito, Hiroshi Murase, Qi Fan
{"title":"Semantic-Centric Alignment for Zero-shot Panoptic Segmentation with Limited Data","authors":"Jialei Chen, Daisuke Deguchi, Dongyue Li, Xu Zheng, Seigo Ito, Hiroshi Murase, Qi Fan","doi":"10.1007/s11263-025-02648-4","DOIUrl":"https://doi.org/10.1007/s11263-025-02648-4","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"37 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146205038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GLAD: Generative Language-Assisted Visual Tracking for Low-Semantic Templates 低语义模板的生成语言辅助视觉跟踪
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-15 DOI: 10.1007/s11263-026-02774-7
Xingyu Luo, Yidong Cai, Jie Liu, Jie Tang, Gangshan Wu, Limin Wang
{"title":"GLAD: Generative Language-Assisted Visual Tracking for Low-Semantic Templates","authors":"Xingyu Luo, Yidong Cai, Jie Liu, Jie Tang, Gangshan Wu, Limin Wang","doi":"10.1007/s11263-026-02774-7","DOIUrl":"https://doi.org/10.1007/s11263-026-02774-7","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"11 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146196675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SACG++: Complex Sketch Generation via Representation-Enhanced Scale-Adaptive Classifier Guidance 基于表示增强尺度自适应分类器引导的复杂草图生成
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-14 DOI: 10.1007/s11263-026-02768-5
Ke Li, Jijin Hu, Zhipeng Chen, Lan Yang, Yonggang Qi, Yi-Zhe Song
{"title":"SACG++: Complex Sketch Generation via Representation-Enhanced Scale-Adaptive Classifier Guidance","authors":"Ke Li, Jijin Hu, Zhipeng Chen, Lan Yang, Yonggang Qi, Yi-Zhe Song","doi":"10.1007/s11263-026-02768-5","DOIUrl":"https://doi.org/10.1007/s11263-026-02768-5","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"121 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146196677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Self-Balancing Multimodal Models via Multi-Loss Gradient Modulation 基于多损耗梯度调制的自平衡多模态模型
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-12 DOI: 10.1007/s11263-025-02696-w
Konstantinos Kontras, Christos Chatzichristos, Matthew Blaschko, Maarten De Vos
{"title":"Self-Balancing Multimodal Models via Multi-Loss Gradient Modulation","authors":"Konstantinos Kontras, Christos Chatzichristos, Matthew Blaschko, Maarten De Vos","doi":"10.1007/s11263-025-02696-w","DOIUrl":"https://doi.org/10.1007/s11263-025-02696-w","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"173 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146196676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
EgoPlan-Bench: Benchmarking Multimodal Large Language Models for Human-Level Planning EgoPlan-Bench:对人类水平规划的多模态大型语言模型进行基准测试
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-12 DOI: 10.1007/s11263-025-02676-0
Yi Chen, Yuying Ge, Yixiao Ge, Mingyu Ding, Bohao Li, Rui Wang, Ruifeng Xu, Ying Shan, Xihui Liu
The pursuit of artificial general intelligence (AGI) has been accelerated by Multimodal Large Language Models (MLLMs), which exhibit superior reasoning, generalization capabilities, and proficiency in processing multimodal inputs. A crucial milestone in the evolution of AGI is the attainment of human-level planning, a fundamental ability for making informed decisions in complex environments, and solving a wide range of real-world problems. Despite the impressive advancements in MLLMs, a question remains: How far are current MLLMs from achieving human-level planning? To shed light on this question, we introduce EgoPlan-Bench, a comprehensive benchmark to evaluate the planning abilities of MLLMs in real-world scenarios from an egocentric perspective, mirroring human perception. EgoPlan-Bench emphasizes the evaluation of planning capabilities of MLLMs, featuring realistic tasks, diverse action plans, and intricate visual observations. Our rigorous evaluation of a wide range of MLLMs reveals that EgoPlan-Bench poses significant challenges, highlighting a substantial scope for improvement in MLLMs to achieve human-level task planning. To facilitate this advancement, we further present EgoPlan-IT, a specialized instruction-tuning dataset that effectively enhances model performance on EgoPlan-Bench. We have made all the codes, data, and a maintained benchmark leaderboard available at https://chenyi99.github.io/ego_plan/ to advance future research.
多模态大型语言模型(mllm)加速了对通用人工智能(AGI)的追求,这些模型在处理多模态输入方面表现出卓越的推理、泛化能力和熟练程度。AGI发展的一个重要里程碑是实现了人类水平的规划,这是在复杂环境中做出明智决策并解决各种现实问题的基本能力。尽管mllm取得了令人印象深刻的进步,但问题仍然存在:目前的mllm距离实现人类水平的规划还有多远?为了阐明这个问题,我们引入了EgoPlan-Bench,这是一个全面的基准,从自我中心的角度来评估mlms在现实场景中的规划能力,反映了人类的感知。EgoPlan-Bench强调对mlms规划能力的评估,具有现实的任务、多样化的行动计划和复杂的视觉观察。我们对广泛的mllm的严格评估表明,EgoPlan-Bench提出了重大挑战,突出了mllm改进的实质性范围,以实现人类级别的任务规划。为了促进这一进步,我们进一步提出了EgoPlan-IT,这是一个专门的指令调优数据集,可以有效地提高EgoPlan-Bench上的模型性能。我们已经在https://chenyi99.github.io/ego_plan/上提供了所有代码,数据和维护的基准排行榜,以推进未来的研究。
{"title":"EgoPlan-Bench: Benchmarking Multimodal Large Language Models for Human-Level Planning","authors":"Yi Chen, Yuying Ge, Yixiao Ge, Mingyu Ding, Bohao Li, Rui Wang, Ruifeng Xu, Ying Shan, Xihui Liu","doi":"10.1007/s11263-025-02676-0","DOIUrl":"https://doi.org/10.1007/s11263-025-02676-0","url":null,"abstract":"The pursuit of artificial general intelligence (AGI) has been accelerated by Multimodal Large Language Models (MLLMs), which exhibit superior reasoning, generalization capabilities, and proficiency in processing multimodal inputs. A crucial milestone in the evolution of AGI is the attainment of human-level planning, a fundamental ability for making informed decisions in complex environments, and solving a wide range of real-world problems. Despite the impressive advancements in MLLMs, a question remains: How far are current MLLMs from achieving human-level planning? To shed light on this question, we introduce EgoPlan-Bench, a comprehensive benchmark to evaluate the planning abilities of MLLMs in real-world scenarios from an egocentric perspective, mirroring human perception. EgoPlan-Bench emphasizes the evaluation of planning capabilities of MLLMs, featuring realistic tasks, diverse action plans, and intricate visual observations. Our rigorous evaluation of a wide range of MLLMs reveals that EgoPlan-Bench poses significant challenges, highlighting a substantial scope for improvement in MLLMs to achieve human-level task planning. To facilitate this advancement, we further present EgoPlan-IT, a specialized instruction-tuning dataset that effectively enhances model performance on EgoPlan-Bench. We have made all the codes, data, and a maintained benchmark leaderboard available at <jats:ext-link xmlns:xlink=\"http://www.w3.org/1999/xlink\" xlink:href=\"https://chenyi99.github.io/ego_plan/\" ext-link-type=\"uri\">https://chenyi99.github.io/ego_plan/</jats:ext-link> to advance future research.","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"96 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146196678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust Partial-to-Partial Point Cloud Registration with Overlapping Mask Learning 基于重叠掩模学习的鲁棒部分到部分点云配准
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-12 DOI: 10.1007/s11263-026-02772-9
Hao Xu, Guanghui Liu, Bing Zeng, Shuaicheng Liu
{"title":"Robust Partial-to-Partial Point Cloud Registration with Overlapping Mask Learning","authors":"Hao Xu, Guanghui Liu, Bing Zeng, Shuaicheng Liu","doi":"10.1007/s11263-026-02772-9","DOIUrl":"https://doi.org/10.1007/s11263-026-02772-9","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"334 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146196679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust Shape from Focus via Multiscale Directional Dilated Laplacian and Recurrent Network 基于多尺度定向扩张拉普拉斯网络和递归网络的鲁棒聚焦形状
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-11 DOI: 10.1007/s11263-025-02596-z
Khurram Ashfaq, Muhammad Tariq Mahmood
{"title":"Robust Shape from Focus via Multiscale Directional Dilated Laplacian and Recurrent Network","authors":"Khurram Ashfaq, Muhammad Tariq Mahmood","doi":"10.1007/s11263-025-02596-z","DOIUrl":"https://doi.org/10.1007/s11263-025-02596-z","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"16 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146153644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Haze Hue and Haze Saturation Priors for Single Image Dehazing 阴霾色调和阴霾饱和度先验的单一图像去雾
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-11 DOI: 10.1007/s11263-025-02655-5
Sobhan K. Dhara, Mayukh Roy, Debashis Sen
{"title":"Haze Hue and Haze Saturation Priors for Single Image Dehazing","authors":"Sobhan K. Dhara, Mayukh Roy, Debashis Sen","doi":"10.1007/s11263-025-02655-5","DOIUrl":"https://doi.org/10.1007/s11263-025-02655-5","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"230 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146196680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Journal of Computer Vision
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1