首页 > 最新文献

International Journal of Computer Vision最新文献

英文 中文
End-to-End Full-Page Optical Music Recognition for Pianoform Sheet Music 端到端全页光学音乐识别钢琴乐谱
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-09 DOI: 10.1007/s11263-025-02654-6
Antonio Ríos-Vila, Jorge Calvo-Zaragoza, David Rizo, Thierry Paquet
Optical Music Recognition (OMR) has made significant progress since its inception, with various approaches now capable of accurately transcribing music scores into digital formats. Despite these advancements, most so-called end-to-end OMR approaches still rely on multi-stage processing pipelines for transcribing full-page score images, which entails challenges such as the need for dedicated layout analysis and specific annotated data, thereby limiting the general applicability of such methods. In this paper, we present the first truly end-to-end approach for page-level OMR in complex layouts. Our system, which combines convolutional layers with autoregressive Transformers, processes an entire music score page and outputs a complete transcription in a music encoding format. This is made possible by both the architecture and the training procedure, which utilizes curriculum learning through incremental synthetic data generation. We evaluate the proposed system using pianoform corpora, which is one of the most complex sources in the OMR literature. This evaluation is conducted first in a controlled scenario with synthetic data, and subsequently against two real-world corpora of varying conditions. Our approach is compared with leading commercial OMR software. The results demonstrate that our system not only successfully transcribes full-page music scores but also outperforms the commercial tool in both zero-shot settings and after fine-tuning with the target domain, representing a significant contribution to the field of OMR.
光学音乐识别(OMR)自诞生以来已经取得了重大进展,现在有各种方法能够准确地将乐谱转录成数字格式。尽管取得了这些进步,但大多数所谓的端到端OMR方法仍然依赖于多阶段处理管道来转录整页乐谱图像,这带来了诸如需要专门的布局分析和特定注释数据等挑战,从而限制了此类方法的一般适用性。在本文中,我们为复杂布局中的页面级OMR提供了第一个真正的端到端方法。我们的系统结合了卷积层和自回归变压器,处理整个乐谱页面,并以音乐编码格式输出完整的转录。这可以通过体系结构和训练过程来实现,训练过程通过增量合成数据生成来利用课程学习。我们使用钢琴形式语料库来评估所提出的系统,钢琴形式语料库是OMR文献中最复杂的来源之一。该评估首先在一个具有合成数据的受控场景中进行,然后在两个不同条件的真实语料库中进行。我们的方法与领先的商业OMR软件进行了比较。结果表明,我们的系统不仅成功地转录了整页乐谱,而且在零射击设置和经过目标域微调后都优于商业工具,代表了对OMR领域的重大贡献。
{"title":"End-to-End Full-Page Optical Music Recognition for Pianoform Sheet Music","authors":"Antonio Ríos-Vila, Jorge Calvo-Zaragoza, David Rizo, Thierry Paquet","doi":"10.1007/s11263-025-02654-6","DOIUrl":"https://doi.org/10.1007/s11263-025-02654-6","url":null,"abstract":"Optical Music Recognition (OMR) has made significant progress since its inception, with various approaches now capable of accurately transcribing music scores into digital formats. Despite these advancements, most so-called <jats:italic>end-to-end</jats:italic> OMR approaches still rely on multi-stage processing pipelines for transcribing full-page score images, which entails challenges such as the need for dedicated layout analysis and specific annotated data, thereby limiting the general applicability of such methods. In this paper, we present the first truly end-to-end approach for page-level OMR in complex layouts. Our system, which combines convolutional layers with autoregressive Transformers, processes an entire music score page and outputs a complete transcription in a music encoding format. This is made possible by both the architecture and the training procedure, which utilizes curriculum learning through incremental synthetic data generation. We evaluate the proposed system using pianoform corpora, which is one of the most complex sources in the OMR literature. This evaluation is conducted first in a controlled scenario with synthetic data, and subsequently against two real-world corpora of varying conditions. Our approach is compared with leading commercial OMR software. The results demonstrate that our system not only successfully transcribes full-page music scores but also outperforms the commercial tool in both zero-shot settings and after fine-tuning with the target domain, representing a significant contribution to the field of OMR.","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"3 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145947208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Delving into Pre-training for Domain Transfer: A Broad Study of Pre-training for Domain Generalization and Domain Adaptation 领域迁移预训练研究:领域泛化与领域自适应预训练的广泛研究
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-09 DOI: 10.1007/s11263-025-02590-5
Jungmyung Wi, Youngkyun Jang, Dujin Lee, Myeongseok Nam, Donghyun Kim
{"title":"Delving into Pre-training for Domain Transfer: A Broad Study of Pre-training for Domain Generalization and Domain Adaptation","authors":"Jungmyung Wi, Youngkyun Jang, Dujin Lee, Myeongseok Nam, Donghyun Kim","doi":"10.1007/s11263-025-02590-5","DOIUrl":"https://doi.org/10.1007/s11263-025-02590-5","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"29 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145947209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Are Minimal Radial Distortion Solvers Really Necessary for Relative Pose Estimation? 最小径向畸变解算对于相对姿态估计真的必要吗?
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-09 DOI: 10.1007/s11263-025-02657-3
Viktor Kocur, Charalambos Tzamos, Yaqing Ding, Zuzana Berger Haladova, Torsten Sattler, Zuzana Kukelova
Estimating the relative pose between two cameras is a fundamental step in many applications such as Structure-from-Motion. The common approach to relative pose estimation is to apply a minimal solver inside a RANSAC loop. Highly efficient solvers exist for pinhole cameras. Yet, (nearly) all cameras exhibit radial distortion. Not modeling radial distortion leads to (significantly) worse results. However, minimal radial distortion solvers are significantly more complex than pinhole solvers, both in terms of run-time and implementation efforts. This paper compares radial distortion solvers with two simple-to-implement approaches that do not use minimal radial distortion solvers: The first approach combines an efficient pinhole solver with sampled radial undistortion parameters, where the sampled parameters are used for undistortion prior to applying the pinhole solver. The second approach uses a state-of-the-art neural network to estimate the distortion parameters rather than sampling them from a set of potential values. Extensive experiments on multiple datasets, and different camera setups, show that complex minimal radial distortion solvers are not necessary in practice. We discuss under which conditions a simple sampling of radial undistortion parameters is preferable over calibrating cameras using a learning-based prior approach. Code and newly created benchmark for relative pose estimation under radial distortion are available at https://github.com/kocurvik/rdnet .
估计两个相机之间的相对姿态是许多应用程序的基本步骤,如运动结构。相对姿态估计的常用方法是在RANSAC循环中应用最小解算器。针孔相机存在高效的求解器。然而,(几乎)所有的相机都表现出径向畸变。不模拟径向畸变会导致(明显)更糟糕的结果。然而,无论在运行时间还是实现方面,最小径向畸变求解器都要比针孔求解器复杂得多。本文将径向畸变求解器与两种不使用最小径向畸变求解器的简单实现方法进行了比较:第一种方法将有效的针孔求解器与采样的径向不畸变参数相结合,其中采样的参数在应用针孔求解器之前用于不畸变。第二种方法使用最先进的神经网络来估计失真参数,而不是从一组潜在值中采样。在多个数据集和不同相机设置上的大量实验表明,在实践中并不需要复杂的最小径向畸变求解器。我们讨论了在哪些条件下,径向无畸变参数的简单采样优于使用基于学习的先验方法校准相机。径向畸变下相对姿态估计的代码和新创建的基准可在https://github.com/kocurvik/rdnet上获得。
{"title":"Are Minimal Radial Distortion Solvers Really Necessary for Relative Pose Estimation?","authors":"Viktor Kocur, Charalambos Tzamos, Yaqing Ding, Zuzana Berger Haladova, Torsten Sattler, Zuzana Kukelova","doi":"10.1007/s11263-025-02657-3","DOIUrl":"https://doi.org/10.1007/s11263-025-02657-3","url":null,"abstract":"Estimating the relative pose between two cameras is a fundamental step in many applications such as Structure-from-Motion. The common approach to relative pose estimation is to apply a minimal solver inside a RANSAC loop. Highly efficient solvers exist for pinhole cameras. Yet, (nearly) all cameras exhibit radial distortion. Not modeling radial distortion leads to (significantly) worse results. However, minimal radial distortion solvers are significantly more complex than pinhole solvers, both in terms of run-time and implementation efforts. This paper compares radial distortion solvers with two simple-to-implement approaches that do not use minimal radial distortion solvers: The first approach combines an efficient pinhole solver with sampled radial undistortion parameters, where the sampled parameters are used for undistortion prior to applying the pinhole solver. The second approach uses a state-of-the-art neural network to estimate the distortion parameters rather than sampling them from a set of potential values. Extensive experiments on multiple datasets, and different camera setups, show that complex minimal radial distortion solvers are not necessary in practice. We discuss under which conditions a simple sampling of radial undistortion parameters is preferable over calibrating cameras using a learning-based prior approach. Code and newly created benchmark for relative pose estimation under radial distortion are available at <jats:ext-link xmlns:xlink=\"http://www.w3.org/1999/xlink\" xlink:href=\"https://github.com/kocurvik/rdnet\" ext-link-type=\"uri\">https://github.com/kocurvik/rdnet</jats:ext-link> .","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"48 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145947210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
OmniDrag: Enabling Motion Control for Omnidirectional Image-to-Video Generation OmniDrag:启用全方位图像到视频生成的运动控制
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-07 DOI: 10.1007/s11263-025-02629-7
Weiqi Li, Shijie Zhao, Chong Mou, Xuhan Sheng, Zhenyu Zhang, Qian Wang, Junlin Li, Li Zhang, Jian Zhang
{"title":"OmniDrag: Enabling Motion Control for Omnidirectional Image-to-Video Generation","authors":"Weiqi Li, Shijie Zhao, Chong Mou, Xuhan Sheng, Zhenyu Zhang, Qian Wang, Junlin Li, Li Zhang, Jian Zhang","doi":"10.1007/s11263-025-02629-7","DOIUrl":"https://doi.org/10.1007/s11263-025-02629-7","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"5 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145947254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Breaking Redundancy via 3D Sparse Geometry: 3D-aware Neural Compression for Multi-View Videos 通过3D稀疏几何打破冗余:多视图视频的3D感知神经压缩
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-07 DOI: 10.1007/s11263-025-02604-2
Shiwei Wang, Liquan Shen, Jimin Xiao, Zhaoyi Tian, Feifeng Wang, Xiangyu Hu, Yao Zhu, Guorui Feng
{"title":"Breaking Redundancy via 3D Sparse Geometry: 3D-aware Neural Compression for Multi-View Videos","authors":"Shiwei Wang, Liquan Shen, Jimin Xiao, Zhaoyi Tian, Feifeng Wang, Xiangyu Hu, Yao Zhu, Guorui Feng","doi":"10.1007/s11263-025-02604-2","DOIUrl":"https://doi.org/10.1007/s11263-025-02604-2","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"170 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145947211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-Granularity Prediction with Learnable Fusion for Scene Text Recognition 基于可学习融合的场景文本识别多粒度预测
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-07 DOI: 10.1007/s11263-025-02653-7
Cheng Da, Peng Wang, Cong Yao
{"title":"Multi-Granularity Prediction with Learnable Fusion for Scene Text Recognition","authors":"Cheng Da, Peng Wang, Cong Yao","doi":"10.1007/s11263-025-02653-7","DOIUrl":"https://doi.org/10.1007/s11263-025-02653-7","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"82 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145947212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evidence Conflict Sampling for Open-set Active Learning 开放集主动学习的证据冲突抽样
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-07 DOI: 10.1007/s11263-025-02600-6
Kun-Peng Ning, Hai-Jian Ke, Jia-Yu Yao, Yu-Yang Liu, Yong-Hong Tian, Li Yuan
{"title":"Evidence Conflict Sampling for Open-set Active Learning","authors":"Kun-Peng Ning, Hai-Jian Ke, Jia-Yu Yao, Yu-Yang Liu, Yong-Hong Tian, Li Yuan","doi":"10.1007/s11263-025-02600-6","DOIUrl":"https://doi.org/10.1007/s11263-025-02600-6","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"46 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145947214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds foleycraft:将无声视频与栩栩如生的同步声音结合起来
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-07 DOI: 10.1007/s11263-025-02649-3
Yiming Zhang, Yicheng Gu, Yanhong Zeng, Zhening Xing, Yuancheng Wang, Zhizheng Wu, Bin Liu, Kai Chen
{"title":"FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds","authors":"Yiming Zhang, Yicheng Gu, Yanhong Zeng, Zhening Xing, Yuancheng Wang, Zhizheng Wu, Bin Liu, Kai Chen","doi":"10.1007/s11263-025-02649-3","DOIUrl":"https://doi.org/10.1007/s11263-025-02649-3","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"253 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145947213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CAS-AIR-3D: A Large-scale Low-quality Multi-modal Face Database CAS-AIR-3D:大规模低质量多模态人脸数据库
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-07 DOI: 10.1007/s11263-025-02674-2
Qi Li, Xiaoxiao Dong, Weining Wang, Zhenan Sun, Tieniu Tan, Caifeng Shan
{"title":"CAS-AIR-3D: A Large-scale Low-quality Multi-modal Face Database","authors":"Qi Li, Xiaoxiao Dong, Weining Wang, Zhenan Sun, Tieniu Tan, Caifeng Shan","doi":"10.1007/s11263-025-02674-2","DOIUrl":"https://doi.org/10.1007/s11263-025-02674-2","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"18 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145947257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Traditional Approach for Color Constancy and Color Assimilation Illusions with Its Applications to Low-Light Image Enhancement 一种传统的色恒和色同化错觉方法及其在微光图像增强中的应用
IF 19.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-06 DOI: 10.1007/s11263-025-02595-0
Oguzhan Ulucan, Diclehan Ulucan, Marc Ebner
The human visual system achieves color constancy, allowing consistent color perception under varying environmental contexts, while also being deceived by color illusions, where contextual information affects our perception. Despite the close relationship between color constancy and color illusions, and their potential benefits to the field, both phenomena are rarely studied together in computer vision. In this study, we present the benefits of considering color illusions in the field of computer vision. Particularly, we introduce a learning-free method, namely multiresolution color constancy , which combines insights from computational neuroscience and computer vision to address both phenomena within a single framework. Our approach performs color constancy in both multi- and single-illuminant scenarios, while it is also deceived by assimilation illusions. Additionally, we extend our method to low-light image enhancement, thus, demonstrate its usability across different computer vision tasks. Through comprehensive experiments on color constancy, we show the effectiveness of our method in multi-illuminant and single-illuminant scenarios. Furthermore, we compare our method with state-of-the-art learning-based models on low-light image enhancement, where it shows competitive performance. This work presents the first method that integrates color constancy, color illusions, and low-light image enhancement in a single and explainable framework.
人类的视觉系统实现了颜色的恒常性,允许在不同的环境背景下保持一致的颜色感知,同时也被颜色错觉所欺骗,其中环境信息影响我们的感知。尽管色彩恒常性和色彩错觉之间有着密切的关系,以及它们对该领域的潜在益处,但这两种现象在计算机视觉中很少被一起研究。在本研究中,我们展示了在计算机视觉领域考虑色彩错觉的好处。特别是,我们引入了一种无需学习的方法,即多分辨率颜色恒常性,它结合了计算神经科学和计算机视觉的见解,在单一框架内解决这两种现象。我们的方法在多光源和单光源情况下都能执行颜色恒定,同时它也被同化错觉所欺骗。此外,我们将该方法扩展到低光图像增强,从而证明其在不同计算机视觉任务中的可用性。通过对颜色稳定性的综合实验,我们证明了该方法在多光源和单光源场景下的有效性。此外,我们将我们的方法与最先进的基于学习的低光图像增强模型进行了比较,其中它显示出具有竞争力的性能。这项工作提出了在一个单一的和可解释的框架中集成颜色恒常性,颜色错觉和低光图像增强的第一种方法。
{"title":"A Traditional Approach for Color Constancy and Color Assimilation Illusions with Its Applications to Low-Light Image Enhancement","authors":"Oguzhan Ulucan, Diclehan Ulucan, Marc Ebner","doi":"10.1007/s11263-025-02595-0","DOIUrl":"https://doi.org/10.1007/s11263-025-02595-0","url":null,"abstract":"The human visual system achieves color constancy, allowing consistent color perception under varying environmental contexts, while also being deceived by color illusions, where contextual information affects our perception. Despite the close relationship between color constancy and color illusions, and their potential benefits to the field, both phenomena are rarely studied together in computer vision. In this study, we present the benefits of considering color illusions in the field of computer vision. Particularly, we introduce a learning-free method, namely <jats:italic>multiresolution color constancy</jats:italic> , which combines insights from computational neuroscience and computer vision to address both phenomena within a single framework. Our approach performs color constancy in both multi- and single-illuminant scenarios, while it is also deceived by assimilation illusions. Additionally, we extend our method to low-light image enhancement, thus, demonstrate its usability across different computer vision tasks. Through comprehensive experiments on color constancy, we show the effectiveness of our method in multi-illuminant and single-illuminant scenarios. Furthermore, we compare our method with state-of-the-art learning-based models on low-light image enhancement, where it shows competitive performance. This work presents the first method that integrates color constancy, color illusions, and low-light image enhancement in a single and explainable framework.","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"83 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145902469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Journal of Computer Vision
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1