Pub Date : 2026-02-06DOI: 10.1007/s11263-025-02667-1
Songtao Li, Hao Tang
{"title":"Multimodal Alignment and Fusion: A Survey","authors":"Songtao Li, Hao Tang","doi":"10.1007/s11263-025-02667-1","DOIUrl":"https://doi.org/10.1007/s11263-025-02667-1","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"46 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146138683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-06DOI: 10.1007/s11263-025-02689-9
Chao Huang, Susan Liang, Yapeng Tian, Anurag Kumar, Chenliang Xu
{"title":"High-Quality Sound Separation Across Diverse Categories via Visually-Guided Generative Modeling","authors":"Chao Huang, Susan Liang, Yapeng Tian, Anurag Kumar, Chenliang Xu","doi":"10.1007/s11263-025-02689-9","DOIUrl":"https://doi.org/10.1007/s11263-025-02689-9","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"1 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146138685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-06DOI: 10.1007/s11263-025-02663-5
Thang-Anh-Quan Nguyen, Amine Bourki, Mátyás Macudzinski, Anthony Brunel, Mohammed Bennamoun
{"title":"Semantically-aware Neural Radiance Fields for Visual Scene Understanding: A Comprehensive Review","authors":"Thang-Anh-Quan Nguyen, Amine Bourki, Mátyás Macudzinski, Anthony Brunel, Mohammed Bennamoun","doi":"10.1007/s11263-025-02663-5","DOIUrl":"https://doi.org/10.1007/s11263-025-02663-5","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"92 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146138688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Beyond Fixed Topologies: Unregistered Training and Comprehensive Evaluation Metrics for 3D Talking Heads","authors":"Federico Nocentini, Thomas Besnier, Claudio Ferrari, Sylvain Arguillere, Mohamed Daoudi, Stefano Berretti","doi":"10.1007/s11263-025-02726-7","DOIUrl":"https://doi.org/10.1007/s11263-025-02726-7","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"59 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146138874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-06DOI: 10.1007/s11263-025-02711-0
Anne Marthe Sophie Ngo Bibinbe, Chiron Bang, Patrick Gagnon, Jamie Ahloy-Dallaire, Eric R. Paquet
{"title":"An HMM-Based Framework for Identity-Aware Long-Term Multi-Object Tracking From Sparse and Uncertain Identification: Use Case on Long-Term Tracking in Livestock","authors":"Anne Marthe Sophie Ngo Bibinbe, Chiron Bang, Patrick Gagnon, Jamie Ahloy-Dallaire, Eric R. Paquet","doi":"10.1007/s11263-025-02711-0","DOIUrl":"https://doi.org/10.1007/s11263-025-02711-0","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"91 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146138686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-06DOI: 10.1007/s11263-025-02675-1
Camillo Quattrocchi, Antonino Furnari, Daniele Di Mauro, Mario Valerio Giuffrida, Giovanni Maria Farinella
{"title":"Exocentric-to-Egocentric Adaptation for Temporal Action Segmentation with Unlabeled Synchronized Video Pairs","authors":"Camillo Quattrocchi, Antonino Furnari, Daniele Di Mauro, Mario Valerio Giuffrida, Giovanni Maria Farinella","doi":"10.1007/s11263-025-02675-1","DOIUrl":"https://doi.org/10.1007/s11263-025-02675-1","url":null,"abstract":"","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"9 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146138689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-05DOI: 10.1007/s11263-025-02680-4
Yaqing Ding, Jian Yang, Zuzana Kukelova
Homography refers to a specific type of transformation that relates two images of the same planar surface taken from different perspectives. Recovering motion parameters from a homography matrix is a classic problem in computer vision. It is important to derive a fast and stable solution to homography decomposition, since it forms a critical component of many vision systems, e . g ., in Structure-from-Motion and visual localization. The current state-of-the-art solvers can be categorized into two types of methods, the numerical procedures based on singular value decomposition (SVD), and the closed-form solution. The SVD-based methods are stable but time-consuming, while the existing closed-form solution is faster but less stable. In this paper, we discuss the homography decomposition problem from a different viewpoint. In contrast to the existing methods which focus on the properties of the homography matrix, we propose a new method that uses three random point correspondences to obtain the motion parameters in closed form. The proposed method is conceptually simple, easy to understand and implement, and has a good geometrical interpretation. This solution can be seen as an alternative to the existing closed-form solution. We also discuss the configurations where the closed-form solutions might be unstable and present a framework for homography decomposition taking into account both the efficiency and stability.
{"title":"Homography Decomposition Revisited","authors":"Yaqing Ding, Jian Yang, Zuzana Kukelova","doi":"10.1007/s11263-025-02680-4","DOIUrl":"https://doi.org/10.1007/s11263-025-02680-4","url":null,"abstract":"Homography refers to a specific type of transformation that relates two images of the same planar surface taken from different perspectives. Recovering motion parameters from a homography matrix is a classic problem in computer vision. It is important to derive a fast and stable solution to homography decomposition, since it forms a critical component of many vision systems, <jats:italic>e</jats:italic> . <jats:italic>g</jats:italic> ., in Structure-from-Motion and visual localization. The current state-of-the-art solvers can be categorized into two types of methods, the numerical procedures based on singular value decomposition (SVD), and the closed-form solution. The SVD-based methods are stable but time-consuming, while the existing closed-form solution is faster but less stable. In this paper, we discuss the homography decomposition problem from a different viewpoint. In contrast to the existing methods which focus on the properties of the homography matrix, we propose a new method that uses three random point correspondences to obtain the motion parameters in closed form. The proposed method is conceptually simple, easy to understand and implement, and has a good geometrical interpretation. This solution can be seen as an alternative to the existing closed-form solution. We also discuss the configurations where the closed-form solutions might be unstable and present a framework for homography decomposition taking into account both the efficiency and stability.","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"12 1","pages":""},"PeriodicalIF":19.5,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146138691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}