IEEE transactions on pattern analysis and machine intelligence最新文献

英文中文

Editorial Introduction to the ICCV 2021 Special Section

IEEE transactions on pattern analysis and machine intelligence

Pub Date : 2025-04-08 DOI: 10.1109/TPAMI.2025.3546068

Dima Damen;Tal Hassner;Chris Pal;Yoichi Sato

引用次数: 0

Revisiting One-stage Deep Uncalibrated Photometric Stereo via Fourier Embedding. 通过傅立叶嵌入重新审视单级深度非校准光度立体。

IEEE transactions on pattern analysis and machine intelligence

Pub Date : 2025-04-02 DOI: 10.1109/TPAMI.2025.3557245

Yakun Ju, Boxin Shi, Bihan Wen, Kin-Man Lam, Xudong Jiang, Alex C Kot

This paper introduces a one-stage deep uncalibrated photometric stereo (UPS) network, namely Fourier Uncalibrated Photometric Stereo Network (FUPS-Net), for non-Lambertian objects under unknown light directions. It departs from traditional two-stage methods that first explicitly learn lighting information and then estimate surface normals. Two-stage methods were deployed because the interplay of lighting with shading cues presents challenges for directly estimating surface normals without explicit lighting information. However, these two-stage networks are disjointed and separately trained so that the error in explicit light calibration will propagate to the second stage and cannot be eliminated. In contrast, the proposed FUPS-Net utilizes an embedded Fourier transform network to implicitly learn lighting features by decomposing inputs, rather than employing a disjointed light estimation network. Our approach is motivated from observations in the Fourier domain of photometric stereo images: lighting information is mainly encoded in amplitudes, while geometry information is mainly associated with phases. Leveraging this property, our method "decomposes" geometry and lighting in the Fourier domain as guidance, via the proposed Fourier Embedding Extraction (FEE) block and Fourier Embedding Aggregation (FEA) block, which generate lighting and geometry features for the FUPS-Net to implicitly resolve the geometry-lighting ambiguity. Furthermore, we propose a Frequency-Spatial Weighted (FSW) block that assigns weights to combine features extracted from the frequency domain and those from the spatial domain for enhancing surface reconstructions. FUPS-Net overcomes the limitations of two-stage UPS methods, offering better training stability, a concise end-to-end structure, and avoiding accumulated errors in disjointed networks. Experimental results on synthetic and real datasets demonstrate the superior performance of our approach, and its simpler training setup, potentially paving the way for a new strategy in deep learning-based UPS methods.

{"title":"Revisiting One-stage Deep Uncalibrated Photometric Stereo via Fourier Embedding.","authors":"Yakun Ju, Boxin Shi, Bihan Wen, Kin-Man Lam, Xudong Jiang, Alex C Kot","doi":"10.1109/TPAMI.2025.3557245","DOIUrl":"https://doi.org/10.1109/TPAMI.2025.3557245","url":null,"abstract":"This paper introduces a one-stage deep uncalibrated photometric stereo (UPS) network, namely Fourier Uncalibrated Photometric Stereo Network (FUPS-Net), for non-Lambertian objects under unknown light directions. It departs from traditional two-stage methods that first explicitly learn lighting information and then estimate surface normals. Two-stage methods were deployed because the interplay of lighting with shading cues presents challenges for directly estimating surface normals without explicit lighting information. However, these two-stage networks are disjointed and separately trained so that the error in explicit light calibration will propagate to the second stage and cannot be eliminated. In contrast, the proposed FUPS-Net utilizes an embedded Fourier transform network to implicitly learn lighting features by decomposing inputs, rather than employing a disjointed light estimation network. Our approach is motivated from observations in the Fourier domain of photometric stereo images: lighting information is mainly encoded in amplitudes, while geometry information is mainly associated with phases. Leveraging this property, our method \"decomposes\" geometry and lighting in the Fourier domain as guidance, via the proposed Fourier Embedding Extraction (FEE) block and Fourier Embedding Aggregation (FEA) block, which generate lighting and geometry features for the FUPS-Net to implicitly resolve the geometry-lighting ambiguity. Furthermore, we propose a Frequency-Spatial Weighted (FSW) block that assigns weights to combine features extracted from the frequency domain and those from the spatial domain for enhancing surface reconstructions. FUPS-Net overcomes the limitations of two-stage UPS methods, offering better training stability, a concise end-to-end structure, and avoiding accumulated errors in disjointed networks. Experimental results on synthetic and real datasets demonstrate the superior performance of our approach, and its simpler training setup, potentially paving the way for a new strategy in deep learning-based UPS methods.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143775229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Modeling the Label Distributions for Weakly-Supervised Semantic Segmentation. 为弱监督语义分割的标签分布建模

IEEE transactions on pattern analysis and machine intelligence

Pub Date : 2025-04-02 DOI: 10.1109/TPAMI.2025.3557047

Linshan Wu, Zhun Zhong, Jiayi Ma, Yunchao Wei, Hao Chen, Leyuan Fang, Shutao Li

Weakly-Supervised Semantic Segmentation (WSSS) aims to train segmentation models by weak labels, which is receiving significant attention due to its low annotation cost. Existing approaches focus on generating pseudo labels for supervision while largely ignoring to leverage the inherent semantic correlation among different pseudo labels. We observe that pseudo-labeled pixels that are close to each other in the feature space are more likely to share the same class, and those closer to the distribution centers tend to have higher confidence. Motivated by this, we propose to model the underlying label distributions and employ cross-label constraints to generate more accurate pseudo labels. In this paper, we develop a unified WSSS framework named Adaptive Gaussian Mixtures Model, which leverages a GMM to model the label distributions. Specifically, we calculate the feature distribution centers of pseudo-labeled pixels and build the GMM by measuring the distance between the centers and each pseudo-labeled pixel. Then, we introduce an Online Expectation-Maximization (OEM) algorithm and a novel maximization loss to optimize the GMM adaptively, aiming to learn more discriminative decision boundaries between different class- wise Gaussian mixtures. Based on the label distributions, we leverage the GMM to generate high-quality pseudo labels for more reliable supervision. Our framework is capable of solving different forms of weak labels: image-level labels, points, scribbles, blocks, and bounding-boxes. Extensive experiments on PASCAL, COCO, Cityscapes, and ADE20 K datasets demonstrate that our framework can effectively provide more reliable supervision and outperform the state-of-the-art methods under all settings. Code will be available at https://github.com/Luffy03/AGMM-SASS.

{"title":"Modeling the Label Distributions for Weakly-Supervised Semantic Segmentation.","authors":"Linshan Wu, Zhun Zhong, Jiayi Ma, Yunchao Wei, Hao Chen, Leyuan Fang, Shutao Li","doi":"10.1109/TPAMI.2025.3557047","DOIUrl":"https://doi.org/10.1109/TPAMI.2025.3557047","url":null,"abstract":"Weakly-Supervised Semantic Segmentation (WSSS) aims to train segmentation models by weak labels, which is receiving significant attention due to its low annotation cost. Existing approaches focus on generating pseudo labels for supervision while largely ignoring to leverage the inherent semantic correlation among different pseudo labels. We observe that pseudo-labeled pixels that are close to each other in the feature space are more likely to share the same class, and those closer to the distribution centers tend to have higher confidence. Motivated by this, we propose to model the underlying label distributions and employ cross-label constraints to generate more accurate pseudo labels. In this paper, we develop a unified WSSS framework named Adaptive Gaussian Mixtures Model, which leverages a GMM to model the label distributions. Specifically, we calculate the feature distribution centers of pseudo-labeled pixels and build the GMM by measuring the distance between the centers and each pseudo-labeled pixel. Then, we introduce an Online Expectation-Maximization (OEM) algorithm and a novel maximization loss to optimize the GMM adaptively, aiming to learn more discriminative decision boundaries between different class- wise Gaussian mixtures. Based on the label distributions, we leverage the GMM to generate high-quality pseudo labels for more reliable supervision. Our framework is capable of solving different forms of weak labels: image-level labels, points, scribbles, blocks, and bounding-boxes. Extensive experiments on PASCAL, COCO, Cityscapes, and ADE20 K datasets demonstrate that our framework can effectively provide more reliable supervision and outperform the state-of-the-art methods under all settings. Code will be available at https://github.com/Luffy03/AGMM-SASS.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143775191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Interpreting Low-level Vision Models with Causal Effect Maps.

IEEE transactions on pattern analysis and machine intelligence

Pub Date : 2025-04-02 DOI: 10.1109/TPAMI.2025.3557149

Jinfan Hu, Jinjin Gu, Shiyao Yu, Fanghua Yu, Zheyuan Li, Zhiyuan You, Chaochao Lu, Chao Dong

Deep neural networks have significantly improved the performance of low-level vision tasks but also increased the difficulty of interpretability. A deep understanding of deep models is beneficial for both network design and practical reliability. To take up this challenge, we introduce causality theory to interpret low-level vision models and propose a model-/task-agnostic method called Causal Effect Map (CEM). With CEM, we can visualize and quantify the input-output relationships on either positive or negative effects. After analyzing various low-level vision tasks with CEM, we have reached several interesting insights, such as: (1) Using more information of input images (e.g., larger receptive field) does NOT always yield positive outcomes. (2) Attempting to incorporate mechanisms with a global receptive field (e.g., channel attention) into image denoising may prove futile. (3) Integrating multiple tasks to train a general model could encourage the network to prioritize local information over global context. Based on the causal effect theory, the proposed diagnostic tool can refresh our common knowledge and bring a deeper understanding of low-level vision models. Codes are available at https://github.com/J-FHu/CEM.

{"title":"Interpreting Low-level Vision Models with Causal Effect Maps.","authors":"Jinfan Hu, Jinjin Gu, Shiyao Yu, Fanghua Yu, Zheyuan Li, Zhiyuan You, Chaochao Lu, Chao Dong","doi":"10.1109/TPAMI.2025.3557149","DOIUrl":"10.1109/TPAMI.2025.3557149","url":null,"abstract":"Deep neural networks have significantly improved the performance of low-level vision tasks but also increased the difficulty of interpretability. A deep understanding of deep models is beneficial for both network design and practical reliability. To take up this challenge, we introduce causality theory to interpret low-level vision models and propose a model-/task-agnostic method called Causal Effect Map (CEM). With CEM, we can visualize and quantify the input-output relationships on either positive or negative effects. After analyzing various low-level vision tasks with CEM, we have reached several interesting insights, such as: (1) Using more information of input images (e.g., larger receptive field) does NOT always yield positive outcomes. (2) Attempting to incorporate mechanisms with a global receptive field (e.g., channel attention) into image denoising may prove futile. (3) Integrating multiple tasks to train a general model could encourage the network to prioritize local information over global context. Based on the causal effect theory, the proposed diagnostic tool can refresh our common knowledge and bring a deeper understanding of low-level vision models. Codes are available at https://github.com/J-FHu/CEM.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143775185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Calibration-Free Raw Image Denoising via Fine-Grained Noise Estimation.

IEEE transactions on pattern analysis and machine intelligence

Pub Date : 2025-03-24 DOI: 10.1109/TPAMI.2025.3550264

Yunhao Zou, Ying Fu, Yulun Zhang, Tao Zhang, Chenggang Yan, Radu Timofte

Image denoising has progressed significantly due to the development of effective deep denoisers. To improve the performance in real-world scenarios, recent trends prefer to formulate superior noise models to generate realistic training data, or estimate noise levels to steer non-blind denoisers. In this paper, we bridge both strategies by presenting an innovative noise estimation and realistic noise synthesis pipeline. Specifically, we integrates a fine-grained statistical noise model and contrastive learning strategy, with a unique data augmentation to enhance learning ability. Then, we use this model to estimate noise parameters on evaluation dataset, which are subsequently used to craft camera-specific noise distribution and synthesize realistic noise. One distinguishing feature of our methodology is its adaptability: our pre-trained model can directly estimate unknown cameras, making it possible to unfamiliar sensor noise modeling using only testing images, without calibration frames or paired training data. Another highlight is our attempt in estimating parameters for fine-grained noise models, which extends the applicability to even more challenging low-light conditions. Through empirical testing, our calibration-free pipeline demonstrates effectiveness in both normal and low-light scenarios, further solidifying its utility in real-world noise synthesis and denoising tasks.

{"title":"Calibration-Free Raw Image Denoising via Fine-Grained Noise Estimation.","authors":"Yunhao Zou, Ying Fu, Yulun Zhang, Tao Zhang, Chenggang Yan, Radu Timofte","doi":"10.1109/TPAMI.2025.3550264","DOIUrl":"10.1109/TPAMI.2025.3550264","url":null,"abstract":"Image denoising has progressed significantly due to the development of effective deep denoisers. To improve the performance in real-world scenarios, recent trends prefer to formulate superior noise models to generate realistic training data, or estimate noise levels to steer non-blind denoisers. In this paper, we bridge both strategies by presenting an innovative noise estimation and realistic noise synthesis pipeline. Specifically, we integrates a fine-grained statistical noise model and contrastive learning strategy, with a unique data augmentation to enhance learning ability. Then, we use this model to estimate noise parameters on evaluation dataset, which are subsequently used to craft camera-specific noise distribution and synthesize realistic noise. One distinguishing feature of our methodology is its adaptability: our pre-trained model can directly estimate unknown cameras, making it possible to unfamiliar sensor noise modeling using only testing images, without calibration frames or paired training data. Another highlight is our attempt in estimating parameters for fine-grained noise models, which extends the applicability to even more challenging low-light conditions. Through empirical testing, our calibration-free pipeline demonstrates effectiveness in both normal and low-light scenarios, further solidifying its utility in real-world noise synthesis and denoising tasks.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143702482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Addressing Information Asymmetry: Deep Temporal Causality Discovery for Mixed Time Series.

IEEE transactions on pattern analysis and machine intelligence

Pub Date : 2025-03-24 DOI: 10.1109/TPAMI.2025.3553957

Jiawei Chen, Chunhui Zhao

While existing causal discovery methods mostly focus on continuous time series, causal discovery for mixed time series encompassing both continuous variables (CVs) and discrete variables (DVs) is a fundamental yet underexplored problem. Together with nonlinearity and high dimensionality, mixed time series pose significant challenges for causal discovery. This study addresses the aforementioned challenges based on the following recognitions: (1) DVs may originate from latent continuous variables (LCVs) and undergo discretization processes due to measurement limitations, storage requirements, and other reasons. (2) LCVs contain fine-grained information and interact with CVs. By leveraging these interactions, the intrinsic continuity of DVs can be recovered. Thereupon, we propose a generic deep mixed time series temporal causal discovery framework. Our key idea is to adaptively recover LCVs from DVs with the guidance of CVs and perform causal discovery in a unified continuous-valued space. Technically, a new contextual adaptive Gaussian kernel embedding technique is developed for latent continuity recovery by adaptively aggregating temporal contextual information of DVs. Accordingly, two interdependent model training stages are devised for learning the latent continuity recovery with self-supervision and causal structure learning with sparsity-induced optimization. Experimentally, extensive empirical evaluations and in-depth investigations validate the superior performance of our framework. Our code and data are available at https://github.com/chunhuiz/MiTCD.

{"title":"Addressing Information Asymmetry: Deep Temporal Causality Discovery for Mixed Time Series.","authors":"Jiawei Chen, Chunhui Zhao","doi":"10.1109/TPAMI.2025.3553957","DOIUrl":"https://doi.org/10.1109/TPAMI.2025.3553957","url":null,"abstract":"While existing causal discovery methods mostly focus on continuous time series, causal discovery for mixed time series encompassing both continuous variables (CVs) and discrete variables (DVs) is a fundamental yet underexplored problem. Together with nonlinearity and high dimensionality, mixed time series pose significant challenges for causal discovery. This study addresses the aforementioned challenges based on the following recognitions: (1) DVs may originate from latent continuous variables (LCVs) and undergo discretization processes due to measurement limitations, storage requirements, and other reasons. (2) LCVs contain fine-grained information and interact with CVs. By leveraging these interactions, the intrinsic continuity of DVs can be recovered. Thereupon, we propose a generic deep mixed time series temporal causal discovery framework. Our key idea is to adaptively recover LCVs from DVs with the guidance of CVs and perform causal discovery in a unified continuous-valued space. Technically, a new contextual adaptive Gaussian kernel embedding technique is developed for latent continuity recovery by adaptively aggregating temporal contextual information of DVs. Accordingly, two interdependent model training stages are devised for learning the latent continuity recovery with self-supervision and causal structure learning with sparsity-induced optimization. Experimentally, extensive empirical evaluations and in-depth investigations validate the superior performance of our framework. Our code and data are available at https://github.com/chunhuiz/MiTCD.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143702469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Learning-aided Neighborhood Search for Vehicle Routing Problems.

IEEE transactions on pattern analysis and machine intelligence

Pub Date : 2025-03-24 DOI: 10.1109/TPAMI.2025.3554669

Tong Guo, Yi Mei, Mengjie Zhang, Haoran Zhao, Kaiquan Cai, Wenbo Du

The Vehicle Routing Problem (VRP) is a classic optimization problem with diverse real-world applications. The neighborhood search has emerged as an effective approach, yielding high-quality solutions across different VRPs. However, most existing studies exhaustively explore all considered neighborhoods with a pre-fixed order, leading to an inefficient search process. To address this issue, this paper proposes a Learning-aided Neighborhood Search algorithm (LaNS) that employs a cutting-edge multi-agent reinforcement learning-driven adaptive operator/neighborhood selection mechanism to achieve efficient routing for VRP. Within this framework, two agents serve as high-level instructors, collaboratively guiding the search direction by selecting perturbation/improvement operators from a pool of low-level heuristics. Furthermore, to equip the agents with comprehensive information for learning guidance knowledge, we have developed a new informative state representation. This representation transforms the spatial route structures into an image-like tensor, allowing us to extract spatial features using a convolutional neural network. Comprehensive evaluations on diverse VRP benchmarks, including the capacitated VRP (CVRP), multi-depot VRP (MDVRP) and cumulative multi-depot VRP with energy constraints, demonstrate LaNS's superiority over the state-of-the-art neighborhood search methods as well as the existing learning-guided neighborhood search algorithms.

{"title":"Learning-aided Neighborhood Search for Vehicle Routing Problems.","authors":"Tong Guo, Yi Mei, Mengjie Zhang, Haoran Zhao, Kaiquan Cai, Wenbo Du","doi":"10.1109/TPAMI.2025.3554669","DOIUrl":"10.1109/TPAMI.2025.3554669","url":null,"abstract":"The Vehicle Routing Problem (VRP) is a classic optimization problem with diverse real-world applications. The neighborhood search has emerged as an effective approach, yielding high-quality solutions across different VRPs. However, most existing studies exhaustively explore all considered neighborhoods with a pre-fixed order, leading to an inefficient search process. To address this issue, this paper proposes a Learning-aided Neighborhood Search algorithm (LaNS) that employs a cutting-edge multi-agent reinforcement learning-driven adaptive operator/neighborhood selection mechanism to achieve efficient routing for VRP. Within this framework, two agents serve as high-level instructors, collaboratively guiding the search direction by selecting perturbation/improvement operators from a pool of low-level heuristics. Furthermore, to equip the agents with comprehensive information for learning guidance knowledge, we have developed a new informative state representation. This representation transforms the spatial route structures into an image-like tensor, allowing us to extract spatial features using a convolutional neural network. Comprehensive evaluations on diverse VRP benchmarks, including the capacitated VRP (CVRP), multi-depot VRP (MDVRP) and cumulative multi-depot VRP with energy constraints, demonstrate LaNS's superiority over the state-of-the-art neighborhood search methods as well as the existing learning-guided neighborhood search algorithms.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143702483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Graph Prompt Clustering.

IEEE transactions on pattern analysis and machine intelligence

Pub Date : 2025-03-20 DOI: 10.1109/TPAMI.2025.3553129

Man-Sheng Chen, Pei-Yuan Lai, De-Zhang Liao, Chang-Dong Wang, Jian-Huang Lai

Due to the wide existence of unlabeled graph-structured data (e.g. molecular structures), the graph-level clustering has recently attracted increasing attention, whose goal is to divide the input graphs into several disjoint groups. However, the existing methods habitually focus on learning the graphs embeddings with different graph reguralizations, and seldom refer to the obvious differences in data distributions of distinct graph-level datasets. How to characteristically consider multiple graph-level datasets in a general well-designed model without prior knowledge is still challenging. In view of this, we propose a novel Graph Prompt Clustering (GPC) method. Within this model, there are two main modules, i.e., graph model pretraining as well as prompt and finetuning. In the graph model pretraining module, the graph model is pretrained by a selected source graph-level dataset with mutual information maximization and self-supervised clustering regularization. In the prompt and finetuning module, the network parameters of the pretrained graph model are frozen, and a groups of learnable prompt vectors assigned to each graph-level representation are trained for adapting different target graph-level datasets with various data distributions. Experimental results across six benchmark datasets demonstrate the impressive generalization capability and effectiveness of GPC compared with the state-of-the-art methods.

{"title":"Graph Prompt Clustering.","authors":"Man-Sheng Chen, Pei-Yuan Lai, De-Zhang Liao, Chang-Dong Wang, Jian-Huang Lai","doi":"10.1109/TPAMI.2025.3553129","DOIUrl":"10.1109/TPAMI.2025.3553129","url":null,"abstract":"Due to the wide existence of unlabeled graph-structured data (e.g. molecular structures), the graph-level clustering has recently attracted increasing attention, whose goal is to divide the input graphs into several disjoint groups. However, the existing methods habitually focus on learning the graphs embeddings with different graph reguralizations, and seldom refer to the obvious differences in data distributions of distinct graph-level datasets. How to characteristically consider multiple graph-level datasets in a general well-designed model without prior knowledge is still challenging. In view of this, we propose a novel Graph Prompt Clustering (GPC) method. Within this model, there are two main modules, i.e., graph model pretraining as well as prompt and finetuning. In the graph model pretraining module, the graph model is pretrained by a selected source graph-level dataset with mutual information maximization and self-supervised clustering regularization. In the prompt and finetuning module, the network parameters of the pretrained graph model are frozen, and a groups of learnable prompt vectors assigned to each graph-level representation are trained for adapting different target graph-level datasets with various data distributions. Experimental results across six benchmark datasets demonstrate the impressive generalization capability and effectiveness of GPC compared with the state-of-the-art methods.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143672019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

CLRNetV2: A Faster and Stronger Lane Detector.

IEEE transactions on pattern analysis and machine intelligence

Pub Date : 2025-03-18 DOI: 10.1109/TPAMI.2025.3551935

Tu Zheng, Yifei Huang, Yang Liu, Binbin Lin, Zheng Yang, Deng Cai, Xiaofei He

Lane is critical in the vision navigation system of intelligent vehicles. Naturally, the lane is a traffic sign with high-level semantics, whereas it owns the specific local pattern which needs detailed low-level features to localize accurately. Using different feature levels is of great importance for accurate lane detection, but it is still under-explored. On the other hand, current lane detection methods still struggle to detect complex dense lanes, such as Y-shape or fork-shape. In this work, we present Cross Layer Refinement Network aiming at fully utilizing both high-level and low-level features in lane detection. In particular, it first detects lanes with high-level semantic features and then performs refinement based on low-level features. In this way, we can exploit more contextual information to detect lanes while leveraging local-detailed features to improve localization accuracy. We present Fast-ROIGather to gather global context, which further enhances the representation of lane features. To detect dense lanes accurately, we propose Correlation Discrimination Module (CDM) to discriminate the correlation of dense lanes, enabling nearly cost-free high-quality dense lane prediction. In addition to our novel network design, we introduce LineIoU loss which regresses lanes as a whole unit to improve localization accuracy. Experiments demonstrate our approach significantly outperforms the state-of-the-art lane detection methods.

{"title":"CLRNetV2: A Faster and Stronger Lane Detector.","authors":"Tu Zheng, Yifei Huang, Yang Liu, Binbin Lin, Zheng Yang, Deng Cai, Xiaofei He","doi":"10.1109/TPAMI.2025.3551935","DOIUrl":"https://doi.org/10.1109/TPAMI.2025.3551935","url":null,"abstract":"Lane is critical in the vision navigation system of intelligent vehicles. Naturally, the lane is a traffic sign with high-level semantics, whereas it owns the specific local pattern which needs detailed low-level features to localize accurately. Using different feature levels is of great importance for accurate lane detection, but it is still under-explored. On the other hand, current lane detection methods still struggle to detect complex dense lanes, such as Y-shape or fork-shape. In this work, we present Cross Layer Refinement Network aiming at fully utilizing both high-level and low-level features in lane detection. In particular, it first detects lanes with high-level semantic features and then performs refinement based on low-level features. In this way, we can exploit more contextual information to detect lanes while leveraging local-detailed features to improve localization accuracy. We present Fast-ROIGather to gather global context, which further enhances the representation of lane features. To detect dense lanes accurately, we propose Correlation Discrimination Module (CDM) to discriminate the correlation of dense lanes, enabling nearly cost-free high-quality dense lane prediction. In addition to our novel network design, we introduce LineIoU loss which regresses lanes as a whole unit to improve localization accuracy. Experiments demonstrate our approach significantly outperforms the state-of-the-art lane detection methods.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143660091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Systematic Bias of Machine Learning Regression Models and Correction.

IEEE transactions on pattern analysis and machine intelligence

Pub Date : 2025-03-18 DOI: 10.1109/TPAMI.2025.3552368

Hwiyoung Lee, Shuo Chen

Machine learning models for continuous outcomes often yield systematically biased predictions, particularly for values that largely deviate from the mean. Specifically, predictions for large-valued outcomes tend to be negatively biased (underestimating actual values), while those for small-valued outcomes are positively biased (overestimating actual values). We refer to this linear central tendency warped bias as the "systematic bias of machine learning regression". In this paper, we first demonstrate that this systematic prediction bias persists across various machine learning regression models, and then delve into its theoretical underpinnings. To address this issue, we propose a general constrained optimization approach designed to correct this bias and develop computationally efficient implementation algorithms. Simulation results indicate that our correction method effectively eliminates the bias from the predicted outcomes. We apply the proposed approach to the prediction of brain age using neuroimaging data. In comparison to competing machine learning regression models, our method effectively addresses the longstanding issue of "systematic bias of machine learning regression" in neuroimaging-based brain age calculation, yielding unbiased predictions of brain age.

{"title":"Systematic Bias of Machine Learning Regression Models and Correction.","authors":"Hwiyoung Lee, Shuo Chen","doi":"10.1109/TPAMI.2025.3552368","DOIUrl":"10.1109/TPAMI.2025.3552368","url":null,"abstract":"Machine learning models for continuous outcomes often yield systematically biased predictions, particularly for values that largely deviate from the mean. Specifically, predictions for large-valued outcomes tend to be negatively biased (underestimating actual values), while those for small-valued outcomes are positively biased (overestimating actual values). We refer to this linear central tendency warped bias as the \"systematic bias of machine learning regression\". In this paper, we first demonstrate that this systematic prediction bias persists across various machine learning regression models, and then delve into its theoretical underpinnings. To address this issue, we propose a general constrained optimization approach designed to correct this bias and develop computationally efficient implementation algorithms. Simulation results indicate that our correction method effectively eliminates the bias from the predicted outcomes. We apply the proposed approach to the prediction of brain age using neuroimaging data. In comparison to competing machine learning regression models, our method effectively addresses the longstanding issue of \"systematic bias of machine learning regression\" in neuroimaging-based brain age calculation, yielding unbiased predictions of brain age.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143660149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

IEEE transactions on pattern analysis and machine intelligence

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀