首页 > 最新文献

IEEE transactions on pattern analysis and machine intelligence最新文献

英文 中文
RankFeat&RankWeight: Rank-1 Feature/Weight Removal for Out-of-Distribution Detection rankfeature & rankweight:用于非分布检测的Rank-1特征/权重去除
Pub Date : 2024-12-23 DOI: 10.1109/TPAMI.2024.3520899
Yue Song;Wei Wang;Nicu Sebe
The task of out-of-distribution (OOD) detection is crucial for deploying machine learning models in real-world settings. In this paper, we observe that the singular value distributions of the in-distribution (ID) and OOD features are quite different: the OOD feature matrix tends to have a larger dominant singular value than the ID feature, and the class predictions of OOD samples are largely determined by it. This observation motivates us to propose RankFeat, a simple yet effective post hoc approach for OOD detection by removing the rank-1 matrix composed of the largest singular value and the associated singular vectors from the high-level feature. RankFeat achieves state-of-the-art performance and reduces the average false positive rate (FPR95) by 17.90% compared with the previous best method. The success of RankFeat motivates us to investigate whether a similar phenomenon would exist in the parameter matrices of neural networks. We thus propose RankWeight which removes the rank-1 weight from the parameter matrices of a single deep layer. Our RankWeight is also post hoc and only requires computing the rank-1 matrix once. As a standalone approach, RankWeight has very competitive performance against other methods across various backbones. Moreover, RankWeight enjoys flexible compatibility with a wide range of OOD detection methods. The combination of RankWeight and RankFeat refreshes the new state-of-the-art performance, achieving the FPR95 as low as 16.13% on the ImageNet-1k benchmark. Extensive ablation studies and comprehensive theoretical analyses are presented to support the empirical results.
{"title":"RankFeat&RankWeight: Rank-1 Feature/Weight Removal for Out-of-Distribution Detection","authors":"Yue Song;Wei Wang;Nicu Sebe","doi":"10.1109/TPAMI.2024.3520899","DOIUrl":"10.1109/TPAMI.2024.3520899","url":null,"abstract":"The task of out-of-distribution (OOD) detection is crucial for deploying machine learning models in real-world settings. In this paper, we observe that the singular value distributions of the in-distribution (ID) and OOD features are quite different: the OOD feature matrix tends to have a larger dominant singular value than the ID feature, and the class predictions of OOD samples are largely determined by it. This observation motivates us to propose <monospace>RankFeat</monospace>, a simple yet effective <italic>post hoc</i> approach for OOD detection by removing the rank-1 matrix composed of the largest singular value and the associated singular vectors from the high-level feature. <monospace>RankFeat</monospace> achieves <italic>state-of-the-art</i> performance and reduces the average false positive rate (FPR95) by 17.90% compared with the previous best method. The success of <monospace>RankFeat</monospace> motivates us to investigate whether a similar phenomenon would exist in the parameter matrices of neural networks. We thus propose <monospace>RankWeight</monospace> which removes the rank-1 weight from the parameter matrices of a single deep layer. Our <monospace>RankWeight</monospace> is also <italic>post hoc</i> and only requires computing the rank-1 matrix once. As a standalone approach, <monospace>RankWeight</monospace> has very competitive performance against other methods across various backbones. Moreover, <monospace>RankWeight</monospace> enjoys flexible compatibility with a wide range of OOD detection methods. The combination of <monospace>RankWeight</monospace> and <monospace>RankFeat</monospace> refreshes the new <italic>state-of-the-art</i> performance, achieving the FPR95 as low as 16.13% on the ImageNet-1k benchmark. Extensive ablation studies and comprehensive theoretical analyses are presented to support the empirical results.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 4","pages":"2505-2519"},"PeriodicalIF":0.0,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10812065","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142879951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Manifold Based Multi-View K-Means 基于流形的多视图k均值
Pub Date : 2024-12-23 DOI: 10.1109/TPAMI.2024.3521022
Quanxue Gao;Fangfang Li;Qianqian Wang;Xinbo Gao;Dacheng Tao
Although numerous clustering algorithms have been developed, many existing methods still rely on the K-means technique to identify clusters of data points. However, the performance of K-means is highly dependent on the accurate estimation of cluster centers, which is challenging to achieve optimally. Furthermore, it struggles to handle linearly non-separable data. To address these limitations, from the perspective of manifold learning, we reformulate multi-view K-means into a manifold-based multi-view clustering formulation that eliminates the need for computing centroid matrix. This reformulation ensures consistency between the manifold structure and the data labels. Building on this, we propose a novel multi-view K-means model incorporating the tensor rank constraint. Our model employs the indicator matrices from different views to construct a third-order tensor, whose rank is minimized via the tensor Schatten p-norm. This approach effectively leverages the complementary information across views. By utilizing different distance functions, our proposed model can effectively handle linearly non-separable data. Extensive experimental results on multiple databases demonstrate the superiority of our proposed model.
{"title":"Manifold Based Multi-View K-Means","authors":"Quanxue Gao;Fangfang Li;Qianqian Wang;Xinbo Gao;Dacheng Tao","doi":"10.1109/TPAMI.2024.3521022","DOIUrl":"10.1109/TPAMI.2024.3521022","url":null,"abstract":"Although numerous clustering algorithms have been developed, many existing methods still rely on the <italic>K</i>-means technique to identify clusters of data points. However, the performance of <italic>K</i>-means is highly dependent on the accurate estimation of cluster centers, which is challenging to achieve optimally. Furthermore, it struggles to handle linearly non-separable data. To address these limitations, from the perspective of manifold learning, we reformulate multi-view <italic>K</i>-means into a manifold-based multi-view clustering formulation that eliminates the need for computing centroid matrix. This reformulation ensures consistency between the manifold structure and the data labels. Building on this, we propose a novel multi-view <italic>K</i>-means model incorporating the tensor rank constraint. Our model employs the indicator matrices from different views to construct a third-order tensor, whose rank is minimized via the tensor Schatten <italic>p</i>-norm. This approach effectively leverages the complementary information across views. By utilizing different distance functions, our proposed model can effectively handle linearly non-separable data. Extensive experimental results on multiple databases demonstrate the superiority of our proposed model.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 4","pages":"3175-3182"},"PeriodicalIF":0.0,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142879948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Infrared and Visible Image Fusion: From Data Compatibility to Task Adaption 红外与可见光图像融合:从数据兼容到任务适应
Pub Date : 2024-12-23 DOI: 10.1109/TPAMI.2024.3521416
Jinyuan Liu;Guanyao Wu;Zhu Liu;Di Wang;Zhiying Jiang;Long Ma;Wei Zhong;Xin Fan;Risheng Liu
Infrared-visible image fusion (IVIF) is a fundamental and critical task in the field of computer vision. Its aim is to integrate the unique characteristics of both infrared and visible spectra into a holistic representation. Since 2018, growing amount and diversity IVIF approaches step into a deep-learning era, encompassing introduced a broad spectrum of networks or loss functions for improving visual enhancement. As research deepens and practical demands grow, several intricate issues like data compatibility, perception accuracy, and efficiency cannot be ignored. Regrettably, there is a lack of recent surveys that comprehensively introduce and organize this expanding domain of knowledge. Given the current rapid development, this paper aims to fill the existing gap by providing a comprehensive survey that covers a wide array of aspects. Initially, we introduce a multi-dimensional framework to elucidate the prevalent learning-based IVIF methodologies, spanning topics from basic visual enhancement strategies to data compatibility, task adaptability, and further extensions. Subsequently, we delve into a profound analysis of these new approaches, offering a detailed lookup table to clarify their core ideas. Last but not the least, We also summarize performance comparisons quantitatively and qualitatively, covering registration, fusion and follow-up high-level tasks. Beyond delving into the technical nuances of these learning-based fusion approaches, we also explore potential future directions and open issues that warrant further exploration by the community.
{"title":"Infrared and Visible Image Fusion: From Data Compatibility to Task Adaption","authors":"Jinyuan Liu;Guanyao Wu;Zhu Liu;Di Wang;Zhiying Jiang;Long Ma;Wei Zhong;Xin Fan;Risheng Liu","doi":"10.1109/TPAMI.2024.3521416","DOIUrl":"10.1109/TPAMI.2024.3521416","url":null,"abstract":"Infrared-visible image fusion (IVIF) is a fundamental and critical task in the field of computer vision. Its aim is to integrate the unique characteristics of both infrared and visible spectra into a holistic representation. Since 2018, growing amount and diversity IVIF approaches step into a deep-learning era, encompassing introduced a broad spectrum of networks or loss functions for improving visual enhancement. As research deepens and practical demands grow, several intricate issues like data compatibility, perception accuracy, and efficiency cannot be ignored. Regrettably, there is a lack of recent surveys that comprehensively introduce and organize this expanding domain of knowledge. Given the current rapid development, this paper aims to fill the existing gap by providing a comprehensive survey that covers a wide array of aspects. Initially, we introduce a multi-dimensional framework to elucidate the prevalent learning-based IVIF methodologies, spanning topics from basic visual enhancement strategies to data compatibility, task adaptability, and further extensions. Subsequently, we delve into a profound analysis of these new approaches, offering a detailed lookup table to clarify their core ideas. Last but not the least, We also summarize performance comparisons quantitatively and qualitatively, covering registration, fusion and follow-up high-level tasks. Beyond delving into the technical nuances of these learning-based fusion approaches, we also explore potential future directions and open issues that warrant further exploration by the community.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 4","pages":"2349-2369"},"PeriodicalIF":0.0,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142879950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-Objective Convex Quantization for Efficient Model Compression 高效模型压缩的多目标凸量化
Pub Date : 2024-12-23 DOI: 10.1109/TPAMI.2024.3521589
Chunxiao Fan;Dan Guo;Ziqi Wang;Meng Wang
Quantization is one of the efficient model compression methods, which represents the network with fixed-point or low-bit numbers. Existing quantization methods address the network quantization by treating it as a single-objective optimization that pursues high accuracy (performance optimization) while keeping the quantization constraint. However, owing to the non-differentiability of the quantization operation, it is challenging to integrate the quantization operation into the network training and achieve optimal parameters. In this paper, a novel multi-objective convex quantization for efficient model compression is proposed. Specifically, the network training is modeled as a multi-objective optimization to find the network with both high precision and low quantization error (actually, these two goals are somewhat contradictory and affect each other). To achieve effective multi-objective optimization, this paper designs a quantization error function that is differentiable and ensures the computation convexity in each period, so as to avoid the non-differentiable back-propagation of the quantization operation. Then, we perform a time-series self-distillation training scheme on the multi-objective optimization framework, which distills its past softened labels and combines the hard targets to guarantee controllable and stable performance convergence during training. At last and more importantly, a new dynamic Lagrangian coefficient adaption is designed to adjust the gradient magnitude of quantization loss and performance loss and balance the two losses during training processing. The proposed method is evaluated on well-known benchmarks: MNIST, CIFAR-10/100, ImageNet, Penn Treebank and Microsoft COCO, and experimental results show that the proposed method achieves outstanding performance compared to existing methods.
{"title":"Multi-Objective Convex Quantization for Efficient Model Compression","authors":"Chunxiao Fan;Dan Guo;Ziqi Wang;Meng Wang","doi":"10.1109/TPAMI.2024.3521589","DOIUrl":"10.1109/TPAMI.2024.3521589","url":null,"abstract":"Quantization is one of the efficient model compression methods, which represents the network with fixed-point or low-bit numbers. Existing quantization methods address the network quantization by treating it as a single-objective optimization that pursues high accuracy (performance optimization) while keeping the quantization constraint. However, owing to the non-differentiability of the quantization operation, it is challenging to integrate the quantization operation into the network training and achieve optimal parameters. In this paper, a novel multi-objective convex quantization for efficient model compression is proposed. Specifically, the network training is modeled as a multi-objective optimization to find the network with both high precision and low quantization error (actually, these two goals are somewhat contradictory and affect each other). To achieve effective multi-objective optimization, this paper designs a quantization error function that is differentiable and ensures the computation convexity in each period, so as to avoid the non-differentiable back-propagation of the quantization operation. Then, we perform a time-series self-distillation training scheme on the multi-objective optimization framework, which distills its past softened labels and combines the hard targets to guarantee controllable and stable performance convergence during training. At last and more importantly, a new dynamic Lagrangian coefficient adaption is designed to adjust the gradient magnitude of quantization loss and performance loss and balance the two losses during training processing. The proposed method is evaluated on well-known benchmarks: MNIST, CIFAR-10/100, ImageNet, Penn Treebank and Microsoft COCO, and experimental results show that the proposed method achieves outstanding performance compared to existing methods.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 4","pages":"2313-2329"},"PeriodicalIF":0.0,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142879595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Federated Multi-View K-Means Clustering 联邦多视图k -均值聚类
Pub Date : 2024-12-20 DOI: 10.1109/TPAMI.2024.3520708
Miin-Shen Yang;Kristina P. Sinaga
The increasing effect of Internet of Things (IoT) unlocks the massive volume of the availability of Big Data in many fields. Generally, these Big Data may be in a non-independently and identically distributed fashion (non-IID). In this paper, we have contributions in such a way enable multi-view k-means (MVKM) clustering to maintain the privacy of each database by allowing MVKM to be operated on the local principle of clients’ multi-view data. This work integrates the exponential distance to transform the weighted Euclidean distance on MVKM so that it can make full use of development in federated learning via the MVKM clustering algorithm. The proposed algorithm, called a federated MVKM (Fed-MVKM), can provide a whole new level adding a lot of new ideas to produce a much better output. The proposed Fed-MVKM is highly suitable for clustering large data sets. To demonstrate its efficient and applicable, we implement a synthetic and six real multi-view data sets and then perform Federated Peter-Clark in Huang et al. 2023 for causal inference setting to split the data instances over multiple clients, efficiently. The results show that shared-models based local cluster centers with data-driven in the federated environment can generate a satisfying final pattern of one multi-view data that simultaneously improve the clustering performance of (non-federated) MVKM clustering algorithms.
{"title":"Federated Multi-View K-Means Clustering","authors":"Miin-Shen Yang;Kristina P. Sinaga","doi":"10.1109/TPAMI.2024.3520708","DOIUrl":"10.1109/TPAMI.2024.3520708","url":null,"abstract":"The increasing effect of Internet of Things (IoT) unlocks the massive volume of the availability of Big Data in many fields. Generally, these Big Data may be in a non-independently and identically distributed fashion (non-IID). In this paper, we have contributions in such a way enable multi-view k-means (MVKM) clustering to maintain the privacy of each database by allowing MVKM to be operated on the local principle of clients’ multi-view data. This work integrates the exponential distance to transform the weighted Euclidean distance on MVKM so that it can make full use of development in federated learning via the MVKM clustering algorithm. The proposed algorithm, called a federated MVKM (Fed-MVKM), can provide a whole new level adding a lot of new ideas to produce a much better output. The proposed Fed-MVKM is highly suitable for clustering large data sets. To demonstrate its efficient and applicable, we implement a synthetic and six real multi-view data sets and then perform Federated Peter-Clark in Huang et al. 2023 for causal inference setting to split the data instances over multiple clients, efficiently. The results show that shared-models based local cluster centers with data-driven in the federated environment can generate a satisfying final pattern of one multi-view data that simultaneously improve the clustering performance of (non-federated) MVKM clustering algorithms.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 4","pages":"2446-2459"},"PeriodicalIF":0.0,"publicationDate":"2024-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142867130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Demystify Transformers & Convolutions in Modern Image Deep Networks 揭示现代图像深度网络中的变形和卷积
Pub Date : 2024-12-20 DOI: 10.1109/TPAMI.2024.3520508
Xiaowei Hu;Min Shi;Weiyun Wang;Sitong Wu;Linjie Xing;Wenhai Wang;Xizhou Zhou;Lewei Lu;Jie Zhou;Xiaogang Wang;Yu Qiao;Jifeng Dai
Vision transformers have gained popularity recently, leading to the development of new vision backbones with improved features and consistent performance gains. However, these advancements are not solely attributable to novel feature transformation designs; certain benefits also arise from advanced network-level and block-level architectures. This paper aims to identify the real gains of popular convolution and attention operators through a detailed study. We find that the key difference among these feature transformation modules, such as attention or convolution, lies in their spatial feature aggregation approach, known as the “spatial token mixer” (STM). To facilitate an impartial comparison, we introduce a unified architecture to neutralize the impact of divergent network-level and block-level designs. Subsequently, various STMs are integrated into this unified framework for comprehensive comparative analysis. Our experiments on various tasks and an analysis of inductive bias show a significant performance boost due to advanced network-level and block-level designs, but performance differences persist among different STMs. Our detailed analysis also reveals various findings about different STMs, including effective receptive fields, invariance, and adversarial robustness tests.
{"title":"Demystify Transformers & Convolutions in Modern Image Deep Networks","authors":"Xiaowei Hu;Min Shi;Weiyun Wang;Sitong Wu;Linjie Xing;Wenhai Wang;Xizhou Zhou;Lewei Lu;Jie Zhou;Xiaogang Wang;Yu Qiao;Jifeng Dai","doi":"10.1109/TPAMI.2024.3520508","DOIUrl":"10.1109/TPAMI.2024.3520508","url":null,"abstract":"Vision transformers have gained popularity recently, leading to the development of new vision backbones with improved features and consistent performance gains. However, these advancements are not solely attributable to novel feature transformation designs; certain benefits also arise from advanced network-level and block-level architectures. This paper aims to identify the real gains of popular convolution and attention operators through a detailed study. We find that the key difference among these feature transformation modules, such as attention or convolution, lies in their spatial feature aggregation approach, known as the “spatial token mixer” (STM). To facilitate an impartial comparison, we introduce a unified architecture to neutralize the impact of divergent network-level and block-level designs. Subsequently, various STMs are integrated into this unified framework for comprehensive comparative analysis. Our experiments on various tasks and an analysis of inductive bias show a significant performance boost due to advanced network-level and block-level designs, but performance differences persist among different STMs. Our detailed analysis also reveals various findings about different STMs, including effective receptive fields, invariance, and adversarial robustness tests.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 4","pages":"2416-2428"},"PeriodicalIF":0.0,"publicationDate":"2024-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142867131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Practically Unbiased Pairwise Loss for Recommendation With Implicit Feedback 隐式反馈推荐的实际无偏成对损失
Pub Date : 2024-12-19 DOI: 10.1109/TPAMI.2024.3519711
Tianwei Cao;Qianqian Xu;Zhiyong Yang;Zhanyu Ma;Qingming Huang
Recommender systems have been widely employed on various online platforms to improve user experience. In these systems, recommendation models are often learned from the users’ historical behaviors that are automatically collected. Notably, recommender systems differ slightly from ordinary supervised learning tasks. In recommender systems, there is an exposure mechanism that decides which items could be presented to each specific user, which breaks the i.i.d assumption of supervised learning and brings biases into the recommendation models. In this paper, we focus on unbiased ranking loss weighted by inversed propensity scores (IPS), which are widely used in recommendations with implicit feedback labels. More specifically, we first highlight the fact that there is a gap between theory and practice in IPS-weighted unbiased loss. The existing pairwise loss could be theoretically unbiased by adopting an IPS weighting scheme. Unfortunately, the propensity scores are hard to estimate due to the inaccessibility of each user-item pair's true exposure status. In practical scenarios, we can only approximate the propensity scores. In this way, the theoretically unbiased loss would be still practically biased. To solve this problem, we first construct a theoretical framework to obtain a generalization upper bound of the current theoretically unbiased loss. The bound illustrates that we can ensure the theoretically unbiased loss's generalization ability if we lower its implementation loss and practical bias at the same time. To that aim, we suggest treating feedback label $Y_{ui}$ as a noisy proxy for exposure result $O_{ui}$ for each user-item pair $(u, i)$. Here we assume the noise rate meets the condition that $hat{P}(O_{ui}=1, Y_{ui}ne O_{ui}) < 1/2$. According to our analysis, this is a mild assumption that can be satisfied by many real-world applications. Based on this, we could train an accurate propensity model directly by leveraging a noise-resistant loss function. Then we could construct a practically unbiased recommendation model weighted by precise propensity scores. Lastly, experimental findings on public datasets demonstrate our suggested method's effectiveness.
{"title":"Practically Unbiased Pairwise Loss for Recommendation With Implicit Feedback","authors":"Tianwei Cao;Qianqian Xu;Zhiyong Yang;Zhanyu Ma;Qingming Huang","doi":"10.1109/TPAMI.2024.3519711","DOIUrl":"10.1109/TPAMI.2024.3519711","url":null,"abstract":"Recommender systems have been widely employed on various online platforms to improve user experience. In these systems, recommendation models are often learned from the users’ historical behaviors that are automatically collected. Notably, recommender systems differ slightly from ordinary supervised learning tasks. In recommender systems, there is an exposure mechanism that decides which items could be presented to each specific user, which breaks the i.i.d assumption of supervised learning and brings biases into the recommendation models. In this paper, we focus on unbiased ranking loss weighted by inversed propensity scores (IPS), which are widely used in recommendations with implicit feedback labels. More specifically, we first highlight the fact that there is a gap between theory and practice in IPS-weighted unbiased loss. The existing pairwise loss could be theoretically unbiased by adopting an IPS weighting scheme. Unfortunately, the propensity scores are hard to estimate due to the inaccessibility of each user-item pair's true exposure status. In practical scenarios, we can only approximate the propensity scores. In this way, the theoretically unbiased loss would be still practically biased. To solve this problem, we first construct a theoretical framework to obtain a generalization upper bound of the current theoretically unbiased loss. The bound illustrates that we can ensure the theoretically unbiased loss's generalization ability if we lower its implementation loss and practical bias at the same time. To that aim, we suggest treating feedback label <inline-formula><tex-math>$Y_{ui}$</tex-math></inline-formula> as a noisy proxy for exposure result <inline-formula><tex-math>$O_{ui}$</tex-math></inline-formula> for each user-item pair <inline-formula><tex-math>$(u, i)$</tex-math></inline-formula>. Here we assume the noise rate meets the condition that <inline-formula><tex-math>$hat{P}(O_{ui}=1, Y_{ui}ne O_{ui}) &lt; 1/2$</tex-math></inline-formula>. According to our analysis, this is a mild assumption that can be satisfied by many real-world applications. Based on this, we could train an accurate propensity model directly by leveraging a noise-resistant loss function. Then we could construct a practically unbiased recommendation model weighted by precise propensity scores. Lastly, experimental findings on public datasets demonstrate our suggested method's effectiveness.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 4","pages":"2460-2474"},"PeriodicalIF":0.0,"publicationDate":"2024-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142858472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quantum Gated Recurrent Neural Networks 量子门控循环神经网络
Pub Date : 2024-12-18 DOI: 10.1109/TPAMI.2024.3519605
Yanan Li;Zhimin Wang;Ruipeng Xing;Changheng Shao;Shangshang Shi;Jiaxin Li;Guoqiang Zhong;Yongjian Gu
The exploration of quantum advantages with Quantum Neural Networks (QNNs) is an exciting endeavor. Recurrent neural networks, the widely used framework in deep learning, suffer from the gradient vanishing and exploding problem, which limits their ability to learn long-term dependencies. To address this challenge, in this work, we develop the sequential model of Quantum Gated Recurrent Neural Networks (QGRNNs). This model naturally integrates the gating mechanism into the framework of the variational ansatz circuit of QNNs, enabling efficient execution on near-term quantum devices. We present rigorous proof that QGRNNs can preserve the gradient norm of long-term interactions throughout the recurrent network, enabling efficient learning of long-term dependencies. Meanwhile, the architectural features of QGRNNs can effectively mitigate the barren plateau phenomenon. The effectiveness of QGRNNs in sequential learning is convincingly demonstrated through various typical tasks, including solving the adding problem, learning gene regulatory networks, and predicting stock prices. The hardware-efficient architecture and superior performance of our QGRNNs indicate their promising potential for finding quantum advantageous applications in the near term.
{"title":"Quantum Gated Recurrent Neural Networks","authors":"Yanan Li;Zhimin Wang;Ruipeng Xing;Changheng Shao;Shangshang Shi;Jiaxin Li;Guoqiang Zhong;Yongjian Gu","doi":"10.1109/TPAMI.2024.3519605","DOIUrl":"10.1109/TPAMI.2024.3519605","url":null,"abstract":"The exploration of quantum advantages with Quantum Neural Networks (QNNs) is an exciting endeavor. Recurrent neural networks, the widely used framework in deep learning, suffer from the gradient vanishing and exploding problem, which limits their ability to learn long-term dependencies. To address this challenge, in this work, we develop the sequential model of Quantum Gated Recurrent Neural Networks (QGRNNs). This model naturally integrates the gating mechanism into the framework of the variational ansatz circuit of QNNs, enabling efficient execution on near-term quantum devices. We present rigorous proof that QGRNNs can preserve the gradient norm of long-term interactions throughout the recurrent network, enabling efficient learning of long-term dependencies. Meanwhile, the architectural features of QGRNNs can effectively mitigate the barren plateau phenomenon. The effectiveness of QGRNNs in sequential learning is convincingly demonstrated through various typical tasks, including solving the adding problem, learning gene regulatory networks, and predicting stock prices. The hardware-efficient architecture and superior performance of our QGRNNs indicate their promising potential for finding quantum advantageous applications in the near term.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 4","pages":"2493-2504"},"PeriodicalIF":0.0,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142848871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
STDatav2: Accessing Efficient Black-Box Stealing for Adversarial Attacks STDatav2:为对抗性攻击获取高效黑盒窃取技术
Pub Date : 2024-12-18 DOI: 10.1109/TPAMI.2024.3519803
Xuxiang Sun;Gong Cheng;Hongda Li;Chunbo Lang;Junwei Han
On account of the extreme settings, stealing the black-box model without its training data is difficult in practice. On this topic, along the lines of data diversity, this paper substantially makes the following improvements based on our conference version (dubbed STDatav1, short for Surrogate Training Data). First, to mitigate the undesirable impacts of the potential mode collapse while training the generator, we propose the joint-data optimization scheme, which utilizes both the synthesized data and the proxy data to optimize the surrogate model. Second, we propose the self-conditional data synthesis framework, an interesting effort that builds the pseudo-class mapping framework via grouping class information extraction to hold the class-specific constraints while holding the diversity. Within this new framework, we inherit and integrate the class-specific constraints of STDatav1 and design a dual cross-entropy loss to fit this new framework. Finally, to facilitate comprehensive evaluations, we perform experiments on four commonly adopted datasets, and a total of eight kinds of models are employed. These assessments witness the considerable performance gains compared to our early work and demonstrate the competitive ability and promising potential of our approach.
{"title":"STDatav2: Accessing Efficient Black-Box Stealing for Adversarial Attacks","authors":"Xuxiang Sun;Gong Cheng;Hongda Li;Chunbo Lang;Junwei Han","doi":"10.1109/TPAMI.2024.3519803","DOIUrl":"10.1109/TPAMI.2024.3519803","url":null,"abstract":"On account of the extreme settings, stealing the black-box model without its training data is difficult in practice. On this topic, along the lines of data diversity, this paper substantially makes the following improvements based on our conference version (dubbed STDatav1, short for Surrogate Training Data). First, to mitigate the undesirable impacts of the potential mode collapse while training the generator, we propose the joint-data optimization scheme, which utilizes both the synthesized data and the proxy data to optimize the surrogate model. Second, we propose the self-conditional data synthesis framework, an interesting effort that builds the pseudo-class mapping framework via grouping class information extraction to hold the class-specific constraints while holding the diversity. Within this new framework, we inherit and integrate the class-specific constraints of STDatav1 and design a dual cross-entropy loss to fit this new framework. Finally, to facilitate comprehensive evaluations, we perform experiments on four commonly adopted datasets, and a total of eight kinds of models are employed. These assessments witness the considerable performance gains compared to our early work and demonstrate the competitive ability and promising potential of our approach.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 4","pages":"2429-2445"},"PeriodicalIF":0.0,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142849396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Self-Supervised Anomaly Detection With Neural Transformations 基于神经变换的自监督异常检测
Pub Date : 2024-12-18 DOI: 10.1109/TPAMI.2024.3519543
Chen Qiu;Marius Kloft;Stephan Mandt;Maja Rudolph
Data augmentation plays a critical role in self-supervised learning, including anomaly detection. While hand-crafted transformations such as image rotations can achieve impressive performance on image data, effective transformations of non-image data are lacking. In this work, we study learning such transformations for end-to-end anomaly detection on arbitrary data. We find that a contrastive loss–which encourages learning diverse data transformations while preserving the relevant semantic content of the data–is more suitable than previously proposed losses for transformation learning, a fact that we prove theoretically and empirically. We demonstrate that anomaly detection using neural transformation learning can achieve state-of-the-art results for time series data, tabular data, text data and graph data. Furthermore, our approach can make image anomaly detection more interpretable by learning transformations at different levels of abstraction.
{"title":"Self-Supervised Anomaly Detection With Neural Transformations","authors":"Chen Qiu;Marius Kloft;Stephan Mandt;Maja Rudolph","doi":"10.1109/TPAMI.2024.3519543","DOIUrl":"10.1109/TPAMI.2024.3519543","url":null,"abstract":"Data augmentation plays a critical role in self-supervised learning, including anomaly detection. While hand-crafted transformations such as image rotations can achieve impressive performance on image data, effective transformations of non-image data are lacking. In this work, we study <italic>learning</i> such transformations for end-to-end anomaly detection on arbitrary data. We find that a contrastive loss–which encourages learning diverse data transformations while preserving the relevant semantic content of the data–is more suitable than previously proposed losses for transformation learning, a fact that we prove theoretically and empirically. We demonstrate that anomaly detection using neural transformation learning can achieve state-of-the-art results for time series data, tabular data, text data and graph data. Furthermore, our approach can make image anomaly detection more interpretable by learning transformations at different levels of abstraction.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 3","pages":"2170-2185"},"PeriodicalIF":0.0,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10806806","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142848870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE transactions on pattern analysis and machine intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1