Pub Date : 2024-07-23DOI: 10.1007/s13042-024-02279-0
Seyedsaman Emami, Gonzalo Martínez-Muñoz
This paper presents a computationally efficient variant of Gradient Boosting (GB) for multi-class classification and multi-output regression tasks. Standard GB uses a 1-vs-all strategy for classification tasks with more than two classes. This strategy entails that one tree per class and iteration has to be trained. In this work, we propose the use of multi-output regressors as base models to handle the multi-class problem as a single task. In addition, the proposed modification allows the model to learn multi-output regression problems. An extensive comparison with other multi-output based Gradient Boosting methods is carried out in terms of generalization and computational efficiency. The proposed method showed the best trade-off between generalization ability and training and prediction speeds. Furthermore, an analysis of space and time complexity was undertaken.
{"title":"Condensed-gradient boosting","authors":"Seyedsaman Emami, Gonzalo Martínez-Muñoz","doi":"10.1007/s13042-024-02279-0","DOIUrl":"https://doi.org/10.1007/s13042-024-02279-0","url":null,"abstract":"<p>This paper presents a computationally efficient variant of Gradient Boosting (GB) for multi-class classification and multi-output regression tasks. Standard GB uses a 1-vs-all strategy for classification tasks with more than two classes. This strategy entails that one tree per class and iteration has to be trained. In this work, we propose the use of multi-output regressors as base models to handle the multi-class problem as a single task. In addition, the proposed modification allows the model to learn multi-output regression problems. An extensive comparison with other multi-output based Gradient Boosting methods is carried out in terms of generalization and computational efficiency. The proposed method showed the best trade-off between generalization ability and training and prediction speeds. Furthermore, an analysis of space and time complexity was undertaken.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"8 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141784937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-23DOI: 10.1007/s13042-024-02287-0
Hui Tang, Yichang Li, Zhong Jin
Facial Expression Recognition (FER) is crucial for human-computer interaction and has achieved satisfactory results on lab-collected datasets. However, occlusion and head pose variation in the real world make FER extremely challenging due to facial information deficiency. This paper proposes a novel Dual Stream Attention Network (DSAN) for occlusion and head pose robust FER. Specifically, DSAN consists of a Global Feature Element-based Attention Network (GFE-AN) and a Multi-Feature Fusion-based Attention Network (MFF-AN). A sparse attention block and a feature recalibration loss designed in GFE-AN selectively emphasize feature elements meaningful for facial expression and suppress those unrelated to facial expression. And a lightweight local feature attention block is customized in MFF-AN to extract rich semantic information from different representation sub-spaces. In addition, DSAN takes into account computation overhead minimization when designing model architecture. Extensive experiments on public benchmarks demonstrate that the proposed DSAN outperforms the state-of-the-art methods with 89.70% on RAF-DB, 89.93% on FERPlus, 65.77% on AffectNet-7, 62.13% on AffectNet-8. Moreover, the parameter size of DSAN is only 11.33M, which is lightweight compared to most of the recent in-the-wild FER algorithms.
{"title":"A dual stream attention network for facial expression recognition in the wild","authors":"Hui Tang, Yichang Li, Zhong Jin","doi":"10.1007/s13042-024-02287-0","DOIUrl":"https://doi.org/10.1007/s13042-024-02287-0","url":null,"abstract":"<p>Facial Expression Recognition (FER) is crucial for human-computer interaction and has achieved satisfactory results on lab-collected datasets. However, occlusion and head pose variation in the real world make FER extremely challenging due to facial information deficiency. This paper proposes a novel Dual Stream Attention Network (DSAN) for occlusion and head pose robust FER. Specifically, DSAN consists of a Global Feature Element-based Attention Network (GFE-AN) and a Multi-Feature Fusion-based Attention Network (MFF-AN). A sparse attention block and a feature recalibration loss designed in GFE-AN selectively emphasize feature elements meaningful for facial expression and suppress those unrelated to facial expression. And a lightweight local feature attention block is customized in MFF-AN to extract rich semantic information from different representation sub-spaces. In addition, DSAN takes into account computation overhead minimization when designing model architecture. Extensive experiments on public benchmarks demonstrate that the proposed DSAN outperforms the state-of-the-art methods with 89.70% on RAF-DB, 89.93% on FERPlus, 65.77% on AffectNet-7, 62.13% on AffectNet-8. Moreover, the parameter size of DSAN is only 11.33M, which is lightweight compared to most of the recent in-the-wild FER algorithms.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"37 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141770211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-22DOI: 10.1007/s13042-024-02290-5
Jun Hu, Guanghao Wei, Shuyin Xia, Guoyin Wang
Deep neural networks have been widely applied in many fields, but it is found that they are vulnerable to adversarial examples, which can mislead the DNN-based models with imperceptible perturbations. Many adversarial attack methods can achieve great success rates when attacking white-box models, but they usually exhibit poor transferability when attacking black-box models. Momentum iterative gradient-based methods can effectively improve the transferability of adversarial examples. Still, the momentum update mechanism of existing methods may lead to a problem of unstable gradient update direction and result in poor local optima. In this paper, we propose an enhanced spatial momentum iterative gradient-based adversarial attack method. Specifically, we introduce the spatial domain momentum accumulation mechanism. Instead of only accumulating the gradients of data points on the optimization path in the gradient update process, we additionally accumulate the average gradients of multiple sampling points within the neighborhood of data points. This mechanism fully utilizes the contextual gradient information of different regions within the image to smooth the accumulated gradients and find a more stable gradient update direction, thus escaping from poor local optima. Empirical results on the standard ImageNet dataset demonstrate that our method can significantly improve the attack success rate of momentum iterative gradient-based methods and shows excellent attack performance not only against normally trained models but also against adversarial training and defense models, outperforming the state-of-the-art methods.
{"title":"Adversarial attack method based on enhanced spatial momentum","authors":"Jun Hu, Guanghao Wei, Shuyin Xia, Guoyin Wang","doi":"10.1007/s13042-024-02290-5","DOIUrl":"https://doi.org/10.1007/s13042-024-02290-5","url":null,"abstract":"<p>Deep neural networks have been widely applied in many fields, but it is found that they are vulnerable to adversarial examples, which can mislead the DNN-based models with imperceptible perturbations. Many adversarial attack methods can achieve great success rates when attacking white-box models, but they usually exhibit poor transferability when attacking black-box models. Momentum iterative gradient-based methods can effectively improve the transferability of adversarial examples. Still, the momentum update mechanism of existing methods may lead to a problem of unstable gradient update direction and result in poor local optima. In this paper, we propose an enhanced spatial momentum iterative gradient-based adversarial attack method. Specifically, we introduce the spatial domain momentum accumulation mechanism. Instead of only accumulating the gradients of data points on the optimization path in the gradient update process, we additionally accumulate the average gradients of multiple sampling points within the neighborhood of data points. This mechanism fully utilizes the contextual gradient information of different regions within the image to smooth the accumulated gradients and find a more stable gradient update direction, thus escaping from poor local optima. Empirical results on the standard ImageNet dataset demonstrate that our method can significantly improve the attack success rate of momentum iterative gradient-based methods and shows excellent attack performance not only against normally trained models but also against adversarial training and defense models, outperforming the state-of-the-art methods.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"81 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141738346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-22DOI: 10.1007/s13042-024-02272-7
Gang Liu, Jiaying Xu, Shanshan Zhao, Rui Zhang, Xiaoyuan Li, Shanshan Guo, Yajing Pang
Hyperspectral image (HSI) classification faces the challenges of large and complex data and costly training labels. Existing methods for small-sample HSI classification may not achieve good generalization because they pursue powerful feature extraction and nonlinear mapping abilities. We argue that small samples need deep feature extraction but weak nonlinear mapping to achieve generalization. Based on this, we propose a Deep Feature Dendrite (DFD) method, which consists of two parts: a deep feature extraction part that uses a convolution-tokenization-attention module to effectively extract spatial-spectral features, and a controllable mapping part that uses a residual dendrite network to perform weak mapping and enhance generalization ability. We conducted experiments on four standard datasets, and the results show that our method has higher classification accuracy than other existing methods. Significance: This paper pioneers and verifies weak mapping and generalization for HSI classification (new ideas). DFD code is available at https://github.com/liugang1234567/DFD
{"title":"Deep feature dendrite with weak mapping for small-sample hyperspectral image classification","authors":"Gang Liu, Jiaying Xu, Shanshan Zhao, Rui Zhang, Xiaoyuan Li, Shanshan Guo, Yajing Pang","doi":"10.1007/s13042-024-02272-7","DOIUrl":"https://doi.org/10.1007/s13042-024-02272-7","url":null,"abstract":"<p>Hyperspectral image (HSI) classification faces the challenges of large and complex data and costly training labels. Existing methods for small-sample HSI classification may not achieve good generalization because they pursue powerful feature extraction and nonlinear mapping abilities. We argue that small samples need deep feature extraction but weak nonlinear mapping to achieve generalization. Based on this, we propose a Deep Feature Dendrite (DFD) method, which consists of two parts: a deep feature extraction part that uses a convolution-tokenization-attention module to effectively extract spatial-spectral features, and a controllable mapping part that uses a residual dendrite network to perform weak mapping and enhance generalization ability. We conducted experiments on four standard datasets, and the results show that our method has higher classification accuracy than other existing methods. Significance: This paper pioneers and verifies weak mapping and generalization for HSI classification (new ideas). DFD code is available at https://github.com/liugang1234567/DFD</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"6 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141738345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-19DOI: 10.1007/s13042-024-02288-z
Agyemang Paul, Yuxuan Wan, Boyu Chen, Zhefu Wu
Visual recommendation systems have shown remarkable performance by leveraging consumer feedback and the visual attributes of products. However, recent concerns have arisen regarding the decline in recommendation quality when these systems are subjected to attacks that compromise the model parameters. While the fast gradient sign method (FGSM) and iterative FGSM (I-FGSM) are well-studied attack strategies, the momentum iterative FGSM (MI-FGSM), known for its superiority in the computer vision (CV) domain, has been overlooked. This oversight raises the possibility that visual recommender systems may be vulnerable to MI-FGSM, leading to significant vulnerabilities. Adversarial training, a regularization technique designed to withstand MI-FGSM attacks, could be a promising solution to bolster model resilience. In this research, we introduce MI-FGSM for visual recommendation. We propose the Sequential Pairwise Embedding Recommender with MI-FGSM (SPERM), a model that incorporates visual, temporal, and sequential information for visual recommendations through adversarial training. Specifically, we employ higher-order Markov chains to capture consumers’ sequential behaviors and utilize visual pairwise ranking to discern their visual preferences. To optimize the SPERM model, we employ a learning method based on AdaGrad. Moreover, we fortify the SPERM approach with adversarial training, where the primary objective is to train the model to withstand adversarial inputs introduced by MI-FGSM. Finally, we evaluate the effectiveness of our approach by conducting experiments on three Amazon datasets, comparing it with existing visual and adversarial recommendation algorithms. Our results demonstrate the efficacy of the proposed SPERM model in addressing adversarial attacks while enhancing visual recommendation performance.
{"title":"SPERM: sequential pairwise embedding recommendation with MI-FGSM","authors":"Agyemang Paul, Yuxuan Wan, Boyu Chen, Zhefu Wu","doi":"10.1007/s13042-024-02288-z","DOIUrl":"https://doi.org/10.1007/s13042-024-02288-z","url":null,"abstract":"<p>Visual recommendation systems have shown remarkable performance by leveraging consumer feedback and the visual attributes of products. However, recent concerns have arisen regarding the decline in recommendation quality when these systems are subjected to attacks that compromise the model parameters. While the fast gradient sign method (FGSM) and iterative FGSM (I-FGSM) are well-studied attack strategies, the momentum iterative FGSM (MI-FGSM), known for its superiority in the computer vision (CV) domain, has been overlooked. This oversight raises the possibility that visual recommender systems may be vulnerable to MI-FGSM, leading to significant vulnerabilities. Adversarial training, a regularization technique designed to withstand MI-FGSM attacks, could be a promising solution to bolster model resilience. In this research, we introduce MI-FGSM for visual recommendation. We propose the Sequential Pairwise Embedding Recommender with MI-FGSM (SPERM), a model that incorporates visual, temporal, and sequential information for visual recommendations through adversarial training. Specifically, we employ higher-order Markov chains to capture consumers’ sequential behaviors and utilize visual pairwise ranking to discern their visual preferences. To optimize the SPERM model, we employ a learning method based on AdaGrad. Moreover, we fortify the SPERM approach with adversarial training, where the primary objective is to train the model to withstand adversarial inputs introduced by MI-FGSM. Finally, we evaluate the effectiveness of our approach by conducting experiments on three Amazon datasets, comparing it with existing visual and adversarial recommendation algorithms. Our results demonstrate the efficacy of the proposed SPERM model in addressing adversarial attacks while enhancing visual recommendation performance.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"146 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141738483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-17DOI: 10.1007/s13042-024-02280-7
Sally El Hajjar, Fahed Abdallah, Hichem Omrani, Alain Khaled Chaaban, Muhammad Arif, Ryan Alturki, Mohammed J. AlGhamdi
Multi-view clustering techniques, especially spectral clustering methods, are quite popular today in the fields of machine learning and data science owing to the ever-growing diversity in data types and information sources. As the landscape of data continues to evolve, the need for advanced clustering approaches becomes increasingly crucial. In this context, the research in this study addresses the challenges posed by traditional multi-view spectral clustering techniques, offering a novel approach that simultaneously learns nonnegative embedding matrices and spectral embeddings. Moreover, the cluster label matrix, also known as the nonnegative embedding matrix, is split into two different types of matrices: (1) the shared nonnegative embedding matrix, which reflects the common cluster structure, (2) the individual nonnegative embedding matrices, which represent the unique cluster structure of each view. The proposed strategy allows us to effectively deal with noise and outliers in multiple views. The simultaneous optimization of the proposed model is solved efficiently with an alternating minimization scheme. The proposed method exhibits significant improvements, with an average accuracy enhancement of 4% over existing models, as demonstrated through extensive experiments on various real datasets. This highlights the efficacy of the approach in achieving superior clustering results.
{"title":"One-step graph-based multi-view clustering via specific and unified nonnegative embeddings","authors":"Sally El Hajjar, Fahed Abdallah, Hichem Omrani, Alain Khaled Chaaban, Muhammad Arif, Ryan Alturki, Mohammed J. AlGhamdi","doi":"10.1007/s13042-024-02280-7","DOIUrl":"https://doi.org/10.1007/s13042-024-02280-7","url":null,"abstract":"<p>Multi-view clustering techniques, especially spectral clustering methods, are quite popular today in the fields of machine learning and data science owing to the ever-growing diversity in data types and information sources. As the landscape of data continues to evolve, the need for advanced clustering approaches becomes increasingly crucial. In this context, the research in this study addresses the challenges posed by traditional multi-view spectral clustering techniques, offering a novel approach that simultaneously learns nonnegative embedding matrices and spectral embeddings. Moreover, the cluster label matrix, also known as the nonnegative embedding matrix, is split into two different types of matrices: (1) the shared nonnegative embedding matrix, which reflects the common cluster structure, (2) the individual nonnegative embedding matrices, which represent the unique cluster structure of each view. The proposed strategy allows us to effectively deal with noise and outliers in multiple views. The simultaneous optimization of the proposed model is solved efficiently with an alternating minimization scheme. The proposed method exhibits significant improvements, with an average accuracy enhancement of 4% over existing models, as demonstrated through extensive experiments on various real datasets. This highlights the efficacy of the approach in achieving superior clustering results.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"21 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141738484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-11DOI: 10.1007/s13042-024-02213-4
Yue Wang, Yong Yang, Mingsheng Liu, Xianghong Tang, Haibin Wang, Zhifeng Hao, Ze Shi, Gang Wang, Botao Jiang, Chunyang Liu
In the age of intelligent manufacturing, surface defect detection plays a pivotal role in the automated quality control of industrial products, constituting a fundamental aspect of smart factory evolution. Considering the diverse sizes and feature scales of surface defects on industrial products and the difficulty in procuring high-quality training samples, the achievement of real-time and high-quality surface defect detection through artificial intelligence technologies remains a formidable challenge. To address this, we introduce a defect detection approach grounded in the Fast Denoising Probabilistic Implicit Models. Firstly, we propose a noise predictor influenced by the spectral radius feature tensor of images. This enhancement augments the ability of generative model to capture nuanced details in non-defective areas, thus overcoming limitations in model versatility and detail portrayal. Furthermore, we present a loss function constraint based on the Perron-root. This is designed to incorporate the constraint within the representational space, ensuring the denoising model consistently produces high-quality samples. Lastly, comprehensive experiments on both the Magnetic Tile and Market-PCB datasets, benchmarked against nine most representative models, underscore the exemplary detection efficacy of our proposed approach.
{"title":"Industrial product surface defect detection via the fast denoising diffusion implicit model","authors":"Yue Wang, Yong Yang, Mingsheng Liu, Xianghong Tang, Haibin Wang, Zhifeng Hao, Ze Shi, Gang Wang, Botao Jiang, Chunyang Liu","doi":"10.1007/s13042-024-02213-4","DOIUrl":"https://doi.org/10.1007/s13042-024-02213-4","url":null,"abstract":"<p>In the age of intelligent manufacturing, surface defect detection plays a pivotal role in the automated quality control of industrial products, constituting a fundamental aspect of smart factory evolution. Considering the diverse sizes and feature scales of surface defects on industrial products and the difficulty in procuring high-quality training samples, the achievement of real-time and high-quality surface defect detection through artificial intelligence technologies remains a formidable challenge. To address this, we introduce a defect detection approach grounded in the Fast Denoising Probabilistic Implicit Models. Firstly, we propose a noise predictor influenced by the spectral radius feature tensor of images. This enhancement augments the ability of generative model to capture nuanced details in non-defective areas, thus overcoming limitations in model versatility and detail portrayal. Furthermore, we present a loss function constraint based on the Perron-root. This is designed to incorporate the constraint within the representational space, ensuring the denoising model consistently produces high-quality samples. Lastly, comprehensive experiments on both the Magnetic Tile and Market-PCB datasets, benchmarked against nine most representative models, underscore the exemplary detection efficacy of our proposed approach.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"25 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141587467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-09DOI: 10.1007/s13042-024-02277-2
Bufan Wang, Yongjun Zhang, Wei Long, Zhongwei Cui
Integrating convolutional neural networks (CNNs) and transformers has notably improved lightweight single image super-resolution (SISR) tasks. However, existing methods lack the capability to exploit multi-level contextual information, and transformer computations inherently add quadratic complexity. To address these issues, we propose a Joint features-Guided Linear Transformer and CNN Network (JGLTN) for efficient SISR, which is constructed by cascading modules composed of CNN layers and linear transformer layers. Specifically, in the CNN layer, our approach employs an inter-scale feature integration module (IFIM) to extract critical latent information across scales. Then, in the linear transformer layer, we design a joint feature-guided linear attention (JGLA). It jointly considers adjacent and extended regional features, dynamically assigning weights to convolutional kernels for contextual feature selection. This process garners multi-level contextual information, which is used to guide linear attention for effective information interaction. Moreover, we redesign the method of computing feature similarity within the self-attention, reducing its computational complexity to linear. Extensive experiments shows that our proposal outperforms state-of-the-art models while balancing performance and computational costs.
{"title":"Joint features-guided linear transformer and CNN for efficient image super-resolution","authors":"Bufan Wang, Yongjun Zhang, Wei Long, Zhongwei Cui","doi":"10.1007/s13042-024-02277-2","DOIUrl":"https://doi.org/10.1007/s13042-024-02277-2","url":null,"abstract":"<p>Integrating convolutional neural networks (CNNs) and transformers has notably improved lightweight single image super-resolution (SISR) tasks. However, existing methods lack the capability to exploit multi-level contextual information, and transformer computations inherently add quadratic complexity. To address these issues, we propose a <b>J</b>oint features-<b>G</b>uided <b>L</b>inear <b>T</b>ransformer and CNN <b>N</b>etwork (JGLTN) for efficient SISR, which is constructed by cascading modules composed of CNN layers and linear transformer layers. Specifically, in the CNN layer, our approach employs an inter-scale feature integration module (IFIM) to extract critical latent information across scales. Then, in the linear transformer layer, we design a joint feature-guided linear attention (JGLA). It jointly considers adjacent and extended regional features, dynamically assigning weights to convolutional kernels for contextual feature selection. This process garners multi-level contextual information, which is used to guide linear attention for effective information interaction. Moreover, we redesign the method of computing feature similarity within the self-attention, reducing its computational complexity to linear. Extensive experiments shows that our proposal outperforms state-of-the-art models while balancing performance and computational costs.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"15 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141577378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Multi-domain NMT aims to develop a parameter-sharing model for translating general and specific domains, such as biology, legal, etc., which often struggle with the parameter interference problem. Existing approaches typically tackle this issue by learning a domain-specific sub-network for each domain equally, but they ignore the significant data imbalance problem across domains. For instance, the training data for the general domain often outweighs the biological domain tenfold. In this paper, we observe a natural similarity between the general and specific domains, including shared vocabulary or similar sentence structure. We propose a novel parameter inheritance strategy to adaptively learn domain-specific child networks from the general domain. Our approach employs gradient similarity as the criterion for determining which parameters should be inherited or discarded between the general and specific domains. Extensive experiments on several multi-domain NMT corpora demonstrate that our method significantly outperforms several strong baselines. In addition, our method exhibits remarkable generalization performance in adapting to few-shot multi-domain NMT scenarios. Further investigations reveal that our method achieves good interpretability because the parameters learned by the child network from the general domain depend on the interconnectedness between the specific domain and the general domain.
{"title":"Inherit or discard: learning better domain-specific child networks from the general domain for multi-domain NMT","authors":"Jinlei Xu, Yonghua Wen, Yan Xiang, Shuting Jiang, Yuxin Huang, Zhengtao Yu","doi":"10.1007/s13042-024-02253-w","DOIUrl":"https://doi.org/10.1007/s13042-024-02253-w","url":null,"abstract":"<p>Multi-domain NMT aims to develop a parameter-sharing model for translating general and specific domains, such as biology, legal, etc., which often struggle with the parameter interference problem. Existing approaches typically tackle this issue by learning a domain-specific sub-network for each domain equally, but they ignore the significant data imbalance problem across domains. For instance, the training data for the general domain often outweighs the biological domain tenfold. In this paper, we observe a natural similarity between the general and specific domains, including shared vocabulary or similar sentence structure. We propose a novel parameter inheritance strategy to adaptively learn domain-specific child networks from the general domain. Our approach employs gradient similarity as the criterion for determining which parameters should be inherited or discarded between the general and specific domains. Extensive experiments on several multi-domain NMT corpora demonstrate that our method significantly outperforms several strong baselines. In addition, our method exhibits remarkable generalization performance in adapting to few-shot multi-domain NMT scenarios. Further investigations reveal that our method achieves good interpretability because the parameters learned by the child network from the general domain depend on the interconnectedness between the specific domain and the general domain.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"1 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141568634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-06DOI: 10.1007/s13042-024-02275-4
Xiangfa Song
Unsupervised feature selection (UFS), which involves selecting representative features from unlabeled high-dimensional data, has attracted much attention. Numerous self-representation-based models have been recently developed successfully for UFS. However, these models have two main problems. First, existing self-representation-based UFS models cannot effectively handle noise and outliers. Second, many graph-regularized self-representation-based UFS models typically construct a fixed graph to maintain the local structure of data. To overcome the above shortcomings, we propose a novel robust UFS model called self-representation with adaptive loss minimization via doubly stochastic graph regularization (SRALDS). Specifically, SRALDS uses an adaptive loss function to minimize the representation residual term, which may enhance the robustness of the model and diminish the effect of noise and outliers. Besides, rather than utilizing a fixed graph, SRALDS learns a high-quality doubly stochastic graph that more accurately captures the local structure of data. Finally, an efficient optimization algorithm is designed to obtain the optimal solution for SRALDS. Extensive experiments demonstrate the superior performance of SRALDS over several well-known UFS methods.
{"title":"Self-representation with adaptive loss minimization via doubly stochastic graph regularization for robust unsupervised feature selection","authors":"Xiangfa Song","doi":"10.1007/s13042-024-02275-4","DOIUrl":"https://doi.org/10.1007/s13042-024-02275-4","url":null,"abstract":"<p>Unsupervised feature selection (UFS), which involves selecting representative features from unlabeled high-dimensional data, has attracted much attention. Numerous self-representation-based models have been recently developed successfully for UFS. However, these models have two main problems. First, existing self-representation-based UFS models cannot effectively handle noise and outliers. Second, many graph-regularized self-representation-based UFS models typically construct a fixed graph to maintain the local structure of data. To overcome the above shortcomings, we propose a novel robust UFS model called self-representation with adaptive loss minimization via doubly stochastic graph regularization (SRALDS). Specifically, SRALDS uses an adaptive loss function to minimize the representation residual term, which may enhance the robustness of the model and diminish the effect of noise and outliers. Besides, rather than utilizing a fixed graph, SRALDS learns a high-quality doubly stochastic graph that more accurately captures the local structure of data. Finally, an efficient optimization algorithm is designed to obtain the optimal solution for SRALDS. Extensive experiments demonstrate the superior performance of SRALDS over several well-known UFS methods.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"6 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141568636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}