首页 > 最新文献

International Journal of Machine Learning and Cybernetics最新文献

英文 中文
ASR-Fed: agnostic straggler-resilient semi-asynchronous federated learning technique for secured drone network ASR-Fed:面向安全无人机网络的不可知的抗流浪者半同步联合学习技术
IF 3.1 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-15 DOI: 10.1007/s13042-024-02238-9
Vivian Ukamaka Ihekoronye, C. I. Nwakanma, Dong‐Seong Kim, Jae Min Lee
{"title":"ASR-Fed: agnostic straggler-resilient semi-asynchronous federated learning technique for secured drone network","authors":"Vivian Ukamaka Ihekoronye, C. I. Nwakanma, Dong‐Seong Kim, Jae Min Lee","doi":"10.1007/s13042-024-02238-9","DOIUrl":"https://doi.org/10.1007/s13042-024-02238-9","url":null,"abstract":"","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141648194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Privacy-preserving matrix factorization for recommendation systems using Gaussian mechanism and functional mechanism 使用高斯机制和函数机制的推荐系统隐私保护矩阵因式分解
IF 3.1 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-14 DOI: 10.1007/s13042-024-02276-3
Sohan Salahuddin Mugdho, Hafiz Imtiaz
{"title":"Privacy-preserving matrix factorization for recommendation systems using Gaussian mechanism and functional mechanism","authors":"Sohan Salahuddin Mugdho, Hafiz Imtiaz","doi":"10.1007/s13042-024-02276-3","DOIUrl":"https://doi.org/10.1007/s13042-024-02276-3","url":null,"abstract":"","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141649484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Industrial product surface defect detection via the fast denoising diffusion implicit model 通过快速去噪扩散隐含模型检测工业产品表面缺陷
IF 5.6 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-11 DOI: 10.1007/s13042-024-02213-4
Yue Wang, Yong Yang, Mingsheng Liu, Xianghong Tang, Haibin Wang, Zhifeng Hao, Ze Shi, Gang Wang, Botao Jiang, Chunyang Liu

In the age of intelligent manufacturing, surface defect detection plays a pivotal role in the automated quality control of industrial products, constituting a fundamental aspect of smart factory evolution. Considering the diverse sizes and feature scales of surface defects on industrial products and the difficulty in procuring high-quality training samples, the achievement of real-time and high-quality surface defect detection through artificial intelligence technologies remains a formidable challenge. To address this, we introduce a defect detection approach grounded in the Fast Denoising Probabilistic Implicit Models. Firstly, we propose a noise predictor influenced by the spectral radius feature tensor of images. This enhancement augments the ability of generative model to capture nuanced details in non-defective areas, thus overcoming limitations in model versatility and detail portrayal. Furthermore, we present a loss function constraint based on the Perron-root. This is designed to incorporate the constraint within the representational space, ensuring the denoising model consistently produces high-quality samples. Lastly, comprehensive experiments on both the Magnetic Tile and Market-PCB datasets, benchmarked against nine most representative models, underscore the exemplary detection efficacy of our proposed approach.

在智能制造时代,表面缺陷检测在工业产品的自动化质量控制中起着举足轻重的作用,是智能工厂演进的一个基本方面。考虑到工业产品表面缺陷的尺寸和特征尺度多种多样,且难以获得高质量的训练样本,通过人工智能技术实现实时、高质量的表面缺陷检测仍然是一项艰巨的挑战。为此,我们引入了一种基于快速去噪概率隐含模型的缺陷检测方法。首先,我们提出了一种受图像光谱半径特征张量影响的噪声预测器。这一改进增强了生成模型捕捉非缺陷区域细微细节的能力,从而克服了模型通用性和细节刻画方面的局限。此外,我们还提出了一种基于 Perron 根的损失函数约束。这样做的目的是将约束条件纳入表征空间,确保去噪模型始终能生成高质量的样本。最后,我们在磁瓦数据集和市场-PCB 数据集上进行了综合实验,以九种最具代表性的模型为基准,强调了我们提出的方法具有典范性的检测功效。
{"title":"Industrial product surface defect detection via the fast denoising diffusion implicit model","authors":"Yue Wang, Yong Yang, Mingsheng Liu, Xianghong Tang, Haibin Wang, Zhifeng Hao, Ze Shi, Gang Wang, Botao Jiang, Chunyang Liu","doi":"10.1007/s13042-024-02213-4","DOIUrl":"https://doi.org/10.1007/s13042-024-02213-4","url":null,"abstract":"<p>In the age of intelligent manufacturing, surface defect detection plays a pivotal role in the automated quality control of industrial products, constituting a fundamental aspect of smart factory evolution. Considering the diverse sizes and feature scales of surface defects on industrial products and the difficulty in procuring high-quality training samples, the achievement of real-time and high-quality surface defect detection through artificial intelligence technologies remains a formidable challenge. To address this, we introduce a defect detection approach grounded in the Fast Denoising Probabilistic Implicit Models. Firstly, we propose a noise predictor influenced by the spectral radius feature tensor of images. This enhancement augments the ability of generative model to capture nuanced details in non-defective areas, thus overcoming limitations in model versatility and detail portrayal. Furthermore, we present a loss function constraint based on the Perron-root. This is designed to incorporate the constraint within the representational space, ensuring the denoising model consistently produces high-quality samples. Lastly, comprehensive experiments on both the Magnetic Tile and Market-PCB datasets, benchmarked against nine most representative models, underscore the exemplary detection efficacy of our proposed approach.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":null,"pages":null},"PeriodicalIF":5.6,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141587467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Joint features-guided linear transformer and CNN for efficient image super-resolution 联合特征引导线性变换器和 CNN 实现高效图像超分辨率
IF 5.6 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-09 DOI: 10.1007/s13042-024-02277-2
Bufan Wang, Yongjun Zhang, Wei Long, Zhongwei Cui

Integrating convolutional neural networks (CNNs) and transformers has notably improved lightweight single image super-resolution (SISR) tasks. However, existing methods lack the capability to exploit multi-level contextual information, and transformer computations inherently add quadratic complexity. To address these issues, we propose a Joint features-Guided Linear Transformer and CNN Network (JGLTN) for efficient SISR, which is constructed by cascading modules composed of CNN layers and linear transformer layers. Specifically, in the CNN layer, our approach employs an inter-scale feature integration module (IFIM) to extract critical latent information across scales. Then, in the linear transformer layer, we design a joint feature-guided linear attention (JGLA). It jointly considers adjacent and extended regional features, dynamically assigning weights to convolutional kernels for contextual feature selection. This process garners multi-level contextual information, which is used to guide linear attention for effective information interaction. Moreover, we redesign the method of computing feature similarity within the self-attention, reducing its computational complexity to linear. Extensive experiments shows that our proposal outperforms state-of-the-art models while balancing performance and computational costs.

卷积神经网络(CNN)与变换器的结合显著改善了轻量级单图像超分辨率(SISR)任务。然而,现有的方法缺乏利用多层次上下文信息的能力,而且变换器计算本质上增加了二次复杂性。为了解决这些问题,我们提出了一种用于高效 SISR 的联合特征引导线性变换器和 CNN 网络(JGLTN),它由 CNN 层和线性变换器层组成的级联模块构建而成。具体来说,在 CNN 层,我们的方法采用了跨尺度特征整合模块(IFIM)来提取跨尺度的关键潜在信息。然后,在线性变换层中,我们设计了联合特征引导线性注意(JGLA)。它联合考虑相邻和扩展区域特征,动态分配卷积核的权重,以进行上下文特征选择。这一过程收集了多层次的上下文信息,用于引导线性注意,从而实现有效的信息交互。此外,我们还重新设计了在自我注意中计算特征相似性的方法,将其计算复杂度降低到线性水平。广泛的实验表明,我们的建议优于最先进的模型,同时兼顾了性能和计算成本。
{"title":"Joint features-guided linear transformer and CNN for efficient image super-resolution","authors":"Bufan Wang, Yongjun Zhang, Wei Long, Zhongwei Cui","doi":"10.1007/s13042-024-02277-2","DOIUrl":"https://doi.org/10.1007/s13042-024-02277-2","url":null,"abstract":"<p>Integrating convolutional neural networks (CNNs) and transformers has notably improved lightweight single image super-resolution (SISR) tasks. However, existing methods lack the capability to exploit multi-level contextual information, and transformer computations inherently add quadratic complexity. To address these issues, we propose a <b>J</b>oint features-<b>G</b>uided <b>L</b>inear <b>T</b>ransformer and CNN <b>N</b>etwork (JGLTN) for efficient SISR, which is constructed by cascading modules composed of CNN layers and linear transformer layers. Specifically, in the CNN layer, our approach employs an inter-scale feature integration module (IFIM) to extract critical latent information across scales. Then, in the linear transformer layer, we design a joint feature-guided linear attention (JGLA). It jointly considers adjacent and extended regional features, dynamically assigning weights to convolutional kernels for contextual feature selection. This process garners multi-level contextual information, which is used to guide linear attention for effective information interaction. Moreover, we redesign the method of computing feature similarity within the self-attention, reducing its computational complexity to linear. Extensive experiments shows that our proposal outperforms state-of-the-art models while balancing performance and computational costs.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":null,"pages":null},"PeriodicalIF":5.6,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141577378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Inherit or discard: learning better domain-specific child networks from the general domain for multi-domain NMT 继承还是放弃:从一般领域学习更好的特定领域子网络,实现多领域 NMT
IF 5.6 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-08 DOI: 10.1007/s13042-024-02253-w
Jinlei Xu, Yonghua Wen, Yan Xiang, Shuting Jiang, Yuxin Huang, Zhengtao Yu

Multi-domain NMT aims to develop a parameter-sharing model for translating general and specific domains, such as biology, legal, etc., which often struggle with the parameter interference problem. Existing approaches typically tackle this issue by learning a domain-specific sub-network for each domain equally, but they ignore the significant data imbalance problem across domains. For instance, the training data for the general domain often outweighs the biological domain tenfold. In this paper, we observe a natural similarity between the general and specific domains, including shared vocabulary or similar sentence structure. We propose a novel parameter inheritance strategy to adaptively learn domain-specific child networks from the general domain. Our approach employs gradient similarity as the criterion for determining which parameters should be inherited or discarded between the general and specific domains. Extensive experiments on several multi-domain NMT corpora demonstrate that our method significantly outperforms several strong baselines. In addition, our method exhibits remarkable generalization performance in adapting to few-shot multi-domain NMT scenarios. Further investigations reveal that our method achieves good interpretability because the parameters learned by the child network from the general domain depend on the interconnectedness between the specific domain and the general domain.

多领域 NMT 旨在开发一种参数共享模型,用于翻译一般领域和特定领域,如生物、法律等领域,这些领域通常都存在参数干扰问题。现有方法通常通过为每个领域平等地学习特定领域的子网络来解决这一问题,但它们忽略了跨领域的严重数据不平衡问题。例如,普通领域的训练数据往往是生物领域的十倍。在本文中,我们观察到通用领域和特定领域之间存在天然的相似性,包括共享词汇或相似的句子结构。我们提出了一种新颖的参数继承策略,以便从一般领域自适应地学习特定领域的子网络。我们的方法采用梯度相似性作为标准,以确定哪些参数应在一般域和特定域之间继承或舍弃。在几个多域 NMT 体系上进行的广泛实验表明,我们的方法明显优于几个强大的基线方法。此外,我们的方法在适应少量多域 NMT 场景方面表现出了卓越的泛化性能。进一步的研究表明,我们的方法具有良好的可解释性,因为子网络从一般领域学习到的参数取决于特定领域和一般领域之间的相互关联性。
{"title":"Inherit or discard: learning better domain-specific child networks from the general domain for multi-domain NMT","authors":"Jinlei Xu, Yonghua Wen, Yan Xiang, Shuting Jiang, Yuxin Huang, Zhengtao Yu","doi":"10.1007/s13042-024-02253-w","DOIUrl":"https://doi.org/10.1007/s13042-024-02253-w","url":null,"abstract":"<p>Multi-domain NMT aims to develop a parameter-sharing model for translating general and specific domains, such as biology, legal, etc., which often struggle with the parameter interference problem. Existing approaches typically tackle this issue by learning a domain-specific sub-network for each domain equally, but they ignore the significant data imbalance problem across domains. For instance, the training data for the general domain often outweighs the biological domain tenfold. In this paper, we observe a natural similarity between the general and specific domains, including shared vocabulary or similar sentence structure. We propose a novel parameter inheritance strategy to adaptively learn domain-specific child networks from the general domain. Our approach employs gradient similarity as the criterion for determining which parameters should be inherited or discarded between the general and specific domains. Extensive experiments on several multi-domain NMT corpora demonstrate that our method significantly outperforms several strong baselines. In addition, our method exhibits remarkable generalization performance in adapting to few-shot multi-domain NMT scenarios. Further investigations reveal that our method achieves good interpretability because the parameters learned by the child network from the general domain depend on the interconnectedness between the specific domain and the general domain.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":null,"pages":null},"PeriodicalIF":5.6,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141568634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Self-representation with adaptive loss minimization via doubly stochastic graph regularization for robust unsupervised feature selection 通过双随机图正则化实现自适应损失最小化的自我呈现,从而实现稳健的无监督特征选择
IF 5.6 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-06 DOI: 10.1007/s13042-024-02275-4
Xiangfa Song

Unsupervised feature selection (UFS), which involves selecting representative features from unlabeled high-dimensional data, has attracted much attention. Numerous self-representation-based models have been recently developed successfully for UFS. However, these models have two main problems. First, existing self-representation-based UFS models cannot effectively handle noise and outliers. Second, many graph-regularized self-representation-based UFS models typically construct a fixed graph to maintain the local structure of data. To overcome the above shortcomings, we propose a novel robust UFS model called self-representation with adaptive loss minimization via doubly stochastic graph regularization (SRALDS). Specifically, SRALDS uses an adaptive loss function to minimize the representation residual term, which may enhance the robustness of the model and diminish the effect of noise and outliers. Besides, rather than utilizing a fixed graph, SRALDS learns a high-quality doubly stochastic graph that more accurately captures the local structure of data. Finally, an efficient optimization algorithm is designed to obtain the optimal solution for SRALDS. Extensive experiments demonstrate the superior performance of SRALDS over several well-known UFS methods.

无监督特征选择(UFS)涉及从未标明的高维数据中选择代表性特征,已引起广泛关注。最近,针对无监督特征选择成功开发了许多基于自代表的模型。然而,这些模型存在两个主要问题。首先,现有的基于自表示的 UFS 模型无法有效处理噪声和异常值。其次,许多基于图规则化自表示的 UFS 模型通常会构建一个固定的图来保持数据的局部结构。为了克服上述缺点,我们提出了一种新颖的鲁棒 UFS 模型,称为通过双随机图正则化实现自适应损失最小化的自表示模型(SRALDS)。具体来说,SRALDS 使用自适应损失函数来最小化表征残差项,这可以增强模型的鲁棒性,减少噪声和异常值的影响。此外,SRALDS 不使用固定的图,而是学习高质量的双随机图,从而更准确地捕捉数据的局部结构。最后,设计了一种高效的优化算法,以获得 SRALDS 的最优解。大量实验证明,SRALDS 的性能优于几种著名的 UFS 方法。
{"title":"Self-representation with adaptive loss minimization via doubly stochastic graph regularization for robust unsupervised feature selection","authors":"Xiangfa Song","doi":"10.1007/s13042-024-02275-4","DOIUrl":"https://doi.org/10.1007/s13042-024-02275-4","url":null,"abstract":"<p>Unsupervised feature selection (UFS), which involves selecting representative features from unlabeled high-dimensional data, has attracted much attention. Numerous self-representation-based models have been recently developed successfully for UFS. However, these models have two main problems. First, existing self-representation-based UFS models cannot effectively handle noise and outliers. Second, many graph-regularized self-representation-based UFS models typically construct a fixed graph to maintain the local structure of data. To overcome the above shortcomings, we propose a novel robust UFS model called self-representation with adaptive loss minimization via doubly stochastic graph regularization (SRALDS). Specifically, SRALDS uses an adaptive loss function to minimize the representation residual term, which may enhance the robustness of the model and diminish the effect of noise and outliers. Besides, rather than utilizing a fixed graph, SRALDS learns a high-quality doubly stochastic graph that more accurately captures the local structure of data. Finally, an efficient optimization algorithm is designed to obtain the optimal solution for SRALDS. Extensive experiments demonstrate the superior performance of SRALDS over several well-known UFS methods.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":null,"pages":null},"PeriodicalIF":5.6,"publicationDate":"2024-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141568636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A multi-strategy hybrid cuckoo search algorithm with specular reflection based on a population linear decreasing strategy 基于群体线性递减策略的带有镜面反射的多策略混合布谷鸟搜索算法
IF 5.6 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-05 DOI: 10.1007/s13042-024-02273-6
Chengtian Ouyang, Xin Liu, Donglin Zhu, Yangyang Zheng, Changjun Zhou, Chengye Zou

The cuckoo search algorithm (CS), an algorithm inspired by the nest-parasitic breeding behavior of cuckoos, has proved its own effectiveness as a problem-solving approach in many fields since it was proposed. Nevertheless, the cuckoo search algorithm still suffers from an imbalance between exploration and exploitation as well as a tendency to fall into local optimization. In this paper, we propose a new hybrid cuckoo search algorithm (LHCS) based on linear decreasing of populations, and in order to optimize the local search of the algorithm and make the algorithm converge quickly, we mix the solution updating strategy of the Grey Yours sincerely, wolf optimizer (GWO) and use the linear decreasing rule to adjust the calling ratio of the strategy in order to balance the global exploration and the local exploitation; Second, the addition of a specular reflection learning strategy enhances the algorithm's ability to jump out of local optima; Finally, the convergence ability of the algorithm on different intervals and the adaptive ability of population diversity are improved using a population linear decreasing strategy. The experimental results on 29 benchmark functions from the CEC2017 test set show that the LHCS algorithm has significant superiority and stability over other algorithms when the quality of all solutions is considered together. In order to further verify the performance of the proposed algorithm in this paper, we applied the algorithm to engineering problems, functional tests, and Wilcoxon test results show that the comprehensive performance of the LHCS algorithm outperforms the other 14 state-of-the-art algorithms. In several engineering optimization problems, the practicality and effectiveness of the LHCS algorithm are verified, and the design cost can be greatly reduced by applying it to real engineering problems.

布谷鸟搜索算法(CS)是一种受布谷鸟筑巢寄生繁殖行为启发而产生的算法,自提出以来,已在许多领域证明了其作为一种解决问题的方法的有效性。然而,布谷鸟搜索算法仍然存在探索与利用不平衡以及容易陷入局部优化的问题。本文提出了一种基于种群线性递减的新型混合布谷鸟搜索算法(LHCS),为了优化算法的局部搜索,使算法快速收敛,我们混合了灰狼优化器(GWO)的解更新策略,并利用线性递减规则调整策略的调用比例,以平衡全局探索和局部开发;其次,增加了镜面反射学习策略,增强了算法跳出局部最优的能力;最后,利用种群线性递减策略提高了算法在不同区间的收敛能力和种群多样性的适应能力。对 CEC2017 测试集中 29 个基准函数的实验结果表明,综合考虑所有解的质量,LHCS 算法比其他算法具有明显的优越性和稳定性。为了进一步验证本文所提算法的性能,我们将该算法应用于工程问题、功能测试,Wilcoxon 检验结果表明,LHCS 算法的综合性能优于其他 14 种最先进算法。在多个工程优化问题中,LHCS 算法的实用性和有效性得到了验证,将其应用于实际工程问题,可以大大降低设计成本。
{"title":"A multi-strategy hybrid cuckoo search algorithm with specular reflection based on a population linear decreasing strategy","authors":"Chengtian Ouyang, Xin Liu, Donglin Zhu, Yangyang Zheng, Changjun Zhou, Chengye Zou","doi":"10.1007/s13042-024-02273-6","DOIUrl":"https://doi.org/10.1007/s13042-024-02273-6","url":null,"abstract":"<p>The cuckoo search algorithm (CS), an algorithm inspired by the nest-parasitic breeding behavior of cuckoos, has proved its own effectiveness as a problem-solving approach in many fields since it was proposed. Nevertheless, the cuckoo search algorithm still suffers from an imbalance between exploration and exploitation as well as a tendency to fall into local optimization. In this paper, we propose a new hybrid cuckoo search algorithm (LHCS) based on linear decreasing of populations, and in order to optimize the local search of the algorithm and make the algorithm converge quickly, we mix the solution updating strategy of the Grey Yours sincerely, wolf optimizer (GWO) and use the linear decreasing rule to adjust the calling ratio of the strategy in order to balance the global exploration and the local exploitation; Second, the addition of a specular reflection learning strategy enhances the algorithm's ability to jump out of local optima; Finally, the convergence ability of the algorithm on different intervals and the adaptive ability of population diversity are improved using a population linear decreasing strategy. The experimental results on 29 benchmark functions from the CEC2017 test set show that the LHCS algorithm has significant superiority and stability over other algorithms when the quality of all solutions is considered together. In order to further verify the performance of the proposed algorithm in this paper, we applied the algorithm to engineering problems, functional tests, and Wilcoxon test results show that the comprehensive performance of the LHCS algorithm outperforms the other 14 state-of-the-art algorithms. In several engineering optimization problems, the practicality and effectiveness of the LHCS algorithm are verified, and the design cost can be greatly reduced by applying it to real engineering problems.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":null,"pages":null},"PeriodicalIF":5.6,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141551462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Low-dimensional intrinsic dimension reveals a phase transition in gradient-based learning of deep neural networks 低维内在维度揭示了基于梯度学习的深度神经网络的阶段性转变
IF 5.6 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-04 DOI: 10.1007/s13042-024-02244-x
Chengli Tan, Jiangshe Zhang, Junmin Liu, Zixiang Zhao

Deep neural networks complete a feature extraction task by propagating the inputs through multiple modules. However, how the representations evolve with the gradient-based optimization remains unknown. Here we leverage the intrinsic dimension of the representations to study the learning dynamics and find that the training process undergoes a phase transition from expansion to compression under disparate training regimes. Surprisingly, this phenomenon is ubiquitous across a wide variety of model architectures, optimizers, and data sets. We demonstrate that the variation in the intrinsic dimension is consistent with the complexity of the learned hypothesis, which can be quantitatively assessed by the critical sample ratio that is rooted in adversarial robustness. Meanwhile, we mathematically show that this phenomenon can be analyzed in terms of the mutable correlation between neurons. Although the evoked activities obey a power-law decaying rule in biological circuits, we identify that the power-law exponent of the representations in deep neural networks predicted adversarial robustness well only at the end of the training but not during the training process. These results together suggest that deep neural networks are prone to producing robust representations by adaptively eliminating or retaining redundancies. The code is publicly available at https://github.com/cltan023/learning2022.

深度神经网络通过多个模块传播输入来完成特征提取任务。然而,表征如何随着基于梯度的优化而演化仍是未知数。在这里,我们利用表征的内在维度来研究学习动态,并发现在不同的训练机制下,训练过程经历了从扩展到压缩的阶段性转变。令人惊讶的是,这种现象在各种模型架构、优化器和数据集中都普遍存在。我们证明了内在维度的变化与所学假设的复杂性是一致的,这可以通过临界样本比进行定量评估,而临界样本比则植根于对抗鲁棒性。同时,我们用数学方法证明,这种现象可以用神经元之间可变的相关性来分析。虽然诱发活动在生物回路中遵循幂律衰减规则,但我们发现,深度神经网络中表征的幂律指数只有在训练结束时才能很好地预测对抗鲁棒性,而在训练过程中却不能。这些结果共同表明,深度神经网络很容易通过自适应地消除或保留冗余来产生鲁棒性表征。代码可在 https://github.com/cltan023/learning2022 公开获取。
{"title":"Low-dimensional intrinsic dimension reveals a phase transition in gradient-based learning of deep neural networks","authors":"Chengli Tan, Jiangshe Zhang, Junmin Liu, Zixiang Zhao","doi":"10.1007/s13042-024-02244-x","DOIUrl":"https://doi.org/10.1007/s13042-024-02244-x","url":null,"abstract":"<p>Deep neural networks complete a feature extraction task by propagating the inputs through multiple modules. However, how the representations evolve with the gradient-based optimization remains unknown. Here we leverage the intrinsic dimension of the representations to study the learning dynamics and find that the training process undergoes a phase transition from expansion to compression under disparate training regimes. Surprisingly, this phenomenon is ubiquitous across a wide variety of model architectures, optimizers, and data sets. We demonstrate that the variation in the intrinsic dimension is consistent with the complexity of the learned hypothesis, which can be quantitatively assessed by the critical sample ratio that is rooted in adversarial robustness. Meanwhile, we mathematically show that this phenomenon can be analyzed in terms of the mutable correlation between neurons. Although the evoked activities obey a power-law decaying rule in biological circuits, we identify that the power-law exponent of the representations in deep neural networks predicted adversarial robustness well only at the end of the training but not during the training process. These results together suggest that deep neural networks are prone to producing robust representations by adaptively eliminating or retaining redundancies. The code is publicly available at https://github.com/cltan023/learning2022.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":null,"pages":null},"PeriodicalIF":5.6,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141551465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel abstractive summarization model based on topic-aware and contrastive learning 基于主题感知和对比学习的新型抽象摘要模型
IF 5.6 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-04 DOI: 10.1007/s13042-024-02263-8
Huanling Tang, Ruiquan Li, Wenhao Duan, Quansheng Dou, Mingyu Lu

The majority of abstractive summarization models are designed based on the Sequence-to-Sequence(Seq2Seq) architecture. These models are able to capture syntactic and contextual information between words. However, Seq2Seq-based summarization models tend to overlook global semantic information. Moreover, there exist inconsistency between the objective function and evaluation metrics of this model. To address these limitations, a novel model named ASTCL is proposed in this paper. It integrates the neural topic model into the Seq2Seq framework innovatively, aiming to capture the text’s global semantic information and guide the summary generation. Additionally, it incorporates contrastive learning techniques to mitigate the discrepancy between the objective loss and the evaluation metrics through scoring multiple candidate summaries. On CNN/DM XSum and NYT datasets, the experimental results demonstrate that the ASTCL model outperforms the other generic models in summarization task.

大多数抽象摘要模型都是基于序列到序列(Sequence-to-Sequence,Seq2Seq)架构设计的。这些模型能够捕捉词与词之间的句法和上下文信息。然而,基于 Seq2Seq 的摘要模型往往会忽略全局语义信息。此外,该模型的目标函数和评价指标之间也存在不一致。针对这些局限性,本文提出了一种名为 ASTCL 的新型模型。它将神经主题模型创新性地集成到 Seq2Seq 框架中,旨在捕捉文本的全局语义信息并指导摘要的生成。此外,它还结合了对比学习技术,通过对多个候选摘要进行评分来减少客观损失与评价指标之间的差异。在 CNN/DM XSum 和 NYT 数据集上的实验结果表明,ASTCL 模型在摘要任务中的表现优于其他通用模型。
{"title":"A novel abstractive summarization model based on topic-aware and contrastive learning","authors":"Huanling Tang, Ruiquan Li, Wenhao Duan, Quansheng Dou, Mingyu Lu","doi":"10.1007/s13042-024-02263-8","DOIUrl":"https://doi.org/10.1007/s13042-024-02263-8","url":null,"abstract":"<p>The majority of abstractive summarization models are designed based on the Sequence-to-Sequence(Seq2Seq) architecture. These models are able to capture syntactic and contextual information between words. However, Seq2Seq-based summarization models tend to overlook global semantic information. Moreover, there exist inconsistency between the objective function and evaluation metrics of this model. To address these limitations, a novel model named ASTCL is proposed in this paper. It integrates the neural topic model into the Seq2Seq framework innovatively, aiming to capture the text’s global semantic information and guide the summary generation. Additionally, it incorporates contrastive learning techniques to mitigate the discrepancy between the objective loss and the evaluation metrics through scoring multiple candidate summaries. On CNN/DM XSum and NYT datasets, the experimental results demonstrate that the ASTCL model outperforms the other generic models in summarization task.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":null,"pages":null},"PeriodicalIF":5.6,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141551461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Undersampling based on generalized learning vector quantization and natural nearest neighbors for imbalanced data 基于广义学习向量量化和自然近邻的不平衡数据去采样
IF 5.6 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-03 DOI: 10.1007/s13042-024-02261-w
Long-Hui Wang, Qi Dai, Jia-You Wang, Tony Du, Lifang Chen

Imbalanced datasets can adversely affect classifier performance. Conventional undersampling approaches may lead to the loss of essential information, while oversampling techniques could introduce noise. To address this challenge, we propose an undersampling algorithm called GLNDU (Generalized Learning Vector Quantization and Natural Nearest Neighbors-based Undersampling). GLNDU utilizes Generalized Learning Vector Quantization (GLVQ) for computing the centroids of positive and negative instances. It also utilizes the concept of Natural Nearest Neighbors to identify majority-class instances in the overlapping region of the centroids of minority-class instances. Afterwards, these majority-class instances are removed, resulting in a new balanced training dataset that is used to train a foundational classifier. We conduct extensive experiments on 29 publicly available datasets, evaluating the performance using AUC and G_mean values. GLNDU demonstrates significant advantages over established methods such as SVM, CART, and KNN across different types of classifiers. Additionally, the results of the Friedman ranking and Nemenyi post-hoc test provide additional support for the findings obtained from the experiments.

不平衡的数据集会对分类器的性能产生不利影响。传统的欠采样方法可能会导致基本信息丢失,而超采样技术则可能会引入噪声。为了应对这一挑战,我们提出了一种称为 GLNDU(基于广义学习矢量量化和自然近邻的欠采样)的欠采样算法。GLNDU 利用广义学习矢量量化(GLVQ)计算正负实例的中心点。它还利用 "自然近邻"(Natural Nearest Neighbors)的概念,在少数类实例中心点的重叠区域识别多数类实例。之后,这些多数类实例会被移除,从而产生一个新的平衡训练数据集,用于训练基础分类器。我们在 29 个公开可用的数据集上进行了广泛的实验,并使用 AUC 和 G_mean 值对性能进行了评估。与 SVM、CART 和 KNN 等成熟方法相比,GLNDU 在不同类型的分类器上都表现出显著优势。此外,Friedman 排序和 Nemenyi 事后检验的结果也为实验结果提供了更多支持。
{"title":"Undersampling based on generalized learning vector quantization and natural nearest neighbors for imbalanced data","authors":"Long-Hui Wang, Qi Dai, Jia-You Wang, Tony Du, Lifang Chen","doi":"10.1007/s13042-024-02261-w","DOIUrl":"https://doi.org/10.1007/s13042-024-02261-w","url":null,"abstract":"<p>Imbalanced datasets can adversely affect classifier performance. Conventional undersampling approaches may lead to the loss of essential information, while oversampling techniques could introduce noise. To address this challenge, we propose an undersampling algorithm called GLNDU (Generalized Learning Vector Quantization and Natural Nearest Neighbors-based Undersampling). GLNDU utilizes Generalized Learning Vector Quantization (GLVQ) for computing the centroids of positive and negative instances. It also utilizes the concept of Natural Nearest Neighbors to identify majority-class instances in the overlapping region of the centroids of minority-class instances. Afterwards, these majority-class instances are removed, resulting in a new balanced training dataset that is used to train a foundational classifier. We conduct extensive experiments on 29 publicly available datasets, evaluating the performance using AUC and G_mean values. GLNDU demonstrates significant advantages over established methods such as SVM, CART, and KNN across different types of classifiers. Additionally, the results of the Friedman ranking and Nemenyi post-hoc test provide additional support for the findings obtained from the experiments.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":null,"pages":null},"PeriodicalIF":5.6,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141551464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Journal of Machine Learning and Cybernetics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1