Pub Date : 2024-09-13DOI: 10.1007/s10489-024-05816-0
Isabel Fernández, Javier Puente, Borja Ponte, Alberto Gómez
The combined use of the Analytical Hierarchy Process (AHP) and Fuzzy Inference Systems (FISs) can significantly enhance the effectiveness of transformative projects in organizations by better managing their complexities and uncertainties. This work develops a novel multicriteria model that integrates both methodologies to assist organizations in these projects. To demonstrate the value of the proposed approach, we present an illustrative example focused on the implementation of Industry 4.0 in SMEs. First, through a review of relevant literature, we identify the key barriers to improving SMEs' capability to implement Industry 4.0 effectively. Subsequently, the AHP, enhanced through Dong and Saaty’s methodology, establishes a consensus-based assessment of the importance of these barriers, using the judgments of five experts. Next, a FIS is utilized, with rule bases automatically derived from the preceding weights, eliminating the need for another round of expert input. This paper shows and discusses how SMEs can use this model to self-assess their adaptability to the Industry 4.0 landscape and formulate improvement strategies to achieve deeper alignment with this transformative paradigm.
{"title":"Integration of AHP and fuzzy inference systems for empowering transformative journeys in organizations: Assessing the implementation of Industry 4.0 in SMEs","authors":"Isabel Fernández, Javier Puente, Borja Ponte, Alberto Gómez","doi":"10.1007/s10489-024-05816-0","DOIUrl":"10.1007/s10489-024-05816-0","url":null,"abstract":"<div><p>The combined use of the Analytical Hierarchy Process (AHP) and Fuzzy Inference Systems (FISs) can significantly enhance the effectiveness of transformative projects in organizations by better managing their complexities and uncertainties. This work develops a novel multicriteria model that integrates both methodologies to assist organizations in these projects. To demonstrate the value of the proposed approach, we present an illustrative example focused on the implementation of Industry 4.0 in SMEs. First, through a review of relevant literature, we identify the key barriers to improving SMEs' capability to implement Industry 4.0 effectively. Subsequently, the AHP, enhanced through Dong and Saaty’s methodology, establishes a consensus-based assessment of the importance of these barriers, using the judgments of five experts. Next, a FIS is utilized, with rule bases automatically derived from the preceding weights, eliminating the need for another round of expert input. This paper shows and discusses how SMEs can use this model to self-assess their adaptability to the Industry 4.0 landscape and formulate improvement strategies to achieve deeper alignment with this transformative paradigm.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 23","pages":"12357 - 12377"},"PeriodicalIF":3.4,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10489-024-05816-0.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142265439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-13DOI: 10.1007/s10489-024-05812-4
Ting Yang, Shuisheng Zhou, Zhuan Zhang
Subspace clustering typically clusters data by performing spectral clustering to an affinity matrix constructed in some deterministic ways of self-representation coefficient matrix. Therefore, the quality of the affinity matrix is vital to their performance. However, traditional deterministic ways only provide a feasible affinity matrix but not the most suitable one for showing data structures. Besides, post-processing commonly on the coefficient matrix also affects the affinity matrix’s quality. Furthermore, constructing the affinity matrix is separate from optimizing the coefficient matrix and performing spectral clustering, which can not guarantee the optimal overall result. To this end, we propose a new method, affinity adaptive sparse subspace clustering (AASSC), by adding Laplacian rank constraint into a subspace sparse-representation model to adaptively learn a high-quality affinity matrix having accurate p-connected components from a sparse coefficient matrix without post-processing, where p represents categories. In addition, by relaxing the Laplacian rank constraint into a trace minimization, AASSC naturally combines the operations of the coefficient matrix, affinity matrix, and spectral clustering into a unified optimization, guaranteeing the overall optimal result. Extensive experimental results verify the proposed method to be effective and superior.
子空间聚类通常是通过对以某种确定性方式构建的自表示系数矩阵的亲和矩阵进行频谱聚类,从而对数据进行聚类。因此,亲和矩阵的质量对其性能至关重要。然而,传统的确定性方法只能提供可行的亲和矩阵,却不能提供最适合显示数据结构的亲和矩阵。此外,通常对系数矩阵进行的后处理也会影响亲和矩阵的质量。而且,构建亲和矩阵与优化系数矩阵和进行频谱聚类是分开的,不能保证整体结果最优。为此,我们提出了一种新方法--亲和力自适应稀疏子空间聚类(AASSC),即在子空间稀疏表示模型中加入拉普拉斯秩约束,从而无需后处理即可从稀疏系数矩阵中自适应地学习出具有精确 p 个连接分量的高质量亲和力矩阵,其中 p 代表类别。此外,通过将拉普拉斯秩约束放宽为迹线最小化,AASSC 自然而然地将系数矩阵、亲和矩阵和谱聚类的操作结合为统一的优化,保证了整体最优结果。大量实验结果验证了所提方法的有效性和优越性。
{"title":"Affinity adaptive sparse subspace clustering via constrained Laplacian rank","authors":"Ting Yang, Shuisheng Zhou, Zhuan Zhang","doi":"10.1007/s10489-024-05812-4","DOIUrl":"10.1007/s10489-024-05812-4","url":null,"abstract":"<div><p>Subspace clustering typically clusters data by performing spectral clustering to an affinity matrix constructed in some deterministic ways of self-representation coefficient matrix. Therefore, the quality of the affinity matrix is vital to their performance. However, traditional deterministic ways only provide a feasible affinity matrix but not the most suitable one for showing data structures. Besides, post-processing commonly on the coefficient matrix also affects the affinity matrix’s quality. Furthermore, constructing the affinity matrix is separate from optimizing the coefficient matrix and performing spectral clustering, which can not guarantee the optimal overall result. To this end, we propose a new method, affinity adaptive sparse subspace clustering (AASSC), by adding Laplacian rank constraint into a subspace sparse-representation model to adaptively learn a high-quality affinity matrix having accurate <i>p</i>-connected components from a sparse coefficient matrix without post-processing, where <i>p</i> represents categories. In addition, by relaxing the Laplacian rank constraint into a trace minimization, AASSC naturally combines the operations of the coefficient matrix, affinity matrix, and spectral clustering into a unified optimization, guaranteeing the overall optimal result. Extensive experimental results verify the proposed method to be effective and superior.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 23","pages":"12378 - 12390"},"PeriodicalIF":3.4,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142220572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-13DOI: 10.1007/s10489-024-05803-5
Francisco J. Gil-Gala, Marko Đurasević, Domagoj Jakobović
In recent years, the growing interest in environmental sustainability has led to Electric Vehicle Routing Problems (EVRPs) attracting more and more attention. EVRPs involve the use of electric vehicles, which have additional constraints, such as range and recharging time, compared to conventional Vehicle Routing Problems (VRPs). The complexity and dynamic nature of solving VRPs often lead to the introduction of Routing Policies (RPs), simple heuristics that incrementally build routes. However, manually designing efficient RPs proves to be a challenging and time-consuming task. Therefore, there is a pressing need to explore the application of hyper-heuristics, in particular Genetic Programming (GP), to automatically generate new RPs. Since this method has not yet been investigated in the literature in the context of EVRPs, this study explores the applicability of GP to automatically generate new RPs for EVRP. To this end, three RP variants (serial, semiparallel, and parallel) are introduced in this study, along with a set of domain-specific terminal nodes to optimise three criteria: the number of vehicles, energy consumption, and total tardiness. The experimental analysis shows that the serial variant performs best in terms of energy consumption and number of vehicles, while the parallel variant is most effective in minimising the total tardiness. A comprehensive analysis of the proposed method is conducted to determine its convergence properties and the impact of the proposed terminal nodes on performance and to describe several generated RPs. The results show that the automatically generated RPs perform commendably compared to traditional methods such as metaheuristics and exact methods, which usually require significantly more runtime. More specifically, depending on the scenario in which they are used, the generated RPs achieve results that are about 20%-37% worse compared to the best known results for the number of vehicles in almost negligible time, in just some milliseconds.
近年来,人们对环境可持续发展的兴趣与日俱增,电动汽车路由问题(EVRP)也因此受到越来越多的关注。与传统的车辆路由问题(VRP)相比,电动车辆路由问题涉及电动汽车的使用,而电动汽车又有额外的限制,如续航里程和充电时间。由于解决 VRP 的复杂性和动态性,通常需要引入路由策略 (RP),这种简单的启发式方法可以逐步建立路由。然而,手动设计高效的路由策略被证明是一项具有挑战性且耗时的任务。因此,迫切需要探索超启发式方法的应用,特别是遗传编程(GP),以自动生成新的 RP。由于该方法尚未在有关 EVRP 的文献中得到研究,本研究探讨了 GP 在自动生成 EVRP 新 RP 方面的适用性。为此,本研究引入了三种 RP 变体(串行、半并行和并行)以及一组特定领域的终端节点,以优化三个标准:车辆数量、能耗和总迟到时间。实验分析表明,串行变量在能源消耗和车辆数量方面表现最佳,而并行变量在最大限度地减少总延迟方面最为有效。对所提出的方法进行了全面分析,以确定其收敛特性和所提出的终端节点对性能的影响,并描述了几个生成的 RP。结果表明,与元启发式和精确法等传统方法相比,自动生成的 RP 性能值得称赞,因为传统方法通常需要更多的运行时间。更具体地说,根据使用场景的不同,生成的 RPs 在几乎可以忽略不计的时间内(仅需几毫秒),与已知的最佳结果相比,在车辆数量上取得了差 20%-37% 的结果。
{"title":"Evolving routing policies for electric vehicles by means of genetic programming","authors":"Francisco J. Gil-Gala, Marko Đurasević, Domagoj Jakobović","doi":"10.1007/s10489-024-05803-5","DOIUrl":"10.1007/s10489-024-05803-5","url":null,"abstract":"<div><p>In recent years, the growing interest in environmental sustainability has led to Electric Vehicle Routing Problems (EVRPs) attracting more and more attention. EVRPs involve the use of electric vehicles, which have additional constraints, such as range and recharging time, compared to conventional Vehicle Routing Problems (VRPs). The complexity and dynamic nature of solving VRPs often lead to the introduction of Routing Policies (RPs), simple heuristics that incrementally build routes. However, manually designing efficient RPs proves to be a challenging and time-consuming task. Therefore, there is a pressing need to explore the application of hyper-heuristics, in particular Genetic Programming (GP), to automatically generate new RPs. Since this method has not yet been investigated in the literature in the context of EVRPs, this study explores the applicability of GP to automatically generate new RPs for EVRP. To this end, three RP variants (serial, semiparallel, and parallel) are introduced in this study, along with a set of domain-specific terminal nodes to optimise three criteria: the number of vehicles, energy consumption, and total tardiness. The experimental analysis shows that the serial variant performs best in terms of energy consumption and number of vehicles, while the parallel variant is most effective in minimising the total tardiness. A comprehensive analysis of the proposed method is conducted to determine its convergence properties and the impact of the proposed terminal nodes on performance and to describe several generated RPs. The results show that the automatically generated RPs perform commendably compared to traditional methods such as metaheuristics and exact methods, which usually require significantly more runtime. More specifically, depending on the scenario in which they are used, the generated RPs achieve results that are about 20%-37% worse compared to the best known results for the number of vehicles in almost negligible time, in just some milliseconds.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 23","pages":"12391 - 12419"},"PeriodicalIF":3.4,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10489-024-05803-5.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142220529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-12DOI: 10.1007/s10489-024-05821-3
Teddy Lazebnik, Or Iny
Temporal graphs have become an essential tool for analyzing complex dynamic systems with multiple agents. Detecting anomalies in temporal graphs is crucial for various applications, including identifying emerging trends, monitoring network security, understanding social dynamics, tracking disease outbreaks, and understanding financial dynamics. In this paper, we present a comprehensive benchmarking study that compares 12 data-driven methods for anomaly detection in temporal graphs. We conduct experiments on two temporal graphs extracted from Twitter and Facebook, aiming to identify anomalies in group interactions. Surprisingly, our study reveals an unclear pattern regarding the best method for such tasks, highlighting the complexity and challenges involved in anomaly emergence detection in large and dynamic systems. The results underscore the need for further research and innovative approaches to effectively detect emerging anomalies in dynamic systems represented as temporal graphs.
{"title":"Temporal graphs anomaly emergence detection: benchmarking for social media interactions","authors":"Teddy Lazebnik, Or Iny","doi":"10.1007/s10489-024-05821-3","DOIUrl":"10.1007/s10489-024-05821-3","url":null,"abstract":"<div><p>Temporal graphs have become an essential tool for analyzing complex dynamic systems with multiple agents. Detecting anomalies in temporal graphs is crucial for various applications, including identifying emerging trends, monitoring network security, understanding social dynamics, tracking disease outbreaks, and understanding financial dynamics. In this paper, we present a comprehensive benchmarking study that compares 12 data-driven methods for anomaly detection in temporal graphs. We conduct experiments on two temporal graphs extracted from Twitter and Facebook, aiming to identify anomalies in group interactions. Surprisingly, our study reveals an unclear pattern regarding the best method for such tasks, highlighting the complexity and challenges involved in anomaly emergence detection in large and dynamic systems. The results underscore the need for further research and innovative approaches to effectively detect emerging anomalies in dynamic systems represented as temporal graphs.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 23","pages":"12347 - 12356"},"PeriodicalIF":3.4,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10489-024-05821-3.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142220568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-12DOI: 10.1007/s10489-024-05825-z
Lei Sang, Yang Hu, Yi Zhang, Yiwen Zhang
The goal of bundle recommendation is to offer users a set of items that match their preferences. Current methods mainly categorize user preferences into bundle and item levels, and then use graph neural networks to obtain representations of users and bundles at both levels. However, real-world interaction data often contains irrelevant and uninformative noise connections, leading to inaccurate representations of user interests and bundle content. In this paper, we introduce a Multi-view Denoising Contrastive Learning approach for Bundle Recommendation (MDCLBR), aiming to reduce the negative effects of noisy data on users’ and bundles’ representations. We use the original view, which includes bundle and item levels, to guide data augmentation for creating augmented views. Then, we apply the multi-view contrastive learning paradigm to enhance collaboration within the original view, the augmented views, and between them. This leads to more accurate representations of users and bundles, reducing the impact of noisy data. Our method outperforms previous approaches in extensive experiments on three real-world public datasets.
{"title":"Multi-view denoising contrastive learning for bundle recommendation","authors":"Lei Sang, Yang Hu, Yi Zhang, Yiwen Zhang","doi":"10.1007/s10489-024-05825-z","DOIUrl":"10.1007/s10489-024-05825-z","url":null,"abstract":"<div><p>The goal of bundle recommendation is to offer users a set of items that match their preferences. Current methods mainly categorize user preferences into bundle and item levels, and then use graph neural networks to obtain representations of users and bundles at both levels. However, real-world interaction data often contains irrelevant and uninformative noise connections, leading to inaccurate representations of user interests and bundle content. In this paper, we introduce a <b>M</b>ulti-view <b>D</b>enoising <b>C</b>ontrastive <b>L</b>earning approach for <b>B</b>undle <b>R</b>ecommendation (<b>MDCLBR</b>), aiming to reduce the negative effects of noisy data on users’ and bundles’ representations. We use the original view, which includes bundle and item levels, to guide data augmentation for creating augmented views. Then, we apply the multi-view contrastive learning paradigm to enhance collaboration within the original view, the augmented views, and between them. This leads to more accurate representations of users and bundles, reducing the impact of noisy data. Our method outperforms previous approaches in extensive experiments on three real-world public datasets.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 23","pages":"12332 - 12346"},"PeriodicalIF":3.4,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142220569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Structural deep document clustering methods, which leverage both structural information and inherent data properties to learn document representations using deep neural networks for clustering, have recently garnered increased research interest. However, the structural information used in these methods is usually static and remains unchanged during the clustering process. This can negatively impact the clustering results if the initial structural information is inaccurate or noisy. In this paper, we present an adaptive structural enhanced representation learning network for document clustering. This network can adjust the structural information with the help of clustering partitions and consists of two components: an adaptive structure learner, which automatically evaluates and adjusts structural information at both the document and term levels to facilitate the learning of more effective structural information, and a structural enhanced representation learning network. The latter incorporates integrates this adjusted structural information to enhance text document representations while reducing noise, thereby improving the clustering results. The iterative process between clustering results and the adaptive structural enhanced representation learning network promotes mutual optimization, progressively enhancing model performance. Extensive experiments on various text document datasets demonstrate that the proposed method outperforms several state-of-the-art methods.
The overall framework of adaptive structural enhanced representation learning network
{"title":"Adaptive structural enhanced representation learning for deep document clustering","authors":"Jingjing Xue, Ruizhang Huang, Ruina Bai, Yanping Chen, Yongbin Qin, Chuan Lin","doi":"10.1007/s10489-024-05791-6","DOIUrl":"10.1007/s10489-024-05791-6","url":null,"abstract":"<p>Structural deep document clustering methods, which leverage both structural information and inherent data properties to learn document representations using deep neural networks for clustering, have recently garnered increased research interest. However, the structural information used in these methods is usually static and remains unchanged during the clustering process. This can negatively impact the clustering results if the initial structural information is inaccurate or noisy. In this paper, we present an adaptive structural enhanced representation learning network for document clustering. This network can adjust the structural information with the help of clustering partitions and consists of two components: an adaptive structure learner, which automatically evaluates and adjusts structural information at both the document and term levels to facilitate the learning of more effective structural information, and a structural enhanced representation learning network. The latter incorporates integrates this adjusted structural information to enhance text document representations while reducing noise, thereby improving the clustering results. The iterative process between clustering results and the adaptive structural enhanced representation learning network promotes mutual optimization, progressively enhancing model performance. Extensive experiments on various text document datasets demonstrate that the proposed method outperforms several state-of-the-art methods.</p><p>The overall framework of adaptive structural enhanced representation learning network</p>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 23","pages":"12315 - 12331"},"PeriodicalIF":3.4,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142220573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-11DOI: 10.1007/s10489-024-05750-1
Xuan Cho Do, Hai Anh Tran, Thi Lan Phuong Nguyen
Advanced Persistent Threat (APT) attack is one of the most dangerous cyber-attack techniques nowadays. Therefore, the issue of detecting and predicting the spread of APT malware in the network is a very urgent issue to help the process of preventing this attack effectively. In this paper, we propose a new approach that is capable of predicting the spread of APT malware in the network based on the APT's own behaviors. Accordingly, to predict the spread of APT malicious code in the system, we propose to use a combination of two single Susceptible‐Infected‐Recovered (SIR) models. Specifically, the first SIR model was built to predict the spread of APT malicious code to devices and computers within the organization. These devices and computers are often used by APT malicious code as a basis to escalate privileges to devices or computers containing important and sensitive information of the organization. The second SIR model has the function of predicting the spread of APT malware to a group of computers containing sensitive information or potentially causing high risks to the organization. The two SIR models will provide information about infections between computer groups in the system to help accurately predict the spread of APT malware in the system. The proposal to combine two SIR models in the article is a new proposal based on the behavior of APT malware in practice. By combining two SIR models, the proposal in this article has opened up a new approach for a number of problems predicting the spread in the internet such as malicious code in wireless sensor networks or malicious information on the social network.
高级持续威胁(APT)攻击是当今最危险的网络攻击技术之一。因此,检测和预测 APT 恶意软件在网络中的传播是一个非常紧迫的问题,有助于有效预防这种攻击。本文提出了一种新方法,能够根据 APT 自身的行为预测 APT 恶意软件在网络中的传播。因此,为了预测 APT 恶意代码在系统中的传播,我们建议使用两个单一的易感-感染-恢复(SIR)模型组合。具体来说,建立第一个 SIR 模型是为了预测 APT 恶意代码在组织内的设备和计算机上的传播。APT 恶意代码通常会利用这些设备和计算机,将权限升级到包含组织重要敏感信息的设备或计算机。第二个 SIR 模型的功能是预测 APT 恶意软件向包含敏感信息或可能对组织造成高风险的计算机群传播的情况。两个 SIR 模型将提供系统中计算机组之间的感染信息,以帮助准确预测 APT 恶意软件在系统中的传播。文章中结合两个 SIR 模型的建议是根据 APT 恶意软件在实践中的行为提出的新建议。通过结合两个 SIR 模型,本文中的建议为预测互联网中的恶意代码或社交网络中的恶意信息等一系列传播问题开辟了一种新的方法。
{"title":"A novel approach for predicting the spread of APT malware in the network","authors":"Xuan Cho Do, Hai Anh Tran, Thi Lan Phuong Nguyen","doi":"10.1007/s10489-024-05750-1","DOIUrl":"10.1007/s10489-024-05750-1","url":null,"abstract":"<div><p>Advanced Persistent Threat (APT) attack is one of the most dangerous cyber-attack techniques nowadays. Therefore, the issue of detecting and predicting the spread of APT malware in the network is a very urgent issue to help the process of preventing this attack effectively. In this paper, we propose a new approach that is capable of predicting the spread of APT malware in the network based on the APT's own behaviors. Accordingly, to predict the spread of APT malicious code in the system, we propose to use a combination of two single Susceptible‐Infected‐Recovered (SIR) models. Specifically, the first SIR model was built to predict the spread of APT malicious code to devices and computers within the organization. These devices and computers are often used by APT malicious code as a basis to escalate privileges to devices or computers containing important and sensitive information of the organization. The second SIR model has the function of predicting the spread of APT malware to a group of computers containing sensitive information or potentially causing high risks to the organization. The two SIR models will provide information about infections between computer groups in the system to help accurately predict the spread of APT malware in the system. The proposal to combine two SIR models in the article is a new proposal based on the behavior of APT malware in practice. By combining two SIR models, the proposal in this article has opened up a new approach for a number of problems predicting the spread in the internet such as malicious code in wireless sensor networks or malicious information on the social network.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 23","pages":"12293 - 12314"},"PeriodicalIF":3.4,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142220579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-11DOI: 10.1007/s10489-024-05806-2
Xuebo Cheng, Xiaohui Huang, Zhichao Huang, Nan Jiang
Offline Reinforcement Learning (Offline RL) is able to learn from pre-collected offline data without real-time interaction with the environment by policy regularization via distributional constraints or support set constraints. However, since the policy learned from offline data under the constrains of support set is usually similar to the behavioral policy due to the overly conservative constraints, offline RL confronts challenges in active behavioral exploration. Moreover, without online interaction, policy evaluation becomes prone to inaccuracy, and the learned policy may lack robustness in the presence of sub-optimal state-action pairs or noise in a dataset. In this paper, we propose an Offline-to-Online Reinforcement Learning Approach based on Multi-action Evaluation with Policy Extension(MAERL) for improving the ability of the policy exploration and the effective value evaluation of state-action in offline RL. In MAERL, we develop four modules: (1) in the policy extension module, we design a policy extension method, which uses the online policy to extend the offline policy; (2) in the multi-action evaluation module, we present an adaptive manner to merge the offline and online policies to generate an action of the agent; (3) in the action-oriented module, we learn the action trajectories of the agent from the dataset, mitigating the issue of actions deviating excessively during environmental exploration; (4) to maintain the consistency in the agent’s actions, we propose an action temporally-aligned representation learning method to maintain the trend of actions of agents. This approach ensures that the agent’s actions align with the learned trajectories, preventing significant deviations during exploration. Extensive experiments are conducted on 15 scenarios of the D4RL/mujoco environment. Results demonstrate that our proposed methods achieve the best performance in 12 scenarios and the second-best performance in 3 scenarios compared to state-of-the-art methods. The project’s code can be found at https://github.com/FrankGod111/Policy-Expansion.git
{"title":"An offline-to-online reinforcement learning approach based on multi-action evaluation with policy extension","authors":"Xuebo Cheng, Xiaohui Huang, Zhichao Huang, Nan Jiang","doi":"10.1007/s10489-024-05806-2","DOIUrl":"10.1007/s10489-024-05806-2","url":null,"abstract":"<div><p>Offline Reinforcement Learning (Offline RL) is able to learn from pre-collected offline data without real-time interaction with the environment by policy regularization via distributional constraints or support set constraints. However, since the policy learned from offline data under the constrains of support set is usually similar to the behavioral policy due to the overly conservative constraints, offline RL confronts challenges in active behavioral exploration. Moreover, without online interaction, policy evaluation becomes prone to inaccuracy, and the learned policy may lack robustness in the presence of sub-optimal state-action pairs or noise in a dataset. In this paper, we propose an Offline-to-Online Reinforcement Learning Approach based on Multi-action Evaluation with Policy Extension(MAERL) for improving the ability of the policy exploration and the effective value evaluation of state-action in offline RL. In MAERL, we develop four modules: (1) in the policy extension module, we design a policy extension method, which uses the online policy to extend the offline policy; (2) in the multi-action evaluation module, we present an adaptive manner to merge the offline and online policies to generate an action of the agent; (3) in the action-oriented module, we learn the action trajectories of the agent from the dataset, mitigating the issue of actions deviating excessively during environmental exploration; (4) to maintain the consistency in the agent’s actions, we propose an action temporally-aligned representation learning method to maintain the trend of actions of agents. This approach ensures that the agent’s actions align with the learned trajectories, preventing significant deviations during exploration. Extensive experiments are conducted on 15 scenarios of the D4RL/mujoco environment. Results demonstrate that our proposed methods achieve the best performance in 12 scenarios and the second-best performance in 3 scenarios compared to state-of-the-art methods. The project’s code can be found at https://github.com/FrankGod111/Policy-Expansion.git</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 23","pages":"12246 - 12271"},"PeriodicalIF":3.4,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142220570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Adversarial attacks pose a significant threat to real-world applications based on deep neural networks (DNNs), especially in security-critical applications. Research has shown that adversarial examples (AEs) generated on a surrogate model can also succeed on a target model, which is known as transferability. Feature-level transfer-based attacks improve the transferability of AEs by disrupting intermediate features. They target the intermediate layer of the model and use feature importance metrics to find these features. However, current methods overfit feature importance metrics to surrogate models, which results in poor sharing of the importance metrics across models and insufficient destruction of deep features. This work demonstrates the trade-off between feature importance metrics and feature corruption generalization, and categorizes feature destructive causes of misclassification. This work proposes a generative framework named PTNAA to guide the destruction of deep features across models, thus improving the transferability of AEs. Specifically, the method introduces path methods into integrated gradients. It selects path functions using only a priori knowledge and approximates neuron attribution using nonuniform sampling. In addition, it measures neurons based on the attribution results and performs feature-level attacks to remove inherent features of the image. Extensive experiments demonstrate the effectiveness of the proposed method. The code is available at https://github.com/lounwb/PTNAA.
{"title":"Improving the transferability of adversarial examples with path tuning","authors":"Tianyu Li, Xiaoyu Li, Wuping Ke, Xuwei Tian, Desheng Zheng, Chao Lu","doi":"10.1007/s10489-024-05820-4","DOIUrl":"10.1007/s10489-024-05820-4","url":null,"abstract":"<p>Adversarial attacks pose a significant threat to real-world applications based on deep neural networks (DNNs), especially in security-critical applications. Research has shown that adversarial examples (AEs) generated on a surrogate model can also succeed on a target model, which is known as transferability. Feature-level transfer-based attacks improve the transferability of AEs by disrupting intermediate features. They target the intermediate layer of the model and use feature importance metrics to find these features. However, current methods overfit feature importance metrics to surrogate models, which results in poor sharing of the importance metrics across models and insufficient destruction of deep features. This work demonstrates the trade-off between feature importance metrics and feature corruption generalization, and categorizes feature destructive causes of misclassification. This work proposes a generative framework named PTNAA to guide the destruction of deep features across models, thus improving the transferability of AEs. Specifically, the method introduces path methods into integrated gradients. It selects path functions using only a priori knowledge and approximates neuron attribution using nonuniform sampling. In addition, it measures neurons based on the attribution results and performs feature-level attacks to remove inherent features of the image. Extensive experiments demonstrate the effectiveness of the proposed method. The code is available at https://github.com/lounwb/PTNAA.</p>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 23","pages":"12194 - 12214"},"PeriodicalIF":3.4,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142227473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Large-scale Pre-trained Language Models (PLMs) have become the backbones of text classification due to their exceptional performance. However, they treat input documents as independent and uniformly distributed, thereby disregarding potential relationships among the documents. This limitation could lead to some miscalculations and inaccuracies in text classification. To address this issue, some recent work explores the integration of Graph Neural Networks (GNNs) with PLMs, as GNNs can effectively model document relationships. Yet, combining graph-based methods with PLMs is challenging due to the structural incompatibility between graphs and sequences. To tackle this challenge, we propose a graph-enhanced text mutual learning framework that integrates graph-based models with PLMs to boost classification performance. Our approach separates graph-based methods and language models into two independent channels and allows them to approximate each other through mutual learning of probability distributions. This probability-distribution-guided approach simplifies the adaptation of graph-based models to PLMs and enables seamless end-to-end training of the entire architecture. Moreover, we introduce Asymmetrical Learning, a strategy that enhances the learning process, and incorporate Uncertainty Weighting loss to achieve smoother probability distribution learning. These enhancements significantly improve the performance of mutual learning. The practical value of our research lies in its potential applications in various industries, such as social network analysis, information retrieval, and recommendation systems, where understanding and leveraging document relationships are crucial. Importantly, our method can be easily combined with different PLMs and consistently achieves state-of-the-art results on multiple public datasets.
{"title":"GEML: a graph-enhanced pre-trained language model framework for text classification via mutual learning","authors":"Tao Yu, Rui Song, Sandro Pinto, Tiago Gomes, Adriano Tavares, Hao Xu","doi":"10.1007/s10489-024-05831-1","DOIUrl":"10.1007/s10489-024-05831-1","url":null,"abstract":"<div><p>Large-scale Pre-trained Language Models (PLMs) have become the backbones of text classification due to their exceptional performance. However, they treat input documents as independent and uniformly distributed, thereby disregarding potential relationships among the documents. This limitation could lead to some miscalculations and inaccuracies in text classification. To address this issue, some recent work explores the integration of Graph Neural Networks (GNNs) with PLMs, as GNNs can effectively model document relationships. Yet, combining graph-based methods with PLMs is challenging due to the structural incompatibility between graphs and sequences. To tackle this challenge, we propose a graph-enhanced text mutual learning framework that integrates graph-based models with PLMs to boost classification performance. Our approach separates graph-based methods and language models into two independent channels and allows them to approximate each other through mutual learning of probability distributions. This probability-distribution-guided approach simplifies the adaptation of graph-based models to PLMs and enables seamless end-to-end training of the entire architecture. Moreover, we introduce Asymmetrical Learning, a strategy that enhances the learning process, and incorporate Uncertainty Weighting loss to achieve smoother probability distribution learning. These enhancements significantly improve the performance of mutual learning. The practical value of our research lies in its potential applications in various industries, such as social network analysis, information retrieval, and recommendation systems, where understanding and leveraging document relationships are crucial. Importantly, our method can be easily combined with different PLMs and consistently achieves state-of-the-art results on multiple public datasets.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 23","pages":"12215 - 12229"},"PeriodicalIF":3.4,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142220571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}