Pub Date : 2024-10-22DOI: 10.1016/j.ins.2024.121569
Qinghua Gu , Liyao Rong , Dan Wang , Di Liu
In the real world, the decision variables of large-scale sparse multi-objective problems are high-dimensional, and most Pareto optimal solutions are sparse. The balance of the algorithms is difficult to control, so it is challenging to deal with such problems in general. Therefore, An Enhanced Competitive Swarm Optimizer with Strongly Robust Sparse Operator (SR-ECSO) algorithm is proposed. Firstly, the strongly robust sparse functions which accelerate particles in the population better sparsity in decision space, are used in high-dimensional decision variables. Secondly, the diversity of sparse solutions is maintained, and the convergence balance of the algorithm is enhanced by the introduction of an adaptive random perturbation operator. Finally, the state of the particles is updated using a swarm optimizer to improve population competitiveness. To verify the proposed algorithm, we tested eight large-scale sparse benchmark problems, and the decision variables were set in three groups with 100, 500, and 1000 as examples. Experimental results show that the algorithm is promising for solving large-scale sparse optimization problems.
{"title":"An enhanced competitive swarm optimizer with strongly robust sparse operator for large-scale sparse multi-objective optimization problem","authors":"Qinghua Gu , Liyao Rong , Dan Wang , Di Liu","doi":"10.1016/j.ins.2024.121569","DOIUrl":"10.1016/j.ins.2024.121569","url":null,"abstract":"<div><div>In the real world, the decision variables of large-scale sparse multi-objective problems are high-dimensional, and most Pareto optimal solutions are sparse. The balance of the algorithms is difficult to control, so it is challenging to deal with such problems in general. Therefore, An Enhanced Competitive Swarm Optimizer with Strongly Robust Sparse Operator (SR-ECSO) algorithm is proposed. Firstly, the strongly robust sparse functions which accelerate particles in the population better sparsity in decision space, are used in high-dimensional decision variables. Secondly, the diversity of sparse solutions is maintained, and the convergence balance of the algorithm is enhanced by the introduction of an adaptive random perturbation operator. Finally, the state of the particles is updated using a swarm optimizer to improve population competitiveness. To verify the proposed algorithm, we tested eight large-scale sparse benchmark problems, and the decision variables were set in three groups with 100, 500, and 1000 as examples. Experimental results show that the algorithm is promising for solving large-scale sparse optimization problems.</div></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"690 ","pages":"Article 121569"},"PeriodicalIF":8.1,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142530665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-22DOI: 10.1016/j.ins.2024.121571
Guliu Liu , Lei Li , Guanfeng Liu , Xindong Wu
Multi-Constrained Graph Pattern Matching (MC-GPM) aims to match a pattern graph with multiple attribute constraints on its nodes and edges, and has garnered significant interest in various fields, including social-based e-commerce and trust-based group discovery. However, the existing MC-GPM methods do not consider situations where the number of each node in the pattern graph needs to be fixed, such as finding experts group with expert quantities and relations specified. In this paper, a Multi-Constrained Strong Simulation with the Fixed Number of Nodes (MCSS-FNN) matching model is proposed, and then a Trust-oriented Optimal Multi-constrained Path (TOMP) matching algorithm is designed for solving it. Additionally, two heuristic optimization strategies are designed, one for combinatorial testing and the other for edge matching, to enhance the efficiency of the TOMP algorithm. Empirical experiments are conducted on four real social network datasets, and the results demonstrate the effectiveness and efficiency of the proposed algorithm and optimization strategies.
{"title":"Size-fixed group discovery via multi-constrained graph pattern matching","authors":"Guliu Liu , Lei Li , Guanfeng Liu , Xindong Wu","doi":"10.1016/j.ins.2024.121571","DOIUrl":"10.1016/j.ins.2024.121571","url":null,"abstract":"<div><div>Multi-Constrained Graph Pattern Matching (MC-GPM) aims to match a pattern graph with multiple attribute constraints on its nodes and edges, and has garnered significant interest in various fields, including social-based e-commerce and trust-based group discovery. However, the existing MC-GPM methods do not consider situations where the number of each node in the pattern graph needs to be fixed, such as finding experts group with expert quantities and relations specified. In this paper, a Multi-Constrained Strong Simulation with the Fixed Number of Nodes (MCSS-FNN) matching model is proposed, and then a Trust-oriented Optimal Multi-constrained Path (TOMP) matching algorithm is designed for solving it. Additionally, two heuristic optimization strategies are designed, one for combinatorial testing and the other for edge matching, to enhance the efficiency of the TOMP algorithm. Empirical experiments are conducted on four real social network datasets, and the results demonstrate the effectiveness and efficiency of the proposed algorithm and optimization strategies.</div></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"690 ","pages":"Article 121571"},"PeriodicalIF":8.1,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142530664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-22DOI: 10.1016/j.ins.2024.121572
Jianfeng Qiu , Ning Wang , Shengda Shu , Kaixuan Li , Juan Xie , Chunhui Chen , Fan Cheng
Multi-objective evolutionary algorithms have shown their competitiveness in solving ROC convex hull maximization. However, due to “the curse of dimensionality”, few of them focus on high-dimensional ROCCH maximization. Therefore, in this paper, a feedback matrix (FM)-based evolutionary multitasking algorithm, termed as FM-EMTA, is proposed. In FM-EMTA, to tackle “the curse of dimensionality”, a feature importance based low-dimensional task construction strategy is designed to transform the high-dimensional ROCCH maximization task into several low-dimensional tasks. Then, each low-dimensional task evolves with a population. To ensure that the low-dimensional task achieves a better ROCCH, an FM-based evolutionary multitasking operator is proposed. Specifically, for each low-dimensional task i, the element FM(i,j) in feedback matrix is defined to measure the degree that the low-dimensional task j could assist task i. Based on it, an FM-based assisted task selection operator and an FM-based knowledge transfer operator are developed to constitute the evolutionary multitasking operator, with which the useful knowledge is transferred among the low-dimensional tasks. After the evolution, the best ROCCHs obtained by the low-dimensional tasks are combined together to achieve the final ROCCH on the original high-dimensional task. Experiments on twelve high-dimensional datasets with different characteristics demonstrate the superiority of the proposed FM-EMTA over the state-of-the-arts in terms of the area under ROCCH, the hypervolume indicator and the running time.
{"title":"A feedback matrix based evolutionary multitasking algorithm for high-dimensional ROC convex hull maximization","authors":"Jianfeng Qiu , Ning Wang , Shengda Shu , Kaixuan Li , Juan Xie , Chunhui Chen , Fan Cheng","doi":"10.1016/j.ins.2024.121572","DOIUrl":"10.1016/j.ins.2024.121572","url":null,"abstract":"<div><div>Multi-objective evolutionary algorithms have shown their competitiveness in solving ROC convex hull maximization. However, due to “the curse of dimensionality”, few of them focus on high-dimensional ROCCH maximization. Therefore, in this paper, a feedback matrix (<strong>FM</strong>)-based evolutionary multitasking algorithm, termed as FM-EMTA, is proposed. In FM-EMTA, to tackle “the curse of dimensionality”, a feature importance based low-dimensional task construction strategy is designed to transform the high-dimensional ROCCH maximization task into several low-dimensional tasks. Then, each low-dimensional task evolves with a population. To ensure that the low-dimensional task achieves a better ROCCH, an <strong>FM</strong>-based evolutionary multitasking operator is proposed. Specifically, for each low-dimensional task <em>i</em>, the element <strong>FM</strong>(<em>i</em>,<em>j</em>) in feedback matrix is defined to measure the degree that the low-dimensional task <em>j</em> could assist task <em>i</em>. Based on it, an <strong>FM</strong>-based assisted task selection operator and an <strong>FM</strong>-based knowledge transfer operator are developed to constitute the evolutionary multitasking operator, with which the useful knowledge is transferred among the low-dimensional tasks. After the evolution, the best ROCCHs obtained by the low-dimensional tasks are combined together to achieve the final ROCCH on the original high-dimensional task. Experiments on twelve high-dimensional datasets with different characteristics demonstrate the superiority of the proposed FM-EMTA over the state-of-the-arts in terms of the area under ROCCH, the hypervolume indicator and the running time.</div></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"690 ","pages":"Article 121572"},"PeriodicalIF":8.1,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142530658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-21DOI: 10.1016/j.ins.2024.121575
Tong Wu, Gui-Fu Lu
The advantage of multi-view clustering lies in its ability to leverage the diversity and consistency among multiple views to better capture the intrinsic structure of the data. However, existing multi-view methods treat diversity and consistency as a set of opposing attributes, overlooking their inherent connections. Meanwhile, the complete information across multiple views is not fully utilized. To address these issues, this paper proposes the tensorized diversity and consistency with Laplacian manifold for multi-view clustering method (TDCLM). Specifically, starting from the self-expressive property of the original data, we obtain the diversity graphs and the consistency graph, and for the first time, we combined Laplacian manifold constraints to strengthen the relationship between diversity and consistency while jointly optimizing the diversity graphs and the consistency graph. Additionally, we innovatively combine the diversity graphs and the consistency graph into a tensor and subject it to the constraint of tensor nuclear norm. By doing so, we not only obtain the complete information between multiple views but also enable the mutual learning and mutual enhancement of the diversity graphs and the consistency graph. Finally, by adopting the augmented Lagrange multiplier method, we integrate the two steps into a comprehensive framework. The TDCLM shows a performance enhancement of up to 25.85%, with experimental results across diverse datasets demonstrating that the TDCLM algorithm surpasses the state-of-the-art algorithms. In other words, these experimental results validate the importance of obtaining complete information from multiple views and effectively leveraging the diversity and consistency inherent in this complete information. The code is publicly available at https://github.com/TongWuahpu/TDCLM.
{"title":"Tensorized diversity and consistency with Laplacian manifold for multi-view clustering","authors":"Tong Wu, Gui-Fu Lu","doi":"10.1016/j.ins.2024.121575","DOIUrl":"10.1016/j.ins.2024.121575","url":null,"abstract":"<div><div>The advantage of multi-view clustering lies in its ability to leverage the diversity and consistency among multiple views to better capture the intrinsic structure of the data. However, existing multi-view methods treat diversity and consistency as a set of opposing attributes, overlooking their inherent connections. Meanwhile, the complete information across multiple views is not fully utilized. To address these issues, this paper proposes the tensorized diversity and consistency with Laplacian manifold for multi-view clustering method (TDCLM). Specifically, starting from the self-expressive property of the original data, we obtain the diversity graphs and the consistency graph, and for the first time, we combined Laplacian manifold constraints to strengthen the relationship between diversity and consistency while jointly optimizing the diversity graphs and the consistency graph. Additionally, we innovatively combine the diversity graphs and the consistency graph into a tensor and subject it to the constraint of tensor nuclear norm. By doing so, we not only obtain the complete information between multiple views but also enable the mutual learning and mutual enhancement of the diversity graphs and the consistency graph. Finally, by adopting the augmented Lagrange multiplier method, we integrate the two steps into a comprehensive framework. The TDCLM shows a performance enhancement of up to 25.85%, with experimental results across diverse datasets demonstrating that the TDCLM algorithm surpasses the state-of-the-art algorithms. In other words, these experimental results validate the importance of obtaining complete information from multiple views and effectively leveraging the diversity and consistency inherent in this complete information. The code is publicly available at https://github.com/TongWuahpu/TDCLM.</div></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"690 ","pages":"Article 121575"},"PeriodicalIF":8.1,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142530655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-18DOI: 10.1016/j.ins.2024.121555
Jian Feng , Weizhao Song , Lijuan Xu , Juan Zhang
The distributed time-varying formation (TVF) control problem of multiagent systems (MASs) with multileaders is explored in this research. Contrasted with the existing results, this paper considers the following situations: 1) the multifollower group is heterogeneous, as is the multileader group; 2) the heterogeneous system matrices of the multileaders are not already known to all followers; 3) some strict constraint conditions, such as well-informed follower assumption and virtual leader condition, are removed. This paper presents the event-triggered (ET) matrix observer, the adaptive ET state compensator, and the output-feedback TVF controller, which are constituted as the innovative completely distributed ET control protocol. Considering the limited communication bandwidth, the ET matrix observer and compensator are designed, with the communication-bandwidth-saving manners, to estimate the integrated system matrix and integrated state information of all leader agents, respectively. The output feedback formation controller is built to adjust the followers to keep the predetermined team formations and follow the reference trace, where the trace is all leaders outputs' convex combination. The stability analysis and simulation experiment are brought out to demonstrate the validity of the suggested control strategy.
本研究探讨了多领导者多代理系统(MAS)的分布式时变编队(TVF)控制问题。与现有成果相比,本文考虑了以下情况:1) 多追随者群体是异构的,多领导者群体也是异构的;2) 多领导者的异构系统矩阵并非所有追随者都已知晓;3) 取消了一些严格的约束条件,如知情追随者假设和虚拟领导者条件。本文提出了事件触发(ET)矩阵观测器、自适应 ET 状态补偿器和输出反馈 TVF 控制器,它们构成了创新的完全分布式 ET 控制协议。考虑到有限的通信带宽,ET 矩阵观测器和补偿器以节省通信带宽的方式设计,分别估计所有领导者代理的集成系统矩阵和集成状态信息。建立了输出反馈队形控制器,以调整追随者保持预定的队形并遵循参考轨迹,其中轨迹为所有领导者的输出凸组合。通过稳定性分析和仿真实验证明了建议控制策略的有效性。
{"title":"Formation control of multiagent systems with multileaders through completely distributed intermittent communication strategies","authors":"Jian Feng , Weizhao Song , Lijuan Xu , Juan Zhang","doi":"10.1016/j.ins.2024.121555","DOIUrl":"10.1016/j.ins.2024.121555","url":null,"abstract":"<div><div>The distributed time-varying formation (TVF) control problem of multiagent systems (MASs) with multileaders is explored in this research. Contrasted with the existing results, this paper considers the following situations: 1) the multifollower group is heterogeneous, as is the multileader group; 2) the heterogeneous system matrices of the multileaders are not already known to all followers; 3) some strict constraint conditions, such as well-informed follower assumption and virtual leader condition, are removed. This paper presents the event-triggered (ET) matrix observer, the adaptive ET state compensator, and the output-feedback TVF controller, which are constituted as the innovative completely distributed ET control protocol. Considering the limited communication bandwidth, the ET matrix observer and compensator are designed, with the communication-bandwidth-saving manners, to estimate the integrated system matrix and integrated state information of all leader agents, respectively. The output feedback formation controller is built to adjust the followers to keep the predetermined team formations and follow the reference trace, where the trace is all leaders outputs' convex combination. The stability analysis and simulation experiment are brought out to demonstrate the validity of the suggested control strategy.</div></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"690 ","pages":"Article 121555"},"PeriodicalIF":8.1,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142530612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-18DOI: 10.1016/j.ins.2024.121568
Hailong Cui, Guanglei Zhao, Shuang Liu, Zhijie Li
This paper studies event-triggered bipartite output consensus problem of heterogeneous multiagent systems under denial-of-service (DoS) attacks. A novel dynamic event-triggered scheme (DETS) is proposed, which, by introducing an extra dynamic function with time-varying coefficients into triggering conditions, can guarantee strictly positive minimum inter-event intervals no matter DoS attacks occur or not. An event-based resilient compensator with adaptive coupling coefficients is then designed to estimate leader's state, and a hybrid model with jump dynamics is constructed that can incorporate the estimation error, DETS, and DoS attacks, and is useful for convergence analysis. Then, a fully distributed observer-based control protocol is designed to regulate the bipartite output consensus. The main advantages of the proposed method include: 1) global information is not needed to implement the event-based control protocol; 2) strictly positive inter-event intervals are guaranteed even under DoS attacks. Finally, a numerical example is presented to testify the main results.
本文研究了拒绝服务(DoS)攻击下异构多代理系统的事件触发式双向输出共识问题。本文提出了一种新颖的动态事件触发方案(DETS),通过在触发条件中引入一个具有时变系数的额外动态函数,无论 DoS 攻击发生与否,都能保证严格正向的最小事件间隔。然后,设计了一个具有自适应耦合系数的基于事件的弹性补偿器来估计领导者的状态,并构建了一个具有跳跃动力学的混合模型,该模型可以包含估计误差、DETS 和 DoS 攻击,并可用于收敛性分析。然后,设计了一种基于观测器的全分布式控制协议,以调节双向输出共识。所提方法的主要优点包括1) 实现基于事件的控制协议不需要全局信息;2) 即使在 DoS 攻击下也能保证严格的正事件间隔。最后,将通过一个数值示例来验证主要结果。
{"title":"Event-triggered bipartite consensus to heterogeneous multiagent systems under DoS attacks: A fully distributed method","authors":"Hailong Cui, Guanglei Zhao, Shuang Liu, Zhijie Li","doi":"10.1016/j.ins.2024.121568","DOIUrl":"10.1016/j.ins.2024.121568","url":null,"abstract":"<div><div>This paper studies event-triggered bipartite output consensus problem of heterogeneous multiagent systems under denial-of-service (DoS) attacks. A novel dynamic event-triggered scheme (DETS) is proposed, which, by introducing an extra dynamic function with time-varying coefficients into triggering conditions, can guarantee strictly positive minimum inter-event intervals no matter DoS attacks occur or not. An event-based resilient compensator with adaptive coupling coefficients is then designed to estimate leader's state, and a hybrid model with jump dynamics is constructed that can incorporate the estimation error, DETS, and DoS attacks, and is useful for convergence analysis. Then, a fully distributed observer-based control protocol is designed to regulate the bipartite output consensus. The main advantages of the proposed method include: 1) global information is not needed to implement the event-based control protocol; 2) strictly positive inter-event intervals are guaranteed even under DoS attacks. Finally, a numerical example is presented to testify the main results.</div></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"690 ","pages":"Article 121568"},"PeriodicalIF":8.1,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142530649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-18DOI: 10.1016/j.ins.2024.121567
Shihan Liu, Lijun Liu, Zhen Yu
In this paper, a novel safe robust multi-agent reinforcement learning method integrated with decentralized robust neural control barrier functions (CBFs) and a safety attention mechanism (SAM) is proposed for the safety-critical multi-agent system (MAS). Safety is fundamental in the safety-critical MAS but can be affected by factors such as modeling errors, external unknown disturbances, and time-varying observable agents. Several appropriate measures are implemented to address these issues. First, modeling errors and external disturbances are regarded as an adversary for each agent. The agent learns a policy that is robust to disturbances created by the adversary. Accordingly, decentralized robust neural CBFs are introduced to maintain the safety of the MAS, particularly when the general handcrafted CBFs are difficult to construct. The SAM, in combination with the robust neural CBFs, provides a control policy with the capacity to handle time-varying observable agents and increases its attention to dangerous events. The online fine-tuning procedure further enhances the safety. Finally, experiments demonstrate the safety and effectiveness of the proposed method.
本文针对安全关键型多代理系统(MAS)提出了一种新型安全鲁棒多代理强化学习方法,该方法与分散鲁棒神经控制障碍函数(CBF)和安全注意机制(SAM)相结合。安全是安全关键型多代理系统的基础,但会受到建模错误、外部未知干扰和时变可观测代理等因素的影响。为解决这些问题,我们采取了几种适当的措施。首先,建模错误和外部干扰被视为每个代理的对手。代理学习的策略对对手造成的干扰具有鲁棒性。因此,我们引入了分散式鲁棒神经 CBF,以维护 MAS 的安全性,尤其是在难以构建一般手工 CBF 的情况下。SAM 与鲁棒神经 CBF 相结合,提供了一种控制策略,能够处理时变的可观测代理,并提高对危险事件的关注度。在线微调程序进一步提高了安全性。最后,实验证明了所提方法的安全性和有效性。
{"title":"Safe robust multi-agent reinforcement learning with neural control barrier functions and safety attention mechanism","authors":"Shihan Liu, Lijun Liu, Zhen Yu","doi":"10.1016/j.ins.2024.121567","DOIUrl":"10.1016/j.ins.2024.121567","url":null,"abstract":"<div><div>In this paper, a novel safe robust multi-agent reinforcement learning method integrated with decentralized robust neural control barrier functions (CBFs) and a safety attention mechanism (SAM) is proposed for the safety-critical multi-agent system (MAS). Safety is fundamental in the safety-critical MAS but can be affected by factors such as modeling errors, external unknown disturbances, and time-varying observable agents. Several appropriate measures are implemented to address these issues. First, modeling errors and external disturbances are regarded as an adversary for each agent. The agent learns a policy that is robust to disturbances created by the adversary. Accordingly, decentralized robust neural CBFs are introduced to maintain the safety of the MAS, particularly when the general handcrafted CBFs are difficult to construct. The SAM, in combination with the robust neural CBFs, provides a control policy with the capacity to handle time-varying observable agents and increases its attention to dangerous events. The online fine-tuning procedure further enhances the safety. Finally, experiments demonstrate the safety and effectiveness of the proposed method.</div></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"690 ","pages":"Article 121567"},"PeriodicalIF":8.1,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142530614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-18DOI: 10.1016/j.ins.2024.121548
Laouni Djafri
In the field of data mining and machine learning, dealing with imbalanced datasets is one of the most complex problems. The class imbalance issue significantly affects the classification of minority classes when using common classification algorithms. These algorithms often prioritize improving the performance of the majority class at the expense of the minority class, leading to misclassifying negative instances as positive ones. To address this problem, the Synthetic Minority Over-sampling Technique (SMOTE) has gained popularity to rebalance imbalanced data for classification. However, in this paper, we propose two algorithms to enhance the performance of imbalanced classification further. The first algorithm is PRO-SMOTE, an improvement over SMOTE. PRO-SMOTE relies on conditional probabilities to effectively rebalance imbalanced classes and improve the predictive performance metrics satisfactorily and reliably. By considering conditional probabilities, PRO-SMOTE can reduce the majority classes and optimally increase the minority class. Second, the PRO-SMOTEBoost algorithm, in turn, is based on the PRO-SMOTE to overcome classification anomalies and problems encountered by machine learning algorithms during classification, especially the weak ones. PRO-SMOTEBoost aims to maximize predictive precision to the greatest extent possible by combining the strengths of PRO-SMOTE with boosting techniques. Evaluating these algorithms using traditional machine learning algorithms such as Random Forests, C4.5, Naive Bayes, and Support Vector Machines has demonstrated excellent classification results. The performance metrics, encompassing F1-score, G-means, Precision, Accuracy, Recall, AUC-ROC, and Precision-Recall-curves, achieved by the proposed algorithm demonstrate a range that extends from over 90% to a flawless score of 100%. Compared to using these traditional algorithms individually, the utilization of PRO-SMOTEBoost has shown a significant improvement of 10% to 40% in performance metrics. Overall, the proposed algorithms, PRO-SMOTE and PRO-SMOTEBoost, offer effective solutions to address the challenges posed by imbalanced datasets. They provide improved predictive metrics and demonstrate their superiority when compared to traditional even modern machine learning algorithms.
{"title":"PRO-SMOTEBoost: An adaptive SMOTEBoost probabilistic algorithm for rebalancing and improving imbalanced data classification","authors":"Laouni Djafri","doi":"10.1016/j.ins.2024.121548","DOIUrl":"10.1016/j.ins.2024.121548","url":null,"abstract":"<div><div>In the field of data mining and machine learning, dealing with imbalanced datasets is one of the most complex problems. The class imbalance issue significantly affects the classification of minority classes when using common classification algorithms. These algorithms often prioritize improving the performance of the majority class at the expense of the minority class, leading to misclassifying negative instances as positive ones. To address this problem, the Synthetic Minority Over-sampling Technique (SMOTE) has gained popularity to rebalance imbalanced data for classification. However, in this paper, we propose two algorithms to enhance the performance of imbalanced classification further. The first algorithm is PRO-SMOTE, an improvement over SMOTE. PRO-SMOTE relies on conditional probabilities to effectively rebalance imbalanced classes and improve the predictive performance metrics satisfactorily and reliably. By considering conditional probabilities, PRO-SMOTE can reduce the majority classes and optimally increase the minority class. Second, the PRO-SMOTEBoost algorithm, in turn, is based on the PRO-SMOTE to overcome classification anomalies and problems encountered by machine learning algorithms during classification, especially the weak ones. PRO-SMOTEBoost aims to maximize predictive precision to the greatest extent possible by combining the strengths of PRO-SMOTE with boosting techniques. Evaluating these algorithms using traditional machine learning algorithms such as Random Forests, C4.5, Naive Bayes, and Support Vector Machines has demonstrated excellent classification results. The performance metrics, encompassing F1-score, G-means, Precision, Accuracy, Recall, AUC-ROC, and Precision-Recall-curves, achieved by the proposed algorithm demonstrate a range that extends from over 90% to a flawless score of 100%. Compared to using these traditional algorithms individually, the utilization of PRO-SMOTEBoost has shown a significant improvement of 10% to 40% in performance metrics. Overall, the proposed algorithms, PRO-SMOTE and PRO-SMOTEBoost, offer effective solutions to address the challenges posed by imbalanced datasets. They provide improved predictive metrics and demonstrate their superiority when compared to traditional even modern machine learning algorithms.</div></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"690 ","pages":"Article 121548"},"PeriodicalIF":8.1,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142530611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-18DOI: 10.1016/j.ins.2024.121556
Yujia Liu , Yuwei Song , Changyong Liang , Mingshuo Cao , Jian Wu
The minimum cost consensus model (MCCM) proposes an effective method for reaching group consensus in group decision-making problems. Conventional MCCM and its advanced models focus on the different behaviors and psychologies of decision-makers, but, it ignores the heterogeneity of decision-makers that activated them. Therefore, they need to assume the compromise limits and unit adjustment costs of decision-makers, which may be difficult to achieve in practice. To resolve this problem, this study will propose a novel data-driven minimum cost consensus model of different compromise limits and unit costs based on online Big Five personality traits prediction. First, this study uses the Convolutional Neural Network (CNN) and Bi-directional Long-Short Term Memory model (BiLSTM) to obtain the decision-maker's probability of agreeableness based on their Weibo online reviews. Second, a novel minimum cost consensus model considering the decision-maker's personality traits (MCCM-P) is established. To do that, the unit adjustment cost and the personalized compromise limits of decision-makers and their interrelations are defined based on the personality traits prediction. Finally, the MCCM-P is applied in a real group decision-making case study of a university student club activity selection. The result and comparative analysis show that the proposed MCCM model can obtain lower consensus reaching costs than the traditional method.
{"title":"A data-driven minimum cost consensus model for group decision making with personality traits prediction","authors":"Yujia Liu , Yuwei Song , Changyong Liang , Mingshuo Cao , Jian Wu","doi":"10.1016/j.ins.2024.121556","DOIUrl":"10.1016/j.ins.2024.121556","url":null,"abstract":"<div><div>The minimum cost consensus model (MCCM) proposes an effective method for reaching group consensus in group decision-making problems. Conventional MCCM and its advanced models focus on the different behaviors and psychologies of decision-makers, but, it ignores the heterogeneity of decision-makers that activated them. Therefore, they need to assume the compromise limits and unit adjustment costs of decision-makers, which may be difficult to achieve in practice. To resolve this problem, this study will propose a novel data-driven minimum cost consensus model of different compromise limits and unit costs based on online Big Five personality traits prediction. First, this study uses the Convolutional Neural Network (CNN) and Bi-directional Long-Short Term Memory model (BiLSTM) to obtain the decision-maker's probability of agreeableness based on their Weibo online reviews. Second, a novel minimum cost consensus model considering the decision-maker's personality traits (MCCM-P) is established. To do that, the unit adjustment cost and the personalized compromise limits of decision-makers and their interrelations are defined based on the personality traits prediction. Finally, the MCCM-P is applied in a real group decision-making case study of a university student club activity selection. The result and comparative analysis show that the proposed MCCM model can obtain lower consensus reaching costs than the traditional method.</div></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"690 ","pages":"Article 121556"},"PeriodicalIF":8.1,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142530650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Detecting anomalies in complex data is crucial for knowledge discovery and data mining across a wide range of applications. While density-based methods are effective for handling varying data densities and diverse distributions, they often struggle with accurately estimating densities in heterogeneous, uncertain data and capturing interdependencies among features in high-dimensional spaces. This paper proposes a fuzzy granule density-based anomaly detection algorithm (GDAD) for heterogeneous data. Specifically, GDAD first partitions high-dimensional attributes into subspaces based on their interdependencies and employs fuzzy information granules to represent data. The core of the method is the definition of fuzzy granule density, which leverages local neighborhood information alongside global density patterns and effectively characterizes anomalies in data. Each object is then assigned a fuzzy granule density-based anomaly factor, reflecting its likelihood of being anomalous. Through extensive experimentation on various real-world datasets, GDAD has demonstrated superior performance, matching or surpassing existing state-of-the-art methods. GDAD's integration of granular computing with density estimation provides a practical framework for anomaly detection in high-dimensional heterogeneous data.
{"title":"Integrating granular computing with density estimation for anomaly detection in high-dimensional heterogeneous data","authors":"Baiyang Chen , Zhong Yuan , Dezhong Peng , Xiaoliang Chen , Hongmei Chen , Yingke Chen","doi":"10.1016/j.ins.2024.121566","DOIUrl":"10.1016/j.ins.2024.121566","url":null,"abstract":"<div><div>Detecting anomalies in complex data is crucial for knowledge discovery and data mining across a wide range of applications. While density-based methods are effective for handling varying data densities and diverse distributions, they often struggle with accurately estimating densities in heterogeneous, uncertain data and capturing interdependencies among features in high-dimensional spaces. This paper proposes a fuzzy granule density-based anomaly detection algorithm (GDAD) for heterogeneous data. Specifically, GDAD first partitions high-dimensional attributes into subspaces based on their interdependencies and employs fuzzy information granules to represent data. The core of the method is the definition of fuzzy granule density, which leverages local neighborhood information alongside global density patterns and effectively characterizes anomalies in data. Each object is then assigned a fuzzy granule density-based anomaly factor, reflecting its likelihood of being anomalous. Through extensive experimentation on various real-world datasets, GDAD has demonstrated superior performance, matching or surpassing existing state-of-the-art methods. GDAD's integration of granular computing with density estimation provides a practical framework for anomaly detection in high-dimensional heterogeneous data.</div></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"690 ","pages":"Article 121566"},"PeriodicalIF":8.1,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142530647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}