Pub Date : 2025-02-11DOI: 10.1016/j.asoc.2025.112839
Lihong Peng , Longlong Liu , Liangliang Huang , Zongzheng Bai , Min Chen , Xing Chen
Cell–cell communication (CCC) is essential to tumor growth, metastasis, and resistance to therapy. The rapid development of single-cell RNA sequencing (scRNA-seq) technologies facilitates us to more accurately decipher cellular signal transduction. CCC is usually mediated by interacting ligand–receptor pairs. Thus, it is crucial to construct a comprehensive ligand–receptor interaction (LRI) database for decoding CCC. Moreover, ordinary ligand–receptor scoring metrics-based CCC inference methods need to be further explored. To solve the above two problems, in this study, we propose a novel computational framework called CellGDnG to analyze CCC within the tumor microenvironment. CellGDnG first utilizes a heterogeneous ensemble deep learning model to identify potential LRIs. Next, it adopts a weighted geometric mean-based strategy to infer CCC. A series of in-depth comparative experiments of CellGDnG and other popular tools demonstrated its ability to decode CCC precisely and effectively. Furthermore, CellGDnG offered heatmap view, circle plot, and sankey diagram to visualize CCC. Notably, the predicted top 3 LRIs mediating breast cancer CCC could be its potential therapeutically tractable drug targets. Available as an open-source tool, CellGDnG provides valuable clues to unveil cell–cell signal transduction and develop new targeted drugs.
{"title":"Predicting cell–cell communication by combining heterogeneous ensemble deep learning and weighted geometric mean","authors":"Lihong Peng , Longlong Liu , Liangliang Huang , Zongzheng Bai , Min Chen , Xing Chen","doi":"10.1016/j.asoc.2025.112839","DOIUrl":"10.1016/j.asoc.2025.112839","url":null,"abstract":"<div><div>Cell–cell communication (CCC) is essential to tumor growth, metastasis, and resistance to therapy. The rapid development of single-cell RNA sequencing (scRNA-seq) technologies facilitates us to more accurately decipher cellular signal transduction. CCC is usually mediated by interacting ligand–receptor pairs. Thus, it is crucial to construct a comprehensive ligand–receptor interaction (LRI) database for decoding CCC. Moreover, ordinary ligand–receptor scoring metrics-based CCC inference methods need to be further explored. To solve the above two problems, in this study, we propose a novel computational framework called CellGDnG to analyze CCC within the tumor microenvironment. CellGDnG first utilizes a heterogeneous ensemble deep learning model to identify potential LRIs. Next, it adopts a weighted geometric mean-based strategy to infer CCC. A series of in-depth comparative experiments of CellGDnG and other popular tools demonstrated its ability to decode CCC precisely and effectively. Furthermore, CellGDnG offered heatmap view, circle plot, and sankey diagram to visualize CCC. Notably, the predicted top 3 LRIs mediating breast cancer CCC could be its potential therapeutically tractable drug targets. Available as an open-source tool, CellGDnG provides valuable clues to unveil cell–cell signal transduction and develop new targeted drugs.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"172 ","pages":"Article 112839"},"PeriodicalIF":7.2,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143403529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-10DOI: 10.1016/j.asoc.2025.112845
Mauricio Morais Almeida, João Dallyson Sousa Almeida, Darlan Bruno Pontes Quintanilha, Geraldo Braz Júnior, Aristófanes Correa Silva
Time series are regularly collected data that describe the average evolution of an event over time, making them increasingly relevant in areas such as business, natural sciences and medicine. A major challenge related to time series is data loss, and several approaches have been developed to recover missing values in univariate time series (UTS). This work aims to improve the imputation of missing data in univariate and heterogeneous time series. Thus, we built a diverse database covering different time series domains and selected a set of data imputation techniques. The results show that imputation in time series is challenging, especially due to the variability of the series, the position of missing data and the number of samples passed to each technique. The HybridLSTM network, developed in this study, proved effective in recommending the most suitable imputation techniques for each series, resulting in a lower average error than using a single technique or recent techniques such as Pix2Pix and Moment. In addition, adopting a hybrid loss function, which considers multi-class and multi-label tasks, contributed to optimal or near-optimal performance, even in cases where the ideal was not achieved. These advances were possible thanks to the efficient but simple construction of metadata and the innovative approach of locally combining several imputation techniques within the same series. We observed that meta-learning has great potential to be applied in real contexts where the ideal technique is not previously known and the data has not been pre-treated in terms of data values. Moreover, as our experiments were very close to this context, it became useful, as the model performed very close to the ideal, validating the applicability of the adaptive meta-learning approach to optimize the imputation of missing data in real contexts.
{"title":"A meta-learning based neural network and LSTM for univariate time series missing data imputation","authors":"Mauricio Morais Almeida, João Dallyson Sousa Almeida, Darlan Bruno Pontes Quintanilha, Geraldo Braz Júnior, Aristófanes Correa Silva","doi":"10.1016/j.asoc.2025.112845","DOIUrl":"10.1016/j.asoc.2025.112845","url":null,"abstract":"<div><div>Time series are regularly collected data that describe the average evolution of an event over time, making them increasingly relevant in areas such as business, natural sciences and medicine. A major challenge related to time series is data loss, and several approaches have been developed to recover missing values in univariate time series (UTS). This work aims to improve the imputation of missing data in univariate and heterogeneous time series. Thus, we built a diverse database covering different time series domains and selected a set of data imputation techniques. The results show that imputation in time series is challenging, especially due to the variability of the series, the position of missing data and the number of samples passed to each technique. The HybridLSTM network, developed in this study, proved effective in recommending the most suitable imputation techniques for each series, resulting in a lower average error than using a single technique or recent techniques such as Pix2Pix and Moment. In addition, adopting a hybrid loss function, which considers multi-class and multi-label tasks, contributed to optimal or near-optimal performance, even in cases where the ideal was not achieved. These advances were possible thanks to the efficient but simple construction of metadata and the innovative approach of locally combining several imputation techniques within the same series. We observed that meta-learning has great potential to be applied in real contexts where the ideal technique is not previously known and the data has not been pre-treated in terms of data values. Moreover, as our experiments were very close to this context, it became useful, as the model performed very close to the ideal, validating the applicability of the adaptive meta-learning approach to optimize the imputation of missing data in real contexts.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"172 ","pages":"Article 112845"},"PeriodicalIF":7.2,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143387131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The design of Artificial Neural Networks (ANN) is critical for their performance. The research field called Neural Network Search (NAS) investigates automated design strategies. This work proposes a novel NAS stack that stands out in three facets. First, the representation scheme encodes problem-specific ANN as plain vectors of numbers without needing auxiliary conversion models. Second, it is a pioneer in relying on the TLBO meta-heuristic. This optimizer supports large-scale problems and only expects two parameters, contrasting with other meta-heuristics used for NAS. Third, the stack includes a new evaluation predictor that avoids evaluating non-promising architectures. It combines several machine learning methods that train as the optimizer evaluates solutions, which avoids preliminary preparing this component and makes it self-adaptive. The proposal has been tested by using it to build a CIFAR-10 classifier while forcing the architecture to have fewer than 150,000 parameters, assuming that the resulting network must be deployed in a resource-constrained IoT device. The designs found with and without the predictor achieve validation accuracies of 78.68% and 80.65%, respectively. Both outperform a larger model from the recent literature. The predictor slightly constraints the evolution of solutions, but it approximately halves the computational effort. After extending the test to the CIFAR-100 dataset, the proposal achieves a validation accuracy of 65.43% with 478,006 parameters in its fastest configuration, competing with current results in the literature.
{"title":"A holistic approach for resource-constrained neural network architecture search","authors":"M. Lupión , N.C. Cruz , E.M. Ortigosa , P.M. Ortigosa","doi":"10.1016/j.asoc.2025.112832","DOIUrl":"10.1016/j.asoc.2025.112832","url":null,"abstract":"<div><div>The design of Artificial Neural Networks (ANN) is critical for their performance. The research field called Neural Network Search (NAS) investigates automated design strategies. This work proposes a novel NAS stack that stands out in three facets. First, the representation scheme encodes problem-specific ANN as plain vectors of numbers without needing auxiliary conversion models. Second, it is a pioneer in relying on the TLBO meta-heuristic. This optimizer supports large-scale problems and only expects two parameters, contrasting with other meta-heuristics used for NAS. Third, the stack includes a new evaluation predictor that avoids evaluating non-promising architectures. It combines several machine learning methods that train as the optimizer evaluates solutions, which avoids preliminary preparing this component and makes it self-adaptive. The proposal has been tested by using it to build a CIFAR-10 classifier while forcing the architecture to have fewer than 150,000 parameters, assuming that the resulting network must be deployed in a resource-constrained IoT device. The designs found with and without the predictor achieve validation accuracies of 78.68% and 80.65%, respectively. Both outperform a larger model from the recent literature. The predictor slightly constraints the evolution of solutions, but it approximately halves the computational effort. After extending the test to the CIFAR-100 dataset, the proposal achieves a validation accuracy of 65.43% with 478,006 parameters in its fastest configuration, competing with current results in the literature.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"172 ","pages":"Article 112832"},"PeriodicalIF":7.2,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143394497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-10DOI: 10.1016/j.asoc.2025.112844
Siyang Li , Peng Zhao , Hongjun Wang , Huan Wang , Tianrui Li
Clustering ensemble is an important method in machine learning and data mining for achieving robust and consistent results by integrating multiple base clustering results. However, existing clustering ensemble methods often overlook self-supervised information in the data, treating data points assigned to the same cluster as equivalent, regardless of their relative distances from the cluster center, which may hinder clustering ensemble methods improvement performance. To address this issue, we propose the Neighbor Self-embedding Graph Model for Clustering Ensemble (NSGMCE), which leverages self-supervised embeddings derived from diverse base clustering methods to extract structural information intrinsic to the data while preserving the characteristics of the base clustering results. Specifically, unlike traditional methods that directly rely on pseudo-labels from base clustering, NSGMCE treats self-supervised embeddings as new feature representations, retaining the advantages of ensemble learning while mitigating the impact of erroneous pseudo-labels. Subsequently, this self-supervised embeddings is used to construct an neighbor self-embedding graph, which is optimized and pruned during the alternating minimization inference process to obtain the final consensus result. The objective function of NSGMCE is formulated as a convex optimization problem, this smooth and continuous objective ensures its convergence and allows for efficient solving to obtain the global optimal solution, which we have theoretically proven. Extensive experiments were conducted comparing NSGMCE with eight state-of-the-art clustering ensemble methods across 12 datasets, demonstrating its superior performance. Specifically, NSGMCE achieved an 11.8% improvement in average accuracy over the second-best method and ranked first in terms of average Purity, Rand Index, and Mirkin metric. Further analysis confirmed that NSGMCE effectively leverages self-supervised embeddings to enhance robustness and stability.
{"title":"Neighbor self-embedding graph model for clustering ensemble","authors":"Siyang Li , Peng Zhao , Hongjun Wang , Huan Wang , Tianrui Li","doi":"10.1016/j.asoc.2025.112844","DOIUrl":"10.1016/j.asoc.2025.112844","url":null,"abstract":"<div><div>Clustering ensemble is an important method in machine learning and data mining for achieving robust and consistent results by integrating multiple base clustering results. However, existing clustering ensemble methods often overlook self-supervised information in the data, treating data points assigned to the same cluster as equivalent, regardless of their relative distances from the cluster center, which may hinder clustering ensemble methods improvement performance. To address this issue, we propose the Neighbor Self-embedding Graph Model for Clustering Ensemble (NSGMCE), which leverages self-supervised embeddings derived from diverse base clustering methods to extract structural information intrinsic to the data while preserving the characteristics of the base clustering results. Specifically, unlike traditional methods that directly rely on pseudo-labels from base clustering, NSGMCE treats self-supervised embeddings as new feature representations, retaining the advantages of ensemble learning while mitigating the impact of erroneous pseudo-labels. Subsequently, this self-supervised embeddings is used to construct an neighbor self-embedding graph, which is optimized and pruned during the alternating minimization inference process to obtain the final consensus result. The objective function of NSGMCE is formulated as a convex optimization problem, this smooth and continuous objective ensures its convergence and allows for efficient solving to obtain the global optimal solution, which we have theoretically proven. Extensive experiments were conducted comparing NSGMCE with eight state-of-the-art clustering ensemble methods across 12 datasets, demonstrating its superior performance. Specifically, NSGMCE achieved an 11.8% improvement in average accuracy over the second-best method and ranked first in terms of average Purity, Rand Index, and Mirkin metric. Further analysis confirmed that NSGMCE effectively leverages self-supervised embeddings to enhance robustness and stability.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"171 ","pages":"Article 112844"},"PeriodicalIF":7.2,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143394746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Reinforcement learning (RL) has been widely used to make continuous trading decisions in portfolio management. However, traditional quantitative trading methods often generalize poorly under certain market conditions, whereas the output of prediction-based approaches cannot be easily translated into actionable insights for trading. Market volatility, noisy signals, and unrealistic simulation environments also exacerbate these challenges. To address the aforementioned limitations, we developed a novel framework that combines Multi-task self-supervised learning (MTSSL) and adaptive exploration (AdapExp) modules. The MTSSL module leverages auxiliary tasks to learn meaningful financial market representations from alternative data, whereas the AdapExp module enhances RL training efficiency by improving the fidelity of the simulation environment. Experimental results obtained in backtesting conducted in real financial markets indicate that the proposed framework achieved approximately 13% higher returns relative to state-of-the-art models. Furthermore, this framework can be used with various RL methods to considerably improve their performance.
{"title":"Portfolio management using online reinforcement learning with adaptive exploration and Multi-task self-supervised representation","authors":"Chuan-Yun Sang , Szu-Hao Huang , Chiao-Ting Chen , Heng-Ta Chang","doi":"10.1016/j.asoc.2025.112846","DOIUrl":"10.1016/j.asoc.2025.112846","url":null,"abstract":"<div><div>Reinforcement learning (RL) has been widely used to make continuous trading decisions in portfolio management. However, traditional quantitative trading methods often generalize poorly under certain market conditions, whereas the output of prediction-based approaches cannot be easily translated into actionable insights for trading. Market volatility, noisy signals, and unrealistic simulation environments also exacerbate these challenges. To address the aforementioned limitations, we developed a novel framework that combines Multi-task self-supervised learning (MTSSL) and adaptive exploration (AdapExp) modules. The MTSSL module leverages auxiliary tasks to learn meaningful financial market representations from alternative data, whereas the AdapExp module enhances RL training efficiency by improving the fidelity of the simulation environment. Experimental results obtained in backtesting conducted in real financial markets indicate that the proposed framework achieved approximately 13% higher returns relative to state-of-the-art models. Furthermore, this framework can be used with various RL methods to considerably improve their performance.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"172 ","pages":"Article 112846"},"PeriodicalIF":7.2,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143403611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-10DOI: 10.1016/j.asoc.2025.112864
Ruyue Chang, Xuejuan Liu, Wanjun Deng
This paper constructs a corporate default risk prediction model taking ESG scores into account in an unbalanced sample state. Four indicators are introduced—the ESG composite score, environmental dimension score, social dimension score, and governance dimension score—to assess an enterprise's capacity for sustainable development as well as its level of greenness and low carbon emissions. These indicators improve and supplement the current corporate default prediction indicator system. The Focal Loss function and the cost-sensitive decision threshold are used to improve the traditional Stacking model at the algorithmic aspect. The CS-FL-Stacking model is then built to address the problem of sample class imbalance. After conducting an empirical analysis with data from 3006 Chinese A-share listed companies during 2021–2023, the following conclusions are drawn: (1) The inclusion of ESG indicators can somewhat enhance the model's prediction ability and reduce the misclassification loss. (2) The CS-FL-Stacking model generally outperforms the benchmark model in terms of accuracy and other indicators. It also considerably improves its capacity to identify minority samples, which can effectively address the problem of unbalanced sample classification. (3) Relevant recommendations are provided for the improvement of the CS-FL-Stacking model and the application of ESG indicators in corporate risk management in light of the analysis just mentioned.
{"title":"Prediction of corporate default risk considering ESG performance and unbalanced samples","authors":"Ruyue Chang, Xuejuan Liu, Wanjun Deng","doi":"10.1016/j.asoc.2025.112864","DOIUrl":"10.1016/j.asoc.2025.112864","url":null,"abstract":"<div><div>This paper constructs a corporate default risk prediction model taking ESG scores into account in an unbalanced sample state. Four indicators are introduced—the ESG composite score, environmental dimension score, social dimension score, and governance dimension score—to assess an enterprise's capacity for sustainable development as well as its level of greenness and low carbon emissions. These indicators improve and supplement the current corporate default prediction indicator system. The Focal Loss function and the cost-sensitive decision threshold are used to improve the traditional Stacking model at the algorithmic aspect. The CS-FL-Stacking model is then built to address the problem of sample class imbalance. After conducting an empirical analysis with data from 3006 Chinese A-share listed companies during 2021–2023, the following conclusions are drawn: (1) The inclusion of ESG indicators can somewhat enhance the model's prediction ability and reduce the misclassification loss. (2) The CS-FL-Stacking model generally outperforms the benchmark model in terms of accuracy and other indicators. It also considerably improves its capacity to identify minority samples, which can effectively address the problem of unbalanced sample classification. (3) Relevant recommendations are provided for the improvement of the CS-FL-Stacking model and the application of ESG indicators in corporate risk management in light of the analysis just mentioned.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"171 ","pages":"Article 112864"},"PeriodicalIF":7.2,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143394749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-10DOI: 10.1016/j.asoc.2025.112853
Mauri Ferrandin , Ricardo Cerri
Ensembles are computational models that combine the strengths of multiple algorithms or models to enhance predictive accuracy, robustness, and generalization across various applications in machine learning and data analysis. They can mitigate the risk of overfitting and improve model stability, reducing the impact of individual algorithmic biases. These are valuable tools for achieving superior performance in complex and dynamic real-world scenarios. Despite constant advances in this research area, recent studies have shown that state-of-the-art ensembles for multi-label classification are still based on classical ensemble methods from 2016. This study proposes three new ensemble algorithms, called the ensemble of flat-to-hierarchical (EF2H) versions, developed using the F2H multi-label classification model. The F2H algorithm transforms the multi-label problem into a hierarchical multi-label problem to generate predictions. Experiments were conducted with 32 multi-label datasets, and the results were compared with those of the state-of-the-art algorithms in this field. The results demonstrate that the EF2H versions are highly competitive algorithms, outperforming the well-known ensemble of classifier chains (ECC) and achieving predictive performance equivalent to that of the random forest of decision trees with binary relevance (RFDTBR) and random forest of predictive clustering trees (RFPCT) algorithms.
{"title":"Ensemble multi-label classification using closed frequent labelsets and label taxonomies","authors":"Mauri Ferrandin , Ricardo Cerri","doi":"10.1016/j.asoc.2025.112853","DOIUrl":"10.1016/j.asoc.2025.112853","url":null,"abstract":"<div><div>Ensembles are computational models that combine the strengths of multiple algorithms or models to enhance predictive accuracy, robustness, and generalization across various applications in machine learning and data analysis. They can mitigate the risk of overfitting and improve model stability, reducing the impact of individual algorithmic biases. These are valuable tools for achieving superior performance in complex and dynamic real-world scenarios. Despite constant advances in this research area, recent studies have shown that state-of-the-art ensembles for multi-label classification are still based on classical ensemble methods from 2016. This study proposes three new ensemble algorithms, called the ensemble of flat-to-hierarchical (EF2H) versions, developed using the F2H multi-label classification model. The F2H algorithm transforms the multi-label problem into a hierarchical multi-label problem to generate predictions. Experiments were conducted with 32 multi-label datasets, and the results were compared with those of the state-of-the-art algorithms in this field. The results demonstrate that the EF2H versions are highly competitive algorithms, outperforming the well-known ensemble of classifier chains (ECC) and achieving predictive performance equivalent to that of the random forest of decision trees with binary relevance (RFDTBR) and random forest of predictive clustering trees (RFPCT) algorithms.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"171 ","pages":"Article 112853"},"PeriodicalIF":7.2,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143394747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-10DOI: 10.1016/j.asoc.2025.112838
Yangdi Shen , Zuowen Liao , Yichao Tian , Jin Tao , JinXuan Luo , Jiale Wang , Qiang Zhang
This paper introduces a novel Knowledge Assisted Differential Evolution Extreme Gradient Boost algorithm (KADE-XGBoost) for estimating mangrove aboveground biomass in the Maowei Sea, Beibu Gulf of China. The proposed algorithm combines differential evolution and extreme gradient boosting to address hyperparameter optimization and feature selection simultaneously. Additionally, a two-stage knowledge assisted strategy is proposed to retain key features for evolution and prevent the algorithm from converging to local optima. Experimental results using a dataset of 227 quadrat data from field surveys demonstrate that KADE-XGBoost outperforms other machine learning and heuristic-based models, achieving the best results with an value of 0.8413 and an of 216.2867. The KADE-XGBoost’s prediction range for mangrove aboveground biomass is 4.4218-218.2612 Mg/ha, showcasing its potential as a reliable algorithm for estimating large-scale mangrove aboveground biomass.
{"title":"Knowledge Assisted Differential Evolution Extreme Gradient Boost algorithm for estimating mangrove aboveground biomass","authors":"Yangdi Shen , Zuowen Liao , Yichao Tian , Jin Tao , JinXuan Luo , Jiale Wang , Qiang Zhang","doi":"10.1016/j.asoc.2025.112838","DOIUrl":"10.1016/j.asoc.2025.112838","url":null,"abstract":"<div><div>This paper introduces a novel Knowledge Assisted Differential Evolution Extreme Gradient Boost algorithm (KADE-XGBoost) for estimating mangrove aboveground biomass in the Maowei Sea, Beibu Gulf of China. The proposed algorithm combines differential evolution and extreme gradient boosting to address hyperparameter optimization and feature selection simultaneously. Additionally, a two-stage knowledge assisted strategy is proposed to retain key features for evolution and prevent the algorithm from converging to local optima. Experimental results using a dataset of 227 quadrat data from field surveys demonstrate that KADE-XGBoost outperforms other machine learning and heuristic-based models, achieving the best results with an <span><math><msup><mrow><mi>R</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span> value of 0.8413 and an <span><math><mrow><mi>R</mi><mi>M</mi><mi>S</mi><mi>E</mi></mrow></math></span> of 216.2867. The KADE-XGBoost’s prediction range for mangrove aboveground biomass is 4.4218-218.2612 Mg/ha, showcasing its potential as a reliable algorithm for estimating large-scale mangrove aboveground biomass.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"172 ","pages":"Article 112838"},"PeriodicalIF":7.2,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143403526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-10DOI: 10.1016/j.asoc.2025.112843
Pablo García Peris , Basil Mohammed Al-Hadithi
This work presents a multi-strategy fuzzy controller (MSFC) based on the Takagi–Sugeno (T–S) model, where the membership functions are based on polar coordinates. Adopting a polar coordinate system helps to reduce the number of rules significantly, especially when fuzzy variables exhibit extreme values. This approach also allows more rules to be assigned in certain directions (angles) than others, while the radius represents the distance to the reference. The MSFC uses three strategies: constant input (CI), where a predefined gain is set in the exterior rules; standard discrete state–space model linear–quadratic regulator (LQR), in the intermediate rules; and an incremental state–space model LQR (INC-LQR) in the central rule. A technique employing fuzzy fusion is suggested to facilitate smooth transitions among the various methods. The results show that the proposed MSFC in polar coordinates has a faster transient response and zero steady-state error compared to other fuzzy controllers, and it requires fewer rules compared to an MSFC based on Cartesian coordinates.
{"title":"Multi-strategy fuzzy controller based on polar coordinates","authors":"Pablo García Peris , Basil Mohammed Al-Hadithi","doi":"10.1016/j.asoc.2025.112843","DOIUrl":"10.1016/j.asoc.2025.112843","url":null,"abstract":"<div><div>This work presents a multi-strategy fuzzy controller (MSFC) based on the Takagi–Sugeno (T–S) model, where the membership functions are based on polar coordinates. Adopting a polar coordinate system helps to reduce the number of rules significantly, especially when fuzzy variables exhibit extreme values. This approach also allows more rules to be assigned in certain directions (angles) than others, while the radius represents the distance to the reference. The MSFC uses three strategies: constant input (CI), where a predefined gain is set in the exterior rules; standard discrete state–space model linear–quadratic regulator (LQR), in the intermediate rules; and an incremental state–space model LQR (INC-LQR) in the central rule. A technique employing fuzzy fusion is suggested to facilitate smooth transitions among the various methods. The results show that the proposed MSFC in polar coordinates has a faster transient response and zero steady-state error compared to other fuzzy controllers, and it requires fewer rules compared to an MSFC based on Cartesian coordinates.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"171 ","pages":"Article 112843"},"PeriodicalIF":7.2,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143394752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-10DOI: 10.1016/j.asoc.2025.112828
Sangho Lee , Hoki Kim , Woojin Lee , Youngdoo Son
Deep learning has significantly impacted prognostic and health management, but its susceptibility to adversarial attacks raises security risks for fault diagnosis systems. Previous research on the adversarial robustness of these systems is limited by unrealistic assumptions about prior model knowledge, which is often unobtainable in the real world, and by a lack of integration of domain-specific knowledge, particularly frequency information crucial for identifying unique characteristics for machinery states. To address these limitations and enhance robustness assessments, we propose a novel adversarial attack method that exploits frequency distortion. Our approach corrupts both frequency components and waveforms of vibration signals from rotating machinery, enabling a more thorough evaluation of system vulnerability without requiring access to model information. Through extensive experiments on two bearing datasets, including a self-collected dataset, we demonstrate the effectiveness of the proposed method in generating malicious yet imperceptible examples that remarkably degrade model performance, even without access to model information. In realistic attack scenarios for fault diagnosis systems, our approach produces adversarial examples that mimic unique frequency components associated with the deceived machinery states, leading to average performance drops of approximately 13 and 19 percentage points higher than existing methods on the two datasets, respectively. These results reveal potential risks for deep learning models embedded in fault diagnosis systems, highlighting the need for enhanced robustness against adversarial attacks.
{"title":"Black-box adversarial examples via frequency distortion against fault diagnosis systems","authors":"Sangho Lee , Hoki Kim , Woojin Lee , Youngdoo Son","doi":"10.1016/j.asoc.2025.112828","DOIUrl":"10.1016/j.asoc.2025.112828","url":null,"abstract":"<div><div>Deep learning has significantly impacted prognostic and health management, but its susceptibility to adversarial attacks raises security risks for fault diagnosis systems. Previous research on the adversarial robustness of these systems is limited by unrealistic assumptions about prior model knowledge, which is often unobtainable in the real world, and by a lack of integration of domain-specific knowledge, particularly frequency information crucial for identifying unique characteristics for machinery states. To address these limitations and enhance robustness assessments, we propose a novel adversarial attack method that exploits frequency distortion. Our approach corrupts both frequency components and waveforms of vibration signals from rotating machinery, enabling a more thorough evaluation of system vulnerability without requiring access to model information. Through extensive experiments on two bearing datasets, including a self-collected dataset, we demonstrate the effectiveness of the proposed method in generating malicious yet imperceptible examples that remarkably degrade model performance, even without access to model information. In realistic attack scenarios for fault diagnosis systems, our approach produces adversarial examples that mimic unique frequency components associated with the deceived machinery states, leading to average performance drops of approximately 13 and 19 percentage points higher than existing methods on the two datasets, respectively. These results reveal potential risks for deep learning models embedded in fault diagnosis systems, highlighting the need for enhanced robustness against adversarial attacks.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"171 ","pages":"Article 112828"},"PeriodicalIF":7.2,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143394750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}