Pub Date : 2025-03-27DOI: 10.1109/TETCI.2025.3569762
Aya Hussein;Eleni Petraki;Hussein A. Abbass
Learning from observation (LfO) is a process where an agent learns a task by passively observing a more competent agent perform it. LfO differs from classical Learning from demonstration (LfD) in that the former requires access to the demonstrator' s states only, whereas the latter requires both the demonstrator' s states and the corresponding actions. On the one hand, LfO avoids the sometimes costly or impractical burden of collecting the demonstrator' s actions, and instead only requires the demonstrator' s states which are more easily captured through cameras or sensors. On the other hand, LfO is more challenging than classical LfD because of the lack of detailed guidance from action labels. Despite the success of LfO in single-agent tasks, the literature falls short of assessing its feasibility in swarm systems, where multiple agents act simultaneously to enact a system-level state change. We tackle this research gap by proposing Swarm-LfO that extends single-agent LfO by leveraging the centralised training with decentralised execution framework to learn a useful agent-centric inverse dynamic model (AIDM). AIDM enables the imitator swarm to predict agent-level actions that would lead to swarm state transitions similar to those exhibited by the demonstrator swarm. Pairs of states and the corresponding estimated actions are then used for learning to imitate the demonstrated behaviour in a supervised learning manner. Evaluation experiments are conducted using four tasks that require different levels of coordination between swarm members: flocking, sheltering, dispersion, and herding. The results show that the performance of Swarm-LfO is comparable to classical LfD methods that require access to action information. Swarm-LfO is extensively evaluated and has demonstrated continued success under various experimental conditions including noise and different sizes of the demonstrator and imitator swarms. Our contribution will pave the way for imitation learning in swarms with diverse platforms, where the demonstrator and imitator swarms operate on different action spaces.
{"title":"Swarm Imitation Learning From Observations","authors":"Aya Hussein;Eleni Petraki;Hussein A. Abbass","doi":"10.1109/TETCI.2025.3569762","DOIUrl":"https://doi.org/10.1109/TETCI.2025.3569762","url":null,"abstract":"Learning from observation (LfO) is a process where an agent learns a task by passively observing a more competent agent perform it. LfO differs from classical Learning from demonstration (LfD) in that the former requires access to the demonstrator' s states only, whereas the latter requires both the demonstrator' s states and the corresponding actions. On the one hand, LfO avoids the sometimes costly or impractical burden of collecting the demonstrator' s actions, and instead only requires the demonstrator' s states which are more easily captured through cameras or sensors. On the other hand, LfO is more challenging than classical LfD because of the lack of detailed guidance from action labels. Despite the success of LfO in single-agent tasks, the literature falls short of assessing its feasibility in swarm systems, where multiple agents act simultaneously to enact a system-level state change. We tackle this research gap by proposing <italic>Swarm-LfO</i> that extends single-agent LfO by leveraging the <italic>centralised training with decentralised execution</i> framework to learn a useful agent-centric inverse dynamic model (AIDM). AIDM enables the imitator swarm to predict agent-level actions that would lead to swarm state transitions similar to those exhibited by the demonstrator swarm. Pairs of states and the corresponding estimated actions are then used for learning to imitate the demonstrated behaviour in a supervised learning manner. Evaluation experiments are conducted using four tasks that require different levels of coordination between swarm members: flocking, sheltering, dispersion, and herding. The results show that the performance of <italic>Swarm-LfO</i> is comparable to classical LfD methods that require access to action information. <italic>Swarm-LfO</i> is extensively evaluated and has demonstrated continued success under various experimental conditions including noise and different sizes of the demonstrator and imitator swarms. Our contribution will pave the way for imitation learning in swarms with diverse platforms, where the demonstrator and imitator swarms operate on different action spaces.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"9 6","pages":"4092-4105"},"PeriodicalIF":5.3,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145584646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-26DOI: 10.1109/TETCI.2025.3548334
{"title":"IEEE Transactions on Emerging Topics in Computational Intelligence Information for Authors","authors":"","doi":"10.1109/TETCI.2025.3548334","DOIUrl":"https://doi.org/10.1109/TETCI.2025.3548334","url":null,"abstract":"","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"9 2","pages":"C3-C3"},"PeriodicalIF":5.3,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10939045","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143706603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-26DOI: 10.1109/TETCI.2025.3548330
{"title":"IEEE Transactions on Emerging Topics in Computational Intelligence Publication Information","authors":"","doi":"10.1109/TETCI.2025.3548330","DOIUrl":"https://doi.org/10.1109/TETCI.2025.3548330","url":null,"abstract":"","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"9 2","pages":"C2-C2"},"PeriodicalIF":5.3,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10939046","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143716532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-26DOI: 10.1109/TETCI.2025.3548332
{"title":"IEEE Computational Intelligence Society Information","authors":"","doi":"10.1109/TETCI.2025.3548332","DOIUrl":"https://doi.org/10.1109/TETCI.2025.3548332","url":null,"abstract":"","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"9 2","pages":"C4-C4"},"PeriodicalIF":5.3,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10939049","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143706630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-26DOI: 10.1109/TETCI.2025.3569759
Mayank Kumar Kundalwal;Deepak Mishra
Federated Learning (FL) has emerged as a promising paradigm for collaborative and privacy-preserving model training in medical imaging. However, FL faces major challenges such as data heterogeneity among hospitals or institutions and scarcity of labeled data, particularly in healthcare applications. To address these challenges, we propose FLAME (Federated Learning with masked Autoencoders and Mean-prototypes Embedding) for sparsely labeled medical images. FLAME implements an integrated learning framework where a masked autoencoder (MAE) learns robust feature representations through reconstruction-based self-supervision, while a Prototypical Network head guides these representations to enhance class separation through mean-prototype embeddings. This learning mechanism enables the encoder to simultaneously capture rich contextual features from unlabeled data while learning discriminative boundaries among classes using limited labeled samples. Our experiments on diverse medical imaging tasks, including PathMNIST, Dermnet, COVID-19 chest X-ray dataset, and Skin-FL, demonstrate FLAME's superior performance over existing FL techniques. The framework shows significant improvements in both classification accuracy and convergence speed, while maintaining privacy and reducing dependence on labeled data. Most importantly, the proposed integration of MAE and Prototypical Network opens new possibilities for the domains that suffer from label scarcity and data heterogeneity, making it particularly valuable for applications like medical diagnostics.
{"title":"FLAME: Federated Learning With Masked Autoencoders and Mean-Prototypes Embedding for Sparsely Labeled Medical Images","authors":"Mayank Kumar Kundalwal;Deepak Mishra","doi":"10.1109/TETCI.2025.3569759","DOIUrl":"https://doi.org/10.1109/TETCI.2025.3569759","url":null,"abstract":"Federated Learning (FL) has emerged as a promising paradigm for collaborative and privacy-preserving model training in medical imaging. However, FL faces major challenges such as data heterogeneity among hospitals or institutions and scarcity of labeled data, particularly in healthcare applications. To address these challenges, we propose FLAME (Federated Learning with masked Autoencoders and Mean-prototypes Embedding) for sparsely labeled medical images. FLAME implements an integrated learning framework where a masked autoencoder (MAE) learns robust feature representations through reconstruction-based self-supervision, while a Prototypical Network head guides these representations to enhance class separation through mean-prototype embeddings. This learning mechanism enables the encoder to simultaneously capture rich contextual features from unlabeled data while learning discriminative boundaries among classes using limited labeled samples. Our experiments on diverse medical imaging tasks, including PathMNIST, Dermnet, COVID-19 chest X-ray dataset, and Skin-FL, demonstrate FLAME's superior performance over existing FL techniques. The framework shows significant improvements in both classification accuracy and convergence speed, while maintaining privacy and reducing dependence on labeled data. Most importantly, the proposed integration of MAE and Prototypical Network opens new possibilities for the domains that suffer from label scarcity and data heterogeneity, making it particularly valuable for applications like medical diagnostics.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"9 6","pages":"4080-4091"},"PeriodicalIF":5.3,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145584642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The morphology of neurons plays a crucial role in identifying their types and investigating the structure and function of the brain. While existing methods recognize neuron types through efficient morphology representations based on their tree-like structure, they can be further enhanced when analyzing neurons with complex and varied morphologies. In this paper, we introduce a shallow yet efficient multi-branch spatio-temporal enhancement-based Spiking Neural Network (SNN), consisting of three spiking VGG5 models, to fully delineate neuronal morphologies and precisely identify neuron types. Our method captures neuronal morphologies from the spatio-temporal domain and explores the relationships among different neuronal branches, thereby providing a comprehensive description of neurons with complex structures and significantly improving the classification performance. Specifically, we first decompose the neuron tree with complex and varied morphologies into multiple subtrees to represent neuronal morphology fully and then explicitly project these subtrees onto the temporal dimension. Then, we introduce the spiking VGG5 model to characterize neuronal morphology through spiking sequences and learn the relation of these subtrees from the spatio-temporal dimensions. Furthermore, we design a plug-and-play Spatio-Temporal Enhancement Module (STEM) for the spiking VGG5, enabling maximal activation of the spiking activity and facilitating information transfer and representation learning. In this way, our SNN architecture can comprehensively learn neuronal morphology representations based on the tree-like structure and depict the relationships of subtrees, accurately describing the morphological features of neurons with complex arbors. Experimental results demonstrate that our method precisely depicts the neuronal morphologies and achieves accuracies of 87.40% and 82.96% on two NeuroMorpho datasets, respectively, outperforming other approaches. Besides, our method displays significant generalizability and performs remarkably on the JML and BIL datasets.
{"title":"Spatio-Temporal Enhancement-Based Spiking Neural Network for Morphological Neuron Classification","authors":"Chunli Sun;Qinghai Guo;Luziwei Leng;Feng Wu;Feng Zhao","doi":"10.1109/TETCI.2025.3549763","DOIUrl":"https://doi.org/10.1109/TETCI.2025.3549763","url":null,"abstract":"The morphology of neurons plays a crucial role in identifying their types and investigating the structure and function of the brain. While existing methods recognize neuron types through efficient morphology representations based on their tree-like structure, they can be further enhanced when analyzing neurons with complex and varied morphologies. In this paper, we introduce a shallow yet efficient multi-branch spatio-temporal enhancement-based Spiking Neural Network (SNN), consisting of three spiking VGG5 models, to fully delineate neuronal morphologies and precisely identify neuron types. Our method captures neuronal morphologies from the spatio-temporal domain and explores the relationships among different neuronal branches, thereby providing a comprehensive description of neurons with complex structures and significantly improving the classification performance. Specifically, we first decompose the neuron tree with complex and varied morphologies into multiple subtrees to represent neuronal morphology fully and then explicitly project these subtrees onto the temporal dimension. Then, we introduce the spiking VGG5 model to characterize neuronal morphology through spiking sequences and learn the relation of these subtrees from the spatio-temporal dimensions. Furthermore, we design a plug-and-play Spatio-Temporal Enhancement Module (STEM) for the spiking VGG5, enabling maximal activation of the spiking activity and facilitating information transfer and representation learning. In this way, our SNN architecture can comprehensively learn neuronal morphology representations based on the tree-like structure and depict the relationships of subtrees, accurately describing the morphological features of neurons with complex arbors. Experimental results demonstrate that our method precisely depicts the neuronal morphologies and achieves accuracies of 87.40% and 82.96% on two NeuroMorpho datasets, respectively, outperforming other approaches. Besides, our method displays significant generalizability and performs remarkably on the JML and BIL datasets.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"9 6","pages":"4273-4287"},"PeriodicalIF":5.3,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145584678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cross-database facial expression recognition (CDFER) has attracted increasing attention when evaluating the systems' generalization performance. Although the attention mechanism can capture the feature-wise importance or feature-correlation of expression sensitive regions, the attention-based network suffers from the overfitting to the source database, due to possible over-dependence on most salient features, without exploring feature characteristics during removal of feature redundancy. To address this issue, this paper introduces a multi-kernel competitive convolution in feature-wise attention to obtain more salient regions and let each kernel compete with others to enhance the expressive ability of features, by reducing attention overfitting to the source domain. For feature-correlation attention, we resort to a Monte Carlo-based dropout to not only reduce the over-learning of the feature relationship, but also model the dropout probability distribution more specifically by taking the characteristics of feature maps into account. Experimental results show that our algorithm achieves much better generalization performances than the state of the arts (SOTAs) on six publicly available datasets, in the scenarios of single source domain, multiple source domains and domain adaption.
{"title":"Generalization-Enhanced Feature-Wise and Correlation Attentions for Cross-Database Facial Expression Recognition","authors":"Weicheng Xie;Tao Zhong;Fan Yang;Siyang Song;Zitong Yu;Linlin Shen","doi":"10.1109/TETCI.2025.3548727","DOIUrl":"https://doi.org/10.1109/TETCI.2025.3548727","url":null,"abstract":"Cross-database facial expression recognition (CDFER) has attracted increasing attention when evaluating the systems' generalization performance. Although the attention mechanism can capture the feature-wise importance or feature-correlation of expression sensitive regions, the attention-based network suffers from the overfitting to the source database, due to possible over-dependence on most salient features, without exploring feature characteristics during removal of feature redundancy. To address this issue, this paper introduces a multi-kernel competitive convolution in feature-wise attention to obtain more salient regions and let each kernel compete with others to enhance the expressive ability of features, by reducing attention overfitting to the source domain. For feature-correlation attention, we resort to a Monte Carlo-based dropout to not only reduce the over-learning of the feature relationship, but also model the dropout probability distribution more specifically by taking the characteristics of feature maps into account. Experimental results show that our algorithm achieves much better generalization performances than the state of the arts (SOTAs) on six publicly available datasets, in the scenarios of single source domain, multiple source domains and domain adaption.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"9 6","pages":"4258-4272"},"PeriodicalIF":5.3,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145584671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recently, a novel framework of in vivo computation has been proposed by modeling the tumor targetingproblem as a natural computation problem. The tumor-triggered biological gradient field (BGF) which provides targeting information for the nanorobots is viewed as the objective function to be optimized. The previous work focuses on the scenario of single BGF, which is interpreted as a uni-objective optimization problem. However, in real-life scenarios, various BGFs will be induced by the arising of a tumor lesion because of the variations of different kinds of biological information around the lesion (e.g., blood velocity, pH, oxygen, glucose, lactate, and $rm H^{+}$ ions). It is plausible to utilize BGF information as much as possible to target the tumor efficiently and robustly. Thus, we propose a BGF selector, which consists of a neural network “VisionaryNet”, a swarm intelligence algorithm, and a weak priority evolution strategy (WP-ES) in this article. Various artificial BGF landscapes are used to train the proposed VisionaryNet, which is employed to choose the alternative BGFs combined with several on-line estimated features during each iterative step. To demonstrate the effectiveness of the proposed BGF selector, a random selection approach is used as the benchmark. Comprehensive in silico experiments are carried out by taking into consideration the in vivo constraints of the nanobiosensing process. Furthermore, the correlation between the number of employed BGFs and the targeting result is investigated as the increasing of the number of BGFs will lead to excessive computation, which is adverse to the computational accuracy.
{"title":"In Vivo Computational Strategy for Tumor Targeting in Co-Associated Biological Landscapes","authors":"Shaolong Shi;Zhaoyang Jiang;Qiang Liu;Qingfu Zhang;Yifan Chen","doi":"10.1109/TETCI.2025.3569765","DOIUrl":"https://doi.org/10.1109/TETCI.2025.3569765","url":null,"abstract":"Recently, a novel framework of in vivo computation has been proposed by modeling the tumor targetingproblem as a natural computation problem. The tumor-triggered biological gradient field (BGF) which provides targeting information for the nanorobots is viewed as the objective function to be optimized. The previous work focuses on the scenario of single BGF, which is interpreted as a uni-objective optimization problem. However, in real-life scenarios, various BGFs will be induced by the arising of a tumor lesion because of the variations of different kinds of biological information around the lesion (e.g., blood velocity, pH, oxygen, glucose, lactate, and <inline-formula><tex-math>$rm H^{+}$</tex-math></inline-formula> ions). It is plausible to utilize BGF information as much as possible to target the tumor efficiently and robustly. Thus, we propose a BGF selector, which consists of a neural network “VisionaryNet”, a swarm intelligence algorithm, and a weak priority evolution strategy (WP-ES) in this article. Various artificial BGF landscapes are used to train the proposed VisionaryNet, which is employed to choose the alternative BGFs combined with several on-line estimated features during each iterative step. To demonstrate the effectiveness of the proposed BGF selector, a random selection approach is used as the benchmark. Comprehensive <italic>in silico</i> experiments are carried out by taking into consideration the in vivo constraints of the nanobiosensing process. Furthermore, the correlation between the number of employed BGFs and the targeting result is investigated as the increasing of the number of BGFs will lead to excessive computation, which is adverse to the computational accuracy.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"9 6","pages":"4133-4144"},"PeriodicalIF":5.3,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145584601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-25DOI: 10.1109/TETCI.2025.3550515
Shunxiang Zhang;Jiawei Li;Shuyu Li;Wenjie Duan;Zhongliang Wei;Kuan-Ching Li
Implicit sentiment analysis seeks to identify and interpret the underlying sentiment within texts that lack explicit sentiment words, significantly enhancing the capabilities of opinion analysis. Current methods often overlook the impact of context-dependent sequential text with graph neural networks, leading to an inadequate semantic representation of the text. In this paper, we propose a textual graph representation method with syntactic weighting for implicit sentiment analysis. This method improves textual semantic association by modeling the graph structure of the word position relationship in the text. It integrates syntactic weighting with an attention mechanism and guides node interactions in the graph attention network to generate textual graph representation with enhanced semantic depth and richness. The word semantics are enriched by introducing external knowledge. The proposed model is compared with existing models on the public implicit sentiment dataset SMP2019-ECISA, the explicit sentiment dataset NLPCC2014-SC, and the self-built dataset containing both explicit and implicit sentiment. The experimental results show that the proposed method can not only efficiently identify implicit sentiment, but also achieve some generalization and robustness in explicit sentiment recognition.
{"title":"Textual Graph Representation With Syntactic Weighting for Implicit Sentiment Analysis","authors":"Shunxiang Zhang;Jiawei Li;Shuyu Li;Wenjie Duan;Zhongliang Wei;Kuan-Ching Li","doi":"10.1109/TETCI.2025.3550515","DOIUrl":"https://doi.org/10.1109/TETCI.2025.3550515","url":null,"abstract":"Implicit sentiment analysis seeks to identify and interpret the underlying sentiment within texts that lack explicit sentiment words, significantly enhancing the capabilities of opinion analysis. Current methods often overlook the impact of context-dependent sequential text with graph neural networks, leading to an inadequate semantic representation of the text. In this paper, we propose a textual graph representation method with syntactic weighting for implicit sentiment analysis. This method improves textual semantic association by modeling the graph structure of the word position relationship in the text. It integrates syntactic weighting with an attention mechanism and guides node interactions in the graph attention network to generate textual graph representation with enhanced semantic depth and richness. The word semantics are enriched by introducing external knowledge. The proposed model is compared with existing models on the public implicit sentiment dataset SMP2019-ECISA, the explicit sentiment dataset NLPCC2014-SC, and the self-built dataset containing both explicit and implicit sentiment. The experimental results show that the proposed method can not only efficiently identify implicit sentiment, but also achieve some generalization and robustness in explicit sentiment recognition.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"9 6","pages":"4288-4299"},"PeriodicalIF":5.3,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145584643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-24DOI: 10.1109/TETCI.2025.3547854
Jinli Li;Ye Yuan;Xin Luo
In Big Data-based applications, high-dimensional and incomplete (HDI) data are frequently used to represent the complicated interactions among numerous nodes. A stochastic gradient descent (SGD)-based latent factor analysis (LFA) model can process such data efficiently. Unfortunately, a standard SGD algorithm trains a single latent factor relying on the stochastic gradient related to the current learning error only, leading to a slow convergence rate. To break through this bottleneck, this study establishes an SGD-based LFA model as the backbone, and proposes six proportional-integral-derivative (PID)-incorporated LFA models with diversified PID-controllers with the following two-fold ideas: a) refining the instant learning error in stochastic gradient by the principle of six PID-variants, i.e., a standard PID, an integral separated PID, a gearshift integral PID, a dead zone PID, an anti-windup PID, and an incomplete differential PID, to assimilate historical update information into the learning scheme in an efficient way; b) making the hyper-parameters adaptation by utilizing the mechanism of particle swarm optimization for acquiring high practicality. In addition, considering the diversified PID-variants, an effective ensemble is implemented for the six PID-incorporated LFA models. Experimental results on industrial HDI datasets illustrate that in comparison with state-of-the-art models, the proposed models obtain superior computational efficiency while maintaining competitive accuracy in predicting missing data within an HDI matrix. Moreover, their ensemble further improves performance in terms of prediction accuracy.
在基于大数据的应用中,经常使用高维不完整数据(high-dimensional and incomplete, HDI)来表示众多节点之间复杂的交互。基于随机梯度下降(SGD)的潜在因素分析(LFA)模型可以有效地处理这些数据。不幸的是,标准的SGD算法只依靠与当前学习误差相关的随机梯度来训练单个潜在因子,导致收敛速度缓慢。为了突破这一瓶颈,本研究建立了一个基于sgd的LFA模型作为主干,并提出了6个包含比例-积分-导数(PID)的LFA模型,采用多种PID控制器,其思路如下:a)利用标准PID、积分分离PID、换挡积分PID、死区PID、反上卷PID、不完全微分PID等6个PID变量的原理,改进随机梯度中的瞬时学习误差,有效地将历史更新信息吸收到学习方案中;B)利用粒子群优化机制进行超参数自适应,以获得较高的实用性。此外,考虑到pid变量的多样性,对6个包含pid的LFA模型进行了有效的集成。工业HDI数据集的实验结果表明,与最先进的模型相比,所提出的模型在预测HDI矩阵中缺失数据时获得了更高的计算效率,同时保持了竞争的准确性。此外,它们的集成进一步提高了预测精度。
{"title":"Learning Error Refinement in Stochastic Gradient Descent-Based Latent Factor Analysis via Diversified PID Controllers","authors":"Jinli Li;Ye Yuan;Xin Luo","doi":"10.1109/TETCI.2025.3547854","DOIUrl":"https://doi.org/10.1109/TETCI.2025.3547854","url":null,"abstract":"In Big Data-based applications, high-dimensional and incomplete (HDI) data are frequently used to represent the complicated interactions among numerous nodes. A stochastic gradient descent (SGD)-based latent factor analysis (LFA) model can process such data efficiently. Unfortunately, a standard SGD algorithm trains a single latent factor relying on the stochastic gradient related to the current learning error only, leading to a slow convergence rate. To break through this bottleneck, this study establishes an SGD-based LFA model as the backbone, and proposes six proportional-integral-derivative (PID)-incorporated LFA models with diversified PID-controllers with the following two-fold ideas: a) refining the instant learning error in stochastic gradient by the principle of six PID-variants, i.e., a standard PID, an integral separated PID, a gearshift integral PID, a dead zone PID, an anti-windup PID, and an incomplete differential PID, to assimilate historical update information into the learning scheme in an efficient way; b) making the hyper-parameters adaptation by utilizing the mechanism of particle swarm optimization for acquiring high practicality. In addition, considering the diversified PID-variants, an effective ensemble is implemented for the six PID-incorporated LFA models. Experimental results on industrial HDI datasets illustrate that in comparison with state-of-the-art models, the proposed models obtain superior computational efficiency while maintaining competitive accuracy in predicting missing data within an HDI matrix. Moreover, their ensemble further improves performance in terms of prediction accuracy.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"9 5","pages":"3582-3597"},"PeriodicalIF":5.3,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145128456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}