Pub Date : 2025-11-29DOI: 10.1007/s10489-025-07022-y
Ziyi Meng, Wenting Wu, Jing Li, Ming Zhu
The development of wireless transmission technology has led to the conceptualization of energy transmission as a service, giving rise to the abstract concept of "Energy as a Service". However, a single energy service is increasingly inadequate to meet the growing energy demand, making the selection of multiple energy services for collaborative operation a pressing challenge. Existing service selection algorithms often face challenges such as high computational complexity and insufficient adaptability when addressing large-scale, complex problems. To address this, this paper proposes the integration of deep reinforcement learning with energy service selection, employing the Proximal Policy Optimization algorithm to solve the energy service selection problem. Additionally, to address the shortcomings of current research in ensuring the reliability of energy services, this paper introduces a blockchain-based smart contract approach for energy service selection, utilizing blockchain to prevent tampering with service information and ensuring service reliability. Experimental results demonstrate that the proposed method exhibits significant advantages in preventing service information tampering and in addressing the energy service selection problem.
{"title":"Energy service selection method based on deep reinforcement learning and blockchain smart contract","authors":"Ziyi Meng, Wenting Wu, Jing Li, Ming Zhu","doi":"10.1007/s10489-025-07022-y","DOIUrl":"10.1007/s10489-025-07022-y","url":null,"abstract":"<div><p>The development of wireless transmission technology has led to the conceptualization of energy transmission as a service, giving rise to the abstract concept of \"Energy as a Service\". However, a single energy service is increasingly inadequate to meet the growing energy demand, making the selection of multiple energy services for collaborative operation a pressing challenge. Existing service selection algorithms often face challenges such as high computational complexity and insufficient adaptability when addressing large-scale, complex problems. To address this, this paper proposes the integration of deep reinforcement learning with energy service selection, employing the Proximal Policy Optimization algorithm to solve the energy service selection problem. Additionally, to address the shortcomings of current research in ensuring the reliability of energy services, this paper introduces a blockchain-based smart contract approach for energy service selection, utilizing blockchain to prevent tampering with service information and ensuring service reliability. Experimental results demonstrate that the proposed method exhibits significant advantages in preventing service information tampering and in addressing the energy service selection problem. </p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 17","pages":""},"PeriodicalIF":3.5,"publicationDate":"2025-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145612913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-29DOI: 10.1007/s10489-025-06991-4
Nana Han, Junsheng Qiao, Tengbiao Li
In this paper, by means of (varvec{G})-lower and (varvec{O})-upper (varvec{L})-fuzzy rough approximation operators proposed by Jiang and Hu, we first introduce two new pairs of (varvec{L})-fuzzy rough approximation operators induced by overlap and grouping functions on complete lattices. These operators are respectively referred to as (varvec{L}^{varvec{(1)}})-fuzzy rough approximation operators and (varvec{L}^{varvec{(2)}})-fuzzy rough approximation operators. And then, we study several basic properties of them. Furthermore, we focus on topological properties of (varvec{L}^{varvec{(2)}})-lower (resp. (varvec{L}^{varvec{(2)}})-upper) fuzzy rough approximation operators in (varvec{L}^{varvec{(2)}})-fuzzy rough approximation operators. Particularly, the set of fixed points of (varvec{L}^{varvec{(2)}})-lower (resp. (varvec{L}^{varvec{(2)}})-upper) fuzzy rough approximation operators forms an Alexandroff (varvec{L})-topology. Finally, we present the application of (varvec{L}^{varvec{(2)}})-fuzzy rough approximation operators to the three-way decisions and the experimental results demonstrate that compared with the existing corresponding fuzzy rough set models derived from t-norms and t-conorms, our model exhibits superior classification performance.
{"title":"Novel L-fuzzy rough approximation operators induced by overlap and grouping functions on complete lattices and its application to three-way decisions","authors":"Nana Han, Junsheng Qiao, Tengbiao Li","doi":"10.1007/s10489-025-06991-4","DOIUrl":"10.1007/s10489-025-06991-4","url":null,"abstract":"<div><p>In this paper, by means of <span>(varvec{G})</span>-lower and <span>(varvec{O})</span>-upper <span>(varvec{L})</span>-fuzzy rough approximation operators proposed by Jiang and Hu, we first introduce two new pairs of <span>(varvec{L})</span>-fuzzy rough approximation operators induced by overlap and grouping functions on complete lattices. These operators are respectively referred to as <span>(varvec{L}^{varvec{(1)}})</span>-fuzzy rough approximation operators and <span>(varvec{L}^{varvec{(2)}})</span>-fuzzy rough approximation operators. And then, we study several basic properties of them. Furthermore, we focus on topological properties of <span>(varvec{L}^{varvec{(2)}})</span>-lower (resp. <span>(varvec{L}^{varvec{(2)}})</span>-upper) fuzzy rough approximation operators in <span>(varvec{L}^{varvec{(2)}})</span>-fuzzy rough approximation operators. Particularly, the set of fixed points of <span>(varvec{L}^{varvec{(2)}})</span>-lower (resp. <span>(varvec{L}^{varvec{(2)}})</span>-upper) fuzzy rough approximation operators forms an Alexandroff <span>(varvec{L})</span>-topology. Finally, we present the application of <span>(varvec{L}^{varvec{(2)}})</span>-fuzzy rough approximation operators to the three-way decisions and the experimental results demonstrate that compared with the existing corresponding fuzzy rough set models derived from t-norms and t-conorms, our model exhibits superior classification performance.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 17","pages":""},"PeriodicalIF":3.5,"publicationDate":"2025-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145612912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-29DOI: 10.1007/s10489-025-06989-y
Sotirios Nikoloutsopoulos, Iordanis Koutsopoulos, Michalis K. Titsias
We propose a Stochastic Gradient Descent (SGD)-type algorithm for Personalized Federated Learning which can be particularly attractive for mobile energy-limited regimes due to its low per-client computational cost. The model to be trained includes a set of common weights for all clients, and a set of personalized weights that are specific to each client. At each optimization round, randomly selected clients perform multiple full gradient-descent updates over their client-specific weights towards optimizing the loss function on their own datasets, without updating the common weights. This procedure is energy-efficient since it has low computational cost per client. At the final update of each round, each client computes the joint gradient over both the client-specific and the common weights and returns the gradient of common weights to the server, which allows to perform an exact SGD step over the full set of weights in a distributed manner. For the overall optimization scheme, we rigorously prove convergence, even in non-convex settings such as those encountered when training neural networks, with a rate of (mathcal {O} left( frac{1}{sqrt{T}} right)) with respect to communication rounds T. In practice, PFLEGO exhibits substantially lower per-round wall-clock time, used as a proxy for energy. Our theoretical guarantees translate to superior performance in practice against baselines such as FedAvg and FedPer, as evaluated in several multi-class classification datasets, in particular, Omniglot, CIFAR-10, MNIST, Fashion-MNIST, and EMNIST.
{"title":"Personalized federated learning with exact stochastic gradient descent","authors":"Sotirios Nikoloutsopoulos, Iordanis Koutsopoulos, Michalis K. Titsias","doi":"10.1007/s10489-025-06989-y","DOIUrl":"10.1007/s10489-025-06989-y","url":null,"abstract":"<div><p>We propose a Stochastic Gradient Descent (SGD)-type algorithm for Personalized Federated Learning which can be particularly attractive for mobile energy-limited regimes due to its low per-client computational cost. The model to be trained includes a set of common weights for all clients, and a set of personalized weights that are specific to each client. At each optimization round, randomly selected clients perform multiple full gradient-descent updates over their client-specific weights towards optimizing the loss function on their own datasets, without updating the common weights. This procedure is energy-efficient since it has low computational cost per client. At the final update of each round, each client computes the joint gradient over both the client-specific and the common weights and returns the gradient of common weights to the server, which allows to perform an exact SGD step over the full set of weights in a distributed manner. For the overall optimization scheme, we rigorously prove convergence, even in non-convex settings such as those encountered when training neural networks, with a rate of <span>(mathcal {O} left( frac{1}{sqrt{T}} right))</span> with respect to communication rounds <i>T</i>. In practice, PFLEGO exhibits substantially lower per-round wall-clock time, used as a proxy for energy. Our theoretical guarantees translate to superior performance in practice against baselines such as FedAvg and FedPer, as evaluated in several multi-class classification datasets, in particular, Omniglot, CIFAR-10, MNIST, Fashion-MNIST, and EMNIST.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 17","pages":""},"PeriodicalIF":3.5,"publicationDate":"2025-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145613143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-29DOI: 10.1007/s10489-025-07020-0
Hui Zhang, Zhicheng Zhou, Shiyi Gu, Ya Zhang
This paper studies the problem of Traffic Signal Control (TSC) for multiple intersections. A large-scale TSC algorithm based on multi-expert demonstrations and Multi-Agent Reinforcement Learning (MARL) called EXPs-XLight is proposed. In contrast to other human-in-the-loop expert demonstration approaches that rely on a single expert, a mechanism for multi-expert demonstrations is introduced to accelerate the training of large-scale multi-agents, reduce training difficulty, and improve the overall training effectiveness. Expert knowledge is derived from multiple sources, using Max Pressure (MP) and Max Queue-Length (M-QL) as expert policies. By combining diverse experiences from multiple experts, agents are able to learn from a broader range of traffic scenarios. During the process of learning from expert knowledge, a supervised large margin classification loss is introduced to encourage the learning of meaningful action values. EXPs-XLight incorporates mixed policy sampling, allowing dynamic adjustment of the balance between expert guidance and agents’ own experiences. Unlike previous methods that involve expert participation throughout entire episodes, EXPs-XLight enables partial expert participation during agent exploration. An (epsilon)-greedy algorithm based on multiple experts is introduced to encourage agents to explore novel state-action pairs while avoiding over-reliance on expert policies. To preserve the agents’ capacity for exploration and autonomous learning, their own strategies are consistently utilized in interactions with the environment. In addition, a replay buffer discrimination mechanism is introduced to ensure the accumulation of high-quality experience by storing interactions with higher rewards. EXPs-XLight has demonstrated excellent performance through experiments on real-world datasets, including a 16-intersection road network in Hangzhou and a 196-intersection road network in New York, as well as a large-scale simulated road network with 1000 intersections.
{"title":"A deep reinforcement learning model for traffic signal control with multi-expert participating in exploration","authors":"Hui Zhang, Zhicheng Zhou, Shiyi Gu, Ya Zhang","doi":"10.1007/s10489-025-07020-0","DOIUrl":"10.1007/s10489-025-07020-0","url":null,"abstract":"<div><p>This paper studies the problem of Traffic Signal Control (TSC) for multiple intersections. A large-scale TSC algorithm based on multi-expert demonstrations and Multi-Agent Reinforcement Learning (MARL) called EXPs-XLight is proposed. In contrast to other human-in-the-loop expert demonstration approaches that rely on a single expert, a mechanism for multi-expert demonstrations is introduced to accelerate the training of large-scale multi-agents, reduce training difficulty, and improve the overall training effectiveness. Expert knowledge is derived from multiple sources, using Max Pressure (MP) and Max Queue-Length (M-QL) as expert policies. By combining diverse experiences from multiple experts, agents are able to learn from a broader range of traffic scenarios. During the process of learning from expert knowledge, a supervised large margin classification loss is introduced to encourage the learning of meaningful action values. EXPs-XLight incorporates mixed policy sampling, allowing dynamic adjustment of the balance between expert guidance and agents’ own experiences. Unlike previous methods that involve expert participation throughout entire episodes, EXPs-XLight enables partial expert participation during agent exploration. An <span>(epsilon)</span>-greedy algorithm based on multiple experts is introduced to encourage agents to explore novel state-action pairs while avoiding over-reliance on expert policies. To preserve the agents’ capacity for exploration and autonomous learning, their own strategies are consistently utilized in interactions with the environment. In addition, a replay buffer discrimination mechanism is introduced to ensure the accumulation of high-quality experience by storing interactions with higher rewards. EXPs-XLight has demonstrated excellent performance through experiments on real-world datasets, including a 16-intersection road network in Hangzhou and a 196-intersection road network in New York, as well as a large-scale simulated road network with 1000 intersections.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 17","pages":""},"PeriodicalIF":3.5,"publicationDate":"2025-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145612915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-29DOI: 10.1007/s10489-025-06964-7
Jingsen Liu, Chennan Zhao, Yu Li, Ping Hu
The current network security situation is becoming increasingly complex and dynamic, and intrusion detection systems are facing severe challenges in terms of performance optimization and detection efficiency. This paper proposes a multi-strategy improved weighted mean of vectors algorithm (MRINFO) to optimize the weights and biases of the Extreme Learning Machine (ELM) classifier in intrusion detection systems, effectively enhancing the classification performance of the system. To address issues such as low solution accuracy and slow convergence of the basic INFO algorithm during execution, MRINFO algorithm proposes and integrates two improvement mechanisms: one is the multi-strategy dynamic stochastic learning pool, which designs and introduces various oppositely learned variants into the learning pool and dynamically updates them to make the candidate solutions more diversified; and the other one is the reinforcement strategy of partial dimension mutation based on feedback priority, which implements scoring operations on each individual dimension and guides the population to converge towards the global optimum. Experimental results show that the MRINFO algorithm performs excellently in the CEC2022 test suite, outperforming other comparative algorithms in terms of optimization accuracy, convergence speed, and stability. In the classification tasks of intrusion detection systems, the ELM classifier optimized by MRINFO shows outstanding performance across various metrics in both binary and multi-class classification tests. This validates the feasibility and effectiveness of the MRINFO algorithm in intrusion detection systems and demonstrates its broad application prospects.
{"title":"Extreme learning machine optimized by multi-strategy improved weighted mean of vectors algorithm for intrusion detection classification","authors":"Jingsen Liu, Chennan Zhao, Yu Li, Ping Hu","doi":"10.1007/s10489-025-06964-7","DOIUrl":"10.1007/s10489-025-06964-7","url":null,"abstract":"<div><p>The current network security situation is becoming increasingly complex and dynamic, and intrusion detection systems are facing severe challenges in terms of performance optimization and detection efficiency. This paper proposes a multi-strategy improved weighted mean of vectors algorithm (MRINFO) to optimize the weights and biases of the Extreme Learning Machine (ELM) classifier in intrusion detection systems, effectively enhancing the classification performance of the system. To address issues such as low solution accuracy and slow convergence of the basic INFO algorithm during execution, MRINFO algorithm proposes and integrates two improvement mechanisms: one is the multi-strategy dynamic stochastic learning pool, which designs and introduces various oppositely learned variants into the learning pool and dynamically updates them to make the candidate solutions more diversified; and the other one is the reinforcement strategy of partial dimension mutation based on feedback priority, which implements scoring operations on each individual dimension and guides the population to converge towards the global optimum. Experimental results show that the MRINFO algorithm performs excellently in the CEC2022 test suite, outperforming other comparative algorithms in terms of optimization accuracy, convergence speed, and stability. In the classification tasks of intrusion detection systems, the ELM classifier optimized by MRINFO shows outstanding performance across various metrics in both binary and multi-class classification tests. This validates the feasibility and effectiveness of the MRINFO algorithm in intrusion detection systems and demonstrates its broad application prospects.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 17","pages":""},"PeriodicalIF":3.5,"publicationDate":"2025-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145612916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
On e-commerce platforms, the multi-behaviors between users and products imply different interests of users for the product. In particular, the intra-behavior dependence and the heterogeneous dependence among group users play an important role in the target behavior decision of users, which can capture the deep interests of users. Previous researches have focused on fusion strategies for final user representations of different behaviors, neglecting adequate modeling of cross dependencies between different behaviors. In this paper, we leverage a self-supervised collaborative contrastive learning framework to learn high-quality user representation for multi-behavior recommendation, named CCLAFMB. The CCLAFMB first designs an adaptive fusion strategy of homogeneous and heterogeneous behaviors and implements their cross dependency propagation process. Then, a self-supervised collaborative contrastive learning paradigm is proposed to ensure the homogeneous and heterogeneous consistency of multi-behavior interest learning. Finally, extensive experimental outcomes on Beibei and Taobao datasets show the proposal achieves improvements of 8.09%, 2.51% on HR@10 metric, and 4.90%, 0.34% on NDCG@10 metric, respectively. The findings demonstrate the significance of adaptive fusion of multi-behavior cross dependencies for multi-behavior recommendation.
{"title":"Self-supervised collaborative contrast learning for multi-behavior recommendation with adaptive fusion of cross dependency","authors":"Jianxing Zheng, Ting Zhang, Suge Wang, Deyu Li, Jian Liao","doi":"10.1007/s10489-025-06981-6","DOIUrl":"10.1007/s10489-025-06981-6","url":null,"abstract":"<div><p>On e-commerce platforms, the multi-behaviors between users and products imply different interests of users for the product. In particular, the intra-behavior dependence and the heterogeneous dependence among group users play an important role in the target behavior decision of users, which can capture the deep interests of users. Previous researches have focused on fusion strategies for final user representations of different behaviors, neglecting adequate modeling of cross dependencies between different behaviors. In this paper, we leverage a self-supervised collaborative contrastive learning framework to learn high-quality user representation for multi-behavior recommendation, named CCLAFMB. The CCLAFMB first designs an adaptive fusion strategy of homogeneous and heterogeneous behaviors and implements their cross dependency propagation process. Then, a self-supervised collaborative contrastive learning paradigm is proposed to ensure the homogeneous and heterogeneous consistency of multi-behavior interest learning. Finally, extensive experimental outcomes on Beibei and Taobao datasets show the proposal achieves improvements of 8.09%, 2.51% on HR@10 metric, and 4.90%, 0.34% on NDCG@10 metric, respectively. The findings demonstrate the significance of adaptive fusion of multi-behavior cross dependencies for multi-behavior recommendation.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 17","pages":""},"PeriodicalIF":3.5,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145613127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Accurate and real-time recognition of right-of-way information transmitted by traffic lights is key to ensuring the safety of traffic participants. Existing deep learning-based traffic light detection and recognition (TLDR) models achieved high accuracy. However, these models require considerable computing power, which makes them difficult to be deployed on mobile platforms such as intelligent vehicles, urban low-speed unmanned work platforms, and visually impaired street-crossing assistance devices. In this paper, a mobile platform-friendly TLDR model HCEDSM (high computational cost-effectiveness and detection speed model) is proposed to achieve high accuracy, low latency, and strong deployment feasibility. First, a lightweight backbone network combined with EfficientViT and efficient multi-scale attention is introduced to reduce the quantity of computation and focus on small target features. Second, a cross-scale neck network is constructed to improve the feature fusion ability with lower computational cost. The open-source S2TLD dataset is used for training and testing, and the model is deployed on the NVIDIA Jetson Nano B01, which is a representative platform for low-computing-power devices. The results show that HCEDSM achieves a precision of 94.7% with 2.4 GFLOPs and a detection speed of up to 12.2 FPS on the NVIDIA Jetson Nano B01. The detection results on LISA traffic Light dataset and BSTLD (Bosch Small Traffic Light Dataset) show that the model has good generalization ability. These findings demonstrate that HCEDSM enables accurate and real-time recognition of traffic lights on resource-constrained platforms.
{"title":"A mobile platform-friendly lightweight traffic light detection and recognition model","authors":"Tinglin Chen, Junyan Han, Jingheng Wang, Xiaoyuan Wang, Cheng Shen, Zhenwei Lv, Yanan Sun, Yunfei Guo, Jianbo Sun","doi":"10.1007/s10489-025-06993-2","DOIUrl":"10.1007/s10489-025-06993-2","url":null,"abstract":"<div><p>Accurate and real-time recognition of right-of-way information transmitted by traffic lights is key to ensuring the safety of traffic participants. Existing deep learning-based traffic light detection and recognition (TLDR) models achieved high accuracy. However, these models require considerable computing power, which makes them difficult to be deployed on mobile platforms such as intelligent vehicles, urban low-speed unmanned work platforms, and visually impaired street-crossing assistance devices. In this paper, a mobile platform-friendly TLDR model HCEDSM (high computational cost-effectiveness and detection speed model) is proposed to achieve high accuracy, low latency, and strong deployment feasibility. First, a lightweight backbone network combined with EfficientViT and efficient multi-scale attention is introduced to reduce the quantity of computation and focus on small target features. Second, a cross-scale neck network is constructed to improve the feature fusion ability with lower computational cost. The open-source S2TLD dataset is used for training and testing, and the model is deployed on the NVIDIA Jetson Nano B01, which is a representative platform for low-computing-power devices. The results show that HCEDSM achieves a precision of 94.7% with 2.4 GFLOPs and a detection speed of up to 12.2 FPS on the NVIDIA Jetson Nano B01. The detection results on LISA traffic Light dataset and BSTLD (Bosch Small Traffic Light Dataset) show that the model has good generalization ability. These findings demonstrate that HCEDSM enables accurate and real-time recognition of traffic lights on resource-constrained platforms.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 17","pages":""},"PeriodicalIF":3.5,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145612897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-28DOI: 10.1007/s10489-025-07009-9
Rabeea Fatma Khan, Mu Sook Lee, Byoung-Dai Lee
Extensive research has focused on developing efficient and accurate solutions for the critical task of medical image segmentation. Approaches have evolved from hand-crafted pipelines to deep convolutional neural networks (CNNs), and more recently, to Transformer-based hybrid models. Among these, hierarchical encoder–decoder architectures remain prevalent, where skip connections are crucial in transmitting spatial features from encoders to decoders. However, conventional skip connections operate in static and passive modes, and cannot adaptively fuse multi-scale features or capture semantic relationships across resolution levels. Although attention-based skip enhancements have been proposed, they are often architecture-specific and difficult to generalize. In this study, we propose TransSkip, a novel transformer-based skip connection module that embeds both self-attention and cross-attention directly within the skip path. This enables dynamic and learnable multi-scale feature fusion across encoder levels, transforming skip connections into active semantic reasoning pathways. TransSkip is modular and architecture agnostic, supporting seamless integration with a range of hierarchical encoder–decoder networks, including CNN-based, Transformer-based, and hybrid models. Extensive experiments across 2D and 3D datasets (BUSI, Kvasir-SEG, MSD-Spleen) and multiple network backbones (U-Net, TransUNet, TransAttUNet, MCV-UNet) demonstrate that TransSkip consistently improves segmentation accuracy, with statistically significant gains and minimal parameter overhead. These results highlight the potential of TransSkip as a generalizable and efficient architectural enhancement for medical image segmentation.
{"title":"Harnessing transformer-based attention mechanisms for multi-scale feature fusion in medical image segmentation","authors":"Rabeea Fatma Khan, Mu Sook Lee, Byoung-Dai Lee","doi":"10.1007/s10489-025-07009-9","DOIUrl":"10.1007/s10489-025-07009-9","url":null,"abstract":"<div><p>Extensive research has focused on developing efficient and accurate solutions for the critical task of medical image segmentation. Approaches have evolved from hand-crafted pipelines to deep convolutional neural networks (CNNs), and more recently, to Transformer-based hybrid models. Among these, hierarchical encoder–decoder architectures remain prevalent, where skip connections are crucial in transmitting spatial features from encoders to decoders. However, conventional skip connections operate in static and passive modes, and cannot adaptively fuse multi-scale features or capture semantic relationships across resolution levels. Although attention-based skip enhancements have been proposed, they are often architecture-specific and difficult to generalize. In this study, we propose TransSkip, a novel transformer-based skip connection module that embeds both self-attention and cross-attention directly within the skip path. This enables dynamic and learnable multi-scale feature fusion across encoder levels, transforming skip connections into active semantic reasoning pathways. TransSkip is modular and architecture agnostic, supporting seamless integration with a range of hierarchical encoder–decoder networks, including CNN-based, Transformer-based, and hybrid models. Extensive experiments across 2D and 3D datasets (BUSI, Kvasir-SEG, MSD-Spleen) and multiple network backbones (U-Net, TransUNet, TransAttUNet, MCV-UNet) demonstrate that TransSkip consistently improves segmentation accuracy, with statistically significant gains and minimal parameter overhead. These results highlight the potential of TransSkip as a generalizable and efficient architectural enhancement for medical image segmentation.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 17","pages":""},"PeriodicalIF":3.5,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10489-025-07009-9.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145613126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-28DOI: 10.1007/s10489-025-07015-x
Yibin Zhao, Jianjun Yi, Yihan Pan, Liwei Chen
In recent years, 3D Gaussian Splatting(3DGS) has attracted attention due to its ability to perform camera-level novel view synthesis(NVS) and 3D reconstruction through camera images with certain poses. Early works usually assumed that the input were good camera poses and RGB images, but the input obtained in actual robotics work is generally an erroneous pose and RGB-D image, which will have a serious impact on the geometry of scene reconstruction and NVS’s quality and waste depth information. In this paper, we propose a new scene reconstruction method based on RGB-D view synthesis and camera pose optimization, which is robust to inaccurate pose estimation and incomplete views. This method optimizes the scene geometry, new views, and poses, and jointly learns the parameters of the Gaussians to obtain a 3D scene with accurate geometry and high quality of NVS, which has a 19.86% improvement on NVS quality and 23.73% improvement on depth estimation compared to the base method.
{"title":"Robust Geometric Reconstruction of RGB-D Data Based on Gaussian Splatting","authors":"Yibin Zhao, Jianjun Yi, Yihan Pan, Liwei Chen","doi":"10.1007/s10489-025-07015-x","DOIUrl":"10.1007/s10489-025-07015-x","url":null,"abstract":"<div><p>In recent years, 3D Gaussian Splatting(3DGS) has attracted attention due to its ability to perform camera-level novel view synthesis(NVS) and 3D reconstruction through camera images with certain poses. Early works usually assumed that the input were good camera poses and RGB images, but the input obtained in actual robotics work is generally an erroneous pose and RGB-D image, which will have a serious impact on the geometry of scene reconstruction and NVS’s quality and waste depth information. In this paper, we propose a new scene reconstruction method based on RGB-D view synthesis and camera pose optimization, which is robust to inaccurate pose estimation and incomplete views. This method optimizes the scene geometry, new views, and poses, and jointly learns the parameters of the Gaussians to obtain a 3D scene with accurate geometry and high quality of NVS, which has a 19.86% improvement on NVS quality and 23.73% improvement on depth estimation compared to the base method.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 17","pages":""},"PeriodicalIF":3.5,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145613125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As urban transportation networks have become increasingly complex and diverse, traditional signal control methods struggle to effectively manage dynamic traffic flows and adverse weather conditions. To tackle this issue, this paper introduces an improved traffic signal control system leveraging a Double Dueling Deep Q-Network algorithm, referred to as the Self-Adaptive Attention Double Dueling Deep Q-Network (SAA-D3QN) framework. This system not only considers mixed traffic flow and an adaptive attention mechanism but also introduces a novel reward function concept specifically designed for adverse weather conditions. Incorporating a mixed traffic flow model, the system can more precisely simulate the behavior of different vehicle types under various traffic conditions. The introduction of the adaptive attention mechanism enables the system to dynamically adjust its focus on critical areas when processing large amounts of traffic data, allowing for rapid identification and processing of key information. In addition, this paper conducts an in-depth analysis of traffic data under adverse weather conditions and propose a new reward function to enable the traffic signal system to adaptively adjust signal timing strategies under such circumstances. The experimental findings indicate that compared to traditional signal control methods, the SAA-D3QN traffic system significantly reduces average vehicle waiting time, enhances intersection throughput, and decreases traffic congestion.
{"title":"Optimized traffic signal control system incorporating mixed traffic flow and adverse weather","authors":"Li-Juan Liu, Ting-Ting Huang, Hamid Reza Karimi, Yan-Hua Ma, Jiao Su","doi":"10.1007/s10489-025-07000-4","DOIUrl":"10.1007/s10489-025-07000-4","url":null,"abstract":"<div><p>As urban transportation networks have become increasingly complex and diverse, traditional signal control methods struggle to effectively manage dynamic traffic flows and adverse weather conditions. To tackle this issue, this paper introduces an improved traffic signal control system leveraging a Double Dueling Deep Q-Network algorithm, referred to as the Self-Adaptive Attention Double Dueling Deep Q-Network (SAA-D3QN) framework. This system not only considers mixed traffic flow and an adaptive attention mechanism but also introduces a novel reward function concept specifically designed for adverse weather conditions. Incorporating a mixed traffic flow model, the system can more precisely simulate the behavior of different vehicle types under various traffic conditions. The introduction of the adaptive attention mechanism enables the system to dynamically adjust its focus on critical areas when processing large amounts of traffic data, allowing for rapid identification and processing of key information. In addition, this paper conducts an in-depth analysis of traffic data under adverse weather conditions and propose a new reward function to enable the traffic signal system to adaptively adjust signal timing strategies under such circumstances. The experimental findings indicate that compared to traditional signal control methods, the SAA-D3QN traffic system significantly reduces average vehicle waiting time, enhances intersection throughput, and decreases traffic congestion.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 17","pages":""},"PeriodicalIF":3.5,"publicationDate":"2025-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145612747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}