Stochastic games form the foundational mathematical framework for describing multiagent interactions and underpin the theoretical foundations of multiagent reinforcement learning (MARL) and optimal decision making. However, previous research has typically focused on either two-agent settings or large-scale well-mixed agent populations, where the considered interaction scenarios were far from realistic. In this article, we consider structured populations where agents can interact with immediate neighbors. By using the pair-approximation method, we develop a new dynamical model to describe the $Q$ -learning dynamics in stochastic games on regular graphs. Through comparisons with agent-based simulation results, we validate the accuracy of our dynamical model across various stochastic games, population structures, and algorithm parameters. Our research thus provides both qualitative and quantitative insights into the effects of state transition rules and graph topologies in population dynamics. In particular, we show that, under certain conditions, state transitions can significantly promote the evolution of cooperation in social dilemmas. We also explored the effects of agent degree on cooperation, and unlike previous findings, we show that this can have either positive or negative implications for cooperation depending on the transition rules.
{"title":"Dynamics of Q-Learning in Networked Stochastic Games.","authors":"Zheng Yuan, Guangchen Jiang, Shuyue Hu, Matjaz Perc, Chen Chu, Jinzhuo Liu","doi":"10.1109/TNNLS.2025.3641365","DOIUrl":"https://doi.org/10.1109/TNNLS.2025.3641365","url":null,"abstract":"<p><p>Stochastic games form the foundational mathematical framework for describing multiagent interactions and underpin the theoretical foundations of multiagent reinforcement learning (MARL) and optimal decision making. However, previous research has typically focused on either two-agent settings or large-scale well-mixed agent populations, where the considered interaction scenarios were far from realistic. In this article, we consider structured populations where agents can interact with immediate neighbors. By using the pair-approximation method, we develop a new dynamical model to describe the $Q$ -learning dynamics in stochastic games on regular graphs. Through comparisons with agent-based simulation results, we validate the accuracy of our dynamical model across various stochastic games, population structures, and algorithm parameters. Our research thus provides both qualitative and quantitative insights into the effects of state transition rules and graph topologies in population dynamics. In particular, we show that, under certain conditions, state transitions can significantly promote the evolution of cooperation in social dilemmas. We also explored the effects of agent degree on cooperation, and unlike previous findings, we show that this can have either positive or negative implications for cooperation depending on the transition rules.</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":8.9,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145889160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01DOI: 10.1109/TNNLS.2025.3643630
Ping Wang, Chengpu Yu, Fang Deng, Jie Chen
This article develops a scheme to tackle the safe optimal formation tracking issue for multiple fixed-wing uncrewed aerial vehicles (UAVs) with external disturbances and asymmetric control constraints. To ensure safety constraints in collision avoidance, a safe set is first constructed by a super level set of a continuously differential function, following a novel control barrier function (CBF) to characterize the safety. Subsequently, we transform the safe optimal formation tracking control into a constrained zero-sum (ZS) differential game to mitigate the destabilizing effects of the disturbances, where the cost function is constructed in a nonquadratic form to cope with asymmetric input constraints. Particularly, the designed CBF is integrated into the cost function to penalize the unsafe behavior, and a damping coefficient is included to balance the optimality and safety. Afterwords, a critic-only reinforcement learning (RL) strategy is developed to learn the robust safe Nash policy, where the critic weights are updated by applying experience replay technology, thus avoiding the requirement for persistence of excitation condition. Moreover, the stability and forward invariance of the safe set of the presented scheme are also verified. Finally, simulation examples are provided to substantiate the validity of the control scheme.
{"title":"Reinforcement Learning-Based Optimal Formation Tracking for UAVs With Safety Constraints.","authors":"Ping Wang, Chengpu Yu, Fang Deng, Jie Chen","doi":"10.1109/TNNLS.2025.3643630","DOIUrl":"https://doi.org/10.1109/TNNLS.2025.3643630","url":null,"abstract":"<p><p>This article develops a scheme to tackle the safe optimal formation tracking issue for multiple fixed-wing uncrewed aerial vehicles (UAVs) with external disturbances and asymmetric control constraints. To ensure safety constraints in collision avoidance, a safe set is first constructed by a super level set of a continuously differential function, following a novel control barrier function (CBF) to characterize the safety. Subsequently, we transform the safe optimal formation tracking control into a constrained zero-sum (ZS) differential game to mitigate the destabilizing effects of the disturbances, where the cost function is constructed in a nonquadratic form to cope with asymmetric input constraints. Particularly, the designed CBF is integrated into the cost function to penalize the unsafe behavior, and a damping coefficient is included to balance the optimality and safety. Afterwords, a critic-only reinforcement learning (RL) strategy is developed to learn the robust safe Nash policy, where the critic weights are updated by applying experience replay technology, thus avoiding the requirement for persistence of excitation condition. Moreover, the stability and forward invariance of the safe set of the presented scheme are also verified. Finally, simulation examples are provided to substantiate the validity of the control scheme.</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":8.9,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145889218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01DOI: 10.1109/TNNLS.2025.3601360
Yubo Huang, Xiaowei Zhao
Policy optimization methods are promising to tackle high-complexity reinforcement learning (RL) tasks with multiple agents. In this article, we derive a general trust region for policy optimization methods by considering the effect of subpolicy combinations among agents in multiagent environments. Based on this trust region, we propose an inductive objective to train the policy function, which can ensure agents learn monotonically improving policies. Furthermore, we observe that the policy always updates very weakly before falling into a local optimum. To address this, we introduce a cost regarding policy distance in the inductive objective to strengthen the motivation of agents to explore new policies. This approach strikes a balance during training, where the policy update step size remains within the constraints of the trust region, preventing excessive updates while avoiding getting stuck in local optima. Simulations on wind farm (WF) control tasks and two multiagent benchmarks demonstrate the high performance of the proposed multiagent inductive policy optimization (MAIPO) method.
{"title":"Multiagent Inductive Policy Optimization.","authors":"Yubo Huang, Xiaowei Zhao","doi":"10.1109/TNNLS.2025.3601360","DOIUrl":"10.1109/TNNLS.2025.3601360","url":null,"abstract":"<p><p>Policy optimization methods are promising to tackle high-complexity reinforcement learning (RL) tasks with multiple agents. In this article, we derive a general trust region for policy optimization methods by considering the effect of subpolicy combinations among agents in multiagent environments. Based on this trust region, we propose an inductive objective to train the policy function, which can ensure agents learn monotonically improving policies. Furthermore, we observe that the policy always updates very weakly before falling into a local optimum. To address this, we introduce a cost regarding policy distance in the inductive objective to strengthen the motivation of agents to explore new policies. This approach strikes a balance during training, where the policy update step size remains within the constraints of the trust region, preventing excessive updates while avoiding getting stuck in local optima. Simulations on wind farm (WF) control tasks and two multiagent benchmarks demonstrate the high performance of the proposed multiagent inductive policy optimization (MAIPO) method.</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":"95-106"},"PeriodicalIF":8.9,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145029590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This article addresses the distributed global prescribed-performance control problem for uncertain Lagrangian dynamics, with a particular emphasis on minimizing steady-state error oscillations. A novel global distributed prescribed-performance control framework is proposed based on a dynamic funnel function and neural network design. Specifically, by integrating funnel barrier properties and derivative information, a new neural network learning law is developed. Furthermore, a projection operator is incorporated into the learning law to guarantee the boundedness of the weight estimates in the stability proof, ultimately avoiding potential constraint incompatibility problems caused by neural network integration. The established control framework ensures that the trajectory consensus error of robotic manipulators under distributed control satisfies global arbitrary convergence rates and steady-state error bounds while leveraging neural network approximation to mitigate the inherent uncertainties of controllers that do not require precise mathematical model, thereby effectively suppressing steady-state error oscillations. Unlike existing literature, this work pioneers the incorporation of neural networks into distributed funnel control, achieving global prescribed performance while significantly reducing steady-state error oscillations. Finally, simulation results validate the effectiveness of the proposed method.
{"title":"A New Neural Network PI-Funnel Distributed Control for Cooperative Manipulator With Global Prescribed Performance.","authors":"Cui-Hua Zhang, Ze-Yun Hu, Yu-Jia Li, Ying Zhang, Chang-Chun Hua","doi":"10.1109/TNNLS.2025.3648714","DOIUrl":"https://doi.org/10.1109/TNNLS.2025.3648714","url":null,"abstract":"<p><p>This article addresses the distributed global prescribed-performance control problem for uncertain Lagrangian dynamics, with a particular emphasis on minimizing steady-state error oscillations. A novel global distributed prescribed-performance control framework is proposed based on a dynamic funnel function and neural network design. Specifically, by integrating funnel barrier properties and derivative information, a new neural network learning law is developed. Furthermore, a projection operator is incorporated into the learning law to guarantee the boundedness of the weight estimates in the stability proof, ultimately avoiding potential constraint incompatibility problems caused by neural network integration. The established control framework ensures that the trajectory consensus error of robotic manipulators under distributed control satisfies global arbitrary convergence rates and steady-state error bounds while leveraging neural network approximation to mitigate the inherent uncertainties of controllers that do not require precise mathematical model, thereby effectively suppressing steady-state error oscillations. Unlike existing literature, this work pioneers the incorporation of neural networks into distributed funnel control, achieving global prescribed performance while significantly reducing steady-state error oscillations. Finally, simulation results validate the effectiveness of the proposed method.</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":8.9,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145889203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-31DOI: 10.1109/TNNLS.2025.3648433
Qinghai Zheng
The low-rank tensor constraint is widely used in multiview subspace clustering (MSC) and has demonstrated promising clustering performance on many datasets. The key challenges in most existing low-rank tensor constraint-based methods include: 1) the choice of surrogate functions for the tensor rank and 2) the rotation operation applied to the tensor formed by stacking multiple subspace representations along the third mode. The latter plays a critical role in enhancing clustering performance in multiview settings. In this work, we rethink the low-rank tensor constraint and present the enhanced residual tensor norm (ERTN) for multiview subspace clustering, dubbed ERTN-MSC. To be specific, ERTN employs a novel surrogate for the tensor rank, based on the residual learning of singular values, which facilitates better exploitation of the structural information in multiview data. Furthermore, ERTN applies the tensor-singular value decomposition (t-SVD) on three modes of the tensor constructed by multiple subspaces, which generalizes the rotation operation of tensor and enables comprehensive exploration of both intraview information and interview information of multiview data. An augmented Lagrangian multiplier-based algorithm with a convergence guarantee is designed for optimization. Experiments conducted on several real-world multiview datasets demonstrate the effectiveness and competitiveness of our ERTN-MSC.
{"title":"Enhanced Residual Tensor Norm Minimization for Multiview Subspace Clustering.","authors":"Qinghai Zheng","doi":"10.1109/TNNLS.2025.3648433","DOIUrl":"https://doi.org/10.1109/TNNLS.2025.3648433","url":null,"abstract":"<p><p>The low-rank tensor constraint is widely used in multiview subspace clustering (MSC) and has demonstrated promising clustering performance on many datasets. The key challenges in most existing low-rank tensor constraint-based methods include: 1) the choice of surrogate functions for the tensor rank and 2) the rotation operation applied to the tensor formed by stacking multiple subspace representations along the third mode. The latter plays a critical role in enhancing clustering performance in multiview settings. In this work, we rethink the low-rank tensor constraint and present the enhanced residual tensor norm (ERTN) for multiview subspace clustering, dubbed ERTN-MSC. To be specific, ERTN employs a novel surrogate for the tensor rank, based on the residual learning of singular values, which facilitates better exploitation of the structural information in multiview data. Furthermore, ERTN applies the tensor-singular value decomposition (t-SVD) on three modes of the tensor constructed by multiple subspaces, which generalizes the rotation operation of tensor and enables comprehensive exploration of both intraview information and interview information of multiview data. An augmented Lagrangian multiplier-based algorithm with a convergence guarantee is designed for optimization. Experiments conducted on several real-world multiview datasets demonstrate the effectiveness and competitiveness of our ERTN-MSC.</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":8.9,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145878231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Knowledge graph (KG) contrastive learning (CL) has garnered significant attention in the realm of recommendation systems. However, existing models often employ random masking for graph enhancement, which can introduce sampling bias and impede interpretability. Furthermore, the KG-UIG information imbalance can lead to the neglect of critical information in the user-item interaction graph (UIG) by the model. To address these challenges, we propose a novel model, diffusion-augmented graph CL (DAGCL). This model leverages a graph diffusion mechanism for data enhancement in CL, thereby ensuring that the generated diffusion graph closely resembles the original UIG and avoiding the pitfalls associated with random sampling. Additionally, DAGCL enhances the impact of UIG on predictive accuracy by implementing both intragraph and intergraph CL (GCL), effectively mitigating the information imbalance between KGs and UIG. The model also leverages the structural characteristics of the UIG to construct a structural diffusion graph, which is integrated with the information diffusion graph to produce a comprehensive diffusion representation-further enhancing the model's robustness against sampling noise and semantic dilution by preserving essential interaction patterns and structural features in the augmented graph. Experimental results across three real-world datasets demonstrate that our proposed model outperforms state-of-the-art models significantly.
{"title":"Diffusion-Augmented Graph Contrastive Learning for Knowledge-Aware Recommendation.","authors":"Jing Zhang, Xiaoqian Jiang, Youxuan Wang, Shunmeng Meng, Cangqi Zhou","doi":"10.1109/TNNLS.2025.3646605","DOIUrl":"https://doi.org/10.1109/TNNLS.2025.3646605","url":null,"abstract":"<p><p>Knowledge graph (KG) contrastive learning (CL) has garnered significant attention in the realm of recommendation systems. However, existing models often employ random masking for graph enhancement, which can introduce sampling bias and impede interpretability. Furthermore, the KG-UIG information imbalance can lead to the neglect of critical information in the user-item interaction graph (UIG) by the model. To address these challenges, we propose a novel model, diffusion-augmented graph CL (DAGCL). This model leverages a graph diffusion mechanism for data enhancement in CL, thereby ensuring that the generated diffusion graph closely resembles the original UIG and avoiding the pitfalls associated with random sampling. Additionally, DAGCL enhances the impact of UIG on predictive accuracy by implementing both intragraph and intergraph CL (GCL), effectively mitigating the information imbalance between KGs and UIG. The model also leverages the structural characteristics of the UIG to construct a structural diffusion graph, which is integrated with the information diffusion graph to produce a comprehensive diffusion representation-further enhancing the model's robustness against sampling noise and semantic dilution by preserving essential interaction patterns and structural features in the augmented graph. Experimental results across three real-world datasets demonstrate that our proposed model outperforms state-of-the-art models significantly.</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":8.9,"publicationDate":"2025-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145863054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-30DOI: 10.1109/TNNLS.2025.3644134
Feihu Jin, Ying Tan
Large language models (LLMs) have achieved striking performance across a broad range of reasoning benchmarks, yet the quality of their outputs remains acutely sensitive to prompt design. A meticulously engineered prompt can coax an LLM into correctly answering even highly complex questions, prompting a surge of research into techniques that boost prompt efficacy. Manual chain-of-thought (CoT) prompting and automated prompt generation have emerged as leading strategies. However, CoT exemplars must be painstakingly tailored to each task, making the process labor-intensive, while prompts optimized for a single task often fail to generalize. We introduce a simple yet powerful alternative: by treating the LLM itself as a multitask optimizer, we enable iterative self-refinement of prompts through natural language task descriptions and few-shot in-context learning (ICL). Recognizing that individual prompts exhibit task-dependent sensitivity, we further ensemble the top-performing prompts at inference time. Empirical evaluation with several state-of-the-art LLMs shows that our method substantially surpasses prior baselines, delivering gains of up to 6.0% on mathematical reasoning tasks and 10.2% on commonsense reasoning benchmarks.
{"title":"Large Language Models Are Multitask Chain-of-Thought Prompting Optimizers.","authors":"Feihu Jin, Ying Tan","doi":"10.1109/TNNLS.2025.3644134","DOIUrl":"https://doi.org/10.1109/TNNLS.2025.3644134","url":null,"abstract":"<p><p>Large language models (LLMs) have achieved striking performance across a broad range of reasoning benchmarks, yet the quality of their outputs remains acutely sensitive to prompt design. A meticulously engineered prompt can coax an LLM into correctly answering even highly complex questions, prompting a surge of research into techniques that boost prompt efficacy. Manual chain-of-thought (CoT) prompting and automated prompt generation have emerged as leading strategies. However, CoT exemplars must be painstakingly tailored to each task, making the process labor-intensive, while prompts optimized for a single task often fail to generalize. We introduce a simple yet powerful alternative: by treating the LLM itself as a multitask optimizer, we enable iterative self-refinement of prompts through natural language task descriptions and few-shot in-context learning (ICL). Recognizing that individual prompts exhibit task-dependent sensitivity, we further ensemble the top-performing prompts at inference time. Empirical evaluation with several state-of-the-art LLMs shows that our method substantially surpasses prior baselines, delivering gains of up to 6.0% on mathematical reasoning tasks and 10.2% on commonsense reasoning benchmarks.</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":8.9,"publicationDate":"2025-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145863039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-30DOI: 10.1109/TNNLS.2025.3646358
Chun-Xiao Li, Huai-Ning Wu
This article proposes a dynamic neural network (DNN)-based control method to realize the optimal control of nonlinear parameter-varying (NPV) systems. Specifically, a DNN-based control policy (DNN-CP) composed of static shared layers and a parameter-related dynamic layer is constructed to improve the generalization and adaptability. An extreme learning machine (ELM)-based weight prediction model is established to fit the relationship between the dynamic weights and the system parameters. The shared layers are updated by solving the constrained multiobjective problem to reduce performance conflicts among different systems, and the weight prediction model is tuned by maximizing parameter-related objectives to achieve optimal control of each system. To improve data efficiency and adaptability, a supervised learning-based pretraining and reinforcement learning (RL)-based fine-tuning algorithm is developed. Finally, the control performance of the DNN-CP is verified on morphing aircraft. We demonstrate that the designed DNN-CP and training algorithm can achieve generalization capabilities, and DNN-CP can be immediately generalized to any system within the parameter space without sample collection or fine-tuning. Compared with other methods, DNN-CP has better control performance on the system with continuously varying parameters.
{"title":"A Dynamic Neural Network-Based Control Method Using Reinforcement Learning for Nonlinear Parameter-Varying System With Application to Morphing Aircraft.","authors":"Chun-Xiao Li, Huai-Ning Wu","doi":"10.1109/TNNLS.2025.3646358","DOIUrl":"https://doi.org/10.1109/TNNLS.2025.3646358","url":null,"abstract":"<p><p>This article proposes a dynamic neural network (DNN)-based control method to realize the optimal control of nonlinear parameter-varying (NPV) systems. Specifically, a DNN-based control policy (DNN-CP) composed of static shared layers and a parameter-related dynamic layer is constructed to improve the generalization and adaptability. An extreme learning machine (ELM)-based weight prediction model is established to fit the relationship between the dynamic weights and the system parameters. The shared layers are updated by solving the constrained multiobjective problem to reduce performance conflicts among different systems, and the weight prediction model is tuned by maximizing parameter-related objectives to achieve optimal control of each system. To improve data efficiency and adaptability, a supervised learning-based pretraining and reinforcement learning (RL)-based fine-tuning algorithm is developed. Finally, the control performance of the DNN-CP is verified on morphing aircraft. We demonstrate that the designed DNN-CP and training algorithm can achieve generalization capabilities, and DNN-CP can be immediately generalized to any system within the parameter space without sample collection or fine-tuning. Compared with other methods, DNN-CP has better control performance on the system with continuously varying parameters.</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":8.9,"publicationDate":"2025-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145863061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-30DOI: 10.1109/TNNLS.2025.3646848
Xinglin Lyu, Junhui Li, Daimeng Wei, Min Zhang, Shimin Tao, Hao Yang, Min Zhang
Large language models (LLMs) are typically adapted for context-aware machine translation (MT) by combining both the source sentence and its surrounding sentences into a single input. This unified input is then processed in one go, with the model producing the target translation step by step. However, this method treats the intrasentence and intersentence contexts similarly, even though they play distinct roles. In this study, we present a novel strategy called multiphase prompt tuning (MPT) to address this issue by enabling LLMs to treat these two context types differently. MPT divides the context-aware translation task into three phases: encoding the intersentence context, encoding the source sentence, andthefinal decoding phase. Each phase incorporates distinct continuous prompts that help the model focus on the appropriate task for each type of context. We also introduce a multitask fine-tuning approach to emphasize the distinction between intersentence and intrasentence contexts and enhance intersentence dependencies. This includes two auxiliary tasks: context-agnostic translation and cross-lingual next sentence generation, which help extract additional information and improve the model's handling of discourse-related challenges.
{"title":"Multiphase and Multitask Prompt Tuning for LLM-Based Context-Aware Machine Translation.","authors":"Xinglin Lyu, Junhui Li, Daimeng Wei, Min Zhang, Shimin Tao, Hao Yang, Min Zhang","doi":"10.1109/TNNLS.2025.3646848","DOIUrl":"https://doi.org/10.1109/TNNLS.2025.3646848","url":null,"abstract":"<p><p>Large language models (LLMs) are typically adapted for context-aware machine translation (MT) by combining both the source sentence and its surrounding sentences into a single input. This unified input is then processed in one go, with the model producing the target translation step by step. However, this method treats the intrasentence and intersentence contexts similarly, even though they play distinct roles. In this study, we present a novel strategy called multiphase prompt tuning (MPT) to address this issue by enabling LLMs to treat these two context types differently. MPT divides the context-aware translation task into three phases: encoding the intersentence context, encoding the source sentence, andthefinal decoding phase. Each phase incorporates distinct continuous prompts that help the model focus on the appropriate task for each type of context. We also introduce a multitask fine-tuning approach to emphasize the distinction between intersentence and intrasentence contexts and enhance intersentence dependencies. This includes two auxiliary tasks: context-agnostic translation and cross-lingual next sentence generation, which help extract additional information and improve the model's handling of discourse-related challenges.</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":8.9,"publicationDate":"2025-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145863013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-25DOI: 10.1109/tnnls.2025.3643632
Pengfei Bie, Ning Song, Nuoqing Zhang, Jie Nie, Min Ye, Xinyue Liang, Qi Wen
{"title":"Space–Frequency Cross-Attention Node Feature Optimization Graph Neural Operator for Partial Differential Equations","authors":"Pengfei Bie, Ning Song, Nuoqing Zhang, Jie Nie, Min Ye, Xinyue Liang, Qi Wen","doi":"10.1109/tnnls.2025.3643632","DOIUrl":"https://doi.org/10.1109/tnnls.2025.3643632","url":null,"abstract":"","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"30 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2025-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145830038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}