Pub Date : 2026-02-12DOI: 10.1109/TKDE.2026.3652658
{"title":"2025 Reviewers List","authors":"","doi":"10.1109/TKDE.2026.3652658","DOIUrl":"https://doi.org/10.1109/TKDE.2026.3652658","url":null,"abstract":"","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"38 3","pages":"2108-2121"},"PeriodicalIF":10.4,"publicationDate":"2026-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11395241","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146162185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
To leverage the advantages of LLM in addressing challenges in the Text-to-SQL task, we present XiYan-SQL, an innovative framework effectively generating and utilizing multiple SQL candidates. It consists of three components: 1) a Schema Filter module filtering and obtaining multiple relevant schemas; 2) a multi-generator ensemble approach generating multiple high-quality and diverse SQL queries; 3) a selection model with a candidate reorganization strategy implemented to obtain the optimal SQL query. Specifically, for the multi-generator ensemble, we employ a multi-task fine-tuning strategy to enhance the capabilities of SQL generation models for the intrinsic alignment between SQL and text, and construct multiple generation models with distinct generation styles by fine-tuning across different SQL formats. The experimental results and comprehensive analysis demonstrate the effectiveness and robustness of our framework. Overall, XiYan-SQL achieves a new SOTA performance of 75.63% on the notable BIRD benchmark, surpassing all previous methods. It also attains SOTA performance on the Spider test set with an accuracy of 89.65%.
{"title":"XiYan-SQL: A Novel Multi-Generator Framework for Text-to-SQL","authors":"Yifu Liu;Yin Zhu;Yingqi Gao;Zhiling Luo;Xiaoxia Li;Xiaorong Shi;Yuntao Hong;Jinyang Gao;Yu Li;Bolin Ding;Jingren Zhou","doi":"10.1109/TKDE.2026.3657851","DOIUrl":"https://doi.org/10.1109/TKDE.2026.3657851","url":null,"abstract":"To leverage the advantages of LLM in addressing challenges in the Text-to-SQL task, we present XiYan-SQL, an innovative framework effectively generating and utilizing multiple SQL candidates. It consists of three components: 1) a Schema Filter module filtering and obtaining multiple relevant schemas; 2) a multi-generator ensemble approach generating multiple high-quality and diverse SQL queries; 3) a selection model with a candidate reorganization strategy implemented to obtain the optimal SQL query. Specifically, for the multi-generator ensemble, we employ a multi-task fine-tuning strategy to enhance the capabilities of SQL generation models for the intrinsic alignment between SQL and text, and construct multiple generation models with distinct generation styles by fine-tuning across different SQL formats. The experimental results and comprehensive analysis demonstrate the effectiveness and robustness of our framework. Overall, XiYan-SQL achieves a new SOTA performance of 75.63% on the notable BIRD benchmark, surpassing all previous methods. It also attains SOTA performance on the Spider test set with an accuracy of 89.65%.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"38 4","pages":"2474-2487"},"PeriodicalIF":10.4,"publicationDate":"2026-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147374362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Federated graph learning (FGL) aims to collaboratively train graph neural networks (GNNs) among multiple clients, where each client owns a subgraph of a global model. A key challenge in FGL arises from the possible interconnections between nodes distributed across different subgraphs, leading to an incomplete capture of neighborhood knowledge within the graph. Existing FGL frameworks attempt to learn missing neighborhood knowledge by generating pseudo nodes or transmitting missing node embedding directly across clients, which is either only suitable to 1-hop neighbor nodes or comes with high communication costs when training deeper GNNs. In this paper, we propose a novel framework for FGL named $text{Fed}^{2}text{GNN}$ that could fully capture neighborhood knowledge while achieving low communication costs. More specifically, we propose ego-tree, a new graph structure that is easy to build and allows us to reconstruct the neighborhood faithfully. Furthermore, we design an encoder-decoder-based method to build ego-tree. The encoder enables clients to transmit encoded information essential for tree construction with minimal communication costs, while the decoder empowers clients to build the ego-tree by decoding the received information. Extensive experiments on real-world network datasets show the effectiveness of our framework for training deep GNNs and about 100× less communication compared to prior works.
{"title":"Toward Federated Learning of Deep Graph Neural Networks","authors":"Zhihua Tian;Yuan Ding;Rui Zhang;Yao Tang;Jian Liu;Kui Ren","doi":"10.1109/TKDE.2026.3652029","DOIUrl":"https://doi.org/10.1109/TKDE.2026.3652029","url":null,"abstract":"Federated graph learning (FGL) aims to collaboratively train graph neural networks (GNNs) among multiple clients, where each client owns a subgraph of a global model. A key challenge in FGL arises from the possible interconnections between nodes distributed across different subgraphs, leading to an incomplete capture of neighborhood knowledge within the graph. Existing FGL frameworks attempt to learn missing neighborhood knowledge by generating pseudo nodes or transmitting missing node embedding directly across clients, which is either only suitable to 1-hop neighbor nodes or comes with high communication costs when training deeper GNNs. In this paper, we propose a novel framework for FGL named <inline-formula><tex-math>$text{Fed}^{2}text{GNN}$</tex-math></inline-formula> that could fully capture neighborhood knowledge while achieving low communication costs. More specifically, we propose ego-tree, a new graph structure that is easy to build and allows us to reconstruct the neighborhood faithfully. Furthermore, we design an encoder-decoder-based method to build ego-tree. The encoder enables clients to transmit encoded information essential for tree construction with minimal communication costs, while the decoder empowers clients to build the ego-tree by decoding the received information. Extensive experiments on real-world network datasets show the effectiveness of our framework for training deep GNNs and about 100× less communication compared to prior works.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"38 3","pages":"2028-2039"},"PeriodicalIF":10.4,"publicationDate":"2026-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146162189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-21DOI: 10.1109/TKDE.2026.3656414
Xinrui Ge;Jia Yu;Rong Hao
Privacy-Preserving Graph Similarity matching Query (PPGSQ) can retrieve the encrypted data graphs that approximately match with the encrypted query graph from the graph database. Existing PPGSQ schemes adopt the pivot filter and the global filter to measure the similarity of two graphs, which leads to heavy computation burden for clients and many mismatched data graphs cannot be filtered out. In addition, traversing the whole graph database to execute PPGSQ can greatly affect the query efficiency. To address these issues, we propose two privacy-preserving graph similarity matching query schemes in this paper. We first present a basic scheme with linear query complexity. We adopt the branch-based lower bound of edit distance to efficiently measure the similarity of two encrypted graphs, which can reduce the computation overhead for clients and improve the lower bound of MGED. In order to facilitate effective pruning and enhance the query efficiency, we give an improved scheme by designing a novel tree-based secure index, which can realize the sublinear query complexity. Our schemes can achieve the necessary privacy without losing the ability of querying. To further protect the number of branches/vertices, we give a succinct discussion on how to use homomorphic Paillier encryption to encrypt this number. We analyze the security of our schemes, and conduct the experiments evaluation on a real-world graph database to show the efficiency of the proposed schemes.
{"title":"Privacy-Preserving Graph Similarity Matching Query Over Encrypted Graph Database","authors":"Xinrui Ge;Jia Yu;Rong Hao","doi":"10.1109/TKDE.2026.3656414","DOIUrl":"https://doi.org/10.1109/TKDE.2026.3656414","url":null,"abstract":"Privacy-Preserving Graph Similarity matching Query (PPGSQ) can retrieve the encrypted data graphs that approximately match with the encrypted query graph from the graph database. Existing PPGSQ schemes adopt the pivot filter and the global filter to measure the similarity of two graphs, which leads to heavy computation burden for clients and many mismatched data graphs cannot be filtered out. In addition, traversing the whole graph database to execute PPGSQ can greatly affect the query efficiency. To address these issues, we propose two privacy-preserving graph similarity matching query schemes in this paper. We first present a basic scheme with linear query complexity. We adopt the branch-based lower bound of edit distance to efficiently measure the similarity of two encrypted graphs, which can reduce the computation overhead for clients and improve the lower bound of MGED. In order to facilitate effective pruning and enhance the query efficiency, we give an improved scheme by designing a novel tree-based secure index, which can realize the sublinear query complexity. Our schemes can achieve the necessary privacy without losing the ability of querying. To further protect the number of branches/vertices, we give a succinct discussion on how to use homomorphic Paillier encryption to encrypt this number. We analyze the security of our schemes, and conduct the experiments evaluation on a real-world graph database to show the efficiency of the proposed schemes.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"38 3","pages":"1932-1945"},"PeriodicalIF":10.4,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146162201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Knowledge base question answering (KBQA) refers to the task of answering natural language questions using factual information from large-scale knowledge bases (KBs). To obtain accurate answers, recent research optimizes semantic parsing methods, a major KBQA approach, with large language models (LLMs), where concise logical forms (LFs) are generated by LLMs and executed in KBs. Although these methods demonstrate superior performance, they still encounter the problem that some generated LFs fail to yield answers when executed, significantly limiting their effectiveness. To mitigate this issue, we propose KARV, a Knowledge-Assisted reasoning path Reconstruction and hierarchical Voting approach for non-executable LFs. This method extracts semantic knowledge from KBs as guidance to correct and reconstruct reasoning paths, deriving answers through a voting-based strategy. The insight is that non-executable LFs generated by LLMs still contain rich semantic information, and the knowledge retrieved from KBs can effectively correct them. Specifically, we fine-tune LLMs to generate high-quality LFs, and the nonexecutable LFs are decomposed into multiple path branches based on mentioned entities. Semantic knowledge from KBs is then leveraged to correct the entities and relations within these branches, effectively reconstructing the reasoning paths. To obtain precise final answers, we apply a hierarchical voting strategy both within and across the non-executable LFs. Our proposed method achieves state-of-the-art performance on benchmarks including WebQuestionSP (WebQSP), ComplexWebQuestions (CWQ), and FreebaseQA.
{"title":"Optimizing KBQA by Correcting LLM-Generated Non-Executable Logical Form Through Knowledge-Assisted Path Reconstruction","authors":"Ranran Bu;Jianqi Gao;Jian Cao;Hongming Cai;Jinghua Tang;Yonggang Zhang","doi":"10.1109/TKDE.2026.3656646","DOIUrl":"https://doi.org/10.1109/TKDE.2026.3656646","url":null,"abstract":"Knowledge base question answering (KBQA) refers to the task of answering natural language questions using factual information from large-scale knowledge bases (KBs). To obtain accurate answers, recent research optimizes semantic parsing methods, a major KBQA approach, with large language models (LLMs), where concise logical forms (LFs) are generated by LLMs and executed in KBs. Although these methods demonstrate superior performance, they still encounter the problem that some generated LFs fail to yield answers when executed, significantly limiting their effectiveness. To mitigate this issue, we propose KARV, a <italic>K</i>nowledge-<italic>A</i>ssisted reasoning path <italic>R</i>econstruction and hierarchical <italic>V</i>oting approach for non-executable LFs. This method extracts semantic knowledge from KBs as guidance to correct and reconstruct reasoning paths, deriving answers through a voting-based strategy. The insight is that non-executable LFs generated by LLMs still contain rich semantic information, and the knowledge retrieved from KBs can effectively correct them. Specifically, we fine-tune LLMs to generate high-quality LFs, and the nonexecutable LFs are decomposed into multiple path branches based on mentioned entities. Semantic knowledge from KBs is then leveraged to correct the entities and relations within these branches, effectively reconstructing the reasoning paths. To obtain precise final answers, we apply a hierarchical voting strategy both within and across the non-executable LFs. Our proposed method achieves state-of-the-art performance on benchmarks including WebQuestionSP (WebQSP), ComplexWebQuestions (CWQ), and FreebaseQA.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"38 3","pages":"1871-1884"},"PeriodicalIF":10.4,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146162192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Graph-based Twitter bot detectors are proven more effective than feature-based and text-based. Mainstream detectors only employ friend relationships, bringing two limitations: (i) friend relationships are sparse, ignoring implicit interactions between users, and (ii) bots would follow humans to expand their influence, challenging the homophily principle. This paper aims to learn a homophilous context graph containing implicit interactions, which faces two challenges: (i) existing homophily measures are influenced by the class distribution, which is not suitable for the class imbalance situation of bot detection, and (ii) existing graph learning paradigm would introduce noisy neighbors and consume computing resources. To this end, we first propose a class-independent homophily measure, which is proven to be robust to class distribution. Meanwhile, we propose HCGBot, which transforms graph learning into similarity metric learning. HCGBot contains a neighbor-mask GNN layer, which masks users that hardly implicitly interact and extracts topology and weight information from the context graph. Finally, we design a hybrid loss to optimize HCGBot, which maximizes the class-independent homophily measure while detecting bots. Extensive experiments prove that HCGBot achieves the best performance and learns a more homophilous context graph with high efficiency. Further analysis illustrates that HCGBot can detect social bots in more realistic situations.
{"title":"HCGBot: Learning Homophilous Context Graphs for Twitter Bot Detection","authors":"Herun Wan;Minnan Luo;Jihong Wang;Xiaojun Chang;Qinghua Zheng","doi":"10.1109/TKDE.2026.3656720","DOIUrl":"https://doi.org/10.1109/TKDE.2026.3656720","url":null,"abstract":"Graph-based Twitter bot detectors are proven more effective than feature-based and text-based. Mainstream detectors only employ friend relationships, bringing two limitations: (i) friend relationships are sparse, ignoring implicit interactions between users, and (ii) bots would follow humans to expand their influence, challenging the homophily principle. This paper aims to learn a homophilous context graph containing implicit interactions, which faces two challenges: (i) existing homophily measures are influenced by the class distribution, which is not suitable for the class imbalance situation of bot detection, and (ii) existing graph learning paradigm would introduce noisy neighbors and consume computing resources. To this end, we first propose a class-independent homophily measure, which is proven to be robust to class distribution. Meanwhile, we propose HCGBot, which transforms graph learning into similarity metric learning. HCGBot contains a neighbor-mask GNN layer, which masks users that hardly implicitly interact and extracts topology and weight information from the context graph. Finally, we design a hybrid loss to optimize HCGBot, which maximizes the class-independent homophily measure while detecting bots. Extensive experiments prove that HCGBot achieves the best performance and learns a more homophilous context graph with high efficiency. Further analysis illustrates that HCGBot can detect social bots in more realistic situations.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"38 3","pages":"1813-1825"},"PeriodicalIF":10.4,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146162181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-21DOI: 10.1109/TKDE.2026.3656436
Guojie Li;Zhiwen Yu;Ziwei Fan;Kaixiang Yang;C. L. Philip Chen
Graph-based clustering has been extensively explored and applied due to its exceptional performance. However, most existing methods operate directly in the original high-dimensional space, where complex nonlinear structures and redundant noisy features often obscure the intrinsic data distribution. Consequently, constructing a reliable similarity graph in such a space is inherently challenging, as uncertainty and noise can significantly degrade clustering performance. To address this issue, this paper proposes a novel graph-based clustering method, Weighted Subspace Graph Learning (WSGL). Specifically, WSGL leverages kernel principal component analysis (Kernel PCA) to construct multiple kernel-based subspaces, effectively capturing nonlinear structures while reducing redundancy and noise. This strategy enhances subspace features from different perspectives, providing a more comprehensive understanding of the data distribution. Next, WSGL learns pairwise relationships across these subspaces, fully exploiting their complementary information to mitigate the limitations of relying on a single original space for capturing the global data structure. Furthermore, to ensure that the learned similarity graph preserves the same number of connected components as the ground-truth clusters, we impose a low-rank constraint on the graph structure. Additionally, considering the varying quality of different subspaces, WSGL introduces a dynamic weighting mechanism that adaptively assigns weights to subspaces based on their contribution to clustering performance, allowing high-quality subspaces to play a more dominant role in the final clustering results. Extensive experiments on multiple high-dimensional datasets demonstrate that WSGL surpasses state-of-the-art methods, validating its effectiveness and superiority in complex clustering tasks.
{"title":"Weighted Subspace Graph Learning for High-Dimensional Data","authors":"Guojie Li;Zhiwen Yu;Ziwei Fan;Kaixiang Yang;C. L. Philip Chen","doi":"10.1109/TKDE.2026.3656436","DOIUrl":"https://doi.org/10.1109/TKDE.2026.3656436","url":null,"abstract":"Graph-based clustering has been extensively explored and applied due to its exceptional performance. However, most existing methods operate directly in the original high-dimensional space, where complex nonlinear structures and redundant noisy features often obscure the intrinsic data distribution. Consequently, constructing a reliable similarity graph in such a space is inherently challenging, as uncertainty and noise can significantly degrade clustering performance. To address this issue, this paper proposes a novel graph-based clustering method, Weighted Subspace Graph Learning (WSGL). Specifically, WSGL leverages kernel principal component analysis (Kernel PCA) to construct multiple kernel-based subspaces, effectively capturing nonlinear structures while reducing redundancy and noise. This strategy enhances subspace features from different perspectives, providing a more comprehensive understanding of the data distribution. Next, WSGL learns pairwise relationships across these subspaces, fully exploiting their complementary information to mitigate the limitations of relying on a single original space for capturing the global data structure. Furthermore, to ensure that the learned similarity graph preserves the same number of connected components as the ground-truth clusters, we impose a low-rank constraint on the graph structure. Additionally, considering the varying quality of different subspaces, WSGL introduces a dynamic weighting mechanism that adaptively assigns weights to subspaces based on their contribution to clustering performance, allowing high-quality subspaces to play a more dominant role in the final clustering results. Extensive experiments on multiple high-dimensional datasets demonstrate that WSGL surpasses state-of-the-art methods, validating its effectiveness and superiority in complex clustering tasks.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"38 3","pages":"2094-2107"},"PeriodicalIF":10.4,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146162233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-20DOI: 10.1109/TKDE.2026.3656194
Liping Yi;Han Yu;Gang Wang;Xiaoguang Liu;Qinghua Hu
With growing client diversity, model-heterogeneous personalized federated learning (MHPFL) supports collaboration over structure-heterogeneous client models. However, existing MHPFL methods only achieve client-level personalization but ignore inherent discrepancies within each client’s different data samples, leading to limited model performance. To this end, we propose a novel model-heterogeneous personalized Federated learning with Mixture of Experts (pFedMoE) to achieve a fine-grained data-level personalization. As the first work that incorporates MoE in MHPFL, it introduces three innovations: (1) Different clients hold heterogeneous local models, we add a small proxy global homogeneous feature extractor shared by clients for knowledge exchange. (2) To achieve a fine-grained data-level personalization, we construct a personalized local MoE for each client: a local expert (local heterogeneous client model’s feature extractor), a global expert (global proxy homogeneous feature extractor), and a local personalized gating network, which dynamically balances the generalization and personalization of the local model at the data sample level. (3) We customize a lightweight linear gating network to capture the generalized and personalized data characteristics of each local data sample. We theoretically prove its $mathcal {O}(1/T)$ convergence rate. Experiments on 3 benchmark image datasets, 1 real-world image dataset and 1 real-world text dataset against 9 baselines demonstrate its state-of-the-art model accuracy with up to 2.79% accuracy improvement while saving up to 43.12% computational overheads and keeping satisfactory communication costs.
{"title":"pFedMoE: Data-Level Personalization With Mixture of Experts in Model-Heterogeneous Personalized Federated Learning","authors":"Liping Yi;Han Yu;Gang Wang;Xiaoguang Liu;Qinghua Hu","doi":"10.1109/TKDE.2026.3656194","DOIUrl":"https://doi.org/10.1109/TKDE.2026.3656194","url":null,"abstract":"With growing client diversity, model-heterogeneous personalized federated learning (MHPFL) supports collaboration over structure-heterogeneous client models. However, existing MHPFL methods only achieve client-level personalization but ignore inherent discrepancies within each client’s different data samples, leading to limited model performance. To this end, we propose a novel model-heterogeneous <u>p</u>ersonalized <u>Fed</u>erated learning with <u>M</u>ixture <u>o</u>f <u>E</u>xperts (<monospace>pFedMoE</monospace>) to achieve a fine-grained data-level personalization. As the first work that incorporates MoE in MHPFL, it introduces three innovations: (1) Different clients hold heterogeneous local models, we add a small proxy global homogeneous feature extractor shared by clients for knowledge exchange. (2) To achieve a fine-grained data-level personalization, we construct a personalized local MoE for each client: a local expert (local heterogeneous client model’s feature extractor), a global expert (global proxy homogeneous feature extractor), and a local personalized gating network, which dynamically balances the generalization and personalization of the local model at the data sample level. (3) We customize a lightweight linear gating network to capture the generalized and personalized data characteristics of each local data sample. We theoretically prove its <inline-formula><tex-math>$mathcal {O}(1/T)$</tex-math></inline-formula> convergence rate. Experiments on 3 benchmark image datasets, 1 real-world image dataset and 1 real-world text dataset against 9 baselines demonstrate its state-of-the-art model accuracy with up to 2.79% accuracy improvement while saving up to 43.12% computational overheads and keeping satisfactory communication costs.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"38 3","pages":"1905-1918"},"PeriodicalIF":10.4,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146162240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As geospatial data from web platforms becomes increasingly accessible and regularly updated, urban representation learning has emerged as a critical research area for advancing urban planning. Recent studies have developed foundation model-based algorithms to leverage this data for various urban-related downstream tasks. However, current research has inadequately explored deep integration strategies for multiscale, multimodal urban data in the context of urban foundation models. This gap arises primarily because the relationships between micro-scale (e.g., individual points of interest and street view imagery) and macro-scale (e.g., region-wide satellite imagery) urban features are inherently implicit and highly complex, making traditional interaction modeling insufficient. This paper introduces a novel research problem – how to learn multiscale urban representations by integrating diverse geographic data modalities and modeling complex multimodal relationships across different spatial scales. To address this significant challenge, we propose UrbanMFM, a spatial graph-based multiscale foundation model framework explicitly designed to capture and leverage these intricate relationships. UrbanMFM utilizes a self-supervised learning paradigm that integrates diverse geographic data modalities, including POI data and urban imagery, through novel contrastive learning objectives and advanced sampling techniques. By explicitly modeling spatial graphs to represent complex multiscale urban relationships, UrbanMFM effectively facilitates deep interactions between multimodal data sources. Extensive experiments on datasets from Singapore, New York, and Beijing demonstrate that UrbanMFM outperforms the strongest baselines significantly in four representative downstream tasks. By effectively modeling spatial hierarchies with diverse data, UrbanMFM provides a more comprehensive and adaptable representation of urban environments.
{"title":"UrbanMFM: Spatial Graph-Based Multiscale Foundation Models for Learning Generalized Urban Representation","authors":"Zhaoqi Zhang;Miao Xie;Pasquale Balsebre;Weiming Huang;Siqiang Luo;Gao Cong","doi":"10.1109/TKDE.2026.3656202","DOIUrl":"https://doi.org/10.1109/TKDE.2026.3656202","url":null,"abstract":"As geospatial data from web platforms becomes increasingly accessible and regularly updated, urban representation learning has emerged as a critical research area for advancing urban planning. Recent studies have developed foundation model-based algorithms to leverage this data for various urban-related downstream tasks. However, current research has inadequately explored deep integration strategies for multiscale, multimodal urban data in the context of urban foundation models. This gap arises primarily because the relationships between micro-scale (e.g., individual points of interest and street view imagery) and macro-scale (e.g., region-wide satellite imagery) urban features are inherently implicit and highly complex, making traditional interaction modeling insufficient. This paper introduces a novel research problem – how to learn multiscale urban representations by integrating diverse geographic data modalities and modeling complex multimodal relationships across different spatial scales. To address this significant challenge, we propose UrbanMFM, a spatial graph-based multiscale foundation model framework explicitly designed to capture and leverage these intricate relationships. UrbanMFM utilizes a self-supervised learning paradigm that integrates diverse geographic data modalities, including POI data and urban imagery, through novel contrastive learning objectives and advanced sampling techniques. By explicitly modeling spatial graphs to represent complex multiscale urban relationships, UrbanMFM effectively facilitates deep interactions between multimodal data sources. Extensive experiments on datasets from Singapore, New York, and Beijing demonstrate that UrbanMFM outperforms the strongest baselines significantly in four representative downstream tasks. By effectively modeling spatial hierarchies with diverse data, UrbanMFM provides a more comprehensive and adaptable representation of urban environments.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"38 3","pages":"2064-2078"},"PeriodicalIF":10.4,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146162196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In information systems lacking decision-making information, effectively leveraging fuzzy rough sets for outlier detection in complex data is challenging, especially in capturing inherent uncertainty and multi-granularity characteristics to construct discriminative outlier scores. However, existing fuzzy rough sets-based outlier detection methods often suffer from three key limitations: (1) Local data distributions are often ignored when calculating fuzzy relation matrices, resulting in inaccurate fuzzy similarity representations; (2) Use of all objects in fuzzy upper and lower approximations can weaken noise resistance and increase computational complexity; (3) Single-granularity data processing reduces efficiency and may fail to capture the multi-granularity nature of data, thereby limiting the adaptability of these methods in complex data environments. To address these issues, we propose to fuses Natural neighbor fuzzy approximations with Granular-ball representation for Outlier Detection (NGOD), which integrates the multi-granularity granular-ball representation and fuzzy rough sets to improve the effectiveness and robustness of unsupervised outlier detection. Specifically, we first define a local distribution-aware fuzzy relation, enabling more discriminative similarity calculations between samples. To improve the effectiveness and robustness of fuzzy upper and lower approximations, we propose a multi-granularity natural neighbor fuzzy approximation model, which effectively utilizes the inherent uncertainty and local abnormal information of data in approximations. Moreover, by introducing natural neighbors, NGOD can adaptively capture local abnormal information in the data without setting neighborhoods manually. Finally, the outlier factors of each sample are calculated in NGOD to measure their outlier degrees. Extensive experiments on diverse datasets demonstrate that NGOD outperforms state-of-the-art methods, validating its superior performance and adaptability.
{"title":"Natural Neighbor Fuzzy Approximations With Granular-Ball Representation for Outlier Detection","authors":"Xinyu Su;Zheng Li;Dezhong Peng;Hongmei Chen;Zhong Yuan","doi":"10.1109/TKDE.2026.3656418","DOIUrl":"https://doi.org/10.1109/TKDE.2026.3656418","url":null,"abstract":"In information systems lacking decision-making information, effectively leveraging fuzzy rough sets for outlier detection in complex data is challenging, especially in capturing inherent uncertainty and multi-granularity characteristics to construct discriminative outlier scores. However, existing fuzzy rough sets-based outlier detection methods often suffer from three key limitations: (1) Local data distributions are often ignored when calculating fuzzy relation matrices, resulting in inaccurate fuzzy similarity representations; (2) Use of all objects in fuzzy upper and lower approximations can weaken noise resistance and increase computational complexity; (3) Single-granularity data processing reduces efficiency and may fail to capture the multi-granularity nature of data, thereby limiting the adaptability of these methods in complex data environments. To address these issues, we propose to fuses <bold>N</b>atural neighbor fuzzy approximations with <bold>G</b>ranular-ball representation for <bold>O</b>utlier <bold>D</b>etection (NGOD), which integrates the multi-granularity granular-ball representation and fuzzy rough sets to improve the effectiveness and robustness of unsupervised outlier detection. Specifically, we first define a local distribution-aware fuzzy relation, enabling more discriminative similarity calculations between samples. To improve the effectiveness and robustness of fuzzy upper and lower approximations, we propose a multi-granularity natural neighbor fuzzy approximation model, which effectively utilizes the inherent uncertainty and local abnormal information of data in approximations. Moreover, by introducing natural neighbors, NGOD can adaptively capture local abnormal information in the data without setting neighborhoods manually. Finally, the outlier factors of each sample are calculated in NGOD to measure their outlier degrees. Extensive experiments on diverse datasets demonstrate that NGOD outperforms state-of-the-art methods, validating its superior performance and adaptability.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"38 3","pages":"1857-1870"},"PeriodicalIF":10.4,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146162193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}