首页 > 最新文献

IEEE Transactions on Knowledge and Data Engineering最新文献

英文 中文
Preference Aware Item Cold-Start Recommendation With Hierarchical Item Alignment 偏好感知项目冷启动推荐与分层项目对齐
IF 10.4 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-09-23 DOI: 10.1109/TKDE.2025.3613263
Wenbo Wang;Ben Chen;Bingquan Liu;Lili Shan;Chengjie Sun;Qian Chen;Feiyang Xiao;Jian Guan
Existing cold-start recommendation methods typically use item-level alignment strategies to align the content feature and collaborative feature of warm items during model training. However, these methods are less effective for cold items with low semantic similarity to the warm items when they first appear in the test stage, as they have no historical interactions to obtain the collaborative feature. In this paper, we propose a preference aware recommendation (PARec) model with hierarchical item alignment to solve the item cold-start issue. Our approach exploits user preference from historical records to achieve group-level alignment with item content feature, enhancing recommendation performance. Specifically, our hierarchical item alignment strategy improves recommendations for both high and low similarity cold items by using item-level alignment for high similarity cold items and introducing group-level alignment for low similarity cold items. Low similarity cold items can be successfully recommended through relationships among items, captured by our group-level alignment, based on their co-occurrence possibilities and semantic similarities. For model training, a hierarchical contrastive objective function is presented to balance the performance of warm and cold items, achieving better overall performance. Extensive experiments demonstrate the effectiveness of our method, with results showing its superiority compared to state-of-the-art approaches.
现有的冷启动推荐方法在模型训练过程中通常使用项目级对齐策略来对齐热项目的内容特征和协作特征。然而,这些方法对于在测试阶段首次出现的与热项目语义相似度较低的冷项目效果较差,因为它们没有历史交互来获得协同特征。为了解决项目冷启动问题,本文提出了一种偏好感知推荐(PARec)模型。我们的方法利用历史记录中的用户偏好来实现与项目内容特征的组级对齐,从而提高推荐性能。具体来说,我们的分层项目对齐策略通过对高相似性冷项目使用项目级对齐和对低相似性冷项目引入组级对齐来改进高相似性和低相似性冷项目的推荐。低相似性冷项目可以通过项目之间的关系成功推荐,通过我们的组级对齐捕获,基于它们的共现可能性和语义相似性。在模型训练中,提出了一种层次对比目标函数来平衡冷热项目的表现,从而获得更好的整体表现。大量的实验证明了我们的方法的有效性,结果显示了它与最先进的方法相比的优越性。
{"title":"Preference Aware Item Cold-Start Recommendation With Hierarchical Item Alignment","authors":"Wenbo Wang;Ben Chen;Bingquan Liu;Lili Shan;Chengjie Sun;Qian Chen;Feiyang Xiao;Jian Guan","doi":"10.1109/TKDE.2025.3613263","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3613263","url":null,"abstract":"Existing cold-start recommendation methods typically use item-level alignment strategies to align the content feature and collaborative feature of warm items during model training. However, these methods are less effective for cold items with low semantic similarity to the warm items when they first appear in the test stage, as they have no historical interactions to obtain the collaborative feature. In this paper, we propose a preference aware recommendation (PARec) model with hierarchical item alignment to solve the item cold-start issue. Our approach exploits user preference from historical records to achieve group-level alignment with item content feature, enhancing recommendation performance. Specifically, our hierarchical item alignment strategy improves recommendations for both high and low similarity cold items by using item-level alignment for high similarity cold items and introducing group-level alignment for low similarity cold items. Low similarity cold items can be successfully recommended through relationships among items, captured by our group-level alignment, based on their co-occurrence possibilities and semantic similarities. For model training, a hierarchical contrastive objective function is presented to balance the performance of warm and cold items, achieving better overall performance. Extensive experiments demonstrate the effectiveness of our method, with results showing its superiority compared to state-of-the-art approaches.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 12","pages":"7388-7401"},"PeriodicalIF":10.4,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145456052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LBF-VQA: Towards Language Bias-Free Visual Question Answering With Multi-Space Collaborative Debiasing LBF-VQA:基于多空间协同去偏的无语言偏见视觉问答
IF 10.4 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-09-23 DOI: 10.1109/TKDE.2025.3613421
Yishu Liu;Huanjia Zhu;Bingzhi Chen;Xiaozhao Fang;Guangming Lu;Shengli Xie
Visual Question Answering (VQA), aimed at improving AI-driven interactions and solving complex visual-linguistic tasks, has increasingly garnered attention as a pivotal research domain in both academic and industrial spheres. Despite progress in VQA, current studies still suffer from the challenge of language bias posed by spurious semantic correlations and minority class collapse, leading to semantic ambiguities and distribution shifts that hinder robust performance across challenging scenarios. To address these challenges, we propose a robust multi-space collaborative debiasing paradigm, termed “LBF-VQA”, which systematically leverages multi-space collaborative debiasing strategies to achieve language bias-free VQA, encompassing both Euclidean space debiasing (ESD) and Spherical space debiasing (SSD). By strategically introducing bias-examples and their corresponding counter-examples, the ESD strategy focuses on uncovering hidden prior correlations and the complex interactions between modality and semantics within the Euclidean space. Benefiting from the infinite contrastive and distribution debiasing learning mechanisms, the SSD strategy is devoted to effectively preventing the collapse of minority classes while enhancing the manifold representations of instance de-bias and distribution de-dependence in the Spherical space. Furthermore, we meticulously constructed a specialized medical dataset intentionally embedded with deliberate language bias to comprehensively examine the negative effects of language bias on medical VQA systems. Extensive experiments on multiple general and medical VQA benchmarks consistently verify the effectiveness and generalizability of our LBF-VQA in handling various complex VQA scenarios than state-of-the-art baselines.
视觉问答(VQA)旨在改善人工智能驱动的交互和解决复杂的视觉语言任务,作为学术界和工业界的关键研究领域,越来越受到关注。尽管VQA取得了进展,但目前的研究仍然受到虚假语义相关性和少数类崩溃带来的语言偏见的挑战,导致语义模糊和分布变化,阻碍了在具有挑战性的场景下的稳健性能。为了应对这些挑战,我们提出了一种鲁棒的多空间协同去偏范式,称为“LBF-VQA”,它系统地利用多空间协同去偏策略来实现无语言偏差的VQA,包括欧几里得空间去偏(ESD)和球面空间去偏(SSD)。通过有策略地引入偏例及其相应的反例,ESD策略侧重于揭示欧几里得空间中隐藏的先验相关性以及模态和语义之间的复杂相互作用。得益于无限对比和分布去偏学习机制,SSD策略致力于有效防止少数类的崩溃,同时增强了实例去偏和分布去依赖在球面空间中的多元表示。此外,我们精心构建了一个专门的医学数据集,故意嵌入故意的语言偏见,以全面检查语言偏见对医学VQA系统的负面影响。在多个通用和医疗VQA基准上进行的大量实验一致地验证了我们的LBF-VQA在处理各种复杂VQA场景时的有效性和通用性,而不是最先进的基线。
{"title":"LBF-VQA: Towards Language Bias-Free Visual Question Answering With Multi-Space Collaborative Debiasing","authors":"Yishu Liu;Huanjia Zhu;Bingzhi Chen;Xiaozhao Fang;Guangming Lu;Shengli Xie","doi":"10.1109/TKDE.2025.3613421","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3613421","url":null,"abstract":"Visual Question Answering (VQA), aimed at improving AI-driven interactions and solving complex visual-linguistic tasks, has increasingly garnered attention as a pivotal research domain in both academic and industrial spheres. Despite progress in VQA, current studies still suffer from the challenge of <italic>language bias</i> posed by spurious semantic correlations and minority class collapse, leading to semantic ambiguities and distribution shifts that hinder robust performance across challenging scenarios. To address these challenges, we propose a robust multi-space collaborative debiasing paradigm, termed “LBF-VQA”, which systematically leverages multi-space collaborative debiasing strategies to achieve language bias-free VQA, encompassing both Euclidean space debiasing (ESD) and Spherical space debiasing (SSD). By strategically introducing bias-examples and their corresponding counter-examples, the ESD strategy focuses on uncovering hidden prior correlations and the complex interactions between modality and semantics within the Euclidean space. Benefiting from the infinite contrastive and distribution debiasing learning mechanisms, the SSD strategy is devoted to effectively preventing the collapse of minority classes while enhancing the manifold representations of instance de-bias and distribution de-dependence in the Spherical space. Furthermore, we meticulously constructed a specialized medical dataset intentionally embedded with deliberate language bias to comprehensively examine the negative effects of language bias on medical VQA systems. Extensive experiments on multiple general and medical VQA benchmarks consistently verify the effectiveness and generalizability of our LBF-VQA in handling various complex VQA scenarios than state-of-the-art baselines.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 12","pages":"7255-7271"},"PeriodicalIF":10.4,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145455968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unsupervised Entity Alignment Based on Personalized Discriminative Rooted Tree 基于个性化判别根树的无监督实体对齐
IF 10.4 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-09-22 DOI: 10.1109/TKDE.2025.3607765
Yaming Yang;Zhe Wang;Ziyu Guan;Wei Zhao;Xinyan Huang;Xiaofei He
Entity Alignment (EA) is to link potential equivalent entities across different knowledge graphs (KGs). Most existing EA methods are supervised as they require the supervision of seed alignments, i.e., manually specified aligned entity pairs. Very recently, several EA studies have made some attempts to get rid of seed alignments. Despite achieving preliminary progress, they still suffer two limitations: (1) The entity embeddings produced by their GNN-like encoders lack personalization since some of the aggregation subpaths are shared between different entities. (2) They cannot fully alleviate the distribution distortion issue between candidate KGs due to the absence of supervised signals. In this work, we propose a novel unsupervised entity alignment approach called UNEA to address the above two issues. First, we parametrically sample a tree neighborhood rooted at each entity, and accordingly develop a tree attention aggregation mechanism to extract a personalized embedding for each entity. Second, we introduce an auxiliary task of maximizing the mutual information between the input and the output of the KG encoder, which serves as a regularization to prevent the distribution distortion. Extensive experiments show that our UNEA achieves a new state-of-the-art for the unsupervised EA task, and can even outperform many existing supervised EA baselines.
实体对齐(EA)是在不同的知识图谱(KGs)中连接潜在的等价实体。大多数现有的EA方法都受到监督,因为它们需要监督种子对齐,即手动指定对齐的实体对。最近,一些EA研究已经尝试摆脱种子排列。尽管取得了初步进展,但它们仍然存在两个局限性:(1)由类gnn编码器产生的实体嵌入缺乏个性化,因为一些聚合子路径在不同实体之间共享。(2)由于缺乏监督信号,它们不能完全缓解候选KGs之间的分布失真问题。在这项工作中,我们提出了一种新的无监督实体对齐方法,称为UNEA,以解决上述两个问题。首先,我们对每个实体的树邻域进行参数化采样,并据此开发树关注聚合机制,提取每个实体的个性化嵌入。其次,我们引入了一个辅助任务来最大化KG编码器输入和输出之间的互信息,作为正则化来防止分布失真。广泛的实验表明,我们的UNEA实现了无监督EA任务的新技术,甚至可以优于许多现有的有监督EA基线。
{"title":"Unsupervised Entity Alignment Based on Personalized Discriminative Rooted Tree","authors":"Yaming Yang;Zhe Wang;Ziyu Guan;Wei Zhao;Xinyan Huang;Xiaofei He","doi":"10.1109/TKDE.2025.3607765","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3607765","url":null,"abstract":"Entity Alignment (EA) is to link potential equivalent entities across different knowledge graphs (KGs). Most existing EA methods are supervised as they require the supervision of seed alignments, i.e., manually specified aligned entity pairs. Very recently, several EA studies have made some attempts to get rid of seed alignments. Despite achieving preliminary progress, they still suffer two limitations: (1) The entity embeddings produced by their GNN-like encoders lack personalization since some of the aggregation subpaths are shared between different entities. (2) They cannot fully alleviate the distribution distortion issue between candidate KGs due to the absence of supervised signals. In this work, we propose a novel unsupervised entity alignment approach called UNEA to address the above two issues. First, we parametrically sample a tree neighborhood rooted at each entity, and accordingly develop a tree attention aggregation mechanism to extract a personalized embedding for each entity. Second, we introduce an auxiliary task of maximizing the mutual information between the input and the output of the KG encoder, which serves as a regularization to prevent the distribution distortion. Extensive experiments show that our UNEA achieves a new state-of-the-art for the unsupervised EA task, and can even outperform many existing supervised EA baselines.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 12","pages":"7440-7452"},"PeriodicalIF":10.4,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145455963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Online Billboard Auction With Social Welfare Maximization 社会福利最大化的在线广告牌拍卖
IF 10.4 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-09-22 DOI: 10.1109/TKDE.2025.3613148
Hao Huang;Mingxin Wang;Mengqi Shan;Zhigao Zheng;Ting Gan;Jiawei Jiang;Zongpeng Li
Outdoor billboard advertising has proven effective for commercial promotions, attracting potential customers, and boosting product sales. Auction serves as a popular method for leasing billboard usage rights, enabling a seller to rent billboards to winning users for predefined periods according to their bids. An effective auction algorithm is of great significance to maximize the efficiency of the billboard ecosystem. In contrast to a rich literature on Internet advertising auctions, well-crafted algorithms tailored for outdoor billboard auctions remain rare. In this work, we investigate the problem of outdoor billboard auctions, in the practical setting where bids are received and processed on the fly. Our goal is to maximize social welfare, namely the total benefits of auction participants, including the billboard service provider and the bidding users. To this end, we first formulate the billboard social welfare maximization problem into an Integer Linear Problem (ILP), and then reformulate the ILP into a compact form with a reduced size of constraints (at the cost of involving exponentially many primal variables), based on which we derive the dual problem. Furthermore, we design a dual oracle to handle the exponentially many dual constraints, avoiding exhaustive enumeration. We present a primal-dual online algorithm with an incentive-compatible pricing mechanism. Theoretical analysis proves the individual rationality, incentive compatibility, and computational efficiency of our online algorithm. Extensive experimental results show that the online algorithm is both effective and efficient, and achieves a good competitive ratio.
户外广告牌广告已被证明是有效的商业促销,吸引潜在客户,促进产品销售。拍卖是一种流行的广告牌使用权租赁方式,卖方可以根据中标用户的出价,在预定的期限内将广告牌出租给中标用户。有效的竞价算法对于实现广告牌生态系统效率最大化具有重要意义。与丰富的互联网广告拍卖文献相比,为户外广告牌拍卖量身定制的精心设计的算法仍然很少。在这项工作中,我们研究了户外广告牌拍卖的问题,在实际设置中,出价是实时接收和处理的。我们的目标是社会福利最大化,即拍卖参与者的总收益,包括广告牌服务商和竞价用户。为此,我们首先将广告牌社会福利最大化问题表述为整数线性问题(ILP),然后将ILP重新表述为约束尺寸减小的紧凑形式(代价是涉及指数级多原始变量),并在此基础上推导出对偶问题。此外,我们设计了一个对偶oracle来处理指数级多的对偶约束,避免了穷举枚举。提出了一种具有激励相容定价机制的原始对偶在线算法。理论分析证明了该在线算法的个体合理性、激励兼容性和计算效率。大量的实验结果表明,在线算法既有效又高效,并取得了良好的竞争比。
{"title":"Online Billboard Auction With Social Welfare Maximization","authors":"Hao Huang;Mingxin Wang;Mengqi Shan;Zhigao Zheng;Ting Gan;Jiawei Jiang;Zongpeng Li","doi":"10.1109/TKDE.2025.3613148","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3613148","url":null,"abstract":"Outdoor billboard advertising has proven effective for commercial promotions, attracting potential customers, and boosting product sales. Auction serves as a popular method for leasing billboard usage rights, enabling a seller to rent billboards to winning users for predefined periods according to their bids. An effective auction algorithm is of great significance to maximize the efficiency of the billboard ecosystem. In contrast to a rich literature on Internet advertising auctions, well-crafted algorithms tailored for outdoor billboard auctions remain rare. In this work, we investigate the problem of outdoor billboard auctions, in the practical setting where bids are received and processed on the fly. Our goal is to maximize social welfare, namely the total benefits of auction participants, including the billboard service provider and the bidding users. To this end, we first formulate the billboard social welfare maximization problem into an Integer Linear Problem (ILP), and then reformulate the ILP into a compact form with a reduced size of constraints (at the cost of involving exponentially many primal variables), based on which we derive the dual problem. Furthermore, we design a dual oracle to handle the exponentially many dual constraints, avoiding exhaustive enumeration. We present a primal-dual online algorithm with an incentive-compatible pricing mechanism. Theoretical analysis proves the individual rationality, incentive compatibility, and computational efficiency of our online algorithm. Extensive experimental results show that the online algorithm is both effective and efficient, and achieves a good competitive ratio.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 12","pages":"7362-7373"},"PeriodicalIF":10.4,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145456020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Long-Term Urban Flow Prediction Against Data Distribution Shift: A Causal Perspective 基于数据分布变化的长期城市流量预测:一个因果视角
IF 10.4 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-09-19 DOI: 10.1109/TKDE.2025.3612033
Yuting Liu;Qiang Zhou;Hanzhe Li;Fuzhen Zhuang;Jingjing Gu
The demand for more precise and timely urban resource allocation and management has driven the extension of urban flow prediction from short-term to long-term horizons. As the time scale expands, the issue of urban flow distribution shift becomes increasingly prominent due to various impact factors, such as weather, events, city changes, etc. Traditionally, comprehensively analyzing and addressing the causal relationships underlying the distribution shift caused by these factors has been challenging. In this paper, we propose that these impact factors can be partitioned in two major types, i.e., context factors and structural factors. We then present a decomposition-based model for long-term urban flow prediction from a causal perspective, named DeCau, which can discriminate between the two types of factors for effectively solving the problem of urban flow distribution shift. First, we employ a decomposition module to decompose urban flow into seasonal part and trend part. The seasonal part contains high frequency irregular variations caused by context factors. We advise a shared distribution estimator to approximate the unavailable prior distributions of context factors, and then apply causal intervention to mitigate the confounding impact of context factors. The distribution shift in the trend part is induced by structural factors. We design a dual causal dependency extractor to model the causality between POIs distribution and urban flow, and then eliminate spurious correlations through causal adjustment. Finally, we design an end-to-end framework for long-term urban flow prediction by combining the embeddings from two parts, enabling the model to generalize to unseen distribution. Extensive experimental results demonstrate DeCau outperforms state-of-the-art baselines.
对更精确和及时的城市资源配置和管理的需求推动了城市流量预测从短期向长期的扩展。随着时间尺度的扩大,由于天气、事件、城市变化等各种影响因素的影响,城市流分布转移问题日益突出。传统上,综合分析和解决这些因素引起的分布转移背后的因果关系一直是一个挑战。本文建议将这些影响因素划分为两大类型,即语境因素和结构因素。然后,我们提出了一个基于分解的因果视角的长期城市流量预测模型DeCau,该模型可以区分这两类因素,从而有效地解决城市流量分布转移问题。首先,采用分解模块将城市流分解为季节部分和趋势部分。季节部分包含由环境因素引起的高频不规则变化。我们建议使用共享分布估计器来近似上下文因素的不可用先验分布,然后应用因果干预来减轻上下文因素的混淆影响。趋势部分的分布偏移是由结构性因素引起的。我们设计了一个双因果依赖提取器来模拟poi分布与城市流量之间的因果关系,然后通过因果调整来消除假相关。最后,我们设计了一个端到端的长期城市流预测框架,通过结合两部分的嵌入,使模型能够推广到不可见的分布。广泛的实验结果表明,DeCau优于最先进的基线。
{"title":"Long-Term Urban Flow Prediction Against Data Distribution Shift: A Causal Perspective","authors":"Yuting Liu;Qiang Zhou;Hanzhe Li;Fuzhen Zhuang;Jingjing Gu","doi":"10.1109/TKDE.2025.3612033","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3612033","url":null,"abstract":"The demand for more precise and timely urban resource allocation and management has driven the extension of urban flow prediction from short-term to long-term horizons. As the time scale expands, the issue of urban flow distribution shift becomes increasingly prominent due to various impact factors, such as weather, events, city changes, etc. Traditionally, comprehensively analyzing and addressing the causal relationships underlying the distribution shift caused by these factors has been challenging. In this paper, we propose that these impact factors can be partitioned in two major types, i.e., context factors and structural factors. We then present a decomposition-based model for long-term urban flow prediction from a causal perspective, named <italic>DeCau</i>, which can discriminate between the two types of factors for effectively solving the problem of urban flow distribution shift. First, we employ a decomposition module to decompose urban flow into seasonal part and trend part. The seasonal part contains high frequency irregular variations caused by context factors. We advise a shared distribution estimator to approximate the unavailable prior distributions of context factors, and then apply causal intervention to mitigate the confounding impact of context factors. The distribution shift in the trend part is induced by structural factors. We design a dual causal dependency extractor to model the causality between POIs distribution and urban flow, and then eliminate spurious correlations through causal adjustment. Finally, we design an end-to-end framework for long-term urban flow prediction by combining the embeddings from two parts, enabling the model to generalize to unseen distribution. Extensive experimental results demonstrate <italic>DeCau</i> outperforms state-of-the-art baselines.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 12","pages":"7286-7299"},"PeriodicalIF":10.4,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145455973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Structural Clustering for Bipartite Graphs 二部图的结构聚类
IF 10.4 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-09-19 DOI: 10.1109/TKDE.2025.3612290
Mingyu Yang;Wentao Li;Wei Wang;Dong Wen;Min Gao;Lu Qin
Bipartite graphs are widely used in many real-world applications, where discovering clusters is crucial for understanding their underlying structure. However, most existing clustering methods for bipartite graphs enforce the assignment of all vertices to clusters, often neglecting the important roles of outliers and hubs. To address this limitation, we plan to extend the structural clustering model from unipartite to bipartite graphs. This extension is non-trivial due to the lack of common neighbors in bipartite graphs, which renders traditional similarity measures less effective. Recognizing that similarity is key to structural clustering, we resort to butterflies—the fundamental building blocks of bipartite graphs—to define a more effective similarity measure. Building on this, we further propose a novel structural clustering model, ${mathsf {SBC}}$, tailored for bipartite graphs. To enable clustering under this model, we develop efficient online and index-based methods, along with a dynamic maintenance method to accommodate graph updates over time. Extensive experiments on real-world bipartite graphs demonstrate that: (1) The ${mathsf {SBC}}$ model greatly enhances clustering quality, achieving higher modularity while effectively identifying outliers and hubs. (2) Our proposed clustering methods are highly scalable, enabling the processing of graphs with up to 12.2 million edges within 2 seconds.
二部图广泛应用于许多现实世界的应用中,其中发现聚类对于理解其底层结构至关重要。然而,大多数现有的二部图聚类方法强制将所有顶点分配给聚类,往往忽略了异常点和中心的重要作用。为了解决这一限制,我们计划将结构聚类模型从单部图扩展到二部图。这种扩展是非平凡的,因为在二部图中缺乏共同邻居,这使得传统的相似性度量不那么有效。认识到相似性是结构聚类的关键,我们借助于蝴蝶——二部图的基本构建块——来定义更有效的相似性度量。在此基础上,我们进一步提出了一种针对二部图的新型结构聚类模型${mathsf {SBC}}$。为了在此模型下实现聚类,我们开发了高效的在线和基于索引的方法,以及动态维护方法,以适应随时间变化的图更新。在实际二部图上的大量实验表明:(1)${mathsf {SBC}}$模型大大提高了聚类质量,在有效识别离群点和枢纽的同时实现了更高的模块化。(2)我们提出的聚类方法具有高度可扩展性,可以在2秒内处理多达1220万条边的图。
{"title":"Structural Clustering for Bipartite Graphs","authors":"Mingyu Yang;Wentao Li;Wei Wang;Dong Wen;Min Gao;Lu Qin","doi":"10.1109/TKDE.2025.3612290","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3612290","url":null,"abstract":"Bipartite graphs are widely used in many real-world applications, where discovering clusters is crucial for understanding their underlying structure. However, most existing clustering methods for bipartite graphs enforce the assignment of <i>all</i> vertices to clusters, often neglecting the important roles of outliers and hubs. To address this limitation, we plan to extend the structural clustering model from unipartite to bipartite graphs. This extension is non-trivial due to the lack of common neighbors in bipartite graphs, which renders traditional similarity measures less effective. Recognizing that similarity is key to structural clustering, we resort to butterflies—the fundamental building blocks of bipartite graphs—to define a more effective similarity measure. Building on this, we further propose a novel structural clustering model, <inline-formula><tex-math>${mathsf {SBC}}$</tex-math></inline-formula>, tailored for bipartite graphs. To enable clustering under this model, we develop efficient online and index-based methods, along with a dynamic maintenance method to accommodate graph updates over time. Extensive experiments on real-world bipartite graphs demonstrate that: (1) The <inline-formula><tex-math>${mathsf {SBC}}$</tex-math></inline-formula> model greatly enhances clustering quality, achieving higher modularity while effectively identifying outliers and hubs. (2) Our proposed clustering methods are highly scalable, enabling the processing of graphs with up to 12.2 million edges within 2 seconds.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"38 1","pages":"645-658"},"PeriodicalIF":10.4,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145705912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Toward Balanced Denoising: Building a Structural and Textual Denoiser for Table Understanding 迈向平衡去噪:为表理解构建结构和文本去噪器
IF 10.4 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-09-19 DOI: 10.1109/TKDE.2025.3612217
Shu-Xun Yang;Xian-Ling Mao;Yu-Ming Shang;Heyan Huang
Recently, large language models (LLMs) have made remarkable progress in table understanding, yet they remain vulnerable to the structural noise (SN) and the textual noise (TN). Existing methods usually employ biased denoising strategies such as structural matching and textual filtering, or overzealous denoising strategies such as introducing supplementary tasks like text-to-SQL and table-to-text to reduce these two types of noise. However, these methods either neglect one type of noise or introduce substantial external noise. Therefore, how to simultaneously mitigate the structural and textual noise without introducing extra noise and improve the performance of LLMs in table understanding is still an unresolved issue. In this paper, we rethink the bottlenecks in table understanding from the perspective of noise reduction and propose a novel dual-denoiser-reasoner model, called TabDDR, for balanced and effective denoising. Specially, our model consists of a structural-and-textual denoiser and a task-adaptive reasoner. The former removes two types of noise via triplet alignment and planning extraction to seek an interpretable balance between breaking structural barriers and preserving structural characteristics, eliminating textual noise and retaining maximal information; the latter ensures a simple but effective reasoning process which can adapt to various downstream tasks. To highlight the presence and impact of the structural and textual noise, we construct the WTQ-SN and WTQ-TN datasets based on the WikiTableQuestion (WTQ) dataset. Extensive experiments on these self-constructed datasets and two other public datasets demonstrate that our proposed method performs better than state-of-the-art baselines.
近年来,大型语言模型(llm)在表理解方面取得了显著进展,但它们仍然容易受到结构噪声(SN)和文本噪声(TN)的影响。现有的方法通常采用有偏差的去噪策略,如结构匹配和文本过滤,或过度的去噪策略,如引入补充任务,如文本到sql和表到文本,以减少这两种类型的噪声。然而,这些方法要么忽略了一类噪声,要么引入了大量的外部噪声。因此,如何在不引入额外噪声的情况下同时减轻结构噪声和文本噪声,提高llm在表理解中的性能仍然是一个悬而未决的问题。在本文中,我们从降噪的角度重新思考了表理解的瓶颈,并提出了一种新的双去噪-推理模型,称为TabDDR,用于平衡和有效的去噪。特别地,我们的模型由一个结构和文本去噪器和一个任务自适应推理器组成。前者通过三联体对齐和规划提取去除两种类型的噪声,在打破结构障碍和保留结构特征、消除文本噪声和保留最大信息之间寻求可解释的平衡;后者确保了一个简单而有效的推理过程,可以适应各种下游任务。为了突出结构噪声和文本噪声的存在及其影响,我们在WikiTableQuestion (WTQ)数据集的基础上构建了WTQ- sn和WTQ- tn数据集。在这些自建数据集和另外两个公共数据集上进行的大量实验表明,我们提出的方法比最先进的基线性能更好。
{"title":"Toward Balanced Denoising: Building a Structural and Textual Denoiser for Table Understanding","authors":"Shu-Xun Yang;Xian-Ling Mao;Yu-Ming Shang;Heyan Huang","doi":"10.1109/TKDE.2025.3612217","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3612217","url":null,"abstract":"Recently, large language models (LLMs) have made remarkable progress in table understanding, yet they remain vulnerable to the structural noise (SN) and the textual noise (TN). Existing methods usually employ biased denoising strategies such as structural matching and textual filtering, or overzealous denoising strategies such as introducing supplementary tasks like text-to-SQL and table-to-text to reduce these two types of noise. However, these methods either neglect one type of noise or introduce substantial external noise. Therefore, how to simultaneously mitigate the structural and textual noise without introducing extra noise and improve the performance of LLMs in table understanding is still an unresolved issue. In this paper, we rethink the bottlenecks in table understanding from the perspective of noise reduction and propose a novel dual-denoiser-reasoner model, called TabDDR, for balanced and effective denoising. Specially, our model consists of a structural-and-textual denoiser and a task-adaptive reasoner. The former removes two types of noise via triplet alignment and planning extraction to seek an interpretable balance between breaking structural barriers and preserving structural characteristics, eliminating textual noise and retaining maximal information; the latter ensures a simple but effective reasoning process which can adapt to various downstream tasks. To highlight the presence and impact of the structural and textual noise, we construct the WTQ-SN and WTQ-TN datasets based on the WikiTableQuestion (WTQ) dataset. Extensive experiments on these self-constructed datasets and two other public datasets demonstrate that our proposed method performs better than state-of-the-art baselines.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 12","pages":"7414-7425"},"PeriodicalIF":10.4,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145456048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PiTruss Community Search for Multilayer Graphs PiTruss社区搜索多层图
IF 10.4 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-09-19 DOI: 10.1109/TKDE.2025.3610998
Run-An Wang;Zhaonian Zou;Dandan Liu;Xudong Liu
Community search on multilayer graphs has significant applications in fields such as bioinformatics, social network analysis, and financial fraud detection, offering deeper insights compared to traditional community search on single-layer graphs. However, existing approaches often suffer from several key limitations, including inefficiency and a lack of flexibility in accommodating query requirements. To address these challenges, we investigate the problem of community search over large multilayer graphs. Specifically, we introduce a novel multilayer community model called PivotTruss Community (PiTC) with provably nice structural guarantees. We formalize the PiTC search (PiTCS) problem, which aims to efficiently identify personalized PiTCs for a given query vertex. To solve the PiTCS problem, we propose an efficient algorithm and design an elegant index to accelerate the search process. In addition, we propose a parameter recommendation method to improve the usability of PiTCS. To further optimize performance, we introduce a method to compact the index by making a trade-off between search time and index size. Extensive experiments on real-world datasets demonstrate the effectiveness and efficiency of our proposed algorithms.
与传统的单层图社区搜索相比,多层图社区搜索在生物信息学、社会网络分析和金融欺诈检测等领域有着重要的应用,提供了更深入的见解。然而,现有的方法经常受到一些关键限制,包括低效率和在适应查询需求方面缺乏灵活性。为了解决这些挑战,我们研究了大型多层图上的社区搜索问题。具体来说,我们引入了一种新的多层社区模型,称为PivotTruss社区(PiTC),具有可证明的良好结构保证。我们形式化了PiTC搜索(PiTCS)问题,该问题旨在有效地识别给定查询顶点的个性化PiTCS。为了解决PiTCS问题,我们提出了一种高效的算法,并设计了一个优雅的索引来加速搜索过程。此外,我们还提出了一种参数推荐方法来提高PiTCS的可用性。为了进一步优化性能,我们引入了一种方法,通过在搜索时间和索引大小之间进行权衡来压缩索引。在真实数据集上的大量实验证明了我们提出的算法的有效性和效率。
{"title":"PiTruss Community Search for Multilayer Graphs","authors":"Run-An Wang;Zhaonian Zou;Dandan Liu;Xudong Liu","doi":"10.1109/TKDE.2025.3610998","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3610998","url":null,"abstract":"Community search on multilayer graphs has significant applications in fields such as bioinformatics, social network analysis, and financial fraud detection, offering deeper insights compared to traditional community search on single-layer graphs. However, existing approaches often suffer from several key limitations, including inefficiency and a lack of flexibility in accommodating query requirements. To address these challenges, we investigate the problem of community search over large multilayer graphs. Specifically, we introduce a novel multilayer community model called <underline>Pi</u>vot<underline>T</u>russ <underline>C</u>ommunity (PiTC) with provably nice structural guarantees. We formalize the PiTC search (PiTCS) problem, which aims to efficiently identify personalized PiTCs for a given query vertex. To solve the PiTCS problem, we propose an efficient algorithm and design an elegant index to accelerate the search process. In addition, we propose a parameter recommendation method to improve the usability of PiTCS. To further optimize performance, we introduce a method to compact the index by making a trade-off between search time and index size. Extensive experiments on real-world datasets demonstrate the effectiveness and efficiency of our proposed algorithms.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 12","pages":"7374-7387"},"PeriodicalIF":10.4,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145455797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GDiffMAE: Guided Diffusion Enhanced Mask Graph AutoEncoder for Recommendation GDiffMAE:引导扩散增强掩模图自动编码器推荐
IF 10.4 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-09-17 DOI: 10.1109/TKDE.2025.3611270
Lei Zhang;Zihao Chen;Wuji Zhang;Hongke Zhao;Likang Wu
Despite advancements using graph neural networks (GNNs) to capture complex user-item interactions, challenges persist due to data sparsity and noise. To address these, self-supervised learning (SSL) methods, particularly recent generative approaches, have gained attention due to their ability to augment graph data without requiring complex view constructions and unstable negative sampling. However, existing generative SSL solutions often focus on structural rather than semantic (refer to collaborative signals in recommendation scenarios) reconstruction, limiting their potential as comprehensive recommender. This paper explores the untapped potential of generative SSL for graph-based recommender systems. We highlight two critical challenges: firstly, designing effective diffusion mechanisms to enhance semantic information and collaborative signals while avoiding optimization biases; and secondly, developing adaptive structural masking mechanisms within graph diffusion to improve overall model performance. Motivated by these challenges, we propose a novel approach: the Guided Diffusion enhanced Mask graph AutoEncoder (GDiffMAE). GDiffMAE integrates an adaptive mask encoder for structural reconstruction and a guided diffusion model for semantic reconstruction, addressing the limitations of current methods. Experimental results on diverse datasets demonstrate that GDiffMAE consistently outperforms powerful baseline models, particularly in handling noisy data scenarios. By enhancing both structural and semantic dimensions through guided diffusion, our model advances the state-of-the-art in graph-based recommender systems.
尽管使用图神经网络(gnn)在捕获复杂的用户-项目交互方面取得了进展,但由于数据稀疏性和噪声,挑战仍然存在。为了解决这些问题,自监督学习(SSL)方法,特别是最近的生成方法,由于能够在不需要复杂的视图构建和不稳定的负采样的情况下增强图数据而受到关注。然而,现有的生成式SSL解决方案通常侧重于结构重建,而不是语义重建(参考推荐场景中的协作信号),这限制了它们作为全面推荐器的潜力。本文探讨了生成SSL在基于图的推荐系统中尚未开发的潜力。我们强调了两个关键的挑战:首先,设计有效的扩散机制来增强语义信息和协作信号,同时避免优化偏差;其次,在图扩散中开发自适应结构掩蔽机制,以提高整体模型性能。面对这些挑战,我们提出了一种新的方法:导引扩散增强掩模图自动编码器(GDiffMAE)。GDiffMAE集成了用于结构重建的自适应掩码编码器和用于语义重建的引导扩散模型,解决了当前方法的局限性。在不同数据集上的实验结果表明,GDiffMAE始终优于强大的基线模型,特别是在处理噪声数据场景时。通过引导扩散增强结构维度和语义维度,我们的模型推动了基于图的推荐系统的发展。
{"title":"GDiffMAE: Guided Diffusion Enhanced Mask Graph AutoEncoder for Recommendation","authors":"Lei Zhang;Zihao Chen;Wuji Zhang;Hongke Zhao;Likang Wu","doi":"10.1109/TKDE.2025.3611270","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3611270","url":null,"abstract":"Despite advancements using graph neural networks (GNNs) to capture complex user-item interactions, challenges persist due to data sparsity and noise. To address these, self-supervised learning (SSL) methods, particularly recent generative approaches, have gained attention due to their ability to augment graph data without requiring complex view constructions and unstable negative sampling. However, existing generative SSL solutions often focus on structural rather than semantic (refer to collaborative signals in recommendation scenarios) reconstruction, limiting their potential as comprehensive recommender. This paper explores the untapped potential of generative SSL for graph-based recommender systems. We highlight two critical challenges: firstly, designing effective diffusion mechanisms to enhance semantic information and collaborative signals while avoiding optimization biases; and secondly, developing adaptive structural masking mechanisms within graph diffusion to improve overall model performance. Motivated by these challenges, we propose a novel approach: the Guided Diffusion enhanced Mask graph AutoEncoder (GDiffMAE). GDiffMAE integrates an adaptive mask encoder for structural reconstruction and a guided diffusion model for semantic reconstruction, addressing the limitations of current methods. Experimental results on diverse datasets demonstrate that GDiffMAE consistently outperforms powerful baseline models, particularly in handling noisy data scenarios. By enhancing both structural and semantic dimensions through guided diffusion, our model advances the state-of-the-art in graph-based recommender systems.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 12","pages":"7199-7212"},"PeriodicalIF":10.4,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145455895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MSC-DOLES: Multi-View Subspace Clustering in Diverse Orthogonal Latent Embedding Spaces MSC-DOLES:不同正交潜在嵌入空间的多视图子空间聚类
IF 10.4 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-09-17 DOI: 10.1109/TKDE.2025.3610659
Yuan Fang;Geping Yang;Ruichu Cai;Yiyang Yang;Zhiguo Gong;Can Chen;Zhifeng Hao
In the domain of Multi-view Subspace Clustering (MSC) in Latent Embedding Space (LES), existing methods aim to capture and leverage critical multi-view information by mapping it into a low-dimensional LES. However, several aspects can be further improved: (i) Fusion Strategy: Existing methods adopt either early fusion or late fusion to integrate multi-view information, limiting the effectiveness of the fusion. (ii) Diversity: Current methods often overlook the inherent diversity in the multi-view data by focusing on a single LES. (iii) Efficiency: LES-based methods exhibit high computational complexity, with cubic time and quadratic space requirements based on the number of samples. To address these issues, we propose a novel framework called MSC-DOLES (Multi-view Subspace Clustering in Diverse Orthogonal Latent Embedding Spaces), a novel framework designed to tackle these challenges. MSC-DOLES incorporates a two-stage fusion approach that generates and learns from multiple LES to maximize cross-view diversity. Orthogonality constraints on individual LES ensure view-internal diversity, resulting in a set of Diverse Orthogonal Latent Embedding Spaces (DOLES). The DOLES are then fused into a consensus anchor graph using learnable anchors. The final clustering is induced by partitioning the obtained graph without pre-processing. We develop an eight-step optimization algorithm for MSC-DOLES, which exhibits nearly linear time and space complexities relative to the number of samples. Extensive experiments demonstrate the superiority of MSC-DOLES over state-of-the-art methods.
在潜在嵌入空间(LES)中的多视图子空间聚类(MSC)领域,现有方法旨在通过将关键的多视图信息映射到低维的LES中来捕获和利用关键的多视图信息。(1)融合策略:现有方法要么采用早期融合,要么采用后期融合对多视图信息进行融合,限制了融合的有效性。多样性:目前的方法往往只关注单一的LES而忽略了多视图数据的内在多样性。(iii)效率:基于les的方法具有很高的计算复杂度,根据样本数量需要三次时间和二次空间。为了解决这些问题,我们提出了一个新的框架MSC-DOLES (Multi-view Subspace Clustering in Diverse Orthogonal Latent Embedding Spaces),这是一个旨在解决这些挑战的新框架。MSC-DOLES采用两阶段融合方法,生成并从多个LES中学习,以最大限度地提高跨视图多样性。单个LES的正交性约束保证了视图内部的多样性,从而得到一组不同的正交潜在嵌入空间(DOLES)。然后使用可学习锚点将DOLES融合成共识锚点图。最终的聚类是在不进行预处理的情况下对得到的图进行划分。我们开发了一个八步优化算法MSC-DOLES,其时间和空间复杂度与样本数量呈近似线性关系。大量的实验证明MSC-DOLES优于最先进的方法。
{"title":"MSC-DOLES: Multi-View Subspace Clustering in Diverse Orthogonal Latent Embedding Spaces","authors":"Yuan Fang;Geping Yang;Ruichu Cai;Yiyang Yang;Zhiguo Gong;Can Chen;Zhifeng Hao","doi":"10.1109/TKDE.2025.3610659","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3610659","url":null,"abstract":"In the domain of Multi-view Subspace Clustering (MSC) in Latent Embedding Space (LES), existing methods aim to capture and leverage critical multi-view information by mapping it into a low-dimensional LES. However, several aspects can be further improved: (i) Fusion Strategy: Existing methods adopt either early fusion or late fusion to integrate multi-view information, limiting the effectiveness of the fusion. (ii) Diversity: Current methods often overlook the inherent diversity in the multi-view data by focusing on a single LES. (iii) Efficiency: LES-based methods exhibit high computational complexity, with cubic time and quadratic space requirements based on the number of samples. To address these issues, we propose a novel framework called MSC-DOLES (Multi-view Subspace Clustering in Diverse Orthogonal Latent Embedding Spaces), a novel framework designed to tackle these challenges. MSC-DOLES incorporates a two-stage fusion approach that generates and learns from multiple LES to maximize cross-view diversity. Orthogonality constraints on individual LES ensure view-internal diversity, resulting in a set of Diverse Orthogonal Latent Embedding Spaces (DOLES). The DOLES are then fused into a consensus anchor graph using learnable anchors. The final clustering is induced by partitioning the obtained graph without pre-processing. We develop an eight-step optimization algorithm for MSC-DOLES, which exhibits nearly linear time and space complexities relative to the number of samples. Extensive experiments demonstrate the superiority of MSC-DOLES over state-of-the-art methods.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 12","pages":"7315-7327"},"PeriodicalIF":10.4,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145456055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Transactions on Knowledge and Data Engineering
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1