Pub Date : 2024-10-29DOI: 10.1109/TKDE.2024.3487523
Yi Yang;John P. Lalor;Ahmed Abbasi;Daniel Dajun Zeng
Topic modeling is a commonly used text analysis tool for discovering latent topics in a text corpus. However, while topics in a text corpus often exhibit a hierarchical structure (e.g., cellphone is a sub-topic of electronics), most topic modeling methods assume a flat topic structure that ignores the hierarchical dependency among topics, or utilize a predefined topic hierarchy. In this work, we present a novel Hierarchical Deep Document Model (HDDM) to learn topic hierarchies using a variational autoencoder framework. We propose a novel objective function, sum of log likelihood, instead of the widely used evidence lower bound, to facilitate the learning of hierarchical latent topic structure. The proposed objective function can directly model and optimize the hierarchical topic-word distributions at all topic levels. We conduct experiments on four real-world text datasets to evaluate the topic modeling capability of the proposed HDDM method compared to state-of-the-art hierarchical topic modeling benchmarks. Experimental results show that HDDM achieves considerable improvement over benchmarks and is capable of learning meaningful topics and topic hierarchies. To further demonstrate the practical utility of HDDM, we apply it to a real-world medical notes dataset for clinical prediction. Experimental results show that HDDM can better summarize topics in medical notes, resulting in more accurate clinical predictions.
{"title":"Hierarchical Deep Document Model","authors":"Yi Yang;John P. Lalor;Ahmed Abbasi;Daniel Dajun Zeng","doi":"10.1109/TKDE.2024.3487523","DOIUrl":"https://doi.org/10.1109/TKDE.2024.3487523","url":null,"abstract":"Topic modeling is a commonly used text analysis tool for discovering latent topics in a text corpus. However, while topics in a text corpus often exhibit a hierarchical structure (e.g., cellphone is a sub-topic of electronics), most topic modeling methods assume a flat topic structure that ignores the hierarchical dependency among topics, or utilize a predefined topic hierarchy. In this work, we present a novel Hierarchical Deep Document Model (HDDM) to learn topic hierarchies using a variational autoencoder framework. We propose a novel objective function, sum of log likelihood, instead of the widely used evidence lower bound, to facilitate the learning of hierarchical latent topic structure. The proposed objective function can directly model and optimize the hierarchical topic-word distributions at all topic levels. We conduct experiments on four real-world text datasets to evaluate the topic modeling capability of the proposed HDDM method compared to state-of-the-art hierarchical topic modeling benchmarks. Experimental results show that HDDM achieves considerable improvement over benchmarks and is capable of learning meaningful topics and topic hierarchies. To further demonstrate the practical utility of HDDM, we apply it to a real-world medical notes dataset for clinical prediction. Experimental results show that HDDM can better summarize topics in medical notes, resulting in more accurate clinical predictions.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 1","pages":"351-364"},"PeriodicalIF":8.9,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142810295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Modern recommender systems derive predictions from an interaction graph that links users and items. To this end, many of today's state-of-the-art systems use graph neural networks (GNNs) to learn effective representations of these graphs under the assumption of homophily, i.e., the idea that similar users will sit close to each other in the graph. However, recent studies have revealed that real-world recommendation graphs are often heterophilous, i.e., dissimilar users will also often sit close to each other. One of the reasons for this heterophilia is shilling attacks that obscure the inherent characteristics of the graph and make the derived recommendations less accurate as a consequence. Hence, to cope with low homophily in recommender systems, we propose a recommendation model called PGT4Rec that is based on a Partitioned Graph Transformer. The model integrates label information into the learning process, which allows discriminative neighbourhoods of users to be generated. As such, the framework can both detect shilling attacks and predict user ratings for items. Extensive experiments on real and synthetic datasets show PGT4Rec as not only providing superior performance in these two tasks but also significant robustness to a range of adversarial conditions.
{"title":"Handling Low Homophily in Recommender Systems With Partitioned Graph Transformer","authors":"Thanh Tam Nguyen;Thanh Toan Nguyen;Matthias Weidlich;Jun Jo;Quoc Viet Hung Nguyen;Hongzhi Yin;Alan Wee-Chung Liew","doi":"10.1109/TKDE.2024.3485880","DOIUrl":"https://doi.org/10.1109/TKDE.2024.3485880","url":null,"abstract":"Modern recommender systems derive predictions from an interaction graph that links users and items. To this end, many of today's state-of-the-art systems use graph neural networks (GNNs) to learn effective representations of these graphs under the assumption of homophily, i.e., the idea that similar users will sit close to each other in the graph. However, recent studies have revealed that real-world recommendation graphs are often heterophilous, i.e., dissimilar users will also often sit close to each other. One of the reasons for this heterophilia is shilling attacks that obscure the inherent characteristics of the graph and make the derived recommendations less accurate as a consequence. Hence, to cope with low homophily in recommender systems, we propose a recommendation model called PGT4Rec that is based on a Partitioned Graph Transformer. The model integrates label information into the learning process, which allows discriminative neighbourhoods of users to be generated. As such, the framework can both detect shilling attacks and predict user ratings for items. Extensive experiments on real and synthetic datasets show PGT4Rec as not only providing superior performance in these two tasks but also significant robustness to a range of adversarial conditions.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 1","pages":"334-350"},"PeriodicalIF":8.9,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142797954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Existing geometric knowledge graph embedding methods employ various relational transformations, such as translation, rotation, and projection, to model different relation patterns, which aims to enhance the expressiveness of models. In contrast to current approaches that treat the expressiveness of the model as a binary issue, we aim to delve deeper into analyzing the level of difficulty in which geometric knowledge graph embedding models can represent relation patterns. In this paper, we provide a theoretical analysis framework that measures the expressiveness of the model in relation patterns by quantifying the size of the solution space of linear equation systems. Additionally, we propose a mechanism for imposing relational constraints on geometric knowledge graph embedding models by setting “traps” near relational optimal solutions, which enables the model to better converge to the optimal solution. Empirically, we analyze and compare several typical knowledge graph embedding models with different geometric algebras, revealing that some models have insufficient solution space due to their design, which leads to performance weaknesses. We also demonstrate that the proposed relational constraint operations can improve the performance of certain relation patterns. The experimental results on public benchmarks and relation pattern specified dataset are consistent with our theoretical analysis.
{"title":"Expressiveness Analysis and Enhancing Framework for Geometric Knowledge Graph Embedding Models","authors":"Tengwei Song;Long Yin;Yang Liu;Long Liao;Jie Luo;Zhiqiang Xu","doi":"10.1109/TKDE.2024.3486915","DOIUrl":"https://doi.org/10.1109/TKDE.2024.3486915","url":null,"abstract":"Existing geometric knowledge graph embedding methods employ various relational transformations, such as translation, rotation, and projection, to model different relation patterns, which aims to enhance the expressiveness of models. In contrast to current approaches that treat the expressiveness of the model as a binary issue, we aim to delve deeper into analyzing the level of difficulty in which geometric knowledge graph embedding models can represent relation patterns. In this paper, we provide a theoretical analysis framework that measures the expressiveness of the model in relation patterns by quantifying the size of the solution space of linear equation systems. Additionally, we propose a mechanism for imposing relational constraints on geometric knowledge graph embedding models by setting “traps” near relational optimal solutions, which enables the model to better converge to the optimal solution. Empirically, we analyze and compare several typical knowledge graph embedding models with different geometric algebras, revealing that some models have insufficient solution space due to their design, which leads to performance weaknesses. We also demonstrate that the proposed relational constraint operations can improve the performance of certain relation patterns. The experimental results on public benchmarks and relation pattern specified dataset are consistent with our theoretical analysis.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 1","pages":"306-318"},"PeriodicalIF":8.9,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142797991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-25DOI: 10.1109/TKDE.2024.3485107
Li Yuan;Yi Cai;Jingyu Xu;Qing Li;Tao Wang
Joint multimodal entity-relation extraction (JMERE) is a challenging task that involves two joint subtasks, i.e., named entity recognition and relation extraction, from multimodal data such as text sentences with associated images. Previous JMERE methods have primarily employed 1) pipeline models, which apply pre-trained unimodal models separately and ignore the interaction between tasks, or 2) word-pair relation tagging methods, which neglect neighboring word pairs. To address these limitations, we propose a fine-grained network for JMERE. Specifically, we introduce a fine-grained alignment module that utilizes a phrase-patch to establish connections between text phrases and visual objects. This module can learn consistent multimodal representations from multimodal data. Furthermore, we address the task-irrelevant image information issue by proposing a gate fusion module, which mitigates the impact of image noise and ensures a balanced representation between image objects and text representations. Furthermore, we design a multi-word decoder that enables ensemble prediction of tags for each word pair. This approach leverages the predicted results of neighboring word pairs, improving the ability to extract multi-word entities. Evaluation results from a series of experiments demonstrate the superiority of our proposed model over state-of-the-art models in JMERE.
{"title":"A Fine-Grained Network for Joint Multimodal Entity-Relation Extraction","authors":"Li Yuan;Yi Cai;Jingyu Xu;Qing Li;Tao Wang","doi":"10.1109/TKDE.2024.3485107","DOIUrl":"https://doi.org/10.1109/TKDE.2024.3485107","url":null,"abstract":"Joint multimodal entity-relation extraction (JMERE) is a challenging task that involves two joint subtasks, i.e., named entity recognition and relation extraction, from multimodal data such as text sentences with associated images. Previous JMERE methods have primarily employed 1) pipeline models, which apply pre-trained unimodal models separately and ignore the interaction between tasks, or 2) word-pair relation tagging methods, which neglect neighboring word pairs. To address these limitations, we propose a fine-grained network for JMERE. Specifically, we introduce a fine-grained alignment module that utilizes a phrase-patch to establish connections between text phrases and visual objects. This module can learn consistent multimodal representations from multimodal data. Furthermore, we address the task-irrelevant image information issue by proposing a gate fusion module, which mitigates the impact of image noise and ensures a balanced representation between image objects and text representations. Furthermore, we design a multi-word decoder that enables ensemble prediction of tags for each word pair. This approach leverages the predicted results of neighboring word pairs, improving the ability to extract multi-word entities. Evaluation results from a series of experiments demonstrate the superiority of our proposed model over state-of-the-art models in JMERE.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 1","pages":"1-14"},"PeriodicalIF":8.9,"publicationDate":"2024-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142797912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-25DOI: 10.1109/TKDE.2024.3486530
Guangjie Zeng;Hao Peng;Angsheng Li;Jia Wu;Chunyang Liu;Philip S. Yu
Semi-supervised clustering leverages prior information in the form of constraints to achieve higher-quality clustering outcomes. However, most existing methods struggle with large-scale datasets owing to their high time and space complexity. Moreover, they encounter the challenge of seamlessly integrating various constraints, thereby limiting their applicability. In this paper, we present S