Jhih-Chen Liu, Chiao-Ting Chen, Chi Lee, Szu-Hao Huang
The growing number of publications in the field of artificial intelligence highlights the need for researchers to enhance their efficiency in searching for relevant articles. Most paper recommendation models either rely on simplistic citation relationships among papers or focus on content-based approaches, both of which overlook interactions within academic networks. To address the aforementioned problem, knowledge graph embedding (KGE) methods have been used for citation recommendations because recent research proving that graph representations can effectively improve recommendation model accuracy. However, academic networks are dynamic, leading to changes in the representations of users and items over time. The majority of KGE-based citation recommendations are primarily designed for static graphs, thus failing to capture the evolution of dynamic knowledge graph (DKG) structures. To address these challenges, we introduced the evolving knowledge graph embedding (EKGE) method. In this methodology, evolving knowledge graphs are input into time-series models to learn the patterns of structural evolution. The model has the capability to generate embeddings for each entity at various time points, thereby overcoming limitation of static models that require retraining to acquire embeddings at each specific time point. To enhance the efficiency of feature extraction, we employed a multiple attention strategy. This helped the model find recommendation lists that are closely related to a user’s needs, leading to improved recommendation accuracy. Various experiments conducted on a citation recommendation dataset revealed that the EKGE model exhibits a 1.13% increase in prediction accuracy compared to other KGE methods. Moreover, the model’s accuracy can be further increased by an additional 0.84% through the incorporation of an attention mechanism.
{"title":"Evolving Knowledge Graph Representation Learning with Multiple Attention Strategies for Citation Recommendation System","authors":"Jhih-Chen Liu, Chiao-Ting Chen, Chi Lee, Szu-Hao Huang","doi":"10.1145/3635273","DOIUrl":"https://doi.org/10.1145/3635273","url":null,"abstract":"<p>The growing number of publications in the field of artificial intelligence highlights the need for researchers to enhance their efficiency in searching for relevant articles. Most paper recommendation models either rely on simplistic citation relationships among papers or focus on content-based approaches, both of which overlook interactions within academic networks. To address the aforementioned problem, knowledge graph embedding (KGE) methods have been used for citation recommendations because recent research proving that graph representations can effectively improve recommendation model accuracy. However, academic networks are dynamic, leading to changes in the representations of users and items over time. The majority of KGE-based citation recommendations are primarily designed for static graphs, thus failing to capture the evolution of dynamic knowledge graph (DKG) structures. To address these challenges, we introduced the evolving knowledge graph embedding (EKGE) method. In this methodology, evolving knowledge graphs are input into time-series models to learn the patterns of structural evolution. The model has the capability to generate embeddings for each entity at various time points, thereby overcoming limitation of static models that require retraining to acquire embeddings at each specific time point. To enhance the efficiency of feature extraction, we employed a multiple attention strategy. This helped the model find recommendation lists that are closely related to a user’s needs, leading to improved recommendation accuracy. Various experiments conducted on a citation recommendation dataset revealed that the EKGE model exhibits a 1.13% increase in prediction accuracy compared to other KGE methods. Moreover, the model’s accuracy can be further increased by an additional 0.84% through the incorporation of an attention mechanism.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"2 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2024-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139464262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haiyan Zhao, Hanjie Chen, Fan Yang, Ninghao Liu, Huiqi Deng, Hengyi Cai, Shuaiqiang Wang, Dawei Yin, Mengnan Du
Large language models (LLMs) have demonstrated impressive capabilities in natural language processing. However, their internal mechanisms are still unclear and this lack of transparency poses unwanted risks for downstream applications. Therefore, understanding and explaining these models is crucial for elucidating their behaviors, limitations, and social impacts. In this paper, we introduce a taxonomy of explainability techniques and provide a structured overview of methods for explaining Transformer-based language models. We categorize techniques based on the training paradigms of LLMs: traditional fine-tuning-based paradigm and prompting-based paradigm. For each paradigm, we summarize the goals and dominant approaches for generating local explanations of individual predictions and global explanations of overall model knowledge. We also discuss metrics for evaluating generated explanations, and discuss how explanations can be leveraged to debug models and improve performance. Lastly, we examine key challenges and emerging opportunities for explanation techniques in the era of LLMs in comparison to conventional deep learning models.
{"title":"Explainability for Large Language Models: A Survey","authors":"Haiyan Zhao, Hanjie Chen, Fan Yang, Ninghao Liu, Huiqi Deng, Hengyi Cai, Shuaiqiang Wang, Dawei Yin, Mengnan Du","doi":"10.1145/3639372","DOIUrl":"https://doi.org/10.1145/3639372","url":null,"abstract":"<p>Large language models (LLMs) have demonstrated impressive capabilities in natural language processing. However, their internal mechanisms are still unclear and this lack of transparency poses unwanted risks for downstream applications. Therefore, understanding and explaining these models is crucial for elucidating their behaviors, limitations, and social impacts. In this paper, we introduce a taxonomy of explainability techniques and provide a structured overview of methods for explaining Transformer-based language models. We categorize techniques based on the training paradigms of LLMs: traditional fine-tuning-based paradigm and prompting-based paradigm. For each paradigm, we summarize the goals and dominant approaches for generating local explanations of individual predictions and global explanations of overall model knowledge. We also discuss metrics for evaluating generated explanations, and discuss how explanations can be leveraged to debug models and improve performance. Lastly, we examine key challenges and emerging opportunities for explanation techniques in the era of LLMs in comparison to conventional deep learning models.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"8 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139084073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The performance of machine learning algorithms can be considerably improved when trained over larger datasets. In many domains, such as medicine and finance, larger datasets can be obtained if several parties, each having access to limited amounts of data, collaborate and share their data. However, such data sharing introduces significant privacy challenges. While multiple recent studies have investigated methods for private collaborative machine learning, the fairness of such collaborative algorithms was overlooked. In this work we suggest a feasible privacy-preserving pre-process mechanism for enhancing fairness of collaborative machine learning algorithms. An extensive evaluation of the proposed method shows that it is able to enhance fairness considerably with only a minor compromise in accuracy.
{"title":"Fairness-Driven Private Collaborative Machine Learning","authors":"Dana Pessach, Tamir Tassa, Erez Shmueli","doi":"10.1145/3639368","DOIUrl":"https://doi.org/10.1145/3639368","url":null,"abstract":"<p>The performance of machine learning algorithms can be considerably improved when trained over larger datasets. In many domains, such as medicine and finance, larger datasets can be obtained if several parties, each having access to limited amounts of data, collaborate and share their data. However, such data sharing introduces significant privacy challenges. While multiple recent studies have investigated methods for private collaborative machine learning, the fairness of such collaborative algorithms was overlooked. In this work we suggest a feasible privacy-preserving pre-process mechanism for enhancing fairness of collaborative machine learning algorithms. An extensive evaluation of the proposed method shows that it is able to enhance fairness considerably with only a minor compromise in accuracy.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"46 8 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139084127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhiyuan Wu, Sheng Sun, Yuwei Wang, Min Liu, Quyang Pan, Junbo Zhang, Zeju Li, Qingxiang Liu
Federated learning (FL) is a privacy-preserving machine learning paradigm in which the server periodically aggregates local model parameters from clients without assembling their private data. Constrained communication and personalization requirements pose severe challenges to FL. Federated distillation (FD) is proposed to simultaneously address the above two problems, which exchanges knowledge between the server and clients, supporting heterogeneous local models while significantly reducing communication overhead. However, most existing FD methods require a proxy dataset, which is often unavailable in reality. A few recent proxy-data-free FD approaches can eliminate the need for additional public data, but suffer from remarkable discrepancy among local knowledge due to client-side model heterogeneity, leading to ambiguous representation on the server and inevitable accuracy degradation. To tackle this issue, we propose a proxy-data-free FD algorithm based on distributed knowledge congruence (FedDKC). FedDKC leverages well-designed refinement strategies to narrow local knowledge differences into an acceptable upper bound, so as to mitigate the negative effects of knowledge incongruence. Specifically, from perspectives of peak probability and Shannon entropy of local knowledge, we design kernel-based knowledge refinement (KKR) and searching-based knowledge refinement (SKR) respectively, and theoretically guarantee that the refined-local knowledge can satisfy an approximately-similar distribution and be regarded as congruent. Extensive experiments conducted on three common datasets demonstrate that our proposed FedDKC significantly outperforms the state-of-the-art on various heterogeneous settings while evidently improving the convergence speed.
{"title":"Exploring the Distributed Knowledge Congruence in Proxy-data-free Federated Distillation","authors":"Zhiyuan Wu, Sheng Sun, Yuwei Wang, Min Liu, Quyang Pan, Junbo Zhang, Zeju Li, Qingxiang Liu","doi":"10.1145/3639369","DOIUrl":"https://doi.org/10.1145/3639369","url":null,"abstract":"<p>Federated learning (FL) is a privacy-preserving machine learning paradigm in which the server periodically aggregates local model parameters from clients without assembling their private data. Constrained communication and personalization requirements pose severe challenges to FL. Federated distillation (FD) is proposed to simultaneously address the above two problems, which exchanges knowledge between the server and clients, supporting heterogeneous local models while significantly reducing communication overhead. However, most existing FD methods require a proxy dataset, which is often unavailable in reality. A few recent proxy-data-free FD approaches can eliminate the need for additional public data, but suffer from remarkable discrepancy among local knowledge due to client-side model heterogeneity, leading to ambiguous representation on the server and inevitable accuracy degradation. To tackle this issue, we propose a proxy-data-free FD algorithm based on distributed knowledge congruence (FedDKC). FedDKC leverages well-designed refinement strategies to narrow local knowledge differences into an acceptable upper bound, so as to mitigate the negative effects of knowledge incongruence. Specifically, from perspectives of peak probability and Shannon entropy of local knowledge, we design kernel-based knowledge refinement (KKR) and searching-based knowledge refinement (SKR) respectively, and theoretically guarantee that the refined-local knowledge can satisfy an approximately-similar distribution and be regarded as congruent. Extensive experiments conducted on three common datasets demonstrate that our proposed FedDKC significantly outperforms the state-of-the-art on various heterogeneous settings while evidently improving the convergence speed.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"45 2 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2023-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139070962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Meng Xu, Xinhong Chen, Yechao She, Yang Jin, Guanyi Zhao, Jianping Wang
Multi-agent reinforcement learning (MARL) has proven effective in training multi-robot confrontation, such as StarCraft and robot soccer games. However, the current joint action policies utilized in MARL have been unsuccessful in recognizing and preventing actions that often lead to failures on our side. This exacerbates the cooperation dilemma, ultimately resulting in our agents acting independently and being defeated individually by their opponents. To tackle this challenge, we propose a novel joint action policy, referred to as the consensus action policy (CAP). Specifically, CAP records the number of times each joint action has caused our side to fail in the past and computes a cooperation tendency, which is integrated with each agent’s Q-value and Nash bargaining solution to determine a joint action. The cooperation tendency promotes team cooperation by selecting joint actions that have a high tendency of cooperation and avoiding actions that may lead to team failure. Moreover, the proposed CAP policy can be extended to partially observable scenarios by combining it with Deep Q network (DQN) or actor-critic-based methods. We conducted extensive experiments to compare the proposed method with seven existing joint action policies, including four commonly used methods and three state-of-the-art (SOTA) methods, in terms of episode rewards, winning rates, and other metrics. Our results demonstrate that this approach holds great promise for multi-robot confrontation scenarios.
事实证明,多代理强化学习(MARL)在训练多机器人对抗(如《星际争霸》和机器人足球比赛)方面非常有效。然而,目前 MARL 中使用的联合行动策略无法成功识别和防止经常导致我方失败的行动。这加剧了合作困境,最终导致我们的代理各自为政,被对手击败。为了应对这一挑战,我们提出了一种新颖的联合行动策略,即共识行动策略(CAP)。具体来说,CAP 记录了每个联合行动在过去导致我方失败的次数,并计算出合作倾向,将其与每个代理的 Q 值和纳什讨价还价方案相结合,确定联合行动。合作倾向通过选择合作倾向高的联合行动,避免可能导致团队失败的行动,从而促进团队合作。此外,通过与深度 Q 网络(DQN)或基于行动者批判的方法相结合,所提出的 CAP 策略还可以扩展到部分可观测场景。我们进行了广泛的实验,将所提出的方法与现有的七种联合行动策略(包括四种常用方法和三种最先进的(SOTA)方法)在情节奖励、胜率和其他指标方面进行了比较。我们的结果表明,这种方法在多机器人对抗场景中大有可为。
{"title":"Strengthening Cooperative Consensus in Multi-Robot Confrontation","authors":"Meng Xu, Xinhong Chen, Yechao She, Yang Jin, Guanyi Zhao, Jianping Wang","doi":"10.1145/3639371","DOIUrl":"https://doi.org/10.1145/3639371","url":null,"abstract":"<p>Multi-agent reinforcement learning (MARL) has proven effective in training multi-robot confrontation, such as StarCraft and robot soccer games. However, the current joint action policies utilized in MARL have been unsuccessful in recognizing and preventing actions that often lead to failures on our side. This exacerbates the cooperation dilemma, ultimately resulting in our agents acting independently and being defeated individually by their opponents. To tackle this challenge, we propose a novel joint action policy, referred to as the consensus action policy (CAP). Specifically, CAP records the number of times each joint action has caused our side to fail in the past and computes a cooperation tendency, which is integrated with each agent’s Q-value and Nash bargaining solution to determine a joint action. The cooperation tendency promotes team cooperation by selecting joint actions that have a high tendency of cooperation and avoiding actions that may lead to team failure. Moreover, the proposed CAP policy can be extended to partially observable scenarios by combining it with Deep Q network (DQN) or actor-critic-based methods. We conducted extensive experiments to compare the proposed method with seven existing joint action policies, including four commonly used methods and three state-of-the-art (SOTA) methods, in terms of episode rewards, winning rates, and other metrics. Our results demonstrate that this approach holds great promise for multi-robot confrontation scenarios.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"194 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2023-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139070955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shengyu Chen, Tianshu Bao, Peyman Givi, Can Zheng, Xiaowei Jia
Accurate simulation of turbulent flows is of crucial importance in many branches of science and engineering. Direct numerical simulation (DNS) provides the highest fidelity means of capturing all intricate physics of turbulent transport. However, the method is computationally expensive because of the wide range of turbulence scales that must be accounted for in such simulations. Large eddy simulation (LES) provides an alternative. In such simulations, the large scales of the flow are resolved and the effects of small scales are modelled. Reconstruction of the DNS field from the low-resolution LES is needed for a wide variety of applications. Thus the construction of super-resolution (SR) methodologies that can provide this reconstruction has become an area of active research. In this work, a new physics-guided neural network is developed for such a reconstruction. The method leverages the partial differential equation that underlies the flow dynamics in the design of spatio-temporal model architecture. A degradation-based refinement method is also developed to enforce physical constraints and to further reduce the accumulated reconstruction errors over long periods. Detailed DNS data on two turbulent flow configurations are used to assess the performance of the model.
湍流的精确模拟在许多科学和工程领域都至关重要。直接数值模拟(DNS)是捕捉湍流传输所有复杂物理现象的保真度最高的方法。然而,由于这种模拟必须考虑广泛的湍流尺度,因此计算成本高昂。大涡模拟(LES)提供了一种替代方法。在这种模拟中,流动的大尺度被解析,小尺度的影响被模拟。各种应用都需要从低分辨率 LES 中重建 DNS 场。因此,构建能够提供这种重构的超分辨率(SR)方法已成为一个活跃的研究领域。在这项工作中,为这种重建开发了一种新的物理引导神经网络。该方法在设计时空模型结构时利用了作为水流动力学基础的偏微分方程。此外,还开发了一种基于退化的细化方法,以执行物理约束并进一步减少长期累积的重建误差。两种湍流配置的详细 DNS 数据用于评估模型的性能。
{"title":"Reconstructing Turbulent Flows Using Spatio-Temporal Physical Dynamics","authors":"Shengyu Chen, Tianshu Bao, Peyman Givi, Can Zheng, Xiaowei Jia","doi":"10.1145/3637491","DOIUrl":"https://doi.org/10.1145/3637491","url":null,"abstract":"<p>Accurate simulation of turbulent flows is of crucial importance in many branches of science and engineering. Direct numerical simulation (DNS) provides the highest fidelity means of capturing all intricate physics of turbulent transport. However, the method is computationally expensive because of the wide range of turbulence scales that must be accounted for in such simulations. Large eddy simulation (LES) provides an alternative. In such simulations, the large scales of the flow are resolved and the effects of small scales are modelled. Reconstruction of the DNS field from the low-resolution LES is needed for a wide variety of applications. Thus the construction of super-resolution (SR) methodologies that can provide this reconstruction has become an area of active research. In this work, a new physics-guided neural network is developed for such a reconstruction. The method leverages the partial differential equation that underlies the flow dynamics in the design of spatio-temporal model architecture. A degradation-based refinement method is also developed to enforce physical constraints and to further reduce the accumulated reconstruction errors over long periods. Detailed DNS data on two turbulent flow configurations are used to assess the performance of the model.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"36 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2023-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138688694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuan Yuan, Jingtao Ding, Huandong Wang, Depeng Jin
Daily activity data recording individuals’ various activities in daily life are widely used in many applications such as activity scheduling, activity recommendation, and policymaking. Though with high value, its accessibility is limited due to high collection costs and potential privacy issues. Therefore, simulating human activities to produce massive high-quality data is of great importance. However, existing solutions, including rule-based methods with simplified behavior assumptions and data-driven methods directly fitting real-world data, both cannot fully qualify for matching reality. In this paper, motivated by the classic psychological theory, Maslow’s need theory describing human motivation, we propose a knowledge-driven simulation framework based on generative adversarial imitation learning. Our core idea is to model the evolution of human needs as the underlying mechanism that drives activity generation in the simulation model. Specifically, a hierarchical model structure that disentangles different need levels and the use of neural stochastic differential equations successfully capture the piecewise-continuous characteristics of need dynamics. Extensive experiments demonstrate that our framework outperforms the state-of-the-art baselines regarding data fidelity and utility. We also present the insightful interpretability of the need modeling. Moreover, privacy preservation evaluations validate that the generated data does not leak individual privacy. The code is available at https://github.com/tsinghua-fib-lab/Activity-Simulation-SAND.
{"title":"Generating Daily Activities with Need Dynamics","authors":"Yuan Yuan, Jingtao Ding, Huandong Wang, Depeng Jin","doi":"10.1145/3637493","DOIUrl":"https://doi.org/10.1145/3637493","url":null,"abstract":"<p>Daily activity data recording individuals’ various activities in daily life are widely used in many applications such as activity scheduling, activity recommendation, and policymaking. Though with high value, its accessibility is limited due to high collection costs and potential privacy issues. Therefore, simulating human activities to produce massive high-quality data is of great importance. However, existing solutions, including rule-based methods with simplified behavior assumptions and data-driven methods directly fitting real-world data, both cannot fully qualify for matching reality. In this paper, motivated by the classic psychological theory, Maslow’s need theory describing human motivation, we propose a knowledge-driven simulation framework based on generative adversarial imitation learning. Our core idea is to model the evolution of human needs as the underlying mechanism that drives activity generation in the simulation model. Specifically, a hierarchical model structure that disentangles different need levels and the use of neural stochastic differential equations successfully capture the piecewise-continuous characteristics of need dynamics. Extensive experiments demonstrate that our framework outperforms the state-of-the-art baselines regarding data fidelity and utility. We also present the insightful interpretability of the need modeling. Moreover, privacy preservation evaluations validate that the generated data does not leak individual privacy. The code is available at https://github.com/tsinghua-fib-lab/Activity-Simulation-SAND.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"33 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2023-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138630639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fernando Terroso-Saenz, Juan Morales-García, Andres Muñoz
Nowadays, air pollution is one of the most relevant environmental problems in most urban settings. Due to the utility in operational terms of anticipating certain pollution levels, several predictors based on Graph Neural Networks (GNN) have been proposed for the last years. Most of these solutions usually encode the relationships among stations in terms of their spatial distance, but they fail when it comes to capture other spatial and feature-based contextual factors. Besides, they assume a homogeneous setting where all the stations are able to capture the same pollutants. However, large-scale settings frequently comprise different types of stations, each one with different measurement capabilities. For that reason, the present paper introduces a novel GNN framework able to capture the similarities among stations related to the land use of their locations and their primary source of pollution. Furthermore, we define a methodology to deal with heterogeneous settings on the top of the GNN architecture. Finally, the proposal has been tested with a nation-wide Spanish air-pollution dataset with very promising results.
{"title":"Nationwide Air Pollution Forecasting with Heterogeneous Graph Neural Networks","authors":"Fernando Terroso-Saenz, Juan Morales-García, Andres Muñoz","doi":"10.1145/3637492","DOIUrl":"https://doi.org/10.1145/3637492","url":null,"abstract":"<p>Nowadays, air pollution is one of the most relevant environmental problems in most urban settings. Due to the utility in operational terms of anticipating certain pollution levels, several predictors based on Graph Neural Networks (GNN) have been proposed for the last years. Most of these solutions usually encode the relationships among stations in terms of their spatial distance, but they fail when it comes to capture other spatial and feature-based contextual factors. Besides, they assume a <i>homogeneous</i> setting where all the stations are able to capture the same pollutants. However, large-scale settings frequently comprise different types of stations, each one with different measurement capabilities. For that reason, the present paper introduces a novel GNN framework able to capture the similarities among stations related to the land use of their locations and their primary source of pollution. Furthermore, we define a methodology to deal with heterogeneous settings on the top of the GNN architecture. Finally, the proposal has been tested with a nation-wide Spanish air-pollution dataset with very promising results.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"4 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2023-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138630644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Traditional recommender systems estimate user preference on items purely based on historical interaction records, thus failing to capture fine-grained yet dynamic user interests and letting users receive recommendation only passively. Recent conversational recommender systems (CRSs) tackle those limitations by enabling recommender systems to interact with the user to obtain her/his current preference through a sequence of clarifying questions. Recently, there has been a rise of using knowledge graphs (KGs) for CRSs, where the core motivation is to incorporate the abundant side information carried by a KG into both the recommendation and conversation processes. However, existing KG-based CRSs are subject to two defects: (1) there is a semantic gap between the learned representations of utterances and KG entities, hindering the retrieval of relevant KG information; (2) the reasoning over KG is mostly performed with the implicitly learned user interests, overlooking the explicit signals from the entities actually mentioned in the conversation.
To address these drawbacks, we propose a new CRS framework, namely the Knowledge Enhanced Conversational Reasoning (KECR) model. As a user can reflect her/his preferences via both attribute- and item-level expressions, KECR jointly embeds the structured knowledge from two levels in the KG. A mutual information maximization constraint is further proposed for semantic alignment between the embedding spaces of utterances and KG entities. Meanwhile, KECR utilizes the connectivity within the KG to conduct explicit reasoning of the user demand, making the model less dependent on the user’s feedback to clarifying questions. As such, the semantic alignment and explicit KG reasoning can jointly facilitate accurate recommendation and quality dialogue generation. By comparing with strong baselines on two real-world datasets, we demonstrate that KECR obtains state-of-the-art recommendation effectiveness, as well as competitive dialogue generation performance.
传统的推荐系统纯粹根据历史交互记录来估算用户对项目的偏好,因此无法捕捉细粒度但动态的用户兴趣,用户只能被动地接受推荐。最近的对话式推荐系统(CRS)解决了这些局限性,它使推荐系统能够与用户互动,通过一连串的澄清问题获得用户当前的偏好。最近,在会话推荐系统中使用知识图谱(KGs)的做法开始兴起,其核心动机是将知识图谱所携带的丰富侧边信息纳入推荐和会话过程。然而,现有的基于知识图谱的 CRS 有两个缺陷:(1) 语篇的学习表示与知识图谱实体之间存在语义鸿沟,阻碍了相关知识图谱信息的检索;(2) 对知识图谱的推理大多是通过隐式学习的用户兴趣来进行的,忽略了对话中实际提及的实体的显式信号。为了解决这些问题,我们提出了一种新的会话推理框架,即知识增强会话推理(KECR)模型。由于用户可以通过属性级和项目级表达来反映自己的偏好,因此 KECR 将两个级别的结构化知识共同嵌入到 KG 中。为实现语篇嵌入空间与 KG 实体之间的语义对齐,进一步提出了互信息最大化约束。同时,KECR 利用 KG 内部的连通性对用户需求进行显式推理,使模型不再依赖于用户对澄清问题的反馈。因此,语义对齐和明确的 KG 推理可以共同促进准确的推荐和高质量的对话生成。通过在两个真实世界数据集上与强大的基线进行比较,我们证明了 KECR 获得了最先进的推荐效果以及具有竞争力的对话生成性能。
{"title":"Explicit Knowledge Graph Reasoning for Conversational Recommendation","authors":"Xuhui Ren, Tong Chen, Quoc Viet Hung Nguyen, Lizhen Cui, Zi Huang, Hongzhi Yin","doi":"10.1145/3637216","DOIUrl":"https://doi.org/10.1145/3637216","url":null,"abstract":"<p>Traditional recommender systems estimate user preference on items purely based on historical interaction records, thus failing to capture fine-grained yet dynamic user interests and letting users receive recommendation only passively. Recent conversational recommender systems (CRSs) tackle those limitations by enabling recommender systems to interact with the user to obtain her/his current preference through a sequence of clarifying questions. Recently, there has been a rise of using knowledge graphs (KGs) for CRSs, where the core motivation is to incorporate the abundant side information carried by a KG into both the recommendation and conversation processes. However, existing KG-based CRSs are subject to two defects: (1) there is a semantic gap between the learned representations of utterances and KG entities, hindering the retrieval of relevant KG information; (2) the reasoning over KG is mostly performed with the implicitly learned user interests, overlooking the explicit signals from the entities actually mentioned in the conversation. </p><p>To address these drawbacks, we propose a new CRS framework, namely the <b>K</b>nowledge <b>E</b>nhanced <b>C</b>onversational <b>R</b>easoning (KECR) model. As a user can reflect her/his preferences via both attribute- and item-level expressions, KECR jointly embeds the structured knowledge from two levels in the KG. A mutual information maximization constraint is further proposed for semantic alignment between the embedding spaces of utterances and KG entities. Meanwhile, KECR utilizes the connectivity within the KG to conduct explicit reasoning of the user demand, making the model less dependent on the user’s feedback to clarifying questions. As such, the semantic alignment and explicit KG reasoning can jointly facilitate accurate recommendation and quality dialogue generation. By comparing with strong baselines on two real-world datasets, we demonstrate that KECR obtains state-of-the-art recommendation effectiveness, as well as competitive dialogue generation performance.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"7 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2023-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138572359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fashion recommendation has become a prominent focus in the realm of online shopping, with various tasks being explored to enhance the customer experience. Recent research has particularly emphasized fashion recommendation based on body shapes, yet a critical aspect of incorporating multimodal data relevance has been overlooked. In this paper, we present the Contrastive Multimodal Cross-Attention Network, a novel approach specifically designed for fashion recommendation catering to diverse body shapes. By incorporating multimodal representation learning and leveraging contrastive learning techniques, our method effectively captures both inter- and intra-sample relationships, resulting in improved accuracy in fashion recommendations tailored to individual body types. Additionally, we propose a locality-aware cross-attention module to align and understand the local preferences between body shapes and clothing items, thus enhancing the matching process. Experimental results conducted on a diverse dataset demonstrate the state-of-the-art performance achieved by our approach, reinforcing its potential to significantly enhance the personalized online shopping experience for consumers with varying body shapes and preferences.
{"title":"Personalized Fashion Recommendations for Diverse Body Shapes and Local Preferences with Contrastive Multimodal Cross-Attention Network","authors":"Jianghong Ma, Huiyue Sun, Dezhao Yang, Haijun Zhang","doi":"10.1145/3637217","DOIUrl":"https://doi.org/10.1145/3637217","url":null,"abstract":"<p>Fashion recommendation has become a prominent focus in the realm of online shopping, with various tasks being explored to enhance the customer experience. Recent research has particularly emphasized fashion recommendation based on body shapes, yet a critical aspect of incorporating multimodal data relevance has been overlooked. In this paper, we present the Contrastive Multimodal Cross-Attention Network, a novel approach specifically designed for fashion recommendation catering to diverse body shapes. By incorporating multimodal representation learning and leveraging contrastive learning techniques, our method effectively captures both inter- and intra-sample relationships, resulting in improved accuracy in fashion recommendations tailored to individual body types. Additionally, we propose a locality-aware cross-attention module to align and understand the local preferences between body shapes and clothing items, thus enhancing the matching process. Experimental results conducted on a diverse dataset demonstrate the state-of-the-art performance achieved by our approach, reinforcing its potential to significantly enhance the personalized online shopping experience for consumers with varying body shapes and preferences.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"28 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2023-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138572071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}