Sichun Luo, Yuanzhang Xiao, Xinyi Zhang, Yang Liu, Wenbo Ding, Linqi Song
Federated recommendation systems employ federated learning techniques to safeguard user privacy by transmitting model parameters instead of raw user data between user devices and the central server. Nevertheless, the current federated recommender system faces three significant challenges: (1) data heterogeneity: the heterogeneity of users’ attributes and local data necessitates the acquisition of personalized models to improve the performance of federated recommendation; (2) model performance degradation: the privacy-preserving protocol design in the federated recommendation, such as pseudo item labeling and differential privacy, would deteriorate the model performance; (3) communication bottleneck: the standard federated recommendation algorithm can have a high communication overhead. Previous studies have attempted to address these issues, but none have been able to solve them simultaneously.
In this paper, we propose a novel framework, named PerFedRec++, to enhance the personalized federated recommendation with self-supervised pre-training. Specifically, we utilize the privacy-preserving mechanism of federated recommender systems to generate two augmented graph views, which are used as contrastive tasks in self-supervised graph learning to pre-train the model. Pre-training enhances the performance of federated models by improving the uniformity of representation learning. Also, by providing a better initial state for federated training, pre-training makes the overall training converge faster, thus alleviating the heavy communication burden. We then construct a collaborative graph to learn the client representation through a federated graph neural network. Based on these learned representations, we cluster users into different user groups and learn personalized models for each cluster. Each user learns a personalized model by combining the global federated model, the cluster-level federated model, and its own fine-tuned local model. Experiments on three real-world datasets show that our proposed method achieves superior performance over existing methods.
{"title":"PerFedRec++: Enhancing Personalized Federated Recommendation with Self-Supervised Pre-Training","authors":"Sichun Luo, Yuanzhang Xiao, Xinyi Zhang, Yang Liu, Wenbo Ding, Linqi Song","doi":"10.1145/3664927","DOIUrl":"https://doi.org/10.1145/3664927","url":null,"abstract":"<p>Federated recommendation systems employ federated learning techniques to safeguard user privacy by transmitting model parameters instead of raw user data between user devices and the central server. Nevertheless, the current federated recommender system faces three significant challenges: (1) <i>data heterogeneity:</i> the heterogeneity of users’ attributes and local data necessitates the acquisition of personalized models to improve the performance of federated recommendation; (2) <i>model performance degradation:</i> the privacy-preserving protocol design in the federated recommendation, such as pseudo item labeling and differential privacy, would deteriorate the model performance; (3) <i>communication bottleneck:</i> the standard federated recommendation algorithm can have a high communication overhead. Previous studies have attempted to address these issues, but none have been able to solve them simultaneously.</p><p>In this paper, we propose a novel framework, named <monospace>PerFedRec++</monospace>, to enhance the personalized federated recommendation with self-supervised pre-training. Specifically, we utilize the privacy-preserving mechanism of federated recommender systems to generate two augmented graph views, which are used as contrastive tasks in self-supervised graph learning to pre-train the model. Pre-training enhances the performance of federated models by improving the uniformity of representation learning. Also, by providing a better initial state for federated training, pre-training makes the overall training converge faster, thus alleviating the heavy communication burden. We then construct a collaborative graph to learn the client representation through a federated graph neural network. Based on these learned representations, we cluster users into different user groups and learn personalized models for each cluster. Each user learns a personalized model by combining the global federated model, the cluster-level federated model, and its own fine-tuned local model. Experiments on three real-world datasets show that our proposed method achieves superior performance over existing methods.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"35 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140925508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Miaomiao Cai, Min Hou, Lei Chen, Le Wu, Haoyue Bai, Yong Li, Meng Wang
Collaborative Filtering (CF) plays a crucial role in modern recommender systems, leveraging historical user-item interactions to provide personalized suggestions. However, CF-based methods often encounter biases due to imbalances in training data. This phenomenon makes CF-based methods tend to prioritize recommending popular items and performing unsatisfactorily on inactive users. Existing works address this issue by rebalancing training samples, reranking recommendation results, or making the modeling process robust to the bias. Despite their effectiveness, these approaches can compromise accuracy or be sensitive to weighting strategies, making them challenging to train. Therefore, exploring how to mitigate these biases remains in urgent demand.
In this paper, we deeply analyze the causes and effects of the biases and propose a framework to alleviate biases in recommendation from the perspective of representation distribution, namely Group-Alignment and Global-Uniformity Enhanced Representation Learning for Debiasing Recommendation (AURL). Specifically, we identify two significant problems in the representation distribution of users and items, namely group-discrepancy and global-collapse. These two problems directly lead to biases in the recommendation results. To this end, we propose two simple but effective regularizers in the representation space, respectively named group-alignment and global-uniformity. The goal of group-alignment is to bring the representation distribution of long-tail entities closer to that of popular entities, while global-uniformity aims to preserve the information of entities as much as possible by evenly distributing representations. Our method directly optimizes both the group-alignment and global-uniformity regularization terms to mitigate recommendation biases. Please note that AURL applies to arbitrary CF-based recommendation backbones. Extensive experiments on three real datasets and various recommendation backbones verify the superiority of our proposed framework. The results show that AURL not only outperforms existing debiasing models in mitigating biases but also improves recommendation performance to some extent.
{"title":"Mitigating Recommendation Biases via Group-Alignment and Global-Uniformity in Representation Learning","authors":"Miaomiao Cai, Min Hou, Lei Chen, Le Wu, Haoyue Bai, Yong Li, Meng Wang","doi":"10.1145/3664931","DOIUrl":"https://doi.org/10.1145/3664931","url":null,"abstract":"<p>Collaborative Filtering (CF) plays a crucial role in modern recommender systems, leveraging historical user-item interactions to provide personalized suggestions. However, CF-based methods often encounter biases due to imbalances in training data. This phenomenon makes CF-based methods tend to prioritize recommending popular items and performing unsatisfactorily on inactive users. Existing works address this issue by rebalancing training samples, reranking recommendation results, or making the modeling process robust to the bias. Despite their effectiveness, these approaches can compromise accuracy or be sensitive to weighting strategies, making them challenging to train. Therefore, exploring how to mitigate these biases remains in urgent demand.</p><p>In this paper, we deeply analyze the causes and effects of the biases and propose a framework to alleviate biases in recommendation from the perspective of representation distribution, namely <b><i>Group-<underline>A</underline>lignment and Global-<underline>U</underline>niformity Enhanced <underline>R</underline>epresentation <underline>L</underline>earning for Debiasing Recommendation (AURL)</i></b>. Specifically, we identify two significant problems in the representation distribution of users and items, namely group-discrepancy and global-collapse. These two problems directly lead to biases in the recommendation results. To this end, we propose two simple but effective regularizers in the representation space, respectively named group-alignment and global-uniformity. The goal of group-alignment is to bring the representation distribution of long-tail entities closer to that of popular entities, while global-uniformity aims to preserve the information of entities as much as possible by evenly distributing representations. Our method directly optimizes both the group-alignment and global-uniformity regularization terms to mitigate recommendation biases. Please note that <i>AURL</i> applies to arbitrary CF-based recommendation backbones. Extensive experiments on three real datasets and various recommendation backbones verify the superiority of our proposed framework. The results show that <i>AURL</i> not only outperforms existing debiasing models in mitigating biases but also improves recommendation performance to some extent.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"24 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140925510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The goal of recommender systems is to provide to users suggestions that match their interests, with the eventual goal of increasing their satisfaction, as measured by the number of transactions (clicks, purchases, etc.). Often, this leads to providing recommendations that are of a particular type. For some contexts (e.g., browsing videos for information) this may be undesirable, as it may enforce the creation of filter bubbles. This is because of the existence of underlying bias in the input data of prior user actions.
Reducing hidden bias in the data and ensuring fairness in algorithmic data analysis has recently received significant attention. In this paper, we consider both the densest subgraph and the (k)-clustering problem, two primitives that are being used by some recommender systems. We are given a coloring on the nodes, respectively the points, and aim to compute a fair solution (S), consisting of a subgraph or a clustering, such that none of the colors is disparately impacted by the solution.
Unfortunately, introducing fair solutions typically makes these problems substantially more difficult. Unlike the unconstrained densest subgraph problem, which is solvable in polynomial time, the fair densest subgraph problem is NP-hard even to approximate. For (k)-clustering, the fairness constraints make the problem very similar to capacitated clustering, which is a notoriously hard problem to even approximate.
Despite such negative premises, we are able to provide positive results in important use cases. In particular, we are able to prove that a suitable spectral embedding allows recovery of an almost optimal, fair, dense subgraph hidden in the input data, whenever one is present, a result that is further supported by experimental evidence.
We also show a polynomial-time, (2)-approximation algorithm to the problem of fair densest subgraph, assuming that there exist only two colors and both colors occur equally often in the graph. This result turns out to be optimal assuming the small set expansion hypothesis. For fair (k)-clustering, we show that we can recover high quality fair clusterings effectively and efficiently. For the special case of (k)-median and (k)-center, we offer additional, fast and simple approximation algorithms as well as new hardness results.
The above theoretical findings drive the design of heuristics, which we experimentally evaluate on a scenario based on real data, in which our aim is to strike a good balance between diversity and highly correlated items from Amazon co-purchasing graphs and facebook contacts. We additionally evaluated our algorithmic solutions for the fair (k)-median problem through experiments on various real-world datasets.
{"title":"Fair Projections as a Means Towards Balanced Recommendations","authors":"Aris Anagnostopoulos, Luca Becchetti, Matteo Böhm, Adriano Fazzone, Stefano Leonardi, Cristina Menghini, Chris Schwiegelshohn","doi":"10.1145/3664929","DOIUrl":"https://doi.org/10.1145/3664929","url":null,"abstract":"<p>The goal of recommender systems is to provide to users suggestions that match their interests, with the eventual goal of increasing their satisfaction, as measured by the number of transactions (clicks, purchases, etc.). Often, this leads to providing recommendations that are of a particular type. For some contexts (e.g., browsing videos for information) this may be undesirable, as it may enforce the creation of filter bubbles. This is because of the existence of underlying bias in the input data of prior user actions.</p><p>Reducing hidden bias in the data and ensuring fairness in algorithmic data analysis has recently received significant attention. In this paper, we consider both the densest subgraph and the (k)-clustering problem, two primitives that are being used by some recommender systems. We are given a coloring on the nodes, respectively the points, and aim to compute a <i>fair</i> solution (S), consisting of a subgraph or a clustering, such that none of the colors is disparately impacted by the solution.</p><p>Unfortunately, introducing fair solutions typically makes these problems substantially more difficult. Unlike the unconstrained densest subgraph problem, which is solvable in polynomial time, the fair densest subgraph problem is NP-hard even to approximate. For (k)-clustering, the fairness constraints make the problem very similar to capacitated clustering, which is a notoriously hard problem to even approximate.</p><p>Despite such negative premises, we are able to provide positive results in important use cases. In particular, we are able to prove that a suitable spectral embedding allows recovery of an almost optimal, fair, dense subgraph hidden in the input data, whenever one is present, a result that is further supported by experimental evidence.</p><p>We also show a polynomial-time, (2)-approximation algorithm to the problem of fair densest subgraph, assuming that there exist only two colors and both colors occur equally often in the graph. This result turns out to be optimal assuming the small set expansion hypothesis. For fair (k)-clustering, we show that we can recover high quality fair clusterings effectively and efficiently. For the special case of (k)-median and (k)-center, we offer additional, fast and simple approximation algorithms as well as new hardness results.</p><p>The above theoretical findings drive the design of heuristics, which we experimentally evaluate on a scenario based on real data, in which our aim is to strike a good balance between diversity and highly correlated items from Amazon co-purchasing graphs and facebook contacts. We additionally evaluated our algorithmic solutions for the fair (k)-median problem through experiments on various real-world datasets.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"34 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140925507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Large language models (LLMs), such as OpenAI’s Generative Pre-trained Transformer (GPT), are a class of language models that have demonstrated outstanding performance across a range of natural language processing (NLP) tasks. LLMs have become a highly sought-after research area because of their ability to generate human-like language and their potential to revolutionize science and technology. In this study, we conduct bibliometric and discourse analyses of scholarly literature on LLMs. Synthesizing over 5,000 publications, this paper serves as a roadmap for researchers, practitioners, and policymakers to navigate the current landscape of LLMs research. We present the research trends from 2017 to early 2023, identifying patterns in research paradigms and collaborations. We start with analyzing the core algorithm developments and NLP tasks that are fundamental in LLMs research. We then investigate the applications of LLMs in various fields and domains, including medicine, engineering, social science, and humanities. Our review also reveals the dynamic, fast-paced evolution of LLMs research. Overall, this paper offers valuable insights into the current state, impact, and potential of LLMs research and its applications.
{"title":"A Bibliometric Review of Large Language Models Research from 2017 to 2023","authors":"Lizhou Fan, Lingyao Li, Zihui Ma, Sanggyu Lee, Huizi Yu, Libby Hemphill","doi":"10.1145/3664930","DOIUrl":"https://doi.org/10.1145/3664930","url":null,"abstract":"<p>Large language models (LLMs), such as OpenAI’s Generative Pre-trained Transformer (GPT), are a class of language models that have demonstrated outstanding performance across a range of natural language processing (NLP) tasks. LLMs have become a highly sought-after research area because of their ability to generate human-like language and their potential to revolutionize science and technology. In this study, we conduct bibliometric and discourse analyses of scholarly literature on LLMs. Synthesizing over 5,000 publications, this paper serves as a roadmap for researchers, practitioners, and policymakers to navigate the current landscape of LLMs research. We present the research trends from 2017 to early 2023, identifying patterns in research paradigms and collaborations. We start with analyzing the core algorithm developments and NLP tasks that are fundamental in LLMs research. We then investigate the applications of LLMs in various fields and domains, including medicine, engineering, social science, and humanities. Our review also reveals the dynamic, fast-paced evolution of LLMs research. Overall, this paper offers valuable insights into the current state, impact, and potential of LLMs research and its applications.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"75 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140925439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The neural boom that has sparked natural language processing (NLP) research throughout the last decade has similarly led to significant innovations in data-to-text generation (D2T). This survey offers a consolidated view into the neural D2T paradigm with a structured examination of the approaches, benchmark datasets, and evaluation protocols. This survey draws boundaries separating D2T from the rest of the natural language generation (NLG) landscape, encompassing an up-to-date synthesis of the literature, and highlighting the stages of technological adoption from within and outside the greater NLG umbrella. With this holistic view, we highlight promising avenues for D2T research that not only focus on the design of linguistically capable systems but also systems that exhibit fairness and accountability.
{"title":"Neural Methods for Data-to-text Generation","authors":"Mandar Sharma, Ajay Kumar Gogineni, Naren Ramakrishnan","doi":"10.1145/3660639","DOIUrl":"https://doi.org/10.1145/3660639","url":null,"abstract":"<p>The neural boom that has sparked natural language processing (NLP) research throughout the last decade has similarly led to significant innovations in data-to-text generation (D2T). This survey offers a consolidated view into the neural D2T paradigm with a structured examination of the approaches, benchmark datasets, and evaluation protocols. This survey draws boundaries separating D2T from the rest of the natural language generation (NLG) landscape, encompassing an up-to-date synthesis of the literature, and highlighting the stages of technological adoption from within and outside the greater NLG umbrella. With this holistic view, we highlight promising avenues for D2T research that not only focus on the design of linguistically capable systems but also systems that exhibit fairness and accountability.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"124 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140925674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Juan Morales-García, Antonio Llanes, Francisco Arcas-Túnez, Fernando Terroso-Sáenz
Nowadays, Generative Large Language Models (GLLMs) have made a significant impact in the field of Artificial Intelligence (AI). One of the domains extensively explored for these models is their ability as generators of functional source code for software projects. Nevertheless, their potential as assistants to write the code needed to generate and model Machine Learning (ML) or Deep Learning (DL) architectures has not been fully explored to date. For this reason, this work focuses on evaluating the extent to which different tools based on GLLMs, such as ChatGPT or Copilot, are able to correctly define the source code necessary to generate viable predictive models. The use case defined is the forecasting of a time series that reports the indoor temperature of a greenhouse. The results indicate that, while it is possible to achieve good accuracy metrics with simple predictive models generated by GLLMs, the composition of predictive models with complex architectures using GLLMs is still far from improving the accuracy of predictive models generated by human data scientists.
{"title":"Developing Time Series Forecasting Models with Generative Large Language Models","authors":"Juan Morales-García, Antonio Llanes, Francisco Arcas-Túnez, Fernando Terroso-Sáenz","doi":"10.1145/3663485","DOIUrl":"https://doi.org/10.1145/3663485","url":null,"abstract":"<p>Nowadays, Generative Large Language Models (GLLMs) have made a significant impact in the field of Artificial Intelligence (AI). One of the domains extensively explored for these models is their ability as generators of functional source code for software projects. Nevertheless, their potential as assistants to write the code needed to generate and model Machine Learning (ML) or Deep Learning (DL) architectures has not been fully explored to date. For this reason, this work focuses on evaluating the extent to which different tools based on GLLMs, such as ChatGPT or Copilot, are able to correctly define the source code necessary to generate viable predictive models. The use case defined is the forecasting of a time series that reports the indoor temperature of a greenhouse. The results indicate that, while it is possible to achieve good accuracy metrics with simple predictive models generated by GLLMs, the composition of predictive models with complex architectures using GLLMs is still far from improving the accuracy of predictive models generated by human data scientists.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"25 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2024-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140925266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xuansheng Wu, Hanqin Wan, Qiaoyu Tan, Wenlin Yao, Ninghao Liu
Recommending products to users with intuitive explanations helps improve the system in transparency, persuasiveness, and satisfaction. Existing interpretation techniques include post-hoc methods and interpretable modeling. The former category could quantitatively analyze input contribution to model prediction but has limited interpretation faithfulness, while the latter could explain model internal mechanisms but may not directly attribute model predictions to input features. In this study, we propose a novel Dual Interpretable Recommendation model called DIRECT, which integrates ideas of the two interpretation categories to inherit their advantages and avoid limitations. Specifically, DIRECT makes use of item descriptions as explainable evidence for recommendation. First, similar to the post-hoc interpretation, DIRECT could attribute the prediction of a user preference score to textual words of the item descriptions. The attribution of each word is related to its sentiment polarity and word importance, where a word is important if it corresponds to an item aspect that the user is interested in. Second, to improve the interpretability of embedding space, we propose to extract high-level concepts from embeddings, where each concept corresponds to an item aspect. To learn discriminative concepts, we employ a concept-bottleneck layer, and maximize the coding rate reduction on word-aspect embeddings by leveraging a word-word affinity graph extracted from a pre-trained language model. In this way, DIRECT simultaneously achieves faithful attribution and usable interpretation of embedding space. We also show that DIRECT achieves linear inference time complexity regarding the length of item reviews. We conduct experiments including ablation studies on five real-world datasets. Quantitative analysis, visualizations, and case studies verify the interpretability of DIRECT. Our code is available at: https://github.com/JacksonWuxs/DIRECT.
通过直观的解释向用户推荐产品有助于提高系统的透明度、说服力和满意度。现有的解释技术包括事后方法和可解释建模。前者可以定量分析输入对模型预测的贡献,但解释的忠实性有限;后者可以解释模型的内部机制,但可能无法将模型预测直接归因于输入特征。在本研究中,我们提出了一种名为 DIRECT 的新型双重可解释推荐模型,它整合了两种解释类别的思想,继承了它们的优点,避免了它们的局限性。具体来说,DIRECT 利用项目描述作为可解释的推荐证据。首先,与事后解释类似,DIRECT 可以将用户偏好分数的预测归因于项目描述中的文字词句。每个词的归因都与其情感极性和词的重要性有关,如果一个词与用户感兴趣的项目方面相对应,那么这个词就是重要的。其次,为了提高嵌入空间的可解释性,我们建议从嵌入中提取高级概念,每个概念对应一个项目方面。为了学习辨别概念,我们采用了一个概念瓶颈层,并利用从预先训练的语言模型中提取的词-词亲和图,最大限度地降低词-词嵌入的编码率。这样,DIRECT 就能同时实现嵌入空间的忠实归属和可用解释。我们还证明,DIRECT 在项目评论长度方面实现了线性推理时间复杂性。我们在五个真实世界数据集上进行了实验,包括消融研究。定量分析、可视化和案例研究验证了 DIRECT 的可解释性。我们的代码可在以下网址获取:https://github.com/JacksonWuxs/DIRECT。
{"title":"DIRECT: Dual Interpretable Recommendation with Multi-aspect Word Attribution","authors":"Xuansheng Wu, Hanqin Wan, Qiaoyu Tan, Wenlin Yao, Ninghao Liu","doi":"10.1145/3663483","DOIUrl":"https://doi.org/10.1145/3663483","url":null,"abstract":"<p>Recommending products to users with intuitive explanations helps improve the system in transparency, persuasiveness, and satisfaction. Existing interpretation techniques include post-hoc methods and interpretable modeling. The former category could quantitatively analyze input contribution to model prediction but has limited interpretation faithfulness, while the latter could explain model internal mechanisms but may not directly attribute model predictions to input features. In this study, we propose a novel <underline>D</underline>ual <underline>I</underline>nterpretable <underline>Rec</underline>ommenda<underline>t</underline>ion model called DIRECT, which integrates ideas of the two interpretation categories to inherit their advantages and avoid limitations. Specifically, DIRECT makes use of item descriptions as explainable evidence for recommendation. First, similar to the post-hoc interpretation, DIRECT could attribute the prediction of a user preference score to textual words of the item descriptions. The attribution of each word is related to its sentiment polarity and word importance, where a word is important if it corresponds to an item aspect that the user is interested in. Second, to improve the interpretability of embedding space, we propose to extract high-level concepts from embeddings, where each concept corresponds to an item aspect. To learn discriminative concepts, we employ a concept-bottleneck layer, and maximize the coding rate reduction on word-aspect embeddings by leveraging a word-word affinity graph extracted from a pre-trained language model. In this way, DIRECT simultaneously achieves faithful attribution and usable interpretation of embedding space. We also show that DIRECT achieves linear inference time complexity regarding the length of item reviews. We conduct experiments including ablation studies on five real-world datasets. Quantitative analysis, visualizations, and case studies verify the interpretability of DIRECT. Our code is available at: https://github.com/JacksonWuxs/DIRECT.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"18 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2024-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140887025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ekaterina Gilman, Francesca Bugiotti, Ahmed Khalid, Hassan Mehmood, Panos Kostakos, Lauri Tuovinen, Johanna Ylipulli, Xiang Su, Denzil Ferreira
Cities serve as vital hubs of economic activity and knowledge generation and dissemination. As such, cities bear a significant responsibility to uphold environmental protection measures while promoting the welfare and living comfort of their residents. There are diverse views on the development of smart cities, from integrating Information and Communication Technologies into urban environments for better operational decisions to supporting sustainability, wealth, and comfort of people. However, for all these cases, data is the key ingredient and enabler for the vision and realization of smart cities. This article explores the challenges associated with smart city data. We start with gaining an understanding of the concept of a smart city, how to measure that the city is a smart one, and what architectures and platforms exist to develop one. Afterwards, we research the challenges associated with the data of the cities, including availability, heterogeneity, management, analysis, privacy, and security. Finally, we discuss ethical issues. This article aims to serve as a “one-stop shop” covering data-related issues of smart cities with references for diving deeper into particular topics of interest.
{"title":"Addressing Data Challenges to Drive the Transformation of Smart Cities","authors":"Ekaterina Gilman, Francesca Bugiotti, Ahmed Khalid, Hassan Mehmood, Panos Kostakos, Lauri Tuovinen, Johanna Ylipulli, Xiang Su, Denzil Ferreira","doi":"10.1145/3663482","DOIUrl":"https://doi.org/10.1145/3663482","url":null,"abstract":"<p>Cities serve as vital hubs of economic activity and knowledge generation and dissemination. As such, cities bear a significant responsibility to uphold environmental protection measures while promoting the welfare and living comfort of their residents. There are diverse views on the development of smart cities, from integrating Information and Communication Technologies into urban environments for better operational decisions to supporting sustainability, wealth, and comfort of people. However, for all these cases, data is the key ingredient and enabler for the vision and realization of smart cities. This article explores the challenges associated with smart city data. We start with gaining an understanding of the concept of a smart city, how to measure that the city is a smart one, and what architectures and platforms exist to develop one. Afterwards, we research the challenges associated with the data of the cities, including availability, heterogeneity, management, analysis, privacy, and security. Finally, we discuss ethical issues. This article aims to serve as a “one-stop shop” covering data-related issues of smart cities with references for diving deeper into particular topics of interest.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"28 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140834479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Janet Layne, Qudrat E Alahy Ratul, Edoardo Serra, Sushil Jajodia
The coronavirus pandemic has fostered an explosion of misinformation about the disease, including the risk and effectiveness of vaccination. AI tools for automatic Scientific Claim Verification (SCV) can be crucial to defeat misinformation campaigns spreading through social media channels. However, over the past years, many concerns have been raised about the robustness of AI to adversarial attacks, and the field of automatic scientific claim verification is not exempt. The risk is that such SCV tools may reinforce and legitimize the spread of fake scientific claims rather than refute them. This paper investigates the problem of generating adversarial attacks for SCV tools and shows that it is far more difficult than the generic NLP adversarial attack problem. The current NLP adversarial attack generators, when applied to SCV, often generate modified claims with entirely different meaning from the original. Even when the meaning is preserved, the modification of the generated claim is too simplistic (only a single word is changed), leaving many weaknesses of the SCV tools undiscovered. We propose T5-ParEvo, an iterative evolutionary attack generator, that is able to generate more complex and creative attacks while better preserving the semantics of the original claim. Using detailed quantitative and qualitative analysis, we demonstrate the efficacy of T5-ParEvo in comparison with existing attack generators.
{"title":"Analyzing Robustness of Automatic Scientific Claim Verification Tools against Adversarial Rephrasing Attacks","authors":"Janet Layne, Qudrat E Alahy Ratul, Edoardo Serra, Sushil Jajodia","doi":"10.1145/3663481","DOIUrl":"https://doi.org/10.1145/3663481","url":null,"abstract":"<p>The coronavirus pandemic has fostered an explosion of misinformation about the disease, including the risk and effectiveness of vaccination. AI tools for automatic Scientific Claim Verification (SCV) can be crucial to defeat misinformation campaigns spreading through social media channels. However, over the past years, many concerns have been raised about the robustness of AI to adversarial attacks, and the field of automatic scientific claim verification is not exempt. The risk is that such SCV tools may reinforce and legitimize the spread of fake scientific claims rather than refute them. This paper investigates the problem of generating adversarial attacks for SCV tools and shows that it is far more difficult than the generic NLP adversarial attack problem. The current NLP adversarial attack generators, when applied to SCV, often generate modified claims with entirely different meaning from the original. Even when the meaning is preserved, the modification of the generated claim is too simplistic (only a single word is changed), leaving many weaknesses of the SCV tools undiscovered. We propose T5-ParEvo, an iterative evolutionary attack generator, that is able to generate more complex and creative attacks while better preserving the semantics of the original claim. Using detailed quantitative and qualitative analysis, we demonstrate the efficacy of T5-ParEvo in comparison with existing attack generators.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"25 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140842102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lo Pang-Yun Ting, Rong Chao, Chai-Shi Chang, Kun-Ta Chuang
With the rise of Internet-of-Things devices, the analysis of sensor-generated energy time series data has become increasingly important. This is especially crucial for detecting rare events like unusual electricity usage or water leakages in residential and commercial buildings, which is essential for optimizing energy efficiency and reducing costs. However, existing detection methods on large-scale data may fail to correctly detect rare events when they do not behave significantly differently from standard events or when their attributes are non-stationary. Additionally, the capacity of computational resources to analyze all time series data generated by an increasing number of sensors becomes a challenge. This situation creates an emergent demand for a workload-bounded strategy. To ensure both effectiveness and efficiency in detecting rare events in massive energy time series, we propose a heuristic-based framework called HALE. This framework utilizes an explore-exploit selection process that is specifically designed to recognize potential features of rare events in energy time series. HALE involves constructing an attribute-aware graph to preserve the attribute information of rare events. A heuristic-based random walk is then derived based on partial labels received at each time period to discover the non-stationarity of rare events. Potential rare event data is selected from the attribute-aware graph, and existing detection models are applied for final confirmation. Our study, which was conducted on three actual energy datasets, demonstrates that the HALE framework is both effective and efficient in its detection capabilities. This underscores its practicality in delivering cost-effective energy monitoring services.
{"title":"An Explore-Exploit Workload-bounded Strategy for Rare Event Detection in Massive Energy Sensor Time Series","authors":"Lo Pang-Yun Ting, Rong Chao, Chai-Shi Chang, Kun-Ta Chuang","doi":"10.1145/3657641","DOIUrl":"https://doi.org/10.1145/3657641","url":null,"abstract":"<p>With the rise of Internet-of-Things devices, the analysis of sensor-generated energy time series data has become increasingly important. This is especially crucial for detecting rare events like unusual electricity usage or water leakages in residential and commercial buildings, which is essential for optimizing energy efficiency and reducing costs. However, existing detection methods on large-scale data may fail to correctly detect rare events when they do not behave significantly differently from standard events or when their attributes are non-stationary. Additionally, the capacity of computational resources to analyze all time series data generated by an increasing number of sensors becomes a challenge. This situation creates an emergent demand for a workload-bounded strategy. To ensure both effectiveness and efficiency in detecting rare events in massive energy time series, we propose a heuristic-based framework called <i>HALE</i>. This framework utilizes an explore-exploit selection process that is specifically designed to recognize potential features of rare events in energy time series. <i>HALE</i> involves constructing an attribute-aware graph to preserve the attribute information of rare events. A heuristic-based random walk is then derived based on partial labels received at each time period to discover the non-stationarity of rare events. Potential rare event data is selected from the attribute-aware graph, and existing detection models are applied for final confirmation. Our study, which was conducted on three actual energy datasets, demonstrates that the <i>HALE</i> framework is both effective and efficient in its detection capabilities. This underscores its practicality in delivering cost-effective energy monitoring services.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"16 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140615375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}