ACM Transactions on Intelligent Systems and Technology最新文献_第2页

PerFedRec++: Enhancing Personalized Federated Recommendation with Self-Supervised Pre-Training PerFedRec++：利用自监督预培训增强个性化联合推荐

IF 5 4区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Intelligent Systems and Technology

Pub Date : 2024-05-14 DOI: 10.1145/3664927

Sichun Luo, Yuanzhang Xiao, Xinyi Zhang, Yang Liu, Wenbo Ding, Linqi Song

Federated recommendation systems employ federated learning techniques to safeguard user privacy by transmitting model parameters instead of raw user data between user devices and the central server. Nevertheless, the current federated recommender system faces three significant challenges: (1) data heterogeneity: the heterogeneity of users’ attributes and local data necessitates the acquisition of personalized models to improve the performance of federated recommendation; (2) model performance degradation: the privacy-preserving protocol design in the federated recommendation, such as pseudo item labeling and differential privacy, would deteriorate the model performance; (3) communication bottleneck: the standard federated recommendation algorithm can have a high communication overhead. Previous studies have attempted to address these issues, but none have been able to solve them simultaneously.

In this paper, we propose a novel framework, named PerFedRec++, to enhance the personalized federated recommendation with self-supervised pre-training. Specifically, we utilize the privacy-preserving mechanism of federated recommender systems to generate two augmented graph views, which are used as contrastive tasks in self-supervised graph learning to pre-train the model. Pre-training enhances the performance of federated models by improving the uniformity of representation learning. Also, by providing a better initial state for federated training, pre-training makes the overall training converge faster, thus alleviating the heavy communication burden. We then construct a collaborative graph to learn the client representation through a federated graph neural network. Based on these learned representations, we cluster users into different user groups and learn personalized models for each cluster. Each user learns a personalized model by combining the global federated model, the cluster-level federated model, and its own fine-tuned local model. Experiments on three real-world datasets show that our proposed method achieves superior performance over existing methods.

联盟推荐系统采用联盟学习技术，通过在用户设备和中央服务器之间传输模型参数而非原始用户数据来保护用户隐私。然而，当前的联合推荐系统面临三个重大挑战：（1）数据异构：用户属性和本地数据的异构要求获取个性化模型，以提高联合推荐的性能；（2）模型性能下降：联合推荐中的隐私保护协议设计，如伪项目标签和差分隐私，会使模型性能下降；（3）通信瓶颈：标准的联合推荐算法会有很高的通信开销。在本文中，我们提出了一个名为 PerFedRec++ 的新框架，通过自监督预训练来增强个性化联合推荐。具体来说，我们利用联合推荐系统的隐私保护机制来生成两个增强图视图，并将其作为自监督图学习中的对比任务来预训练模型。预训练可以提高表征学习的一致性，从而增强联合模型的性能。同时，通过为联合训练提供更好的初始状态，预训练使整体训练收敛得更快，从而减轻了沉重的通信负担。然后，我们构建一个协作图，通过联合图神经网络学习客户端表示。基于这些学习到的表征，我们将用户聚类为不同的用户组，并为每个聚类学习个性化模型。每个用户通过结合全局联合模型、集群级联合模型和自己的微调本地模型来学习个性化模型。在三个真实数据集上的实验表明，我们提出的方法比现有方法性能更优越。

{"title":"PerFedRec++: Enhancing Personalized Federated Recommendation with Self-Supervised Pre-Training","authors":"Sichun Luo, Yuanzhang Xiao, Xinyi Zhang, Yang Liu, Wenbo Ding, Linqi Song","doi":"10.1145/3664927","DOIUrl":"https://doi.org/10.1145/3664927","url":null,"abstract":"Federated recommendation systems employ federated learning techniques to safeguard user privacy by transmitting model parameters instead of raw user data between user devices and the central server. Nevertheless, the current federated recommender system faces three significant challenges: (1) data heterogeneity: the heterogeneity of users’ attributes and local data necessitates the acquisition of personalized models to improve the performance of federated recommendation; (2) model performance degradation: the privacy-preserving protocol design in the federated recommendation, such as pseudo item labeling and differential privacy, would deteriorate the model performance; (3) communication bottleneck: the standard federated recommendation algorithm can have a high communication overhead. Previous studies have attempted to address these issues, but none have been able to solve them simultaneously.In this paper, we propose a novel framework, named <monospace>PerFedRec++</monospace>, to enhance the personalized federated recommendation with self-supervised pre-training. Specifically, we utilize the privacy-preserving mechanism of federated recommender systems to generate two augmented graph views, which are used as contrastive tasks in self-supervised graph learning to pre-train the model. Pre-training enhances the performance of federated models by improving the uniformity of representation learning. Also, by providing a better initial state for federated training, pre-training makes the overall training converge faster, thus alleviating the heavy communication burden. We then construct a collaborative graph to learn the client representation through a federated graph neural network. Based on these learned representations, we cluster users into different user groups and learn personalized models for each cluster. Each user learns a personalized model by combining the global federated model, the cluster-level federated model, and its own fine-tuned local model. Experiments on three real-world datasets show that our proposed method achieves superior performance over existing methods.","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"35 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140925508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Mitigating Recommendation Biases via Group-Alignment and Global-Uniformity in Representation Learning 通过表征学习中的组对齐和全局一致性减轻推荐偏差

IF 5 4区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Intelligent Systems and Technology

Pub Date : 2024-05-14 DOI: 10.1145/3664931

Miaomiao Cai, Min Hou, Lei Chen, Le Wu, Haoyue Bai, Yong Li, Meng Wang

Collaborative Filtering (CF) plays a crucial role in modern recommender systems, leveraging historical user-item interactions to provide personalized suggestions. However, CF-based methods often encounter biases due to imbalances in training data. This phenomenon makes CF-based methods tend to prioritize recommending popular items and performing unsatisfactorily on inactive users. Existing works address this issue by rebalancing training samples, reranking recommendation results, or making the modeling process robust to the bias. Despite their effectiveness, these approaches can compromise accuracy or be sensitive to weighting strategies, making them challenging to train. Therefore, exploring how to mitigate these biases remains in urgent demand.

In this paper, we deeply analyze the causes and effects of the biases and propose a framework to alleviate biases in recommendation from the perspective of representation distribution, namely Group-Alignment and Global-Uniformity Enhanced Representation Learning for Debiasing Recommendation (AURL). Specifically, we identify two significant problems in the representation distribution of users and items, namely group-discrepancy and global-collapse. These two problems directly lead to biases in the recommendation results. To this end, we propose two simple but effective regularizers in the representation space, respectively named group-alignment and global-uniformity. The goal of group-alignment is to bring the representation distribution of long-tail entities closer to that of popular entities, while global-uniformity aims to preserve the information of entities as much as possible by evenly distributing representations. Our method directly optimizes both the group-alignment and global-uniformity regularization terms to mitigate recommendation biases. Please note that AURL applies to arbitrary CF-based recommendation backbones. Extensive experiments on three real datasets and various recommendation backbones verify the superiority of our proposed framework. The results show that AURL not only outperforms existing debiasing models in mitigating biases but also improves recommendation performance to some extent.

协同过滤（CF）在现代推荐系统中发挥着至关重要的作用，它利用用户与项目之间的历史互动来提供个性化建议。然而，由于训练数据的不平衡，基于协同过滤的方法经常会遇到偏差。这种现象使得基于 CF 的方法倾向于优先推荐热门项目，而对不活跃用户的推荐效果则不尽人意。现有的工作通过重新平衡训练样本、重新排序推荐结果或使建模过程对偏差具有鲁棒性来解决这一问题。尽管这些方法很有效，但它们可能会影响准确性或对加权策略很敏感，从而给训练带来挑战。在本文中，我们深入分析了偏差的原因和影响，并从表征分布的角度提出了一个减轻推荐偏差的框架，即用于去偏差推荐的组对齐和全局均匀性增强表征学习（AURL）。具体来说，我们发现用户和项目的表征分布存在两个重要问题，即群体差异和全局塌陷。这两个问题会直接导致推荐结果出现偏差。为此，我们在表征空间中提出了两个简单而有效的正则，分别称为组对齐（group-alignment）和全局均匀性（global-uniformity）。组对齐的目的是使长尾实体的表示分布更接近流行实体的表示分布，而全局均匀性的目的是通过均匀分布表示来尽可能保留实体的信息。我们的方法直接优化了组对齐和全局均匀性正则化项，以减少推荐偏差。请注意，AURL 适用于任意基于 CF 的推荐骨干网。在三个真实数据集和各种推荐骨干网上进行的广泛实验验证了我们提出的框架的优越性。结果表明，AURL 不仅在减轻偏差方面优于现有的去除法模型，还在一定程度上提高了推荐性能。

{"title":"Mitigating Recommendation Biases via Group-Alignment and Global-Uniformity in Representation Learning","authors":"Miaomiao Cai, Min Hou, Lei Chen, Le Wu, Haoyue Bai, Yong Li, Meng Wang","doi":"10.1145/3664931","DOIUrl":"https://doi.org/10.1145/3664931","url":null,"abstract":"Collaborative Filtering (CF) plays a crucial role in modern recommender systems, leveraging historical user-item interactions to provide personalized suggestions. However, CF-based methods often encounter biases due to imbalances in training data. This phenomenon makes CF-based methods tend to prioritize recommending popular items and performing unsatisfactorily on inactive users. Existing works address this issue by rebalancing training samples, reranking recommendation results, or making the modeling process robust to the bias. Despite their effectiveness, these approaches can compromise accuracy or be sensitive to weighting strategies, making them challenging to train. Therefore, exploring how to mitigate these biases remains in urgent demand.In this paper, we deeply analyze the causes and effects of the biases and propose a framework to alleviate biases in recommendation from the perspective of representation distribution, namely Group-<underline>A</underline>lignment and Global-<underline>U</underline>niformity Enhanced <underline>R</underline>epresentation <underline>L</underline>earning for Debiasing Recommendation (AURL). Specifically, we identify two significant problems in the representation distribution of users and items, namely group-discrepancy and global-collapse. These two problems directly lead to biases in the recommendation results. To this end, we propose two simple but effective regularizers in the representation space, respectively named group-alignment and global-uniformity. The goal of group-alignment is to bring the representation distribution of long-tail entities closer to that of popular entities, while global-uniformity aims to preserve the information of entities as much as possible by evenly distributing representations. Our method directly optimizes both the group-alignment and global-uniformity regularization terms to mitigate recommendation biases. Please note that AURL applies to arbitrary CF-based recommendation backbones. Extensive experiments on three real datasets and various recommendation backbones verify the superiority of our proposed framework. The results show that AURL not only outperforms existing debiasing models in mitigating biases but also improves recommendation performance to some extent.","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"24 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140925510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Fair Projections as a Means Towards Balanced Recommendations 将公平预测作为平衡建议的手段

IF 5 4区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Intelligent Systems and Technology

Pub Date : 2024-05-14 DOI: 10.1145/3664929

Aris Anagnostopoulos, Luca Becchetti, Matteo Böhm, Adriano Fazzone, Stefano Leonardi, Cristina Menghini, Chris Schwiegelshohn

The goal of recommender systems is to provide to users suggestions that match their interests, with the eventual goal of increasing their satisfaction, as measured by the number of transactions (clicks, purchases, etc.). Often, this leads to providing recommendations that are of a particular type. For some contexts (e.g., browsing videos for information) this may be undesirable, as it may enforce the creation of filter bubbles. This is because of the existence of underlying bias in the input data of prior user actions.

Reducing hidden bias in the data and ensuring fairness in algorithmic data analysis has recently received significant attention. In this paper, we consider both the densest subgraph and the (k)-clustering problem, two primitives that are being used by some recommender systems. We are given a coloring on the nodes, respectively the points, and aim to compute a fair solution (S), consisting of a subgraph or a clustering, such that none of the colors is disparately impacted by the solution.

Unfortunately, introducing fair solutions typically makes these problems substantially more difficult. Unlike the unconstrained densest subgraph problem, which is solvable in polynomial time, the fair densest subgraph problem is NP-hard even to approximate. For (k)-clustering, the fairness constraints make the problem very similar to capacitated clustering, which is a notoriously hard problem to even approximate.

Despite such negative premises, we are able to provide positive results in important use cases. In particular, we are able to prove that a suitable spectral embedding allows recovery of an almost optimal, fair, dense subgraph hidden in the input data, whenever one is present, a result that is further supported by experimental evidence.

We also show a polynomial-time, (2)-approximation algorithm to the problem of fair densest subgraph, assuming that there exist only two colors and both colors occur equally often in the graph. This result turns out to be optimal assuming the small set expansion hypothesis. For fair (k)-clustering, we show that we can recover high quality fair clusterings effectively and efficiently. For the special case of (k)-median and (k)-center, we offer additional, fast and simple approximation algorithms as well as new hardness results.

The above theoretical findings drive the design of heuristics, which we experimentally evaluate on a scenario based on real data, in which our aim is to strike a good balance between diversity and highly correlated items from Amazon co-purchasing graphs and facebook contacts. We additionally evaluated our algorithmic solutions for the fair (k)-median problem through experiments on various real-world datasets.

推荐系统的目标是向用户提供符合他们兴趣的建议，最终目的是提高他们的满意度，这可以用交易次数（点击、购买等）来衡量。这通常会导致提供特定类型的推荐。在某些情况下（如浏览视频获取信息），这种做法可能并不可取，因为它可能会强制产生过滤气泡。这是因为先前用户操作的输入数据中存在潜在的偏差。减少数据中隐藏的偏差并确保算法数据分析的公平性最近受到了广泛关注。在本文中，我们同时考虑了最密子图和（k）聚类问题，这是一些推荐系统正在使用的两个基本原理。我们分别给定了节点和点的着色，目的是计算出一个公平的解决方案，包括一个子图或一个聚类，使得没有任何一种颜色受到该解决方案的影响。与可在多项式时间内求解的无约束最密子图问题不同，公平最密子图问题甚至连近似都很难。对于（k）聚类来说，公平性约束使得这个问题非常类似于容纳聚类，而容纳聚类是一个臭名昭著的甚至难以近似的问题。我们还展示了一种多项式时间、（2）-近似算法来解决公平最密子图问题，假设只有两种颜色，并且这两种颜色在图中出现的频率相同。结果证明，假设小集扩展假设，这一结果是最优的。对于公平聚类，我们证明可以有效地恢复高质量的公平聚类。上述理论发现推动了启发式算法的设计，我们在一个基于真实数据的场景中对其进行了实验评估，在该场景中，我们的目标是在亚马逊共同购买图和Facebook联系人中的多样性和高度相关的项目之间取得良好的平衡。此外，我们还通过在各种真实世界数据集上的实验，评估了我们针对公平（k）-中值问题的算法解决方案。

{"title":"Fair Projections as a Means Towards Balanced Recommendations","authors":"Aris Anagnostopoulos, Luca Becchetti, Matteo Böhm, Adriano Fazzone, Stefano Leonardi, Cristina Menghini, Chris Schwiegelshohn","doi":"10.1145/3664929","DOIUrl":"https://doi.org/10.1145/3664929","url":null,"abstract":"The goal of recommender systems is to provide to users suggestions that match their interests, with the eventual goal of increasing their satisfaction, as measured by the number of transactions (clicks, purchases, etc.). Often, this leads to providing recommendations that are of a particular type. For some contexts (e.g., browsing videos for information) this may be undesirable, as it may enforce the creation of filter bubbles. This is because of the existence of underlying bias in the input data of prior user actions.Reducing hidden bias in the data and ensuring fairness in algorithmic data analysis has recently received significant attention. In this paper, we consider both the densest subgraph and the (k)-clustering problem, two primitives that are being used by some recommender systems. We are given a coloring on the nodes, respectively the points, and aim to compute a fair solution (S), consisting of a subgraph or a clustering, such that none of the colors is disparately impacted by the solution.Unfortunately, introducing fair solutions typically makes these problems substantially more difficult. Unlike the unconstrained densest subgraph problem, which is solvable in polynomial time, the fair densest subgraph problem is NP-hard even to approximate. For (k)-clustering, the fairness constraints make the problem very similar to capacitated clustering, which is a notoriously hard problem to even approximate.Despite such negative premises, we are able to provide positive results in important use cases. In particular, we are able to prove that a suitable spectral embedding allows recovery of an almost optimal, fair, dense subgraph hidden in the input data, whenever one is present, a result that is further supported by experimental evidence.We also show a polynomial-time, (2)-approximation algorithm to the problem of fair densest subgraph, assuming that there exist only two colors and both colors occur equally often in the graph. This result turns out to be optimal assuming the small set expansion hypothesis. For fair (k)-clustering, we show that we can recover high quality fair clusterings effectively and efficiently. For the special case of (k)-median and (k)-center, we offer additional, fast and simple approximation algorithms as well as new hardness results.The above theoretical findings drive the design of heuristics, which we experimentally evaluate on a scenario based on real data, in which our aim is to strike a good balance between diversity and highly correlated items from Amazon co-purchasing graphs and facebook contacts. We additionally evaluated our algorithmic solutions for the fair (k)-median problem through experiments on various real-world datasets.","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"34 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140925507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Bibliometric Review of Large Language Models Research from 2017 to 2023 2017 至 2023 年大语言模型研究文献计量学回顾

IF 5 4区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Intelligent Systems and Technology

Pub Date : 2024-05-13 DOI: 10.1145/3664930

Lizhou Fan, Lingyao Li, Zihui Ma, Sanggyu Lee, Huizi Yu, Libby Hemphill

Large language models (LLMs), such as OpenAI’s Generative Pre-trained Transformer (GPT), are a class of language models that have demonstrated outstanding performance across a range of natural language processing (NLP) tasks. LLMs have become a highly sought-after research area because of their ability to generate human-like language and their potential to revolutionize science and technology. In this study, we conduct bibliometric and discourse analyses of scholarly literature on LLMs. Synthesizing over 5,000 publications, this paper serves as a roadmap for researchers, practitioners, and policymakers to navigate the current landscape of LLMs research. We present the research trends from 2017 to early 2023, identifying patterns in research paradigms and collaborations. We start with analyzing the core algorithm developments and NLP tasks that are fundamental in LLMs research. We then investigate the applications of LLMs in various fields and domains, including medicine, engineering, social science, and humanities. Our review also reveals the dynamic, fast-paced evolution of LLMs research. Overall, this paper offers valuable insights into the current state, impact, and potential of LLMs research and its applications.

大型语言模型（LLM），如 OpenAI 的生成预训练转换器（GPT），是一类在一系列自然语言处理（NLP）任务中表现出卓越性能的语言模型。由于 LLM 具备生成类人语言的能力，并具有彻底改变科学技术的潜力，因此已成为备受追捧的研究领域。在本研究中，我们对有关 LLM 的学术文献进行了文献计量学和话语分析。本文综合了 5000 多篇文献，为研究人员、从业人员和政策制定者提供了一个路线图，帮助他们了解 LLMs 研究的现状。我们介绍了从2017年到2023年初的研究趋势，确定了研究范式和合作模式。我们首先分析了 LLMs 研究的核心算法开发和 NLP 任务。然后，我们研究了 LLMs 在各个领域的应用，包括医学、工程学、社会科学和人文学科。我们的回顾还揭示了 LLMs 研究的动态和快速发展。总之，本文为了解 LLMs 研究及其应用的现状、影响和潜力提供了宝贵的见解。

{"title":"A Bibliometric Review of Large Language Models Research from 2017 to 2023","authors":"Lizhou Fan, Lingyao Li, Zihui Ma, Sanggyu Lee, Huizi Yu, Libby Hemphill","doi":"10.1145/3664930","DOIUrl":"https://doi.org/10.1145/3664930","url":null,"abstract":"Large language models (LLMs), such as OpenAI’s Generative Pre-trained Transformer (GPT), are a class of language models that have demonstrated outstanding performance across a range of natural language processing (NLP) tasks. LLMs have become a highly sought-after research area because of their ability to generate human-like language and their potential to revolutionize science and technology. In this study, we conduct bibliometric and discourse analyses of scholarly literature on LLMs. Synthesizing over 5,000 publications, this paper serves as a roadmap for researchers, practitioners, and policymakers to navigate the current landscape of LLMs research. We present the research trends from 2017 to early 2023, identifying patterns in research paradigms and collaborations. We start with analyzing the core algorithm developments and NLP tasks that are fundamental in LLMs research. We then investigate the applications of LLMs in various fields and domains, including medicine, engineering, social science, and humanities. Our review also reveals the dynamic, fast-paced evolution of LLMs research. Overall, this paper offers valuable insights into the current state, impact, and potential of LLMs research and its applications.","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"75 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140925439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Neural Methods for Data-to-text Generation 数据到文本生成的神经方法

IF 5 4区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Intelligent Systems and Technology

Pub Date : 2024-05-08 DOI: 10.1145/3660639

Mandar Sharma, Ajay Kumar Gogineni, Naren Ramakrishnan

The neural boom that has sparked natural language processing (NLP) research throughout the last decade has similarly led to significant innovations in data-to-text generation (D2T). This survey offers a consolidated view into the neural D2T paradigm with a structured examination of the approaches, benchmark datasets, and evaluation protocols. This survey draws boundaries separating D2T from the rest of the natural language generation (NLG) landscape, encompassing an up-to-date synthesis of the literature, and highlighting the stages of technological adoption from within and outside the greater NLG umbrella. With this holistic view, we highlight promising avenues for D2T research that not only focus on the design of linguistically capable systems but also systems that exhibit fairness and accountability.

过去十年间，神经技术的蓬勃发展推动了自然语言处理（NLP）研究的发展，同样也带来了数据到文本生成（D2T）领域的重大创新。本调查通过对各种方法、基准数据集和评估协议的结构化审查，为神经 D2T 范例提供了一个综合视角。本调查将 D2T 与自然语言生成（NLG）的其他领域区分开来，包括最新的文献综述，并强调了自然语言生成领域内外的技术应用阶段。通过这种全面的视角，我们强调了 D2T 研究的前景，这些研究不仅关注语言能力系统的设计，还关注展现公平性和问责制的系统。

引用次数: 0

Developing Time Series Forecasting Models with Generative Large Language Models 利用生成式大型语言模型开发时间序列预测模型

IF 5 4区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Intelligent Systems and Technology

Pub Date : 2024-05-07 DOI: 10.1145/3663485

Juan Morales-García, Antonio Llanes, Francisco Arcas-Túnez, Fernando Terroso-Sáenz

Nowadays, Generative Large Language Models (GLLMs) have made a significant impact in the field of Artificial Intelligence (AI). One of the domains extensively explored for these models is their ability as generators of functional source code for software projects. Nevertheless, their potential as assistants to write the code needed to generate and model Machine Learning (ML) or Deep Learning (DL) architectures has not been fully explored to date. For this reason, this work focuses on evaluating the extent to which different tools based on GLLMs, such as ChatGPT or Copilot, are able to correctly define the source code necessary to generate viable predictive models. The use case defined is the forecasting of a time series that reports the indoor temperature of a greenhouse. The results indicate that, while it is possible to achieve good accuracy metrics with simple predictive models generated by GLLMs, the composition of predictive models with complex architectures using GLLMs is still far from improving the accuracy of predictive models generated by human data scientists.

如今，生成式大型语言模型（GLLMs）在人工智能（AI）领域产生了重大影响。这些模型被广泛探索的领域之一是它们作为软件项目功能源代码生成器的能力。然而，迄今为止，它们作为编写生成机器学习（ML）或深度学习（DL）架构所需的代码的助手的潜力尚未得到充分挖掘。因此，这项工作的重点是评估基于 GLLM 的不同工具（如 ChatGPT 或 Copilot）在多大程度上能够正确定义生成可行预测模型所需的源代码。所定义的用例是预测报告温室室内温度的时间序列。结果表明，虽然使用 GLLMs 生成的简单预测模型可以达到很好的准确度指标，但使用 GLLMs 组成具有复杂架构的预测模型仍然远远无法提高人类数据科学家生成的预测模型的准确度。

引用次数: 0

DIRECT: Dual Interpretable Recommendation with Multi-aspect Word Attribution DIRECT：具有多方面词语归属的双重可解释推荐

IF 5 4区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Intelligent Systems and Technology

Pub Date : 2024-05-06 DOI: 10.1145/3663483

Xuansheng Wu, Hanqin Wan, Qiaoyu Tan, Wenlin Yao, Ninghao Liu

Recommending products to users with intuitive explanations helps improve the system in transparency, persuasiveness, and satisfaction. Existing interpretation techniques include post-hoc methods and interpretable modeling. The former category could quantitatively analyze input contribution to model prediction but has limited interpretation faithfulness, while the latter could explain model internal mechanisms but may not directly attribute model predictions to input features. In this study, we propose a novel Dual Interpretable Recommendation model called DIRECT, which integrates ideas of the two interpretation categories to inherit their advantages and avoid limitations. Specifically, DIRECT makes use of item descriptions as explainable evidence for recommendation. First, similar to the post-hoc interpretation, DIRECT could attribute the prediction of a user preference score to textual words of the item descriptions. The attribution of each word is related to its sentiment polarity and word importance, where a word is important if it corresponds to an item aspect that the user is interested in. Second, to improve the interpretability of embedding space, we propose to extract high-level concepts from embeddings, where each concept corresponds to an item aspect. To learn discriminative concepts, we employ a concept-bottleneck layer, and maximize the coding rate reduction on word-aspect embeddings by leveraging a word-word affinity graph extracted from a pre-trained language model. In this way, DIRECT simultaneously achieves faithful attribution and usable interpretation of embedding space. We also show that DIRECT achieves linear inference time complexity regarding the length of item reviews. We conduct experiments including ablation studies on five real-world datasets. Quantitative analysis, visualizations, and case studies verify the interpretability of DIRECT. Our code is available at: https://github.com/JacksonWuxs/DIRECT.

通过直观的解释向用户推荐产品有助于提高系统的透明度、说服力和满意度。现有的解释技术包括事后方法和可解释建模。前者可以定量分析输入对模型预测的贡献，但解释的忠实性有限；后者可以解释模型的内部机制，但可能无法将模型预测直接归因于输入特征。在本研究中，我们提出了一种名为 DIRECT 的新型双重可解释推荐模型，它整合了两种解释类别的思想，继承了它们的优点，避免了它们的局限性。具体来说，DIRECT 利用项目描述作为可解释的推荐证据。首先，与事后解释类似，DIRECT 可以将用户偏好分数的预测归因于项目描述中的文字词句。每个词的归因都与其情感极性和词的重要性有关，如果一个词与用户感兴趣的项目方面相对应，那么这个词就是重要的。其次，为了提高嵌入空间的可解释性，我们建议从嵌入中提取高级概念，每个概念对应一个项目方面。为了学习辨别概念，我们采用了一个概念瓶颈层，并利用从预先训练的语言模型中提取的词-词亲和图，最大限度地降低词-词嵌入的编码率。这样，DIRECT 就能同时实现嵌入空间的忠实归属和可用解释。我们还证明，DIRECT 在项目评论长度方面实现了线性推理时间复杂性。我们在五个真实世界数据集上进行了实验，包括消融研究。定量分析、可视化和案例研究验证了 DIRECT 的可解释性。我们的代码可在以下网址获取：https://github.com/JacksonWuxs/DIRECT。

{"title":"DIRECT: Dual Interpretable Recommendation with Multi-aspect Word Attribution","authors":"Xuansheng Wu, Hanqin Wan, Qiaoyu Tan, Wenlin Yao, Ninghao Liu","doi":"10.1145/3663483","DOIUrl":"https://doi.org/10.1145/3663483","url":null,"abstract":"Recommending products to users with intuitive explanations helps improve the system in transparency, persuasiveness, and satisfaction. Existing interpretation techniques include post-hoc methods and interpretable modeling. The former category could quantitatively analyze input contribution to model prediction but has limited interpretation faithfulness, while the latter could explain model internal mechanisms but may not directly attribute model predictions to input features. In this study, we propose a novel <underline>D</underline>ual <underline>I</underline>nterpretable <underline>Rec</underline>ommenda<underline>t</underline>ion model called DIRECT, which integrates ideas of the two interpretation categories to inherit their advantages and avoid limitations. Specifically, DIRECT makes use of item descriptions as explainable evidence for recommendation. First, similar to the post-hoc interpretation, DIRECT could attribute the prediction of a user preference score to textual words of the item descriptions. The attribution of each word is related to its sentiment polarity and word importance, where a word is important if it corresponds to an item aspect that the user is interested in. Second, to improve the interpretability of embedding space, we propose to extract high-level concepts from embeddings, where each concept corresponds to an item aspect. To learn discriminative concepts, we employ a concept-bottleneck layer, and maximize the coding rate reduction on word-aspect embeddings by leveraging a word-word affinity graph extracted from a pre-trained language model. In this way, DIRECT simultaneously achieves faithful attribution and usable interpretation of embedding space. We also show that DIRECT achieves linear inference time complexity regarding the length of item reviews. We conduct experiments including ablation studies on five real-world datasets. Quantitative analysis, visualizations, and case studies verify the interpretability of DIRECT. Our code is available at: https://github.com/JacksonWuxs/DIRECT.","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"18 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2024-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140887025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Addressing Data Challenges to Drive the Transformation of Smart Cities 应对数据挑战，推动智慧城市转型

IF 5 4区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Intelligent Systems and Technology

Pub Date : 2024-05-03 DOI: 10.1145/3663482

Ekaterina Gilman, Francesca Bugiotti, Ahmed Khalid, Hassan Mehmood, Panos Kostakos, Lauri Tuovinen, Johanna Ylipulli, Xiang Su, Denzil Ferreira

Cities serve as vital hubs of economic activity and knowledge generation and dissemination. As such, cities bear a significant responsibility to uphold environmental protection measures while promoting the welfare and living comfort of their residents. There are diverse views on the development of smart cities, from integrating Information and Communication Technologies into urban environments for better operational decisions to supporting sustainability, wealth, and comfort of people. However, for all these cases, data is the key ingredient and enabler for the vision and realization of smart cities. This article explores the challenges associated with smart city data. We start with gaining an understanding of the concept of a smart city, how to measure that the city is a smart one, and what architectures and platforms exist to develop one. Afterwards, we research the challenges associated with the data of the cities, including availability, heterogeneity, management, analysis, privacy, and security. Finally, we discuss ethical issues. This article aims to serve as a “one-stop shop” covering data-related issues of smart cities with references for diving deeper into particular topics of interest.

城市是经济活动、知识创造和传播的重要枢纽。因此，城市在促进居民福利和生活舒适度的同时，也承担着维护环境保护措施的重要责任。关于智慧城市的发展，从将信息和通信技术融入城市环境以做出更好的运营决策，到支持可持续发展、创造财富和提高人们的生活舒适度，存在着各种不同的观点。然而，在所有这些情况下，数据都是实现智慧城市愿景的关键要素和推动因素。本文探讨了与智慧城市数据相关的挑战。我们首先要了解智慧城市的概念，如何衡量城市是否是智慧城市，以及开发智慧城市有哪些架构和平台。然后，我们研究与城市数据相关的挑战，包括可用性、异构性、管理、分析、隐私和安全。最后，我们讨论了伦理问题。本文旨在提供 "一站式服务"，涵盖与智慧城市数据相关的问题，并为深入探讨感兴趣的特定主题提供参考。

{"title":"Addressing Data Challenges to Drive the Transformation of Smart Cities","authors":"Ekaterina Gilman, Francesca Bugiotti, Ahmed Khalid, Hassan Mehmood, Panos Kostakos, Lauri Tuovinen, Johanna Ylipulli, Xiang Su, Denzil Ferreira","doi":"10.1145/3663482","DOIUrl":"https://doi.org/10.1145/3663482","url":null,"abstract":"Cities serve as vital hubs of economic activity and knowledge generation and dissemination. As such, cities bear a significant responsibility to uphold environmental protection measures while promoting the welfare and living comfort of their residents. There are diverse views on the development of smart cities, from integrating Information and Communication Technologies into urban environments for better operational decisions to supporting sustainability, wealth, and comfort of people. However, for all these cases, data is the key ingredient and enabler for the vision and realization of smart cities. This article explores the challenges associated with smart city data. We start with gaining an understanding of the concept of a smart city, how to measure that the city is a smart one, and what architectures and platforms exist to develop one. Afterwards, we research the challenges associated with the data of the cities, including availability, heterogeneity, management, analysis, privacy, and security. Finally, we discuss ethical issues. This article aims to serve as a “one-stop shop” covering data-related issues of smart cities with references for diving deeper into particular topics of interest.","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"28 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140834479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Analyzing Robustness of Automatic Scientific Claim Verification Tools against Adversarial Rephrasing Attacks 分析自动科学主张验证工具在对抗性改写攻击时的鲁棒性

IF 5 4区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Intelligent Systems and Technology

Pub Date : 2024-05-02 DOI: 10.1145/3663481

Janet Layne, Qudrat E Alahy Ratul, Edoardo Serra, Sushil Jajodia

The coronavirus pandemic has fostered an explosion of misinformation about the disease, including the risk and effectiveness of vaccination. AI tools for automatic Scientific Claim Verification (SCV) can be crucial to defeat misinformation campaigns spreading through social media channels. However, over the past years, many concerns have been raised about the robustness of AI to adversarial attacks, and the field of automatic scientific claim verification is not exempt. The risk is that such SCV tools may reinforce and legitimize the spread of fake scientific claims rather than refute them. This paper investigates the problem of generating adversarial attacks for SCV tools and shows that it is far more difficult than the generic NLP adversarial attack problem. The current NLP adversarial attack generators, when applied to SCV, often generate modified claims with entirely different meaning from the original. Even when the meaning is preserved, the modification of the generated claim is too simplistic (only a single word is changed), leaving many weaknesses of the SCV tools undiscovered. We propose T5-ParEvo, an iterative evolutionary attack generator, that is able to generate more complex and creative attacks while better preserving the semantics of the original claim. Using detailed quantitative and qualitative analysis, we demonstrate the efficacy of T5-ParEvo in comparison with existing attack generators.

冠状病毒大流行引发了有关该疾病（包括疫苗接种的风险和效果）的错误信息爆炸。用于自动科学索赔验证（SCV）的人工智能工具对于战胜通过社交媒体渠道传播的错误信息至关重要。然而，在过去几年中，人们对人工智能是否能抵御对抗性攻击提出了许多担忧，自动科学主张验证领域也不例外。风险在于，此类 SCV 工具可能会强化虚假科学主张的传播并使之合法化，而不是对其进行反驳。本文研究了为SCV工具生成对抗攻击的问题，结果表明它比一般的NLP对抗攻击问题要难得多。当前的 NLP 对抗性攻击生成器在应用于 SCV 时，往往会生成与原始含义完全不同的修改后的声明。即使保留了原意，生成的索赔修改也过于简单（只修改了一个单词），导致 SCV 工具的许多弱点未被发现。我们提出了一种迭代进化攻击生成器 T5-ParEvo，它能够生成更复杂、更有创意的攻击，同时更好地保留原始索赔的语义。通过详细的定量和定性分析，我们证明了 T5-ParEvo 与现有攻击生成器相比的功效。

{"title":"Analyzing Robustness of Automatic Scientific Claim Verification Tools against Adversarial Rephrasing Attacks","authors":"Janet Layne, Qudrat E Alahy Ratul, Edoardo Serra, Sushil Jajodia","doi":"10.1145/3663481","DOIUrl":"https://doi.org/10.1145/3663481","url":null,"abstract":"The coronavirus pandemic has fostered an explosion of misinformation about the disease, including the risk and effectiveness of vaccination. AI tools for automatic Scientific Claim Verification (SCV) can be crucial to defeat misinformation campaigns spreading through social media channels. However, over the past years, many concerns have been raised about the robustness of AI to adversarial attacks, and the field of automatic scientific claim verification is not exempt. The risk is that such SCV tools may reinforce and legitimize the spread of fake scientific claims rather than refute them. This paper investigates the problem of generating adversarial attacks for SCV tools and shows that it is far more difficult than the generic NLP adversarial attack problem. The current NLP adversarial attack generators, when applied to SCV, often generate modified claims with entirely different meaning from the original. Even when the meaning is preserved, the modification of the generated claim is too simplistic (only a single word is changed), leaving many weaknesses of the SCV tools undiscovered. We propose T5-ParEvo, an iterative evolutionary attack generator, that is able to generate more complex and creative attacks while better preserving the semantics of the original claim. Using detailed quantitative and qualitative analysis, we demonstrate the efficacy of T5-ParEvo in comparison with existing attack generators.","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"25 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140842102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An Explore-Exploit Workload-bounded Strategy for Rare Event Detection in Massive Energy Sensor Time Series 在大规模能量传感器时间序列中检测罕见事件的 "探索-利用 "工作量限制策略

IF 5 4区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Intelligent Systems and Technology

Pub Date : 2024-04-17 DOI: 10.1145/3657641

Lo Pang-Yun Ting, Rong Chao, Chai-Shi Chang, Kun-Ta Chuang

With the rise of Internet-of-Things devices, the analysis of sensor-generated energy time series data has become increasingly important. This is especially crucial for detecting rare events like unusual electricity usage or water leakages in residential and commercial buildings, which is essential for optimizing energy efficiency and reducing costs. However, existing detection methods on large-scale data may fail to correctly detect rare events when they do not behave significantly differently from standard events or when their attributes are non-stationary. Additionally, the capacity of computational resources to analyze all time series data generated by an increasing number of sensors becomes a challenge. This situation creates an emergent demand for a workload-bounded strategy. To ensure both effectiveness and efficiency in detecting rare events in massive energy time series, we propose a heuristic-based framework called HALE. This framework utilizes an explore-exploit selection process that is specifically designed to recognize potential features of rare events in energy time series. HALE involves constructing an attribute-aware graph to preserve the attribute information of rare events. A heuristic-based random walk is then derived based on partial labels received at each time period to discover the non-stationarity of rare events. Potential rare event data is selected from the attribute-aware graph, and existing detection models are applied for final confirmation. Our study, which was conducted on three actual energy datasets, demonstrates that the HALE framework is both effective and efficient in its detection capabilities. This underscores its practicality in delivering cost-effective energy monitoring services.

随着物联网设备的兴起，对传感器生成的能源时间序列数据进行分析变得越来越重要。这对于检测住宅和商业建筑中的异常用电或漏水等罕见事件尤为重要，而这对于优化能效和降低成本至关重要。然而，当罕见事件的行为与标准事件无明显差异或其属性为非平稳时，现有的大规模数据检测方法可能无法正确检测到罕见事件。此外，要分析越来越多传感器产生的所有时间序列数据，计算资源的容量也成为了一个挑战。在这种情况下，对有工作量限制的策略提出了新的要求。为了确保在海量能源时间序列中检测罕见事件的有效性和效率，我们提出了一种基于启发式的框架，称为 HALE。该框架采用探索-开发选择流程，专门用于识别能源时间序列中罕见事件的潜在特征。HALE 包括构建一个属性感知图，以保留罕见事件的属性信息。然后，根据每个时间段收到的部分标签推导出基于启发式的随机行走，以发现罕见事件的非平稳性。从属性感知图中选择潜在的罕见事件数据，并应用现有的检测模型进行最终确认。我们在三个实际能源数据集上进行的研究表明，HALE 框架的检测能力既有效又高效。这凸显了它在提供具有成本效益的能源监测服务方面的实用性。

{"title":"An Explore-Exploit Workload-bounded Strategy for Rare Event Detection in Massive Energy Sensor Time Series","authors":"Lo Pang-Yun Ting, Rong Chao, Chai-Shi Chang, Kun-Ta Chuang","doi":"10.1145/3657641","DOIUrl":"https://doi.org/10.1145/3657641","url":null,"abstract":"With the rise of Internet-of-Things devices, the analysis of sensor-generated energy time series data has become increasingly important. This is especially crucial for detecting rare events like unusual electricity usage or water leakages in residential and commercial buildings, which is essential for optimizing energy efficiency and reducing costs. However, existing detection methods on large-scale data may fail to correctly detect rare events when they do not behave significantly differently from standard events or when their attributes are non-stationary. Additionally, the capacity of computational resources to analyze all time series data generated by an increasing number of sensors becomes a challenge. This situation creates an emergent demand for a workload-bounded strategy. To ensure both effectiveness and efficiency in detecting rare events in massive energy time series, we propose a heuristic-based framework called HALE. This framework utilizes an explore-exploit selection process that is specifically designed to recognize potential features of rare events in energy time series. HALE involves constructing an attribute-aware graph to preserve the attribute information of rare events. A heuristic-based random walk is then derived based on partial labels received at each time period to discover the non-stationarity of rare events. Potential rare event data is selected from the attribute-aware graph, and existing detection models are applied for final confirmation. Our study, which was conducted on three actual energy datasets, demonstrates that the HALE framework is both effective and efficient in its detection capabilities. This underscores its practicality in delivering cost-effective energy monitoring services.","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"16 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140615375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0