Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval最新文献

英文中文

Enhancing Graph Neural Networks for Recommender Systems 用于推荐系统的增强图神经网络

Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval

Pub Date : 2020-07-25 DOI: 10.1145/3397271.3401456

Siwei Liu

Recommender systems lie at the heart of many online services such as E-commerce, social media platforms and advertising. To keep users engaged and satisfied with the displayed items, recommender systems usually use the users' historical interactions containing their interests and purchase habits to make personalised recommendations. Recently, Graph Neural Networks (GNNs) have emerged as a technique that can effectively learn representations from structured graph data. By treating the traditional user-item interaction matrix as a bipartite graph, many existing graph-based recommender systems (GBRS) have been shown to achieve state-of-the-art performance when employing GNNs. However, the existing GBRS approaches still have several limitations, which prevent the GNNs from achieving their full potential. In this work, we propose to enhance the performance of the GBRS approaches along several research directions, namely leveraging additional items and users' side information, extending the existing undirected graphs to account for social influence among users, and enhancing their underlying optimisation criterion. In the following, we describe these proposed research directions.

推荐系统是电子商务、社交媒体平台和广告等许多在线服务的核心。为了让用户对所展示的商品保持参与和满意，推荐系统通常会使用用户的历史互动记录，包括他们的兴趣和购买习惯，来进行个性化推荐。近年来，图神经网络(gnn)作为一种能够有效地从结构化图数据中学习表征的技术而出现。通过将传统的用户-物品交互矩阵视为二部图，许多现有的基于图的推荐系统(GBRS)在使用gnn时已经被证明可以达到最先进的性能。然而，现有的GBRS方法仍然存在一些局限性，这阻碍了gnn充分发挥其潜力。在这项工作中，我们建议沿着几个研究方向增强GBRS方法的性能，即利用额外的项目和用户的侧信息，扩展现有的无向图以考虑用户之间的社会影响，并增强其底层优化标准。下面，我们将对这些提出的研究方向进行描述。

{"title":"Enhancing Graph Neural Networks for Recommender Systems","authors":"Siwei Liu","doi":"10.1145/3397271.3401456","DOIUrl":"https://doi.org/10.1145/3397271.3401456","url":null,"abstract":"Recommender systems lie at the heart of many online services such as E-commerce, social media platforms and advertising. To keep users engaged and satisfied with the displayed items, recommender systems usually use the users' historical interactions containing their interests and purchase habits to make personalised recommendations. Recently, Graph Neural Networks (GNNs) have emerged as a technique that can effectively learn representations from structured graph data. By treating the traditional user-item interaction matrix as a bipartite graph, many existing graph-based recommender systems (GBRS) have been shown to achieve state-of-the-art performance when employing GNNs. However, the existing GBRS approaches still have several limitations, which prevent the GNNs from achieving their full potential. In this work, we propose to enhance the performance of the GBRS approaches along several research directions, namely leveraging additional items and users' side information, extending the existing undirected graphs to account for social influence among users, and enhancing their underlying optimisation criterion. In the following, we describe these proposed research directions.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133208495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Copula Guided Neural Topic Modelling for Short Texts 基于Copula的短文本神经主题建模

Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval

Pub Date : 2020-07-25 DOI: 10.1145/3397271.3401245

Lihui Lin, Hongyu Jiang, Yanghui Rao

Extracting the topical information from documents is important for public opinion analysis, text classification, and information retrieval tasks. Compared with identifying a wide variety of topics from long documents, it is challenging to generate a concentrated topic distribution for each short message. Although this problem can be tackled by adjusting the hyper-parameters in traditional topic models such as Latent Dirichlet Allocation, it remains an open problem in neural topic modelling. In this paper, we focus on adapting the popular Auto-Encoding Variational Bayes based neural topic models to short texts, by exploring the Archimedean copulas to guide the estimated topic distributions derived from linear projected samples of re-parameterized posterior distributions. Experimental results show the superiority of our method when compared with existing neural topic models in terms of perplexity, topic coherence, and classification accuracy.

从文档中提取主题信息对于舆情分析、文本分类和信息检索任务具有重要意义。与从长文档中识别各种主题相比，为每条短消息生成集中的主题分布具有挑战性。虽然可以通过调整潜狄利克雷分配等传统主题模型的超参数来解决这一问题，但它仍然是神经主题建模中的一个开放性问题。在本文中，我们将流行的基于自编码变分贝叶斯的神经主题模型应用于短文本，通过探索阿基米德copulas来指导由重新参数化后验分布的线性投影样本导出的估计主题分布。实验结果表明，与现有神经主题模型相比，该方法在困惑度、主题一致性和分类精度方面具有优势。

引用次数: 14

Efficient Graph Query Processing over Geo-Distributed Datacenters 基于地理分布数据中心的高效图形查询处理

Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval

Pub Date : 2020-07-25 DOI: 10.1145/3397271.3401157

Ye Yuan, Delong Ma, Z. Wen, Yuliang Ma, Guoren Wang, Lei Chen

Graph queries have emerged as one of the fundamental techniques to support modern search services, such as PageRank web search, social networking search and knowledge graph search. As such graphs are maintained globally and very huge (e.g., billions of nodes), we need to efficiently process graph queries across multiple geographically distributed datacenters, running geo-distributed graph queries. Existing graph computing frameworks may not work well for geographically distributed datacenters, because they implement a Bulk Synchronous Parallel model that requires excessive inter-datacenter transfers, thereby introducing extremely large latency for query processing. In this paper, we propose GeoGraph --a universal framework to support efficient geo-distributed graph query processing based on clustering datacenters and meta-graph, while reducing the inter-datacenter communication. Our new framework can be applied to many types of graph algorithms without any modification. The framework is developed on the top of Apache Giraph. The experiments were conducted by applying four important graph queries, i.e., shortest path, graph keyword search, subgraph isomorphism and PageRank. The evaluation results show that our proposed framework can achieve up to 82% faster convergence, 42% lower WAN bandwidth usage, and 45% less total monetary cost for the four graph queries, with input graphs stored across ten geo-distributed datacenters.

图查询已经成为支持现代搜索服务的基本技术之一，如PageRank网络搜索、社交网络搜索和知识图搜索。由于这样的图是全局维护的，并且非常庞大(例如，数十亿个节点)，我们需要高效地处理跨多个地理分布式数据中心的图查询，运行地理分布式图查询。现有的图计算框架可能不适用于地理上分布的数据中心，因为它们实现了需要大量数据中心间传输的批量同步并行模型，从而为查询处理带来了极大的延迟。本文提出了基于聚类数据中心和元图的通用框架GeoGraph，以支持高效的地理分布式图形查询处理，同时减少了数据中心间的通信。我们的新框架无需任何修改即可应用于多种类型的图算法。该框架是在Apache Giraph之上开发的。实验采用最短路径、图关键字搜索、子图同构和PageRank四种重要的图查询进行。评估结果表明，我们提出的框架可以实现高达82%的收敛速度，42%的广域网带宽使用降低，以及45%的总货币成本为四个图形查询，输入图形存储在十个地理分布式数据中心。

{"title":"Efficient Graph Query Processing over Geo-Distributed Datacenters","authors":"Ye Yuan, Delong Ma, Z. Wen, Yuliang Ma, Guoren Wang, Lei Chen","doi":"10.1145/3397271.3401157","DOIUrl":"https://doi.org/10.1145/3397271.3401157","url":null,"abstract":"Graph queries have emerged as one of the fundamental techniques to support modern search services, such as PageRank web search, social networking search and knowledge graph search. As such graphs are maintained globally and very huge (e.g., billions of nodes), we need to efficiently process graph queries across multiple geographically distributed datacenters, running geo-distributed graph queries. Existing graph computing frameworks may not work well for geographically distributed datacenters, because they implement a Bulk Synchronous Parallel model that requires excessive inter-datacenter transfers, thereby introducing extremely large latency for query processing. In this paper, we propose GeoGraph --a universal framework to support efficient geo-distributed graph query processing based on clustering datacenters and meta-graph, while reducing the inter-datacenter communication. Our new framework can be applied to many types of graph algorithms without any modification. The framework is developed on the top of Apache Giraph. The experiments were conducted by applying four important graph queries, i.e., shortest path, graph keyword search, subgraph isomorphism and PageRank. The evaluation results show that our proposed framework can achieve up to 82% faster convergence, 42% lower WAN bandwidth usage, and 45% less total monetary cost for the four graph queries, with input graphs stored across ten geo-distributed datacenters.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131848017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Predicting Perceptual Speed from Search Behaviour 从搜索行为预测感知速度

Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval

Pub Date : 2020-07-25 DOI: 10.1145/3397271.3401210

Olivia Foulds, Alessandro Suglia, L. Azzopardi, Martin Halvey

Perceptual Speed (PS) is a cognitive ability that is known to affect multiple factors in Information Retrieval (IR) such as a user's search performance and subjective experience. However PS tests are difficult to administer which limits the design of user-adaptive systems that can automatically infer PS to appropriately accommodate low PS users. Consequently, this paper evaluated whether PS can be automatically classified from search behaviour using several machine learning models trained on features extracted from TREC Common Core search task logs. Our results are encouraging: given a user's interactions from one query, a Decision Tree was able to predict a user's PS as low or high with 86% accuracy. Additionally, we identified different behavioural components for specific PS tests, implying that each PS test measures different aspects of a person's cognitive ability. These findings motivate further work for how best to design search systems that can adapt to individual differences.

感知速度(Perceptual Speed, PS)是一种认知能力，它会影响信息检索(Information Retrieval, IR)中的多个因素，如用户的搜索性能和主观体验。然而，PS测试很难管理，这限制了用户自适应系统的设计，这些系统可以自动推断PS，以适当地适应低PS用户。因此，本文使用从TREC Common Core搜索任务日志中提取的特征训练的几个机器学习模型来评估PS是否可以从搜索行为中自动分类。我们的结果是令人鼓舞的:给定用户与一个查询的交互，决策树能够以86%的准确率预测用户的PS为低或高。此外，我们为特定的PS测试确定了不同的行为成分，这意味着每个PS测试测量的是一个人认知能力的不同方面。这些发现激发了人们进一步研究如何更好地设计能够适应个体差异的搜索系统。

引用次数: 0

Web Table Retrieval using Multimodal Deep Learning 使用多模态深度学习的Web表检索

Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval

Pub Date : 2020-07-25 DOI: 10.1145/3397271.3401120

Roee Shraga, Haggai Roitman, Guy Feigenblat, Mustafa Canim

We address the web table retrieval task, aiming to retrieve and rank web tables as whole answers to a given information need. To this end, we formally define web tables as multimodal objects. We then suggest a neural ranking model, termed MTR, which makes a novel use of Gated Multimodal Units (GMUs) to learn a joint-representation of the query and the different table modalities. We further enhance this model with a co-learning approach which utilizes automatically learned query-independent and query-dependent "helper'' labels. We evaluate the proposed solution using both ad hoc queries (WikiTables) and natural language questions (GNQtables). Overall, we demonstrate that our approach surpasses the performance of previously studied state-of-the-art baselines.

我们解决了web表检索任务，旨在检索和排序web表作为一个给定的信息需求的整体答案。为此，我们正式将web表定义为多模态对象。然后，我们提出了一种称为MTR的神经排序模型，该模型新颖地使用了门控多模态单元(gmu)来学习查询和不同表模态的联合表示。我们通过一种共同学习方法进一步增强了该模型，该方法利用自动学习的查询独立和查询依赖的“助手”标签。我们使用临时查询(wikittables)和自然语言问题(GNQtables)来评估提议的解决方案。总的来说，我们证明了我们的方法超越了以前研究的最先进的基线的性能。

引用次数: 26

A Heterogeneous Graph Neural Model for Cold-start Recommendation 冷启动推荐的异构图神经模型

Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval

Pub Date : 2020-07-25 DOI: 10.1145/3397271.3401252

Siwei Liu, I. Ounis, C. Macdonald, Zaiqiao Meng

The users' historical interactions usually contain their interests and purchase habits based on which personalised recommendations can be made. However, such user interactions are often sparse, leading to the well-known cold-start problem when a user has no or very few interactions. In this paper, we propose a new recommendation model, named Heterogeneous Graph Neural Recommender (HGNR), to tackle the cold-start problem while ensuring effective recommendations for all users. Our HGNR model learns users and items' embeddings by using the Graph Convolutional Network based on a heterogeneous graph, which is constructed from user-item interactions, social links and semantic links predicted from the social network and textual reviews. Our extensive empirical experiments on three public datasets demonstrate that HGNR significantly outperforms competitive baselines in terms of the Normalised Discounted Cumulative Gain and Hit Ratio measures.

用户的历史互动通常包含他们的兴趣和购买习惯，可以根据这些兴趣和习惯做出个性化的推荐。然而，这样的用户交互通常是稀疏的，当用户没有或很少交互时，就会导致众所周知的冷启动问题。在本文中，我们提出了一种新的推荐模型，称为异构图神经推荐(HGNR)，以解决冷启动问题，同时确保对所有用户的有效推荐。我们的HGNR模型通过使用基于异构图的图卷积网络来学习用户和项目的嵌入，该异构图由用户-项目交互、社交网络预测的社交链接和语义链接以及文本评论构建而成。我们在三个公共数据集上进行了广泛的实证实验，结果表明，在标准化贴现累积增益和命中率度量方面，HGNR显著优于竞争基准。

引用次数: 70

Legal Intelligence: Algorithmic, Data, and Social Challenges 法律情报:算法、数据和社会挑战

Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval

Pub Date : 2020-07-25 DOI: 10.1145/3397271.3401466

Changlong Sun, Yating Zhang, Xiaozhong Liu, Fei Wu

In the digital era, information retrieval, text/knowledge mining, and NLP techniques are playing increasingly vital roles in legal domain. While the open datasets and innovative deep learning methodologies provide critical potentials, in the legal-domain, efforts need to be made to transfer the theoretical/algorithmic models into the real applications to assist users, lawyers, judges and the legal professions to solve the real problems. The objective of this workshop is to aggregate studies/applications of text mining/retrieval and NLP automation in the context of classical/novel legal tasks, which address algorithmic, data and social challenges of legal intelligence. Keynote and invited presentations from industry and academic will be able to fill the gap between ambition and execution in the legal domain.

在数字时代，信息检索、文本/知识挖掘和自然语言处理技术在法律领域发挥着越来越重要的作用。虽然开放的数据集和创新的深度学习方法提供了关键的潜力，但在法律领域，需要努力将理论/算法模型转化为实际应用，以帮助用户、律师、法官和法律专业人员解决实际问题。本次研讨会的目的是在经典/新法律任务的背景下，汇集文本挖掘/检索和NLP自动化的研究/应用，这些研究/应用解决了法律智能的算法、数据和社会挑战。来自行业和学术界的主题演讲和邀请演讲将能够填补法律领域的雄心和执行之间的差距。

引用次数: 5

Regional Relation Modeling for Visual Place Recognition 视觉位置识别的区域关系建模

Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval

Pub Date : 2020-07-25 DOI: 10.1145/3397271.3401176

Yingying Zhu, Biao Li, Jiong Wang, Zhou Zhao

In the process of visual perception, humans perceive not only the appearance of objects existing in a place but also their relationships (e.g. spatial layout). However, the dominant works on visual place recognition are always based on the assumption that two images depict the same place if they contain enough similar objects, while the relation information is neglected. In this paper, we propose a regional relation module which models the regional relationships and converts the convolutional feature maps to the relational feature maps. We further design a cascaded pooling method to get discriminative relation descriptors by preventing the influence of confusing relations and preserving as much useful information as possible. Extensive experiments on two place recognition benchmarks demonstrate that training with the proposed regional relation module improves the appearance descriptors and the relation descriptors are complementary to appearance descriptors. When these two kinds of descriptors are concatenated together, the resulting combined descriptors outperform the state-of-the-art methods.

在视觉感知的过程中，人们不仅感知到一个地方存在的物体的外观，而且还感知到它们之间的关系(如空间布局)。然而，在视觉位置识别方面的主流工作总是基于两幅图像中包含足够多的相似物体就描绘了同一个地方的假设，而忽略了关系信息。本文提出了一个区域关系模块，该模块对区域关系进行建模，并将卷积特征映射转换为关系特征映射。我们进一步设计了一种级联池化方法，通过防止混淆关系的影响并保留尽可能多的有用信息来获得判别关系描述符。在两个地点识别基准上的大量实验表明，使用所提出的区域关系模块进行训练可以改善外观描述符，并且关系描述符与外观描述符是互补的。当这两种描述符连接在一起时，所得到的组合描述符优于最先进的方法。

{"title":"Regional Relation Modeling for Visual Place Recognition","authors":"Yingying Zhu, Biao Li, Jiong Wang, Zhou Zhao","doi":"10.1145/3397271.3401176","DOIUrl":"https://doi.org/10.1145/3397271.3401176","url":null,"abstract":"In the process of visual perception, humans perceive not only the appearance of objects existing in a place but also their relationships (e.g. spatial layout). However, the dominant works on visual place recognition are always based on the assumption that two images depict the same place if they contain enough similar objects, while the relation information is neglected. In this paper, we propose a regional relation module which models the regional relationships and converts the convolutional feature maps to the relational feature maps. We further design a cascaded pooling method to get discriminative relation descriptors by preventing the influence of confusing relations and preserving as much useful information as possible. Extensive experiments on two place recognition benchmarks demonstrate that training with the proposed regional relation module improves the appearance descriptors and the relation descriptors are complementary to appearance descriptors. When these two kinds of descriptors are concatenated together, the resulting combined descriptors outperform the state-of-the-art methods.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117157636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Knowledge Graph-based Event Embedding Framework for Financial Quantitative Investments 基于知识图的金融量化投资事件嵌入框架

Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval

Pub Date : 2020-07-25 DOI: 10.1145/3397271.3401427

Dawei Cheng, Fangzhou Yang, Xiaoyang Wang, Ying Zhang, Liqing Zhang

Event representative learning aims to embed news events into continuous space vectors for capturing syntactic and semantic information from text corpus, which is benefit to event-driven quantitative investments. However, the financial market reaction of events is also influenced by the lead-lag effect, which is driven by internal relationships. Therefore, in this paper, we present a knowledge graph-based event embedding framework for quantitative investments. In particular, we first extract structured events from raw texts, and construct the knowledge graph with the mentioned entities and relations simultaneously. Then, we leverage a joint model to merge the knowledge graph information into the objective function of an event embedding learning model. The learned representations are fed as inputs of downstream quantitative trading methods. Extensive experiments on real-world dataset demonstrate the effectiveness of the event embeddings learned from financial news and knowledge graphs. We also deploy the framework for quantitative algorithm trading. The accumulated portfolio return contributed by our method significantly outperforms other baselines.

事件代表学习旨在将新闻事件嵌入到连续的空间向量中，从文本语料库中获取句法和语义信息，这有利于事件驱动的量化投资。然而，金融市场对事件的反应也受到超前滞后效应的影响，这是由内部关系驱动的。因此，本文提出了一种基于知识图的量化投资事件嵌入框架。特别是，我们首先从原始文本中提取结构化事件，并同时使用所提到的实体和关系构建知识图谱。然后，我们利用联合模型将知识图信息合并到事件嵌入学习模型的目标函数中。学习到的表示作为下游量化交易方法的输入。在真实数据集上的大量实验证明了从金融新闻和知识图中学习的事件嵌入的有效性。我们还部署了定量算法交易的框架。我们的方法带来的累积投资组合收益明显优于其他基准。

{"title":"Knowledge Graph-based Event Embedding Framework for Financial Quantitative Investments","authors":"Dawei Cheng, Fangzhou Yang, Xiaoyang Wang, Ying Zhang, Liqing Zhang","doi":"10.1145/3397271.3401427","DOIUrl":"https://doi.org/10.1145/3397271.3401427","url":null,"abstract":"Event representative learning aims to embed news events into continuous space vectors for capturing syntactic and semantic information from text corpus, which is benefit to event-driven quantitative investments. However, the financial market reaction of events is also influenced by the lead-lag effect, which is driven by internal relationships. Therefore, in this paper, we present a knowledge graph-based event embedding framework for quantitative investments. In particular, we first extract structured events from raw texts, and construct the knowledge graph with the mentioned entities and relations simultaneously. Then, we leverage a joint model to merge the knowledge graph information into the objective function of an event embedding learning model. The learned representations are fed as inputs of downstream quantitative trading methods. Extensive experiments on real-world dataset demonstrate the effectiveness of the event embeddings learned from financial news and knowledge graphs. We also deploy the framework for quantitative algorithm trading. The accumulated portfolio return contributed by our method significantly outperforms other baselines.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116456515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 46

Multi-grouping Robust Fair Ranking 多分组稳健公平排名

Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval

Pub Date : 2020-07-25 DOI: 10.1145/3397271.3401292

Thibaut Thonet, J. Renders

Rankings are at the core of countless modern applications and thus play a major role in various decision making scenarios. When such rankings are produced by data-informed, machine learning-based algorithms, the potentially harmful biases contained in the data and algorithms are likely to be reproduced and even exacerbated. This motivated recent research to investigate a methodology for fair ranking, as a way to correct the aforementioned biases. Current approaches to fair ranking consider that the protected groups, i.e., the partition of the population potentially impacted by the biases, are known. However, in a realistic scenario, this assumption might not hold as different biases may lead to different partitioning into protected groups. Only accounting for one such partition (i.e., grouping) would still lead to potential unfairness with respect to the other possible groupings. Therefore, in this paper, we study the problem of designing fair ranking algorithms without knowing in advance the groupings that will be used later to assess their fairness. The approach that we follow is to rely on a carefully chosen set of groupings when deriving the ranked lists, and we empirically investigate which selection strategies are the most effective. An efficient two-step greedy brute-force method is also proposed to embed our strategy. As benchmark for this study, we adopted the dataset and setting composing the TREC 2019 Fair Ranking track.

排名是无数现代应用程序的核心，因此在各种决策场景中发挥着重要作用。当这样的排名由基于数据的、基于机器学习的算法产生时，数据和算法中包含的潜在有害偏见可能会被复制，甚至加剧。这促使最近的研究调查公平排名的方法，作为纠正上述偏见的一种方式。目前公平排名的方法考虑到受保护的群体，即可能受到偏见影响的人口的划分，是已知的。然而，在现实场景中，这个假设可能不成立，因为不同的偏见可能导致不同的受保护组划分。只考虑一个这样的分区(即分组)仍然会导致相对于其他可能的分组的潜在不公平。因此，在本文中，我们研究了设计公平排序算法的问题，而无需事先知道将用于评估其公平性的分组。我们遵循的方法是，在得出排名列表时，依赖于精心选择的一组分组，并根据经验调查哪种选择策略最有效。提出了一种有效的两步贪婪暴力算法来嵌入我们的策略。作为本研究的基准，我们采用了组成TREC 2019公平排名轨道的数据集和设置。

{"title":"Multi-grouping Robust Fair Ranking","authors":"Thibaut Thonet, J. Renders","doi":"10.1145/3397271.3401292","DOIUrl":"https://doi.org/10.1145/3397271.3401292","url":null,"abstract":"Rankings are at the core of countless modern applications and thus play a major role in various decision making scenarios. When such rankings are produced by data-informed, machine learning-based algorithms, the potentially harmful biases contained in the data and algorithms are likely to be reproduced and even exacerbated. This motivated recent research to investigate a methodology for fair ranking, as a way to correct the aforementioned biases. Current approaches to fair ranking consider that the protected groups, i.e., the partition of the population potentially impacted by the biases, are known. However, in a realistic scenario, this assumption might not hold as different biases may lead to different partitioning into protected groups. Only accounting for one such partition (i.e., grouping) would still lead to potential unfairness with respect to the other possible groupings. Therefore, in this paper, we study the problem of designing fair ranking algorithms without knowing in advance the groupings that will be used later to assess their fairness. The approach that we follow is to rely on a carefully chosen set of groupings when deriving the ranked lists, and we empirically investigate which selection strategies are the most effective. An efficient two-step greedy brute-force method is also proposed to embed our strategy. As benchmark for this study, we adopted the dataset and setting composing the TREC 2019 Fair Ranking track.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123663595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀