Recommender systems lie at the heart of many online services such as E-commerce, social media platforms and advertising. To keep users engaged and satisfied with the displayed items, recommender systems usually use the users' historical interactions containing their interests and purchase habits to make personalised recommendations. Recently, Graph Neural Networks (GNNs) have emerged as a technique that can effectively learn representations from structured graph data. By treating the traditional user-item interaction matrix as a bipartite graph, many existing graph-based recommender systems (GBRS) have been shown to achieve state-of-the-art performance when employing GNNs. However, the existing GBRS approaches still have several limitations, which prevent the GNNs from achieving their full potential. In this work, we propose to enhance the performance of the GBRS approaches along several research directions, namely leveraging additional items and users' side information, extending the existing undirected graphs to account for social influence among users, and enhancing their underlying optimisation criterion. In the following, we describe these proposed research directions.
{"title":"Enhancing Graph Neural Networks for Recommender Systems","authors":"Siwei Liu","doi":"10.1145/3397271.3401456","DOIUrl":"https://doi.org/10.1145/3397271.3401456","url":null,"abstract":"Recommender systems lie at the heart of many online services such as E-commerce, social media platforms and advertising. To keep users engaged and satisfied with the displayed items, recommender systems usually use the users' historical interactions containing their interests and purchase habits to make personalised recommendations. Recently, Graph Neural Networks (GNNs) have emerged as a technique that can effectively learn representations from structured graph data. By treating the traditional user-item interaction matrix as a bipartite graph, many existing graph-based recommender systems (GBRS) have been shown to achieve state-of-the-art performance when employing GNNs. However, the existing GBRS approaches still have several limitations, which prevent the GNNs from achieving their full potential. In this work, we propose to enhance the performance of the GBRS approaches along several research directions, namely leveraging additional items and users' side information, extending the existing undirected graphs to account for social influence among users, and enhancing their underlying optimisation criterion. In the following, we describe these proposed research directions.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133208495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Extracting the topical information from documents is important for public opinion analysis, text classification, and information retrieval tasks. Compared with identifying a wide variety of topics from long documents, it is challenging to generate a concentrated topic distribution for each short message. Although this problem can be tackled by adjusting the hyper-parameters in traditional topic models such as Latent Dirichlet Allocation, it remains an open problem in neural topic modelling. In this paper, we focus on adapting the popular Auto-Encoding Variational Bayes based neural topic models to short texts, by exploring the Archimedean copulas to guide the estimated topic distributions derived from linear projected samples of re-parameterized posterior distributions. Experimental results show the superiority of our method when compared with existing neural topic models in terms of perplexity, topic coherence, and classification accuracy.
{"title":"Copula Guided Neural Topic Modelling for Short Texts","authors":"Lihui Lin, Hongyu Jiang, Yanghui Rao","doi":"10.1145/3397271.3401245","DOIUrl":"https://doi.org/10.1145/3397271.3401245","url":null,"abstract":"Extracting the topical information from documents is important for public opinion analysis, text classification, and information retrieval tasks. Compared with identifying a wide variety of topics from long documents, it is challenging to generate a concentrated topic distribution for each short message. Although this problem can be tackled by adjusting the hyper-parameters in traditional topic models such as Latent Dirichlet Allocation, it remains an open problem in neural topic modelling. In this paper, we focus on adapting the popular Auto-Encoding Variational Bayes based neural topic models to short texts, by exploring the Archimedean copulas to guide the estimated topic distributions derived from linear projected samples of re-parameterized posterior distributions. Experimental results show the superiority of our method when compared with existing neural topic models in terms of perplexity, topic coherence, and classification accuracy.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128881265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ye Yuan, Delong Ma, Z. Wen, Yuliang Ma, Guoren Wang, Lei Chen
Graph queries have emerged as one of the fundamental techniques to support modern search services, such as PageRank web search, social networking search and knowledge graph search. As such graphs are maintained globally and very huge (e.g., billions of nodes), we need to efficiently process graph queries across multiple geographically distributed datacenters, running geo-distributed graph queries. Existing graph computing frameworks may not work well for geographically distributed datacenters, because they implement a Bulk Synchronous Parallel model that requires excessive inter-datacenter transfers, thereby introducing extremely large latency for query processing. In this paper, we propose GeoGraph --a universal framework to support efficient geo-distributed graph query processing based on clustering datacenters and meta-graph, while reducing the inter-datacenter communication. Our new framework can be applied to many types of graph algorithms without any modification. The framework is developed on the top of Apache Giraph. The experiments were conducted by applying four important graph queries, i.e., shortest path, graph keyword search, subgraph isomorphism and PageRank. The evaluation results show that our proposed framework can achieve up to 82% faster convergence, 42% lower WAN bandwidth usage, and 45% less total monetary cost for the four graph queries, with input graphs stored across ten geo-distributed datacenters.
{"title":"Efficient Graph Query Processing over Geo-Distributed Datacenters","authors":"Ye Yuan, Delong Ma, Z. Wen, Yuliang Ma, Guoren Wang, Lei Chen","doi":"10.1145/3397271.3401157","DOIUrl":"https://doi.org/10.1145/3397271.3401157","url":null,"abstract":"Graph queries have emerged as one of the fundamental techniques to support modern search services, such as PageRank web search, social networking search and knowledge graph search. As such graphs are maintained globally and very huge (e.g., billions of nodes), we need to efficiently process graph queries across multiple geographically distributed datacenters, running geo-distributed graph queries. Existing graph computing frameworks may not work well for geographically distributed datacenters, because they implement a Bulk Synchronous Parallel model that requires excessive inter-datacenter transfers, thereby introducing extremely large latency for query processing. In this paper, we propose GeoGraph --a universal framework to support efficient geo-distributed graph query processing based on clustering datacenters and meta-graph, while reducing the inter-datacenter communication. Our new framework can be applied to many types of graph algorithms without any modification. The framework is developed on the top of Apache Giraph. The experiments were conducted by applying four important graph queries, i.e., shortest path, graph keyword search, subgraph isomorphism and PageRank. The evaluation results show that our proposed framework can achieve up to 82% faster convergence, 42% lower WAN bandwidth usage, and 45% less total monetary cost for the four graph queries, with input graphs stored across ten geo-distributed datacenters.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131848017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Olivia Foulds, Alessandro Suglia, L. Azzopardi, Martin Halvey
Perceptual Speed (PS) is a cognitive ability that is known to affect multiple factors in Information Retrieval (IR) such as a user's search performance and subjective experience. However PS tests are difficult to administer which limits the design of user-adaptive systems that can automatically infer PS to appropriately accommodate low PS users. Consequently, this paper evaluated whether PS can be automatically classified from search behaviour using several machine learning models trained on features extracted from TREC Common Core search task logs. Our results are encouraging: given a user's interactions from one query, a Decision Tree was able to predict a user's PS as low or high with 86% accuracy. Additionally, we identified different behavioural components for specific PS tests, implying that each PS test measures different aspects of a person's cognitive ability. These findings motivate further work for how best to design search systems that can adapt to individual differences.
感知速度(Perceptual Speed, PS)是一种认知能力,它会影响信息检索(Information Retrieval, IR)中的多个因素,如用户的搜索性能和主观体验。然而,PS测试很难管理,这限制了用户自适应系统的设计,这些系统可以自动推断PS,以适当地适应低PS用户。因此,本文使用从TREC Common Core搜索任务日志中提取的特征训练的几个机器学习模型来评估PS是否可以从搜索行为中自动分类。我们的结果是令人鼓舞的:给定用户与一个查询的交互,决策树能够以86%的准确率预测用户的PS为低或高。此外,我们为特定的PS测试确定了不同的行为成分,这意味着每个PS测试测量的是一个人认知能力的不同方面。这些发现激发了人们进一步研究如何更好地设计能够适应个体差异的搜索系统。
{"title":"Predicting Perceptual Speed from Search Behaviour","authors":"Olivia Foulds, Alessandro Suglia, L. Azzopardi, Martin Halvey","doi":"10.1145/3397271.3401210","DOIUrl":"https://doi.org/10.1145/3397271.3401210","url":null,"abstract":"Perceptual Speed (PS) is a cognitive ability that is known to affect multiple factors in Information Retrieval (IR) such as a user's search performance and subjective experience. However PS tests are difficult to administer which limits the design of user-adaptive systems that can automatically infer PS to appropriately accommodate low PS users. Consequently, this paper evaluated whether PS can be automatically classified from search behaviour using several machine learning models trained on features extracted from TREC Common Core search task logs. Our results are encouraging: given a user's interactions from one query, a Decision Tree was able to predict a user's PS as low or high with 86% accuracy. Additionally, we identified different behavioural components for specific PS tests, implying that each PS test measures different aspects of a person's cognitive ability. These findings motivate further work for how best to design search systems that can adapt to individual differences.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127404190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Roee Shraga, Haggai Roitman, Guy Feigenblat, Mustafa Canim
We address the web table retrieval task, aiming to retrieve and rank web tables as whole answers to a given information need. To this end, we formally define web tables as multimodal objects. We then suggest a neural ranking model, termed MTR, which makes a novel use of Gated Multimodal Units (GMUs) to learn a joint-representation of the query and the different table modalities. We further enhance this model with a co-learning approach which utilizes automatically learned query-independent and query-dependent "helper'' labels. We evaluate the proposed solution using both ad hoc queries (WikiTables) and natural language questions (GNQtables). Overall, we demonstrate that our approach surpasses the performance of previously studied state-of-the-art baselines.
{"title":"Web Table Retrieval using Multimodal Deep Learning","authors":"Roee Shraga, Haggai Roitman, Guy Feigenblat, Mustafa Canim","doi":"10.1145/3397271.3401120","DOIUrl":"https://doi.org/10.1145/3397271.3401120","url":null,"abstract":"We address the web table retrieval task, aiming to retrieve and rank web tables as whole answers to a given information need. To this end, we formally define web tables as multimodal objects. We then suggest a neural ranking model, termed MTR, which makes a novel use of Gated Multimodal Units (GMUs) to learn a joint-representation of the query and the different table modalities. We further enhance this model with a co-learning approach which utilizes automatically learned query-independent and query-dependent \"helper'' labels. We evaluate the proposed solution using both ad hoc queries (WikiTables) and natural language questions (GNQtables). Overall, we demonstrate that our approach surpasses the performance of previously studied state-of-the-art baselines.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114865593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The users' historical interactions usually contain their interests and purchase habits based on which personalised recommendations can be made. However, such user interactions are often sparse, leading to the well-known cold-start problem when a user has no or very few interactions. In this paper, we propose a new recommendation model, named Heterogeneous Graph Neural Recommender (HGNR), to tackle the cold-start problem while ensuring effective recommendations for all users. Our HGNR model learns users and items' embeddings by using the Graph Convolutional Network based on a heterogeneous graph, which is constructed from user-item interactions, social links and semantic links predicted from the social network and textual reviews. Our extensive empirical experiments on three public datasets demonstrate that HGNR significantly outperforms competitive baselines in terms of the Normalised Discounted Cumulative Gain and Hit Ratio measures.
{"title":"A Heterogeneous Graph Neural Model for Cold-start Recommendation","authors":"Siwei Liu, I. Ounis, C. Macdonald, Zaiqiao Meng","doi":"10.1145/3397271.3401252","DOIUrl":"https://doi.org/10.1145/3397271.3401252","url":null,"abstract":"The users' historical interactions usually contain their interests and purchase habits based on which personalised recommendations can be made. However, such user interactions are often sparse, leading to the well-known cold-start problem when a user has no or very few interactions. In this paper, we propose a new recommendation model, named Heterogeneous Graph Neural Recommender (HGNR), to tackle the cold-start problem while ensuring effective recommendations for all users. Our HGNR model learns users and items' embeddings by using the Graph Convolutional Network based on a heterogeneous graph, which is constructed from user-item interactions, social links and semantic links predicted from the social network and textual reviews. Our extensive empirical experiments on three public datasets demonstrate that HGNR significantly outperforms competitive baselines in terms of the Normalised Discounted Cumulative Gain and Hit Ratio measures.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114636858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the digital era, information retrieval, text/knowledge mining, and NLP techniques are playing increasingly vital roles in legal domain. While the open datasets and innovative deep learning methodologies provide critical potentials, in the legal-domain, efforts need to be made to transfer the theoretical/algorithmic models into the real applications to assist users, lawyers, judges and the legal professions to solve the real problems. The objective of this workshop is to aggregate studies/applications of text mining/retrieval and NLP automation in the context of classical/novel legal tasks, which address algorithmic, data and social challenges of legal intelligence. Keynote and invited presentations from industry and academic will be able to fill the gap between ambition and execution in the legal domain.
{"title":"Legal Intelligence: Algorithmic, Data, and Social Challenges","authors":"Changlong Sun, Yating Zhang, Xiaozhong Liu, Fei Wu","doi":"10.1145/3397271.3401466","DOIUrl":"https://doi.org/10.1145/3397271.3401466","url":null,"abstract":"In the digital era, information retrieval, text/knowledge mining, and NLP techniques are playing increasingly vital roles in legal domain. While the open datasets and innovative deep learning methodologies provide critical potentials, in the legal-domain, efforts need to be made to transfer the theoretical/algorithmic models into the real applications to assist users, lawyers, judges and the legal professions to solve the real problems. The objective of this workshop is to aggregate studies/applications of text mining/retrieval and NLP automation in the context of classical/novel legal tasks, which address algorithmic, data and social challenges of legal intelligence. Keynote and invited presentations from industry and academic will be able to fill the gap between ambition and execution in the legal domain.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117016069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the process of visual perception, humans perceive not only the appearance of objects existing in a place but also their relationships (e.g. spatial layout). However, the dominant works on visual place recognition are always based on the assumption that two images depict the same place if they contain enough similar objects, while the relation information is neglected. In this paper, we propose a regional relation module which models the regional relationships and converts the convolutional feature maps to the relational feature maps. We further design a cascaded pooling method to get discriminative relation descriptors by preventing the influence of confusing relations and preserving as much useful information as possible. Extensive experiments on two place recognition benchmarks demonstrate that training with the proposed regional relation module improves the appearance descriptors and the relation descriptors are complementary to appearance descriptors. When these two kinds of descriptors are concatenated together, the resulting combined descriptors outperform the state-of-the-art methods.
{"title":"Regional Relation Modeling for Visual Place Recognition","authors":"Yingying Zhu, Biao Li, Jiong Wang, Zhou Zhao","doi":"10.1145/3397271.3401176","DOIUrl":"https://doi.org/10.1145/3397271.3401176","url":null,"abstract":"In the process of visual perception, humans perceive not only the appearance of objects existing in a place but also their relationships (e.g. spatial layout). However, the dominant works on visual place recognition are always based on the assumption that two images depict the same place if they contain enough similar objects, while the relation information is neglected. In this paper, we propose a regional relation module which models the regional relationships and converts the convolutional feature maps to the relational feature maps. We further design a cascaded pooling method to get discriminative relation descriptors by preventing the influence of confusing relations and preserving as much useful information as possible. Extensive experiments on two place recognition benchmarks demonstrate that training with the proposed regional relation module improves the appearance descriptors and the relation descriptors are complementary to appearance descriptors. When these two kinds of descriptors are concatenated together, the resulting combined descriptors outperform the state-of-the-art methods.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117157636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Event representative learning aims to embed news events into continuous space vectors for capturing syntactic and semantic information from text corpus, which is benefit to event-driven quantitative investments. However, the financial market reaction of events is also influenced by the lead-lag effect, which is driven by internal relationships. Therefore, in this paper, we present a knowledge graph-based event embedding framework for quantitative investments. In particular, we first extract structured events from raw texts, and construct the knowledge graph with the mentioned entities and relations simultaneously. Then, we leverage a joint model to merge the knowledge graph information into the objective function of an event embedding learning model. The learned representations are fed as inputs of downstream quantitative trading methods. Extensive experiments on real-world dataset demonstrate the effectiveness of the event embeddings learned from financial news and knowledge graphs. We also deploy the framework for quantitative algorithm trading. The accumulated portfolio return contributed by our method significantly outperforms other baselines.
{"title":"Knowledge Graph-based Event Embedding Framework for Financial Quantitative Investments","authors":"Dawei Cheng, Fangzhou Yang, Xiaoyang Wang, Ying Zhang, Liqing Zhang","doi":"10.1145/3397271.3401427","DOIUrl":"https://doi.org/10.1145/3397271.3401427","url":null,"abstract":"Event representative learning aims to embed news events into continuous space vectors for capturing syntactic and semantic information from text corpus, which is benefit to event-driven quantitative investments. However, the financial market reaction of events is also influenced by the lead-lag effect, which is driven by internal relationships. Therefore, in this paper, we present a knowledge graph-based event embedding framework for quantitative investments. In particular, we first extract structured events from raw texts, and construct the knowledge graph with the mentioned entities and relations simultaneously. Then, we leverage a joint model to merge the knowledge graph information into the objective function of an event embedding learning model. The learned representations are fed as inputs of downstream quantitative trading methods. Extensive experiments on real-world dataset demonstrate the effectiveness of the event embeddings learned from financial news and knowledge graphs. We also deploy the framework for quantitative algorithm trading. The accumulated portfolio return contributed by our method significantly outperforms other baselines.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116456515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rankings are at the core of countless modern applications and thus play a major role in various decision making scenarios. When such rankings are produced by data-informed, machine learning-based algorithms, the potentially harmful biases contained in the data and algorithms are likely to be reproduced and even exacerbated. This motivated recent research to investigate a methodology for fair ranking, as a way to correct the aforementioned biases. Current approaches to fair ranking consider that the protected groups, i.e., the partition of the population potentially impacted by the biases, are known. However, in a realistic scenario, this assumption might not hold as different biases may lead to different partitioning into protected groups. Only accounting for one such partition (i.e., grouping) would still lead to potential unfairness with respect to the other possible groupings. Therefore, in this paper, we study the problem of designing fair ranking algorithms without knowing in advance the groupings that will be used later to assess their fairness. The approach that we follow is to rely on a carefully chosen set of groupings when deriving the ranked lists, and we empirically investigate which selection strategies are the most effective. An efficient two-step greedy brute-force method is also proposed to embed our strategy. As benchmark for this study, we adopted the dataset and setting composing the TREC 2019 Fair Ranking track.
{"title":"Multi-grouping Robust Fair Ranking","authors":"Thibaut Thonet, J. Renders","doi":"10.1145/3397271.3401292","DOIUrl":"https://doi.org/10.1145/3397271.3401292","url":null,"abstract":"Rankings are at the core of countless modern applications and thus play a major role in various decision making scenarios. When such rankings are produced by data-informed, machine learning-based algorithms, the potentially harmful biases contained in the data and algorithms are likely to be reproduced and even exacerbated. This motivated recent research to investigate a methodology for fair ranking, as a way to correct the aforementioned biases. Current approaches to fair ranking consider that the protected groups, i.e., the partition of the population potentially impacted by the biases, are known. However, in a realistic scenario, this assumption might not hold as different biases may lead to different partitioning into protected groups. Only accounting for one such partition (i.e., grouping) would still lead to potential unfairness with respect to the other possible groupings. Therefore, in this paper, we study the problem of designing fair ranking algorithms without knowing in advance the groupings that will be used later to assess their fairness. The approach that we follow is to rely on a carefully chosen set of groupings when deriving the ranked lists, and we empirically investigate which selection strategies are the most effective. An efficient two-step greedy brute-force method is also proposed to embed our strategy. As benchmark for this study, we adopted the dataset and setting composing the TREC 2019 Fair Ranking track.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123663595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}