Yufan Wang, Zijing Wang, Kai Ming Ting, Yuanyi Shang
This paper aims to solve two enduring challenges in existing trajectory similarity measures: computational inefficiency and the absence of the ‘uniqueness’ property that should be guaranteed in a distance function: dist(X, Y ) = 0 if and only if X = Y , where X and Y are two trajectories. In this work, we present a novel approach utilizing a distributional kernel for trajectory representation and similarity measurement, based on the kernel mean embedding framework. It is the very first time a distributional kernel is used for trajectory representation and similarity measurement. Our method does not rely on point-to-point distances which are used in most existing distances for trajectories. Unlike prevalent learning and deep learning approaches, our method requires no learning. We show the generality of this new approach in anomalous trajectory and sub-trajectory detection. We identify that the distributional kernel has (i) a data-dependent property and the ‘uniqueness’ property which are the key factors that lead to its superior task-specific performance, and (ii) runtime orders of magnitude faster than existing distance measures.
本文旨在解决现有轨迹相似性测量中的两大难题:计算效率低下和缺乏距离函数应保证的 "唯一性 "属性:当且仅当 X = Y 时,dist(X, Y ) = 0,其中 X 和 Y 是两条轨迹。在这项工作中,我们基于核均值嵌入框架,提出了一种利用分布核进行轨迹表示和相似性测量的新方法。这是首次将分布核用于轨迹表示和相似性测量。我们的方法不依赖于点对点距离,而现有的大多数轨迹距离都使用点对点距离。与流行的学习和深度学习方法不同,我们的方法无需学习。我们在异常轨迹和子轨迹检测中展示了这种新方法的通用性。我们发现分布核具有(i)数据依赖性和 "唯一性 "属性,这是导致其在特定任务中性能优越的关键因素,以及(ii)运行时间比现有距离测量方法快几个数量级。
{"title":"A Principled Distributional Approach to Trajectory Similarity Measurement and its Application to Anomaly Detection","authors":"Yufan Wang, Zijing Wang, Kai Ming Ting, Yuanyi Shang","doi":"10.1613/jair.1.15849","DOIUrl":"https://doi.org/10.1613/jair.1.15849","url":null,"abstract":"This paper aims to solve two enduring challenges in existing trajectory similarity measures: computational inefficiency and the absence of the ‘uniqueness’ property that should be guaranteed in a distance function: dist(X, Y ) = 0 if and only if X = Y , where X and Y are two trajectories. In this work, we present a novel approach utilizing a distributional kernel for trajectory representation and similarity measurement, based on the kernel mean embedding framework. It is the very first time a distributional kernel is used for trajectory representation and similarity measurement. Our method does not rely on point-to-point distances which are used in most existing distances for trajectories. Unlike prevalent learning and deep learning approaches, our method requires no learning. We show the generality of this new approach in anomalous trajectory and sub-trajectory detection. We identify that the distributional kernel has (i) a data-dependent property and the ‘uniqueness’ property which are the key factors that lead to its superior task-specific performance, and (ii) runtime orders of magnitude faster than existing distance measures.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140247305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper examines the income inequality among rideshare drivers resulting from discriminatory cancellations by riders, considering the impact of demographic factors such as gender, age, and race. We investigate the tradeoff between income inequality, referred to as the fairness objective, and system efficiency, known as the profit objective. To address this issue, we propose an online bipartite-matching model that captures the sequential arrival of riders according to a known distribution. The model incorporates the notion of acceptance rates between driver-rider types, which are defined based on demographic characteristics. Specifically, we analyze the probabilities of riders accepting or canceling their assigned drivers, reflecting the level of acceptance between different rider and driver types. We construct a bi-objective linear program as a valid benchmark and propose two LP-based parameterized online algorithms. Rigorous analysis of online competitive ratios is conducted to illustrate the flexibility and efficiency of our algorithms in achieving a balance between fairness and profit. Furthermore, we present experimental results based on real-world and synthetic datasets, validating the theoretical predictions put forth in our study.
{"title":"Exploring the Tradeoff Between System Profit and Income Equality Among Ride-hailing Drivers","authors":"Evan Yifan Xu, Pan Xu","doi":"10.1613/jair.1.15170","DOIUrl":"https://doi.org/10.1613/jair.1.15170","url":null,"abstract":"This paper examines the income inequality among rideshare drivers resulting from discriminatory cancellations by riders, considering the impact of demographic factors such as gender, age, and race. We investigate the tradeoff between income inequality, referred to as the fairness objective, and system efficiency, known as the profit objective. To address this issue, we propose an online bipartite-matching model that captures the sequential arrival of riders according to a known distribution. The model incorporates the notion of acceptance rates between driver-rider types, which are defined based on demographic characteristics. Specifically, we analyze the probabilities of riders accepting or canceling their assigned drivers, reflecting the level of acceptance between different rider and driver types. We construct a bi-objective linear program as a valid benchmark and propose two LP-based parameterized online algorithms. Rigorous analysis of online competitive ratios is conducted to illustrate the flexibility and efficiency of our algorithms in achieving a balance between fairness and profit. Furthermore, we present experimental results based on real-world and synthetic datasets, validating the theoretical predictions put forth in our study.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139841314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper examines the income inequality among rideshare drivers resulting from discriminatory cancellations by riders, considering the impact of demographic factors such as gender, age, and race. We investigate the tradeoff between income inequality, referred to as the fairness objective, and system efficiency, known as the profit objective. To address this issue, we propose an online bipartite-matching model that captures the sequential arrival of riders according to a known distribution. The model incorporates the notion of acceptance rates between driver-rider types, which are defined based on demographic characteristics. Specifically, we analyze the probabilities of riders accepting or canceling their assigned drivers, reflecting the level of acceptance between different rider and driver types. We construct a bi-objective linear program as a valid benchmark and propose two LP-based parameterized online algorithms. Rigorous analysis of online competitive ratios is conducted to illustrate the flexibility and efficiency of our algorithms in achieving a balance between fairness and profit. Furthermore, we present experimental results based on real-world and synthetic datasets, validating the theoretical predictions put forth in our study.
{"title":"Exploring the Tradeoff Between System Profit and Income Equality Among Ride-hailing Drivers","authors":"Evan Yifan Xu, Pan Xu","doi":"10.1613/jair.1.15170","DOIUrl":"https://doi.org/10.1613/jair.1.15170","url":null,"abstract":"This paper examines the income inequality among rideshare drivers resulting from discriminatory cancellations by riders, considering the impact of demographic factors such as gender, age, and race. We investigate the tradeoff between income inequality, referred to as the fairness objective, and system efficiency, known as the profit objective. To address this issue, we propose an online bipartite-matching model that captures the sequential arrival of riders according to a known distribution. The model incorporates the notion of acceptance rates between driver-rider types, which are defined based on demographic characteristics. Specifically, we analyze the probabilities of riders accepting or canceling their assigned drivers, reflecting the level of acceptance between different rider and driver types. We construct a bi-objective linear program as a valid benchmark and propose two LP-based parameterized online algorithms. Rigorous analysis of online competitive ratios is conducted to illustrate the flexibility and efficiency of our algorithms in achieving a balance between fairness and profit. Furthermore, we present experimental results based on real-world and synthetic datasets, validating the theoretical predictions put forth in our study.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139781507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In recent years there has been an increasing interest in studying proof systems stronger than Resolution, with the aim of building more efficient SAT solvers based on them. In defining these proof systems, we try to find a balance between the power of the proof system (the size of the proofs required to refute a formula) and the difficulty of finding the proofs. In this paper we consider the proof systems circular Resolution, Sherali-Adams, Nullstellensatz and Weighted Resolution and we study their relative power from a theoretical perspective. We prove that circular Resolution, Sherali-Adams and Weighted Resolution are polynomially equivalent proof systems. We also prove that Nullstellensatz is polynomially equivalent to a restricted version of Weighted Resolution. The equivalences carry on also for versions of the systems where the coefficients/weights are expressed in unary. The practical interest in these systems comes from the fact that they admit efficient algorithms to find proofs in case these have small width/degree.
近年来,人们对研究强于解析(Resolution)的证明系统越来越感兴趣,目的是在此基础上建立更高效的 SAT 解算器。在定义这些证明系统时,我们试图在证明系统的能力(反驳公式所需的证明的大小)和寻找证明的难度之间找到平衡。在本文中,我们考虑了循环解析、Sherali-Adams、Nullstellensatz 和加权解析等证明系统,并从理论角度研究了它们的相对能力。我们证明循环解析、Sherali-Adams 和加权解析是多项式等价的证明系统。我们还证明了 Nullstellensatz 多项式等价于加权解析的限制版本。这些等价性也适用于系数/权重以一元形式表示的系统版本。这些系统的实际意义在于,在这些系统的宽度/阶数较小的情况下,它们可以采用高效算法找到证明。
{"title":"Weighted, Circular and Semi-Algebraic Proofs","authors":"Ilario Bonacina, Maria Luisa Bonet, Jordi Levy","doi":"10.1613/jair.1.15075","DOIUrl":"https://doi.org/10.1613/jair.1.15075","url":null,"abstract":"In recent years there has been an increasing interest in studying proof systems stronger than Resolution, with the aim of building more efficient SAT solvers based on them. In defining these proof systems, we try to find a balance between the power of the proof system (the size of the proofs required to refute a formula) and the difficulty of finding the proofs.\u0000In this paper we consider the proof systems circular Resolution, Sherali-Adams, Nullstellensatz and Weighted Resolution and we study their relative power from a theoretical perspective. We prove that circular Resolution, Sherali-Adams and Weighted Resolution are polynomially equivalent proof systems. We also prove that Nullstellensatz is polynomially equivalent to a restricted version of Weighted Resolution. The equivalences carry on also for versions of the systems where the coefficients/weights are expressed in unary.\u0000The practical interest in these systems comes from the fact that they admit efficient algorithms to find proofs in case these have small width/degree.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139785807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In recent years there has been an increasing interest in studying proof systems stronger than Resolution, with the aim of building more efficient SAT solvers based on them. In defining these proof systems, we try to find a balance between the power of the proof system (the size of the proofs required to refute a formula) and the difficulty of finding the proofs. In this paper we consider the proof systems circular Resolution, Sherali-Adams, Nullstellensatz and Weighted Resolution and we study their relative power from a theoretical perspective. We prove that circular Resolution, Sherali-Adams and Weighted Resolution are polynomially equivalent proof systems. We also prove that Nullstellensatz is polynomially equivalent to a restricted version of Weighted Resolution. The equivalences carry on also for versions of the systems where the coefficients/weights are expressed in unary. The practical interest in these systems comes from the fact that they admit efficient algorithms to find proofs in case these have small width/degree.
近年来,人们对研究强于解析(Resolution)的证明系统越来越感兴趣,目的是在此基础上建立更高效的 SAT 解算器。在定义这些证明系统时,我们试图在证明系统的能力(反驳公式所需的证明的大小)和寻找证明的难度之间找到平衡。在本文中,我们考虑了循环解析、Sherali-Adams、Nullstellensatz 和加权解析等证明系统,并从理论角度研究了它们的相对能力。我们证明循环解析、Sherali-Adams 和加权解析是多项式等价的证明系统。我们还证明了 Nullstellensatz 多项式等价于加权解析的限制版本。这些等价性也适用于系数/权重以一元形式表示的系统版本。这些系统的实际意义在于,在这些系统的宽度/阶数较小的情况下,它们可以采用高效算法找到证明。
{"title":"Weighted, Circular and Semi-Algebraic Proofs","authors":"Ilario Bonacina, Maria Luisa Bonet, Jordi Levy","doi":"10.1613/jair.1.15075","DOIUrl":"https://doi.org/10.1613/jair.1.15075","url":null,"abstract":"In recent years there has been an increasing interest in studying proof systems stronger than Resolution, with the aim of building more efficient SAT solvers based on them. In defining these proof systems, we try to find a balance between the power of the proof system (the size of the proofs required to refute a formula) and the difficulty of finding the proofs.\u0000In this paper we consider the proof systems circular Resolution, Sherali-Adams, Nullstellensatz and Weighted Resolution and we study their relative power from a theoretical perspective. We prove that circular Resolution, Sherali-Adams and Weighted Resolution are polynomially equivalent proof systems. We also prove that Nullstellensatz is polynomially equivalent to a restricted version of Weighted Resolution. The equivalences carry on also for versions of the systems where the coefficients/weights are expressed in unary.\u0000The practical interest in these systems comes from the fact that they admit efficient algorithms to find proofs in case these have small width/degree.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139845805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Belaid, Nassim Belmecheri, Arnaud Gotlieb, Nadjib Lazaar, Helge Spieker
Many planning, scheduling or multi-dimensional packing problems involve the design of subtle logical combinations of temporal or spatial constraints. Recently, we introduced GEQCA-I, which stands for Generic Qualitative Constraint Acquisition, as a new active constraint acquisition method for learning qualitative constraints using qualitative queries. In this paper, we revise and extend GEQCA-I to GEQCA-II with a new type of query, universal query, for qualitative constraint acquisition, with a deeper query-driven acquisition algorithm. Our extended experimental evaluation shows the efficiency and usefulness of the concept of universal query in learning randomly-generated qualitative networks, including both temporal networks based on Allen’s algebra and spatial networks based on region connection calculus. We also show the effectiveness of GEQCA-II in learning the qualitative part of real scheduling problems.
{"title":"Query-driven Qualitative Constraint Acquisition","authors":"M. Belaid, Nassim Belmecheri, Arnaud Gotlieb, Nadjib Lazaar, Helge Spieker","doi":"10.1613/jair.1.14752","DOIUrl":"https://doi.org/10.1613/jair.1.14752","url":null,"abstract":"Many planning, scheduling or multi-dimensional packing problems involve the design of subtle logical combinations of temporal or spatial constraints. Recently, we introduced GEQCA-I, which stands for Generic Qualitative Constraint Acquisition, as a new active constraint acquisition method for learning qualitative constraints using qualitative queries. In this paper, we revise and extend GEQCA-I to GEQCA-II with a new type of query, universal query, for qualitative constraint acquisition, with a deeper query-driven acquisition algorithm. Our extended experimental evaluation shows the efficiency and usefulness of the concept of universal query in learning randomly-generated qualitative networks, including both temporal networks based on Allen’s algebra and spatial networks based on region connection calculus. We also show the effectiveness of GEQCA-II in learning the qualitative part of real scheduling problems.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139594129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In recent years, several machine learning models have been proposed. They are trained with a language modelling objective on large-scale text-only data. With such pretraining, they can achieve impressive results on many Natural Language Understanding and Generation tasks. However, many facets of meaning cannot be learned by “listening to the radio” only. In the literature, many Vision+Language (V+L) tasks have been defined with the aim of creating models that can ground symbols in the visual modality. In this work, we provide a systematic literature review of several tasks and models proposed in the V+L field. We rely on Wittgenstein’s idea of ‘language games’ to categorise such tasks into 3 different families: 1) discriminative games, 2) generative games, and 3) interactive games. Our analysis of the literature provides evidence that future work should be focusing on interactive games where communication in Natural Language is important to resolve ambiguities about object referents and action plans and that physical embodiment is essential to understand the semantics of situations and events. Overall, these represent key requirements for developing grounded meanings in neural models.
{"title":"Visually Grounded Language Learning: A Review of Language Games, Datasets, Tasks, and Models","authors":"Alessandro Suglia, Ioannis Konstas, Oliver Lemon","doi":"10.1613/jair.1.15185","DOIUrl":"https://doi.org/10.1613/jair.1.15185","url":null,"abstract":"In recent years, several machine learning models have been proposed. They are trained with a language modelling objective on large-scale text-only data. With such pretraining, they can achieve impressive results on many Natural Language Understanding and Generation tasks. However, many facets of meaning cannot be learned by “listening to the radio” only. In the literature, many Vision+Language (V+L) tasks have been defined with the aim of creating models that can ground symbols in the visual modality. In this work, we provide a systematic literature review of several tasks and models proposed in the V+L field. We rely on Wittgenstein’s idea of ‘language games’ to categorise such tasks into 3 different families: 1) discriminative games, 2) generative games, and 3) interactive games. Our analysis of the literature provides evidence that future work should be focusing on interactive games where communication in Natural Language is important to resolve ambiguities about object referents and action plans and that physical embodiment is essential to understand the semantics of situations and events. Overall, these represent key requirements for developing grounded meanings in neural models.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139595218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Charlie Street, Bruno Lacerda, Manuel Mühlig, N. Hawes
For many multi-robot problems, tasks are announced during execution, where task announcement times and locations are uncertain. To synthesise multi-robot behaviour that is robust to early announcements and unexpected delays, multi-robot task allocation methods must explicitly model the stochastic processes that govern task announcement. In this paper, we model task announcement using continuous-time Markov chains which predict when and where tasks will be announced. We then present a task allocation framework which uses the continuous-time Markov chains to allocate tasks proactively, such that robots are near or at the task location upon its announcement. Our method seeks to minimise the expected total waiting duration for each task, i.e. the duration between task announcement and a robot beginning to service the task. Our framework can be applied to any multi-robot task allocation problem where robots complete spatiotemporal tasks which are announced stochastically. We demonstrate the efficacy of our approach in simulation, where we outperform baselines which do not allocate tasks proactively, or do not fully exploit our task announcement models.
{"title":"Right Place, Right Time: Proactive Multi-Robot Task Allocation Under Spatiotemporal Uncertainty","authors":"Charlie Street, Bruno Lacerda, Manuel Mühlig, N. Hawes","doi":"10.1613/jair.1.15057","DOIUrl":"https://doi.org/10.1613/jair.1.15057","url":null,"abstract":"For many multi-robot problems, tasks are announced during execution, where task announcement times and locations are uncertain. To synthesise multi-robot behaviour that is robust to early announcements and unexpected delays, multi-robot task allocation methods must explicitly model the stochastic processes that govern task announcement. In this paper, we model task announcement using continuous-time Markov chains which predict when and where tasks will be announced. We then present a task allocation framework which uses the continuous-time Markov chains to allocate tasks proactively, such that robots are near or at the task location upon its announcement. Our method seeks to minimise the expected total waiting duration for each task, i.e. the duration between task announcement and a robot beginning to service the task. Our framework can be applied to any multi-robot task allocation problem where robots complete spatiotemporal tasks which are announced stochastically. We demonstrate the efficacy of our approach in simulation, where we outperform baselines which do not allocate tasks proactively, or do not fully exploit our task announcement models.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139533892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Oskar van der Wal, Dominik Bachmann, Alina Leidinger, Leendert van Maanen, Willem Zuidema, Katrin Schulz
As Large Language Models and Natural Language Processing (NLP) technology rapidly develop and spread into daily life, it becomes crucial to anticipate how their use could harm people. One problem that has received a lot of attention in recent years is that this technology has displayed harmful biases, from generating derogatory stereotypes to producing disparate outcomes for different social groups. Although a lot of effort has been invested in assessing and mitigating these biases, our methods of measuring the biases of NLP models have serious problems and it is often unclear what they actually measure. In this paper, we provide an interdisciplinary approach to discussing the issue of NLP model bias by adopting the lens of psychometrics — a field specialized in the measurement of concepts like bias that are not directly observable. In particular, we will explore two central notions from psychometrics, the construct validity and the reliability of measurement tools, and discuss how they can be applied in the context of measuring model bias. Our goal is to provide NLP practitioners with methodological tools for designing better bias measures, and to inspire them more generally to explore tools from psychometrics when working on bias measurement tools. This article appears in the AI & Society track.
{"title":"Undesirable Biases in NLP: Addressing Challenges of Measurement","authors":"Oskar van der Wal, Dominik Bachmann, Alina Leidinger, Leendert van Maanen, Willem Zuidema, Katrin Schulz","doi":"10.1613/jair.1.15195","DOIUrl":"https://doi.org/10.1613/jair.1.15195","url":null,"abstract":"As Large Language Models and Natural Language Processing (NLP) technology rapidly develop and spread into daily life, it becomes crucial to anticipate how their use could harm people. One problem that has received a lot of attention in recent years is that this technology has displayed harmful biases, from generating derogatory stereotypes to producing disparate outcomes for different social groups. Although a lot of effort has been invested in assessing and mitigating these biases, our methods of measuring the biases of NLP models have serious problems and it is often unclear what they actually measure. In this paper, we provide an interdisciplinary approach to discussing the issue of NLP model bias by adopting the lens of psychometrics — a field specialized in the measurement of concepts like bias that are not directly observable. In particular, we will explore two central notions from psychometrics, the construct validity and the reliability of measurement tools, and discuss how they can be applied in the context of measuring model bias. Our goal is to provide NLP practitioners with methodological tools for designing better bias measures, and to inspire them more generally to explore tools from psychometrics when working on bias measurement tools.\u0000This article appears in the AI & Society track.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139535112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wolfgang Dvořák, Matthias König, Markus Ulbricht, S. Woltran
Argumentation frameworks (AFs) are a key formalism in AI research. Their semantics have been investigated in terms of principles, which define characteristic properties in order to deliver guidance for analyzing established and developing new semantics. Because of the simple structure of AFs, many desired properties hold almost trivially, at the same time hiding interesting concepts behind syntactic notions. We extend the principle-based approach to argumentation frameworks with collective attacks (SETAFs) and provide a comprehensive overview of common principles for their semantics. Our analysis shows that investigating principles based on decomposing the given SETAF (e.g. directionality or SCC-recursiveness) poses additional challenges in comparison to usual AFs. We introduce the notion of the reduct as well as the modularization principle for SETAFs which will prove beneficial for this kind of investigation. We then demonstrate how our findings can be utilized for incremental computation of extensions and show how we can use graph properties of the frameworks to speed up these algorithms.
论证框架(AF)是人工智能研究中的一种重要形式主义。它们的语义是根据原则进行研究的,这些原则定义了特征属性,为分析既有语义和开发新语义提供指导。由于 AF 的结构简单,许多所需的属性几乎都是微不足道的,同时在语法概念背后隐藏着有趣的概念。我们将基于原则的方法扩展到具有集体攻击的论证框架(SETAFs),并对其语义的常用原则进行了全面概述。我们的分析表明,与通常的论证框架相比,基于分解给定的 SETAF(例如方向性或 SCC-递归性)来研究原理会带来额外的挑战。我们介绍了 SETAF 的还原概念和模块化原则,这将证明有利于此类研究。然后,我们将演示如何利用我们的研究成果进行扩展的增量计算,并展示如何利用框架的图属性来加速这些算法。
{"title":"Principles and their Computational Consequences for Argumentation Frameworks with Collective Attacks","authors":"Wolfgang Dvořák, Matthias König, Markus Ulbricht, S. Woltran","doi":"10.1613/jair.1.14879","DOIUrl":"https://doi.org/10.1613/jair.1.14879","url":null,"abstract":"Argumentation frameworks (AFs) are a key formalism in AI research. Their semantics have been investigated in terms of principles, which define characteristic properties in order to deliver guidance for analyzing established and developing new semantics. Because of the simple structure of AFs, many desired properties hold almost trivially, at the same time hiding interesting concepts behind syntactic notions. We extend the principle-based approach to argumentation frameworks with collective attacks (SETAFs) and provide a comprehensive overview of common principles for their semantics. Our analysis shows that investigating principles based on decomposing the given SETAF (e.g. directionality or SCC-recursiveness) poses additional challenges in comparison to usual AFs. We introduce the notion of the reduct as well as the modularization principle for SETAFs which will prove beneficial for this kind of investigation. We then demonstrate how our findings can be utilized for incremental computation of extensions and show how we can use graph properties of the frameworks to speed up these algorithms.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139534888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}