Journal of Artificial Intelligence Research最新文献_第5页

Fair and Efficient Allocation of Scarce Resources Based on Predicted Outcomes: Implications for Homeless Service Delivery 基于预测结果的稀缺资源的公平和有效分配:对无家可归者服务提供的影响

IF 5 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Journal of Artificial Intelligence Research

Pub Date : 2023-04-26 DOI: 10.1613/jair.1.12847

Amanda Kube, Sanmay Das, P. Fowler

Artificial intelligence, machine learning, and algorithmic techniques in general, provide two crucial abilities with the potential to improve decision-making in the context of allocation of scarce societal resources. They have the ability to flexibly and accurately model treatment response at the individual level, potentially allowing us to better match available resources to individuals. In addition, they have the ability to reason simultaneously about the effects of matching sets of scarce resources to populations of individuals. In this work, we leverage these abilities to study algorithmic allocation of scarce societal resources in the context of homelessness. In communities throughout the United States, there is constant demand for an array of homeless services intended to address different levels of need. Allocations of housing services must match households to appropriate services that continuously fluctuate in availability, while inefficiencies in allocation could “waste” scarce resources as households will remain in-need and re-enter the homeless system, increasing the overall demand for homeless services. This complex allocation problem introduces novel technical and ethical challenges. Using administrative data from a regional homeless system, we formulate the problem of “optimal” allocation of resources given data on households with need for homeless services. The optimization problem aims to allocate available resources such that predicted probabilities of household re-entry are minimized. The key element of this work is its use of a counterfactual prediction approach that predicts household probabilities of re-entry into homeless services if assigned to each service. Through these counterfactual predictions, we find that this approach has the potential to improve the efficiency of the homeless system by reducing re-entry, and, therefore, system-wide demand. However, efficiency comes with trade-offs - a significant fraction of households are assigned to services that increase probability of re-entry. To address this issue as well as the inherent fairness considerations present in any context where there are insufficient resources to meet demand, we discuss the efficiency, equity, and fairness issues that arise in our work and consider potential implications for homeless policies.

总的来说，人工智能、机器学习和算法技术提供了两种关键的能力，它们有可能改善稀缺社会资源分配背景下的决策。他们有能力灵活准确地在个体层面上模拟治疗反应，这可能使我们更好地将现有资源与个体相匹配。此外，他们有能力同时推断出稀缺资源对个体群体的匹配效应。在这项工作中，我们利用这些能力来研究在无家可归的背景下稀缺社会资源的算法分配。在美国各地的社区中，不断有对一系列无家可归者服务的需求，这些服务旨在解决不同层次的需求。住房服务的分配必须使家庭与可用性不断波动的适当服务相匹配，而分配效率低下可能会"浪费"稀缺资源，因为家庭将继续处于需要状态并重新进入无家可归者系统，从而增加对无家可归者服务的总体需求。这个复杂的分配问题带来了新的技术和伦理挑战。利用来自一个地区无家可归者系统的行政数据，我们在给定需要无家可归者服务的家庭数据的情况下，制定了资源的“最佳”分配问题。优化问题的目的是分配可用资源，使预测的家庭重新进入的概率最小化。这项工作的关键要素是它使用了一种反事实预测方法，如果分配到每个服务，该方法可以预测家庭重新进入无家可归者服务的概率。通过这些反事实的预测，我们发现这种方法有可能通过减少重新进入来提高无家可归者系统的效率，从而降低整个系统的需求。然而，效率是有代价的——很大一部分家庭被分配到增加重返社会可能性的服务中。为了解决这个问题，以及在资源不足以满足需求的任何情况下存在的固有公平考虑，我们讨论了工作中出现的效率、公平和公平问题，并考虑了对无家可归者政策的潜在影响。

{"title":"Fair and Efficient Allocation of Scarce Resources Based on Predicted Outcomes: Implications for Homeless Service Delivery","authors":"Amanda Kube, Sanmay Das, P. Fowler","doi":"10.1613/jair.1.12847","DOIUrl":"https://doi.org/10.1613/jair.1.12847","url":null,"abstract":"Artificial intelligence, machine learning, and algorithmic techniques in general, provide two crucial abilities with the potential to improve decision-making in the context of allocation of scarce societal resources. They have the ability to flexibly and accurately model treatment response at the individual level, potentially allowing us to better match available resources to individuals. In addition, they have the ability to reason simultaneously about the effects of matching sets of scarce resources to populations of individuals. In this work, we leverage these abilities to study algorithmic allocation of scarce societal resources in the context of homelessness. In communities throughout the United States, there is constant demand for an array of homeless services intended to address different levels of need. Allocations of housing services must match households to appropriate services that continuously fluctuate in availability, while inefficiencies in allocation could “waste” scarce resources as households will remain in-need and re-enter the homeless system, increasing the overall demand for homeless services. This complex allocation problem introduces novel technical and ethical challenges. Using administrative data from a regional homeless system, we formulate the problem of “optimal” allocation of resources given data on households with need for homeless services. The optimization problem aims to allocate available resources such that predicted probabilities of household re-entry are minimized. The key element of this work is its use of a counterfactual prediction approach that predicts household probabilities of re-entry into homeless services if assigned to each service. Through these counterfactual predictions, we find that this approach has the potential to improve the efficiency of the homeless system by reducing re-entry, and, therefore, system-wide demand. However, efficiency comes with trade-offs - a significant fraction of households are assigned to services that increase probability of re-entry. To address this issue as well as the inherent fairness considerations present in any context where there are insufficient resources to meet demand, we discuss the efficiency, equity, and fairness issues that arise in our work and consider potential implications for homeless policies.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":"188 1","pages":"1219-1245"},"PeriodicalIF":5.0,"publicationDate":"2023-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76057668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Decentralized Gradient-Quantization Based Matrix Factorization for Fast Privacy-Preserving Point-of-Interest Recommendation 基于分散梯度量化的快速隐私保护兴趣点推荐矩阵分解

IF 5 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Journal of Artificial Intelligence Research

Pub Date : 2023-04-17 DOI: 10.1613/jair.1.14414

Xuebin Zhou, Zhibin Hu, Jin Huang, Jing Chen

With the rapidly growing of location-based social networks, point-of-interest (POI) recommendation has been attracting tremendous attentions. Previous works for POI recommendation usually use matrix factorization (MF)-based methods, which achieve promising performance. However, existing MF-based methods suffer from two critical limitations: (1) Privacy issues: all users’ sensitive data are collected to the centralized server which may leak on either the server side or during transmission. (2) Poor resource utilization and training efficiency: training on centralized server with potentially huge low-rank matrices is computational inefficient. In this paper, we propose a novel decentralized gradient-quantization based matrix factorization (DGMF) framework to address the above limitations in POI recommendation. Compared with the centralized MF methods which store all sensitive data and low-rank matrices during model training, DGMF treats each user’s device (e.g., phone) as an independent learner and keeps the sensitive data on each user’s end. Furthermore, a privacy-preserving and communication-efficient mechanism with gradient-quantization technique is presented to train the proposed model, which aims to handle the privacy problem and reduces the communication cost in the decentralized setting. Theoretical guarantees of the proposed algorithm and experimental studies on real-world datasets demonstrate the effectiveness of the proposed algorithm.

随着基于位置的社交网络的快速发展，兴趣点(POI)推荐受到了极大的关注。以往的POI推荐工作通常使用基于矩阵分解(MF)的方法，取得了很好的效果。然而，现有的基于mf的方法存在两个关键的局限性:(1)隐私问题:所有用户的敏感数据都被收集到集中的服务器上，这些数据可能在服务器端或在传输过程中泄露。(2)资源利用率和训练效率差:在集中式服务器上进行训练，可能存在巨大的低秩矩阵，计算效率低下。在本文中，我们提出了一种新的基于分散梯度量化的矩阵分解(DGMF)框架来解决POI推荐中的上述限制。与集中式MF方法在模型训练过程中存储所有敏感数据和低秩矩阵相比，DGMF方法将每个用户的设备(如手机)作为独立的学习者，并将敏感数据保存在每个用户的终端上。在此基础上，提出了一种基于梯度量化技术的隐私保护和通信高效机制来训练所提出的模型，以解决分散环境下的隐私问题和降低通信成本。该算法的理论保证和在真实数据集上的实验研究证明了该算法的有效性。

{"title":"Decentralized Gradient-Quantization Based Matrix Factorization for Fast Privacy-Preserving Point-of-Interest Recommendation","authors":"Xuebin Zhou, Zhibin Hu, Jin Huang, Jing Chen","doi":"10.1613/jair.1.14414","DOIUrl":"https://doi.org/10.1613/jair.1.14414","url":null,"abstract":"With the rapidly growing of location-based social networks, point-of-interest (POI) recommendation has been attracting tremendous attentions. Previous works for POI recommendation usually use matrix factorization (MF)-based methods, which achieve promising performance. However, existing MF-based methods suffer from two critical limitations: (1) Privacy issues: all users’ sensitive data are collected to the centralized server which may leak on either the server side or during transmission. (2) Poor resource utilization and training efficiency: training on centralized server with potentially huge low-rank matrices is computational inefficient. In this paper, we propose a novel decentralized gradient-quantization based matrix factorization (DGMF) framework to address the above limitations in POI recommendation. Compared with the centralized MF methods which store all sensitive data and low-rank matrices during model training, DGMF treats each user’s device (e.g., phone) as an independent learner and keeps the sensitive data on each user’s end. Furthermore, a privacy-preserving and communication-efficient mechanism with gradient-quantization technique is presented to train the proposed model, which aims to handle the privacy problem and reduces the communication cost in the decentralized setting. Theoretical guarantees of the proposed algorithm and experimental studies on real-world datasets demonstrate the effectiveness of the proposed algorithm.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":"101 1","pages":"1019-1041"},"PeriodicalIF":5.0,"publicationDate":"2023-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80333745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Coopetition Against an Amazon 与亚马逊的竞争

3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Journal of Artificial Intelligence Research

Pub Date : 2023-04-17 DOI: 10.1613/jair.1.14074

Ronen Gradwohl, Moshe Tennenholtz

This paper analyzes cooperative data-sharing between competitors vying to predict a consumer's tastes. We design optimal data-sharing schemes both for when they compete only with each other, and for when they additionally compete with an Amazon – a company with more, better data. We show that simple schemes – threshold rules that probabilistically induce either full data-sharing between competitors, or the full transfer of data from one competitor to another – are either optimal or approximately optimal, depending on properties of the information structure. We also provide conditions under which firms share more data when they face stronger outside competition, and describe situations in which this conclusion is reversed.

本文分析了竞争对手之间的合作数据共享，以预测消费者的口味。我们设计了最优的数据共享方案，既适用于它们之间的竞争，也适用于它们与拥有更多、更好数据的亚马逊公司的竞争。我们展示了简单的方案——阈值规则，概率地诱导竞争者之间的完全数据共享，或者从一个竞争者到另一个竞争者的完全数据转移——要么是最优的，要么是近似最优的，这取决于信息结构的属性。我们还提供了企业在面临更强的外部竞争时分享更多数据的条件，并描述了这一结论相反的情况。

引用次数: 1

Optimal and Efficient Auctions for the Gradual Procurement of Strategic Service Provider Agents 战略服务供应商代理逐步采购的最优有效拍卖

IF 5 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Journal of Artificial Intelligence Research

Pub Date : 2023-04-14 DOI: 10.1613/jair.1.14126

F. Farhadi, Maria Chli, N. Jennings

We consider an outsourcing problem where a software agent procures multiple services from providers with uncertain reliabilities to complete a computational task before a strict deadline. The service consumer’s goal is to design an outsourcing strategy (defining which services to procure and when) so as to maximize a specific objective function. This objective function can be different based on the consumer’s nature; a socially-focused consumer often aims to maximize social welfare, while a self-interested consumer often aims to maximize its own utility. However, in both cases, the objective function depends on the providers’ execution costs, which are privately held by the self-interested providers and hence may be misreported to influence the consumer’s decisions. For such settings, we develop a unified approach to design truthful procurement auctions that can be used by both socially-focused and, separately, self-interested consumers. This approach benefits from our proposed weighted threshold payment scheme which pays the provably minimum amount to make an auction with a monotone outsourcing strategy incentive compatible. This payment scheme can handle contingent outsourcing plans, where additional procurement happens gradually over time and only if the success probability of the already hired providers drops below a time-dependent threshold. Using a weighted threshold payment scheme, we design two procurement auctions that maximize, as well as two low-complexity heuristic-based auctions that approximately maximize, the consumer’s expected utility and expected social welfare, respectively. We demonstrate the effectiveness and strength of our proposed auctions through both game-theoretical and empirical analysis.

我们考虑一个外包问题，其中软件代理从可靠性不确定的提供商处获得多个服务，以在严格的截止日期之前完成计算任务。服务消费者的目标是设计一个外包策略(定义采购哪些服务以及何时采购)，从而最大化特定的目标函数。这个目标函数可以根据消费者的性质而有所不同;社会关注型消费者的目标往往是社会福利最大化，而自利型消费者的目标往往是自身效用最大化。然而，在这两种情况下，目标函数都取决于供应商的执行成本，这些成本由自利的供应商私人持有，因此可能会被误报以影响消费者的决策。在这种情况下，我们开发了一种统一的方法来设计真实的采购拍卖，既可以被社会关注的消费者使用，也可以被单独的自利消费者使用。这种方法受益于我们提出的加权阈值支付方案，该方案支付可证明的最低金额，使拍卖与单调的外包策略激励兼容。这种支付方案可以处理偶然的外包计划，在这种计划中，只有当已经雇用的供应商的成功概率低于与时间相关的阈值时，才会随着时间的推移逐渐进行额外的采购。使用加权阈值支付方案，我们设计了两个采购拍卖，分别最大化消费者的期望效用和期望社会福利，以及两个低复杂性的启发式拍卖，近似最大化消费者的期望效用和期望社会福利。我们通过博弈论和实证分析证明了我们提议的拍卖的有效性和强度。

{"title":"Optimal and Efficient Auctions for the Gradual Procurement of Strategic Service Provider Agents","authors":"F. Farhadi, Maria Chli, N. Jennings","doi":"10.1613/jair.1.14126","DOIUrl":"https://doi.org/10.1613/jair.1.14126","url":null,"abstract":"We consider an outsourcing problem where a software agent procures multiple services from providers with uncertain reliabilities to complete a computational task before a strict deadline. The service consumer’s goal is to design an outsourcing strategy (defining which services to procure and when) so as to maximize a specific objective function. This objective function can be different based on the consumer’s nature; a socially-focused consumer often aims to maximize social welfare, while a self-interested consumer often aims to maximize its own utility. However, in both cases, the objective function depends on the providers’ execution costs, which are privately held by the self-interested providers and hence may be misreported to influence the consumer’s decisions. For such settings, we develop a unified approach to design truthful procurement auctions that can be used by both socially-focused and, separately, self-interested consumers. This approach benefits from our proposed weighted threshold payment scheme which pays the provably minimum amount to make an auction with a monotone outsourcing strategy incentive compatible. This payment scheme can handle contingent outsourcing plans, where additional procurement happens gradually over time and only if the success probability of the already hired providers drops below a time-dependent threshold. Using a weighted threshold payment scheme, we design two procurement auctions that maximize, as well as two low-complexity heuristic-based auctions that approximately maximize, the consumer’s expected utility and expected social welfare, respectively. We demonstrate the effectiveness and strength of our proposed auctions through both game-theoretical and empirical analysis. ","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":"111 1","pages":"959-1018"},"PeriodicalIF":5.0,"publicationDate":"2023-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77877737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Fair Influence Maximization in Large-scale Social Networks Based on Attribute-aware Reverse Influence Sampling 基于属性感知反向影响抽样的大规模社会网络公平影响最大化

IF 5 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Journal of Artificial Intelligence Research

Pub Date : 2023-04-13 DOI: 10.1613/jair.1.14450

Mingkai Lin, Lintan Sun, Rui Yang, Xu-Sheng Liu, Yajuan Wang, Ding Li, Wenzhong Li, Sanglu Lu

Influence maximization is the problem of finding a set of seed nodes in the network that maximizes the influence spread, which has become an important topic in social network analysis. Conventional influence maximization algorithms cause “unfair" influence spread among different groups in the population, which could lead to severe bias in public opinion dissemination and viral marketing. To address this issue, we formulate the fair influence maximization problem concerning the trade-off between influence maximization and group fairness. For the purpose of solving the fair influence maximization problem in large-scale social networks efficiently, we propose a novel attribute-based reverse influence sampling (ABRIS) framework. This framework intends to estimate influence in specific groups with guarantee through an attribute-based hypergraph so that we can select seed nodes strategically. Therefore, under the ABRIS framework, we design two different node selection algorithms, ABRIS-G and ABRIS-T. ABRIS-G selects nodes in a greedy scheduling way. ABRIS-T adopts a two-phase node selection method. These algorithms run efficiently and achieve a good trade-off between influence maximization and group fairness. Extensive experiments on six real-world social networks show that our algorithms significantly outperform the state-of-the-art approaches.This article appears in the AI & Society track.

影响最大化问题是在网络中寻找一组使影响传播最大化的种子节点，已成为社会网络分析中的一个重要课题。传统的影响力最大化算法会导致影响在人群中不同群体之间的“不公平”传播，这可能导致舆论传播和病毒式营销的严重偏见。为了解决这一问题，我们提出了关于影响最大化和群体公平之间权衡的公平影响最大化问题。为了有效地解决大规模社交网络中公平影响最大化问题，提出了一种基于属性的反向影响采样(ABRIS)框架。该框架旨在通过基于属性的超图来估计在特定群体中的影响，从而可以策略性地选择种子节点。因此，在ABRIS框架下，我们设计了两种不同的节点选择算法:ABRIS- g和ABRIS- t。ABRIS-G采用贪婪调度方式选择节点。ABRIS-T采用两相节点选择方法。这些算法运行效率高，在影响力最大化和群体公平之间取得了很好的平衡。在六个真实的社交网络上进行的大量实验表明，我们的算法明显优于最先进的方法。本文出现在人工智能与社会轨道上。

{"title":"Fair Influence Maximization in Large-scale Social Networks Based on Attribute-aware Reverse Influence Sampling","authors":"Mingkai Lin, Lintan Sun, Rui Yang, Xu-Sheng Liu, Yajuan Wang, Ding Li, Wenzhong Li, Sanglu Lu","doi":"10.1613/jair.1.14450","DOIUrl":"https://doi.org/10.1613/jair.1.14450","url":null,"abstract":"Influence maximization is the problem of finding a set of seed nodes in the network that maximizes the influence spread, which has become an important topic in social network analysis. Conventional influence maximization algorithms cause “unfair\" influence spread among different groups in the population, which could lead to severe bias in public opinion dissemination and viral marketing. To address this issue, we formulate the fair influence maximization problem concerning the trade-off between influence maximization and group fairness. For the purpose of solving the fair influence maximization problem in large-scale social networks efficiently, we propose a novel attribute-based reverse influence sampling (ABRIS) framework. This framework intends to estimate influence in specific groups with guarantee through an attribute-based hypergraph so that we can select seed nodes strategically. Therefore, under the ABRIS framework, we design two different node selection algorithms, ABRIS-G and ABRIS-T. ABRIS-G selects nodes in a greedy scheduling way. ABRIS-T adopts a two-phase node selection method. These algorithms run efficiently and achieve a good trade-off between influence maximization and group fairness. Extensive experiments on six real-world social networks show that our algorithms significantly outperform the state-of-the-art approaches.\u0000This article appears in the AI & Society track.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":"18 1","pages":"925-957"},"PeriodicalIF":5.0,"publicationDate":"2023-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91302847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Mining ℰℒ⊥ Bases with Adaptable Role Depth 具有自适应作用深度的⊥基的挖掘

IF 5 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Journal of Artificial Intelligence Research

Pub Date : 2023-04-13 DOI: 10.1613/jair.1.13777

Ricardo Guimarães, A. Ozaki, Cosimo Persia, B. Sertkaya

In Formal Concept Analysis, a base for a finite structure is a set of implications that characterizes all valid implications of the structure. This notion can be adapted to the context of Description Logic, where the base consists of a set of concept inclusions instead of implications. In this setting, concept expressions can be arbitrarily large. Thus, it is not clear whether a finite base exists and, if so, how large concept expressions may need to be. We first revisit results in the literature for mining ℰℒ⊥ bases from finite interpretations. Those mainly focus on finding a finite base or on fixing the role depth but potentially losing some of the valid concept inclusions with higher role depth. We then present a new strategy for mining ℰℒ⊥ bases which is adaptable in the sense that it can bound the role depth of concepts depending on the local structure of the interpretation. Our strategy guarantees to capture all ℰℒ⊥ concept inclusions holding in the interpretation, not only the ones up to a fixed role depth. We also consider the case of confident ℰℒ⊥ bases, which requires that some proportion of the domain of the interpretation satisfies the base, instead of the whole domain. This case is useful to cope with noisy data.

在形式概念分析中，有限结构的基础是表征该结构的所有有效含义的一组含义。这个概念可以适应描述逻辑的上下文，其中基础由一组概念包含而不是暗示组成。在此设置中，概念表达式可以任意大。因此，目前尚不清楚是否存在有限基，如果存在，概念表达式可能需要多大。我们首先回顾文献中从有限解释中挖掘出的⊥基的结果。这些主要侧重于寻找一个有限的基础或固定角色深度，但可能会失去一些具有较高角色深度的有效概念包含。然后，我们提出了一种挖掘⊥基的新策略，它是适应性的，因为它可以根据解释的局部结构约束概念的作用深度。我们的策略保证捕获解释中所有的⊥概念包含，而不仅仅是达到固定角色深度的那些。我们还考虑了自信的∑⊥基的情况，它要求解释域的某些比例满足基，而不是整个域。这种情况对处理有噪声的数据很有用。

{"title":"Mining ℰℒ⊥ Bases with Adaptable Role Depth","authors":"Ricardo Guimarães, A. Ozaki, Cosimo Persia, B. Sertkaya","doi":"10.1613/jair.1.13777","DOIUrl":"https://doi.org/10.1613/jair.1.13777","url":null,"abstract":"In Formal Concept Analysis, a base for a finite structure is a set of implications that characterizes all valid implications of the structure. This notion can be adapted to the context of Description Logic, where the base consists of a set of concept inclusions instead of implications. In this setting, concept expressions can be arbitrarily large. Thus, it is not clear whether a finite base exists and, if so, how large concept expressions may need to be. We first revisit results in the literature for mining ℰℒ⊥ bases from finite interpretations. Those mainly focus on finding a finite base or on fixing the role depth but potentially losing some of the valid concept inclusions with higher role depth. We then present a new strategy for mining ℰℒ⊥ bases which is adaptable in the sense that it can bound the role depth of concepts depending on the local structure of the interpretation. Our strategy guarantees to capture all ℰℒ⊥ concept inclusions holding in the interpretation, not only the ones up to a fixed role depth. We also consider the case of confident ℰℒ⊥ bases, which requires that some proportion of the domain of the interpretation satisfies the base, instead of the whole domain. This case is useful to cope with noisy data.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":"2 1","pages":"883-924"},"PeriodicalIF":5.0,"publicationDate":"2023-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90255511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Visualizing the Implicit Model Selection Tradeoff 可视化隐式模型选择权衡

3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Journal of Artificial Intelligence Research

Pub Date : 2023-03-31 DOI: 10.1613/jair.1.13764

Zezhen He, Yaron Shaposhnik

The recent rise of machine learning (ML) has been leveraged by practitioners and researchers to provide new solutions to an ever growing number of business problems. As with other ML applications, these solutions rely on model selection, which is typically achieved by evaluating certain metrics on models separately and selecting the model whose evaluations (i.e., accuracy-related loss and/or certain interpretability measures) are optimal. However, empirical evidence suggests that, in practice, multiple models often attain competitive results. Therefore, while models’ overall performance could be similar, they could operate quite differently. This results in an implicit tradeoff in models’ performance throughout the feature space which resolving requires new model selection tools. This paper explores methods for comparing predictive models in an interpretable manner to uncover the tradeoff and help resolve it. To this end, we propose various methods that synthesize ideas from supervised learning, unsupervised learning, dimensionality reduction, and visualization to demonstrate how they can be used to inform model developers about the model selection process. Using various datasets and a simple Python interface, we demonstrate how practitioners and researchers could benefit from applying these approaches to better understand the broader impact of their model selection choices.

最近机器学习(ML)的兴起已经被从业者和研究人员利用，为越来越多的业务问题提供了新的解决方案。与其他ML应用程序一样，这些解决方案依赖于模型选择，这通常是通过单独评估模型上的某些指标并选择其评估(即，与准确性相关的损失和/或某些可解释性度量)最优的模型来实现的。然而，经验证据表明，在实践中，多种模型往往获得竞争性结果。因此，虽然模型的整体性能可能相似，但它们的操作可能完全不同。这导致了模型在整个特征空间的性能的隐式权衡，解决这个问题需要新的模型选择工具。本文探讨了以可解释的方式比较预测模型的方法，以揭示权衡并帮助解决它。为此，我们提出了各种方法，这些方法综合了监督学习、无监督学习、降维和可视化的思想，以演示如何使用它们来通知模型开发人员关于模型选择过程。使用各种数据集和简单的Python接口，我们演示了从业者和研究人员如何从应用这些方法中受益，以更好地理解他们的模型选择选择的更广泛影响。

{"title":"Visualizing the Implicit Model Selection Tradeoff","authors":"Zezhen He, Yaron Shaposhnik","doi":"10.1613/jair.1.13764","DOIUrl":"https://doi.org/10.1613/jair.1.13764","url":null,"abstract":"The recent rise of machine learning (ML) has been leveraged by practitioners and researchers to provide new solutions to an ever growing number of business problems. As with other ML applications, these solutions rely on model selection, which is typically achieved by evaluating certain metrics on models separately and selecting the model whose evaluations (i.e., accuracy-related loss and/or certain interpretability measures) are optimal. However, empirical evidence suggests that, in practice, multiple models often attain competitive results. Therefore, while models’ overall performance could be similar, they could operate quite differently. This results in an implicit tradeoff in models’ performance throughout the feature space which resolving requires new model selection tools. This paper explores methods for comparing predictive models in an interpretable manner to uncover the tradeoff and help resolve it. To this end, we propose various methods that synthesize ideas from supervised learning, unsupervised learning, dimensionality reduction, and visualization to demonstrate how they can be used to inform model developers about the model selection process. Using various datasets and a simple Python interface, we demonstrate how practitioners and researchers could benefit from applying these approaches to better understand the broader impact of their model selection choices.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":"291 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135733029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

On Fair Division under Heterogeneous Matroid Constraints 非均匀矩阵约束下的公平除法

3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Journal of Artificial Intelligence Research

Pub Date : 2023-03-11 DOI: 10.1613/jair.1.13779

Amitay Dror, Michal Feldman, Erel Segal-Halevi

We study fair allocation of indivisible goods among additive agents with feasibility constraints. In these settings, every agent is restricted to get a bundle among a specified set of feasible bundles. Such scenarios have been of great interest to the AI community due to their applicability to real-world problems. Following some impossibility results, we restrict attention to matroid feasibility constraints that capture natural scenarios, such as the allocation of shifts to medical doctors and the allocation of conference papers to referees. We focus on the common fairness notion of envy-freeness up to one good (EF1). Previous algorithms for finding EF1 allocations are either restricted to agents with identical feasibility constraints or allow free disposal of items. An open problem is the existence of EF1 complete allocations among agents who differ both in their valuations and in their feasibility constraints. In this work, we make progress on this problem by providing positive and negative results for several matroid and valuation types. Among other results, we devise polynomial-time algorithms for finding EF1 allocations in the following settings: (i) n agents with heterogeneous (non-identical) binary valuations and partition matroids with heterogeneous capacities; (ii) two agents with heterogeneous additive valuations and partition matroids with heterogeneous capacities; and (iii) three agents with heterogeneous binary valuations and identical base-orderable matroid constraints.

研究了具有可行性约束的添加剂之间不可分割商品的公平分配问题。在这些设置中，每个代理只能从一组指定的可行包中获得一个包。由于这些场景适用于现实世界的问题，因此引起了人工智能社区的极大兴趣。根据一些不可能的结果，我们将注意力限制在捕获自然情景的矩阵可行性约束上，例如医生轮班的分配和会议论文的分配给裁判。我们关注的是普遍的公平概念，即不嫉妒达到一个好(EF1)。以前用于寻找EF1分配的算法要么限制为具有相同可行性约束的代理，要么允许自由处置项目。一个悬而未决的问题是，在估值和可行性约束都不同的代理人之间存在EF1完全分配。在这项工作中，我们通过提供几种矩阵和估值类型的正负结果，在这个问题上取得了进展。在其他结果中，我们设计了多项式时间算法，用于在以下设置中寻找EF1分配:(i) n个具有异构(非相同)二进制估值的代理和具有异构容量的分区拟阵;(ii)两个具有异质相加估值的代理和具有异质容量的分区拟阵;(iii)三个具有异构二元估值和相同基序矩阵约束的代理。

{"title":"On Fair Division under Heterogeneous Matroid Constraints","authors":"Amitay Dror, Michal Feldman, Erel Segal-Halevi","doi":"10.1613/jair.1.13779","DOIUrl":"https://doi.org/10.1613/jair.1.13779","url":null,"abstract":"We study fair allocation of indivisible goods among additive agents with feasibility constraints. In these settings, every agent is restricted to get a bundle among a specified set of feasible bundles. Such scenarios have been of great interest to the AI community due to their applicability to real-world problems. Following some impossibility results, we restrict attention to matroid feasibility constraints that capture natural scenarios, such as the allocation of shifts to medical doctors and the allocation of conference papers to referees. We focus on the common fairness notion of envy-freeness up to one good (EF1). Previous algorithms for finding EF1 allocations are either restricted to agents with identical feasibility constraints or allow free disposal of items. An open problem is the existence of EF1 complete allocations among agents who differ both in their valuations and in their feasibility constraints. In this work, we make progress on this problem by providing positive and negative results for several matroid and valuation types. Among other results, we devise polynomial-time algorithms for finding EF1 allocations in the following settings: (i) n agents with heterogeneous (non-identical) binary valuations and partition matroids with heterogeneous capacities; (ii) two agents with heterogeneous additive valuations and partition matroids with heterogeneous capacities; and (iii) three agents with heterogeneous binary valuations and identical base-orderable matroid constraints.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136007438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

How to DP-fy ML: A Practical Guide to Machine Learning with Differential Privacy 如何DP-fy ML:具有差分隐私的机器学习实用指南

IF 5 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Journal of Artificial Intelligence Research

Pub Date : 2023-03-01 DOI: 10.1613/jair.1.14649

N. Ponomareva, Hussein Hazimeh, Alexey Kurakin, Zheng Xu, Carson E. Denison, H. B. McMahan, Sergei Vassilvitskii, Steve Chien, Abhradeep Thakurta

Machine Learning (ML) models are ubiquitous in real-world applications and are a constant focus of research. Modern ML models have become more complex, deeper, and harder to reason about. At the same time, the community has started to realize the importance of protecting the privacy of the training data that goes into these models.Differential Privacy (DP) has become a gold standard for making formal statements about data anonymization. However, while some adoption of DP has happened in industry, attempts to apply DP to real world complex ML models are still few and far between. The adoption of DP is hindered by limited practical guidance of what DP protection entails, what privacy guarantees to aim for, and the difficulty of achieving good privacy-utility-computation trade-offs for ML models. Tricks for tuning and maximizing performance are scattered among papers or stored in the heads of practitioners, particularly with respect to the challenging task of hyperparameter tuning. Furthermore, the literature seems to present conflicting evidence on how and whether to apply architectural adjustments and which components are “safe” to use with DP.In this survey paper, we attempt to create a self-contained guide that gives an in-depth overview of the field of DP ML. We aim to assemble information about achieving the best possible DP ML model with rigorous privacy guarantees. Our target audience is both researchers and practitioners. Researchers interested in DP for ML will benefit from a clear overview of current advances and areas for improvement. We also include theory-focused sections that highlight important topics such as privacy accounting and convergence. For a practitioner, this survey provides a background in DP theory and a clear step-by-step guide for choosing an appropriate privacy definition and approach, implementing DP training, potentially updating the model architecture, and tuning hyperparameters. For both researchers and practitioners, consistently and fully reporting privacy guarantees is critical, so we propose a set of specific best practices for stating guarantees.With sufficient computation and a sufficiently large training set or supplemental nonprivate data, both good accuracy (that is, almost as good as a non-private model) and good privacy can often be achievable. And even when computation and dataset size are limited, there are advantages to training with even a weak (but still finite) formal DP guarantee. Hence, we hope this work will facilitate more widespread deployments of DP ML models.

机器学习(ML)模型在现实世界的应用中无处不在，并且一直是研究的焦点。现代ML模型变得更复杂、更深入、更难以推理。与此同时，社区已经开始意识到保护这些模型中训练数据隐私的重要性。差分隐私(DP)已经成为对数据匿名化进行正式声明的黄金标准。然而，虽然在工业中已经采用了一些DP，但将DP应用于现实世界复杂ML模型的尝试仍然很少。DP的采用受到以下方面有限的实际指导的阻碍:DP保护需要什么、隐私保证的目标是什么，以及为ML模型实现良好的隐私-效用-计算权衡的困难。调优和最大化性能的技巧分散在论文中或存储在实践者的头脑中，特别是关于超参数调优的挑战性任务。此外，关于如何以及是否应用架构调整以及哪些组件对DP是“安全的”，文献似乎提出了相互矛盾的证据。在这篇调查论文中，我们试图创建一个独立的指南，对DP ML领域进行深入的概述。我们的目标是收集有关实现具有严格隐私保证的最佳DP ML模型的信息。我们的目标受众是研究人员和从业人员。对ML的DP感兴趣的研究人员将受益于当前进展和改进领域的清晰概述。我们还包括以理论为重点的部分，强调隐私会计和收敛等重要主题。对于从业者来说，本调查提供了数据保护理论的背景知识，并为选择适当的隐私定义和方法、实现数据保护培训、可能更新模型架构和调优超参数提供了清晰的分步指南。对于研究人员和从业人员来说，一致和全面地报告隐私保证是至关重要的，因此我们提出了一套具体的最佳实践来说明保证。通过足够的计算和足够大的训练集或补充的非私有数据，通常可以实现良好的准确性(即几乎与非私有模型一样好)和良好的隐私性。即使在计算和数据集大小有限的情况下，使用弱(但仍然有限)的正式DP保证进行训练也有好处。因此，我们希望这项工作将促进DP ML模型的更广泛部署。

{"title":"How to DP-fy ML: A Practical Guide to Machine Learning with Differential Privacy","authors":"N. Ponomareva, Hussein Hazimeh, Alexey Kurakin, Zheng Xu, Carson E. Denison, H. B. McMahan, Sergei Vassilvitskii, Steve Chien, Abhradeep Thakurta","doi":"10.1613/jair.1.14649","DOIUrl":"https://doi.org/10.1613/jair.1.14649","url":null,"abstract":"Machine Learning (ML) models are ubiquitous in real-world applications and are a constant focus of research. Modern ML models have become more complex, deeper, and harder to reason about. At the same time, the community has started to realize the importance of protecting the privacy of the training data that goes into these models.\u0000Differential Privacy (DP) has become a gold standard for making formal statements about data anonymization. However, while some adoption of DP has happened in industry, attempts to apply DP to real world complex ML models are still few and far between. The adoption of DP is hindered by limited practical guidance of what DP protection entails, what privacy guarantees to aim for, and the difficulty of achieving good privacy-utility-computation trade-offs for ML models. Tricks for tuning and maximizing performance are scattered among papers or stored in the heads of practitioners, particularly with respect to the challenging task of hyperparameter tuning. Furthermore, the literature seems to present conflicting evidence on how and whether to apply architectural adjustments and which components are “safe” to use with DP.\u0000In this survey paper, we attempt to create a self-contained guide that gives an in-depth overview of the field of DP ML. We aim to assemble information about achieving the best possible DP ML model with rigorous privacy guarantees. Our target audience is both researchers and practitioners. Researchers interested in DP for ML will benefit from a clear overview of current advances and areas for improvement. We also include theory-focused sections that highlight important topics such as privacy accounting and convergence. For a practitioner, this survey provides a background in DP theory and a clear step-by-step guide for choosing an appropriate privacy definition and approach, implementing DP training, potentially updating the model architecture, and tuning hyperparameters. For both researchers and practitioners, consistently and fully reporting privacy guarantees is critical, so we propose a set of specific best practices for stating guarantees.\u0000With sufficient computation and a sufficiently large training set or supplemental nonprivate data, both good accuracy (that is, almost as good as a non-private model) and good privacy can often be achievable. And even when computation and dataset size are limited, there are advantages to training with even a weak (but still finite) formal DP guarantee. Hence, we hope this work will facilitate more widespread deployments of DP ML models.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":"384 1","pages":"1113-1201"},"PeriodicalIF":5.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77762609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 34

A Logic of East and West 东西方的逻辑

IF 5 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Journal of Artificial Intelligence Research

Pub Date : 2023-02-28 DOI: 10.1613/jair.1.14113

H. Du, N. Alechina, Amin Farjudian, B. Logan, Can Zhou, A. Cohn

We propose a logic of east and west (LEW ) for points in 1D Euclidean space. It formalises primitive direction relations: east (E), west (W) and indeterminate east/west (Iew). It has a parameter τ ∈ N>1, which is referred to as the level of indeterminacy in directions. For every τ ∈ N>1, we provide a sound and complete axiomatisation of LEW , and prove that its satisfiability problem is NP-complete. In addition, we show that the finite axiomatisability of LEW depends on τ : if τ = 2 or τ = 3, then there exists a finite sound and complete axiomatisation; if τ > 3, then the logic is not finitely axiomatisable. LEW can be easily extended to higher-dimensional Euclidean spaces. Extending LEW to 2D Euclidean space makes it suitable for reasoning about not perfectly aligned representations of the same spatial objects in different datasets, for example, in crowd-sourced digital maps.

提出了一维欧几里德空间中点的东西逻辑(LEW)。它形式化了原始的方向关系:东(E)，西(W)和不确定的东/西(视图)。它有一个参数τ∈N>1，称为方向上的不确定性水平。对于每一个τ∈N>1，我们给出了LEW的一个完备的公理化，并证明了其可满足性问题是np完全的。此外，我们证明了LEW的有限公理化依赖于τ:如果τ = 2或τ = 3，则存在有限健全和完全公理化;如果τ > 3，则逻辑不是有限公理化的。LEW可以很容易地扩展到高维欧几里德空间。将LEW扩展到二维欧几里得空间，使其适合于推理不同数据集中相同空间对象的不完全对齐表示，例如，在众包数字地图中。

引用次数: 0