María Luque-Rodriguez, José Molina-Baena, Alfonso Jiménez-Vílchez, A. Arauzo-Azofra
Selecting the best features in a dataset improves accuracy and efficiency of classifiers in a learning process. Datasets generally have more features than necessary, some of them being irrelevant or redundant to others. For this reason, numerous feature selection methods have been developed, in which different evaluation functions and measures are applied. This paper proposes the systematic application of individual feature evaluation methods to initialize search-based feature subset selection methods. An exhaustive review of the starting methods used by genetic algorithms from 2014 to 2020 has been carried out. Subsequently, an in-depth empirical study has been carried out evaluating the proposal for different search-based feature selection methods (Sequential forward and backward selection, Las Vegas filter and wrapper, Simulated Annealing and Genetic Algorithms). Since the computation time is reduced and the classification accuracy with the selected features is improved, the initialization of feature selection proposed in this work is proved to be worth considering while designing any feature selection algorithms.
{"title":"Initialization of Feature Selection Search for Classification","authors":"María Luque-Rodriguez, José Molina-Baena, Alfonso Jiménez-Vílchez, A. Arauzo-Azofra","doi":"10.1613/jair.1.14015","DOIUrl":"https://doi.org/10.1613/jair.1.14015","url":null,"abstract":"Selecting the best features in a dataset improves accuracy and efficiency of classifiers in a learning process. Datasets generally have more features than necessary, some of them being irrelevant or redundant to others. For this reason, numerous feature selection methods have been developed, in which different evaluation functions and measures are applied. This paper proposes the systematic application of individual feature evaluation methods to initialize search-based feature subset selection methods. An exhaustive review of the starting methods used by genetic algorithms from 2014 to 2020 has been carried out. Subsequently, an in-depth empirical study has been carried out evaluating the proposal for different search-based feature selection methods (Sequential forward and backward selection, Las Vegas filter and wrapper, Simulated Annealing and Genetic Algorithms). Since the computation time is reduced and the classification accuracy with the selected features is improved, the initialization of feature selection proposed in this work is proved to be worth considering while designing any feature selection algorithms. ","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":"99 1","pages":"953-983"},"PeriodicalIF":5.0,"publicationDate":"2022-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76603726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Machine learning models are incorporated in different fields and disciplines in which some of them require a high level of accountability and transparency, for example, the healthcare sector. With the General Data Protection Regulation (GDPR), the importance for plausibility and verifiability of the predictions made by machine learning models has become essential. A widely used category of explanation techniques attempts to explain models’ predictions by quantifying the importance score of each input feature. However, summarizing such scores to provide human-interpretable explanations is challenging. Another category of explanation techniques focuses on learning a domain representation in terms of high-level human-understandable concepts and then utilizing them to explain predictions. These explanations are hampered by how concepts are constructed, which is not intrinsically interpretable. To this end, we propose Concept-based Local Explanations with Feedback (CLEF), a novel local model agnostic explanation framework for learning a set of high-level transparent concept definitions in high-dimensional tabular data that uses clinician-labeled concepts rather than raw features. CLEF maps the raw input features to high-level intuitive concepts and then decompose the evidence of prediction of the instance being explained into concepts. In addition, the proposed framework generates counterfactual explanations, suggesting the minimum changes in the instance’s concept based explanation that will lead to a different prediction. We demonstrate with simulated user feedback on predicting the risk of mortality. Such direct feedback is more effective than other techniques, that rely on hand-labelled or automatically extracted concepts, in learning concepts that align with ground truth concept definitions.
{"title":"Interpretable Local Concept-based Explanation with Human Feedback to Predict All-cause Mortality","authors":"Radwa El Shawi, M. Al-Mallah","doi":"10.1613/jair.1.14019","DOIUrl":"https://doi.org/10.1613/jair.1.14019","url":null,"abstract":"Machine learning models are incorporated in different fields and disciplines in which some of them require a high level of accountability and transparency, for example, the healthcare sector. With the General Data Protection Regulation (GDPR), the importance for plausibility and verifiability of the predictions made by machine learning models has become essential. A widely used category of explanation techniques attempts to explain models’ predictions by quantifying the importance score of each input feature. However, summarizing such scores to provide human-interpretable explanations is challenging. Another category of explanation techniques focuses on learning a domain representation in terms of high-level human-understandable concepts and then utilizing them to explain predictions. These explanations are hampered by how concepts are constructed, which is not intrinsically interpretable. To this end, we propose Concept-based Local Explanations with Feedback (CLEF), a novel local model agnostic explanation framework for learning a set of high-level transparent concept definitions in high-dimensional tabular data that uses clinician-labeled concepts rather than raw features. CLEF maps the raw input features to high-level intuitive concepts and then decompose the evidence of prediction of the instance being explained into concepts. In addition, the proposed framework generates counterfactual explanations, suggesting the minimum changes in the instance’s concept based explanation that will lead to a different prediction. We demonstrate with simulated user feedback on predicting the risk of mortality. Such direct feedback is more effective than other techniques, that rely on hand-labelled or automatically extracted concepts, in learning concepts that align with ground truth concept definitions.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":"22 1","pages":"833-855"},"PeriodicalIF":5.0,"publicationDate":"2022-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76754936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shawn Skyler, Dor Atzmon, Tamir Yaffe, Ariel Felner
This paper solves the Watchman Route Problem (WRP) on a general discrete graph with Heuristic Search. Given a graph, a line-of-sight (LOS) function, and a start vertex, the task is to (offline) find a (shortest) path through the graph such that all vertices in the graph will be visually seen by at least one vertex on the path. WRP is reminiscent but different from graph covering and mapping problems, which are done online on an unknown graph. We formalize WRP as a heuristic search problem and solve it optimally with an A*-based algorithm. We develop a series of admissible heuristics with increasing difficulty and accuracy. Our heuristics abstract the underlying graph into a disjoint line-of-sight graph (GDLS) which is based on disjoint clusters of vertices such that vertices within the same cluster have LOS to the same specific vertex. We use solutions for the Minimum Spanning Tree (MST) and the Traveling Salesman Problem (TSP) of GDLS as admissible heuristics for WRP. We theoretically and empirically investigate these heuristics. Then, we show how the optimal methods can be modified (by intelligently pruning away large sub-trees) to obtain various suboptimal solvers with and without bound guarantees. These suboptimal solvers are much faster and expand fewer nodes than the optimal solver with only minor reduction in the quality of the solution.
{"title":"Solving the Watchman Route Problem with Heuristic Search","authors":"Shawn Skyler, Dor Atzmon, Tamir Yaffe, Ariel Felner","doi":"10.1613/jair.1.13685","DOIUrl":"https://doi.org/10.1613/jair.1.13685","url":null,"abstract":"This paper solves the Watchman Route Problem (WRP) on a general discrete graph with Heuristic Search. Given a graph, a line-of-sight (LOS) function, and a start vertex, the task is to (offline) find a (shortest) path through the graph such that all vertices in the graph will be visually seen by at least one vertex on the path. WRP is reminiscent but different from graph covering and mapping problems, which are done online on an unknown graph. We formalize WRP as a heuristic search problem and solve it optimally with an A*-based algorithm. We develop a series of admissible heuristics with increasing difficulty and accuracy. Our heuristics abstract the underlying graph into a disjoint line-of-sight graph (GDLS) which is based on disjoint clusters of vertices such that vertices within the same cluster have LOS to the same specific vertex. We use solutions for the Minimum Spanning Tree (MST) and the Traveling Salesman Problem (TSP) of GDLS as admissible heuristics for WRP. We theoretically and empirically investigate these heuristics. Then, we show how the optimal methods can be modified (by intelligently pruning away large sub-trees) to obtain various suboptimal solvers with and without bound guarantees. These suboptimal solvers are much faster and expand fewer nodes than the optimal solver with only minor reduction in the quality of the solution.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":"42 1","pages":"747-793"},"PeriodicalIF":5.0,"publicationDate":"2022-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89619785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Transformer benefits from the high parallelization of attention networks in fast training, but it still suffers from slow decoding partially due to the linear dependency O(m) of the decoder self-attention on previous target words at inference. In this paper, we propose a generalized average attention network (AAN+) aiming at speeding up decoding by reducing the dependency from O(m) to O(1). We find that the learned self-attention weights in the decoder follow some patterns which can be approximated via a dynamic structure. Based on this insight, we develop AAN+, extending our previously proposed average attention (Zhang et al., 2018a, AAN) to support more general position- and content-based attention patterns. AAN+ only requires to maintain a small constant number of hidden states during decoding, ensuring its O(1) dependency. We apply AAN+ as a drop-in replacement of the decoder selfattention and conduct experiments on machine translation (with diverse language pairs), table-to-text generation and document summarization. With masking tricks and dynamic programming, AAN+ enables Transformer to decode sentences around 20% faster without largely compromising in the training speed and the generation performance. Our results further reveal the importance of the localness (neighboring words) in AAN+ and its capability in modeling long-range dependency.
{"title":"AAN+: Generalized Average Attention Network for Accelerating Neural Transformer","authors":"Biao Zhang, Deyi Xiong, Yubin Ge, Junfeng Yao, Hao Yue, Jinsong Su","doi":"10.1613/jair.1.13896","DOIUrl":"https://doi.org/10.1613/jair.1.13896","url":null,"abstract":"Transformer benefits from the high parallelization of attention networks in fast training, but it still suffers from slow decoding partially due to the linear dependency O(m) of the decoder self-attention on previous target words at inference. In this paper, we propose a generalized average attention network (AAN+) aiming at speeding up decoding by reducing the dependency from O(m) to O(1). We find that the learned self-attention weights in the decoder follow some patterns which can be approximated via a dynamic structure. Based on this insight, we develop AAN+, extending our previously proposed average attention (Zhang et al., 2018a, AAN) to support more general position- and content-based attention patterns. AAN+ only requires to maintain a small constant number of hidden states during decoding, ensuring its O(1) dependency. We apply AAN+ as a drop-in replacement of the decoder selfattention and conduct experiments on machine translation (with diverse language pairs), table-to-text generation and document summarization. With masking tricks and dynamic programming, AAN+ enables Transformer to decode sentences around 20% faster without largely compromising in the training speed and the generation performance. Our results further reveal the importance of the localness (neighboring words) in AAN+ and its capability in modeling long-range dependency.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":"26 1","pages":"677-708"},"PeriodicalIF":5.0,"publicationDate":"2022-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81541655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
David Fern'andez Llorca, V. Charisi, Ronan Hamon, Ignacio E. S'anchez, Emilia G'omez
New emerging technologies powered by Artificial Intelligence (AI) have the potential to disruptively transform our societies for the better. In particular, data-driven learning approaches (i.e., Machine Learning (ML)) have been a true revolution in the advancement of multiple technologies in various application domains. But at the same time there is growing concern about certain intrinsic characteristics of these methodologies that carry potential risks to both safety and fundamental rights. Although there are mechanisms in the adoption process to minimize these risks (e.g., safety regulations), these do not exclude the possibility of harm occurring, and if this happens, victims should be able to seek compensation. Liability regimes will therefore play a key role in ensuring basic protection for victims using or interacting with these systems. However, the same characteristics that make AI systems inherently risky, such as lack of causality, opacity, unpredictability or their self and continuous learning capabilities, may lead to considerable difficulties when it comes to proving causation. This paper presents three case studies, as well as the methodology to reach them, that illustrate these difficulties. Specifically, we address the cases of cleaning robots, delivery drones and robots in education. The outcome of the proposed analysis suggests the need to revise liability regimes to alleviate the burden of proof on victims in cases involving AI technologies. This article appears in the AI & Society track.
{"title":"Liability regimes in the age of AI: a use-case driven analysis of the burden of proof","authors":"David Fern'andez Llorca, V. Charisi, Ronan Hamon, Ignacio E. S'anchez, Emilia G'omez","doi":"10.1613/jair.1.14565","DOIUrl":"https://doi.org/10.1613/jair.1.14565","url":null,"abstract":"New emerging technologies powered by Artificial Intelligence (AI) have the potential to disruptively transform our societies for the better. In particular, data-driven learning approaches (i.e., Machine Learning (ML)) have been a true revolution in the advancement of multiple technologies in various application domains. But at the same time there is growing concern about certain intrinsic characteristics of these methodologies that carry potential risks to both safety and fundamental rights. Although there are mechanisms in the adoption process to minimize these risks (e.g., safety regulations), these do not exclude the possibility of harm occurring, and if this happens, victims should be able to seek compensation. Liability regimes will therefore play a key role in ensuring basic protection for victims using or interacting with these systems. However, the same characteristics that make AI systems inherently risky, such as lack of causality, opacity, unpredictability or their self and continuous learning capabilities, may lead to considerable difficulties when it comes to proving causation. This paper presents three case studies, as well as the methodology to reach them, that illustrate these difficulties. Specifically, we address the cases of cleaning robots, delivery drones and robots in education. The outcome of the proposed analysis suggests the need to revise liability regimes to alleviate the burden of proof on victims in cases involving AI technologies.\u0000\u0000\u0000\u0000This article appears in the AI & Society track.\u0000\u0000\u0000","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":"36 1","pages":"613-644"},"PeriodicalIF":5.0,"publicationDate":"2022-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91152517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Most studies investigating models and algorithms for distributed constraint optimization problems (DCOPs) assume that messages arrive instantaneously and are never lost. Specifically, distributed local search DCOP algorithms, have been designed as synchronous algorithms (i.e., they perform in synchronous iterations in which each agent exchanges messages with all its neighbors), despite running in asynchronous environments. This is true also for an anytime mechanism that reports the best solution explored during the run of synchronous distributed local search algorithms. Thus, when the assumption of perfect communication is relaxed, the properties that were established for the state-of-the-art local search algorithms and the anytime mechanism may not necessarily apply. In this work, we address this limitation by: (1) Proposing a Communication-Aware DCOP model (CA-DCOP) that can represent scenarios with different communication disturbances; (2) Investigating the performance of existing local search DCOP algorithms, specifically Distributed Stochastic Algorithm (DSA) and Maximum Gain Messages (MGM), in the presence of message latency and message loss; (3) Proposing a latency-aware monotonic distributed local search DCOP algorithm; and (4) Proposing an asynchronous anytime framework for reporting the best solution explored by non-monotonic asynchronous local search DCOP algorithms. Our empirical results demonstrate that imperfect communication has a positive effect on distributed local search algorithms due to increased exploration. Furthermore, the asynchronous anytime framework we proposed allows one to benefit from algorithms with inherent explorative heuristics.
{"title":"Communication-Aware Local Search for Distributed Constraint Optimization","authors":"Ben Rachmut","doi":"10.1613/jair.1.13826","DOIUrl":"https://doi.org/10.1613/jair.1.13826","url":null,"abstract":"Most studies investigating models and algorithms for distributed constraint optimization problems (DCOPs) assume that messages arrive instantaneously and are never lost. Specifically, distributed local search DCOP algorithms, have been designed as synchronous algorithms (i.e., they perform in synchronous iterations in which each agent exchanges messages with all its neighbors), despite running in asynchronous environments. This is true also for an anytime mechanism that reports the best solution explored during the run of synchronous distributed local search algorithms. Thus, when the assumption of perfect communication is relaxed, the properties that were established for the state-of-the-art local search algorithms and the anytime mechanism may not necessarily apply.\u0000In this work, we address this limitation by: (1) Proposing a Communication-Aware DCOP model (CA-DCOP) that can represent scenarios with different communication disturbances; (2) Investigating the performance of existing local search DCOP algorithms, specifically Distributed Stochastic Algorithm (DSA) and Maximum Gain Messages (MGM), in the presence of message latency and message loss; (3) Proposing a latency-aware monotonic distributed local search DCOP algorithm; and (4) Proposing an asynchronous anytime framework for reporting the best solution explored by non-monotonic asynchronous local search DCOP algorithms. Our empirical results demonstrate that imperfect communication has a positive effect on distributed local search algorithms due to increased exploration. Furthermore, the asynchronous anytime framework we proposed allows one to benefit from algorithms with inherent explorative heuristics.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":"1 1","pages":"637-675"},"PeriodicalIF":5.0,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83128999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bogdan Mazoure, T. Doan, Tianyu Li, V. Makarenkov, Joelle Pineau, Doina Precup, Guillaume Rabusseau
We propose a general framework for policy representation for reinforcement learning tasks. This framework involves finding a low-dimensional embedding of the policy on a reproducing kernel Hilbert space (RKHS). The usage of RKHS based methods allows us to derive strong theoretical guarantees on the expected return of the reconstructed policy. Such guarantees are typically lacking in black-box models, but are very desirable in tasks requiring stability and convergence guarantees. We conduct several experiments on classic RL domains. The results confirm that the policies can be robustly represented in a low-dimensional space while the embedded policy incurs almost no decrease in returns.
{"title":"Low-Rank Representation of Reinforcement Learning Policies","authors":"Bogdan Mazoure, T. Doan, Tianyu Li, V. Makarenkov, Joelle Pineau, Doina Precup, Guillaume Rabusseau","doi":"10.1613/jair.1.13854","DOIUrl":"https://doi.org/10.1613/jair.1.13854","url":null,"abstract":"We propose a general framework for policy representation for reinforcement learning tasks. This framework involves finding a low-dimensional embedding of the policy on a reproducing kernel Hilbert space (RKHS). The usage of RKHS based methods allows us to derive strong theoretical guarantees on the expected return of the reconstructed policy. Such guarantees are typically lacking in black-box models, but are very desirable in tasks requiring stability and convergence guarantees. We conduct several experiments on classic RL domains. The results confirm that the policies can be robustly represented in a low-dimensional space while the embedded policy incurs almost no decrease in returns.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":"29 1","pages":"597-636"},"PeriodicalIF":5.0,"publicationDate":"2022-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72913510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we present a novel approach to epistemic planning called planning with perspectives (PWP) that is both more expressive and computationally more efficient than existing state-of-the-art epistemic planning tools. Epistemic planning — planning with knowledge and belief — is essential in many multi-agent and human-agent interaction domains. Most state-of-the-art epistemic planners solve epistemic planning problems by either compiling to propositional classical planning (for example, generating all possible knowledge atoms or compiling epistemic formulae to normal forms); or explicitly encoding Kripke-based semantics. However, these methods become computationally infeasible as problem sizes grow. In this paper, we decompose epistemic planning by delegating reasoning about epistemic formulae to an external solver. We do this by modelling the problem using Functional STRIPS, which is more expressive than standard STRIPS and supports the use of external, black-box functions within action models. Building on recent work that demonstrates the relationship between what an agent ‘sees’ and what it knows, we define the perspective of each agent using an external function, and build a solver for epistemic logic around this. Modellers can customise the perspective function of agents, allowing new epistemic logics to be defined without changing the planner. We ran evaluations on well-known epistemic planning benchmarks to compare an existing state-of-the-art planner, and on new scenarios that demonstrate the expressiveness of the PWP approach. The results show that our PWP planner scales significantly better than the state-of-the-art planner that we compared against, and can express problems more succinctly.
{"title":"Planning with Perspectives - Decomposing Epistemic Planning using Functional STRIPS","authors":"Guanghua Hu, Tim Miller, N. Lipovetzky","doi":"10.1613/jair.1.13446","DOIUrl":"https://doi.org/10.1613/jair.1.13446","url":null,"abstract":"In this paper, we present a novel approach to epistemic planning called planning with perspectives (PWP) that is both more expressive and computationally more efficient than existing state-of-the-art epistemic planning tools. Epistemic planning — planning with knowledge and belief — is essential in many multi-agent and human-agent interaction domains. Most state-of-the-art epistemic planners solve epistemic planning problems by either compiling to propositional classical planning (for example, generating all possible knowledge atoms or compiling epistemic formulae to normal forms); or explicitly encoding Kripke-based semantics. However, these methods become computationally infeasible as problem sizes grow. In this paper, we decompose epistemic planning by delegating reasoning about epistemic formulae to an external solver. We do this by modelling the problem using Functional STRIPS, which is more expressive than standard STRIPS and supports the use of external, black-box functions within action models. Building on recent work that demonstrates the relationship between what an agent ‘sees’ and what it knows, we define the perspective of each agent using an external function, and build a solver for epistemic logic around this. Modellers can customise the perspective function of agents, allowing new epistemic logics to be defined without changing the planner. We ran evaluations on well-known epistemic planning benchmarks to compare an existing state-of-the-art planner, and on new scenarios that demonstrate the expressiveness of the PWP approach. The results show that our PWP planner scales significantly better than the state-of-the-art planner that we compared against, and can express problems more succinctly.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":"1 1","pages":"489-539"},"PeriodicalIF":5.0,"publicationDate":"2022-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90419565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. H. Lim, Tyler J. Becker, Mykel J. Kochenderfer, C. Tomlin, Zachary Sunberg
Partially observable Markov decision processes (POMDPs) provide a flexible representation for real-world decision and control problems. However, POMDPs are notoriously difficult to solve, especially when the state and observation spaces are continuous or hybrid, which is often the case for physical systems. While recent online sampling-based POMDP algorithms that plan with observation likelihood weighting have shown practical effectiveness, a general theory characterizing the approximation error of the particle filtering techniques that these algorithms use has not previously been proposed. Our main contribution is bounding the error between any POMDP and its corresponding finite sample particle belief MDP (PB-MDP) approximation. This fundamental bridge between PB-MDPs and POMDPs allows us to adapt any sampling-based MDP algorithm to a POMDP by solving the corresponding particle belief MDP, thereby extending the convergence guarantees of the MDP algorithm to the POMDP. Practically, this is implemented by using the particle filter belief transition model as the generative model for the MDP solver. While this requires access to the observation density model from the POMDP, it only increases the transition sampling complexity of the MDP solver by a factor of O(C), where C is the number of particles. Thus, when combined with sparse sampling MDP algorithms, this approach can yield algorithms for POMDPs that have no direct theoretical dependence on the size of the state and observation spaces. In addition to our theoretical contribution, we perform five numerical experiments on benchmark POMDPs to demonstrate that a simple MDP algorithm adapted using PB-MDP approximation, Sparse-PFT, achieves performance competitive with other leading continuous observation POMDP solvers.
{"title":"Optimality Guarantees for Particle Belief Approximation of POMDPs","authors":"M. H. Lim, Tyler J. Becker, Mykel J. Kochenderfer, C. Tomlin, Zachary Sunberg","doi":"10.1613/jair.1.14525","DOIUrl":"https://doi.org/10.1613/jair.1.14525","url":null,"abstract":"Partially observable Markov decision processes (POMDPs) provide a flexible representation for real-world decision and control problems. However, POMDPs are notoriously difficult to solve, especially when the state and observation spaces are continuous or hybrid, which is often the case for physical systems. While recent online sampling-based POMDP algorithms that plan with observation likelihood weighting have shown practical effectiveness, a general theory characterizing the approximation error of the particle filtering techniques that these algorithms use has not previously been proposed. Our main contribution is bounding the error between any POMDP and its corresponding finite sample particle belief MDP (PB-MDP) approximation. This fundamental bridge between PB-MDPs and POMDPs allows us to adapt any sampling-based MDP algorithm to a POMDP by solving the corresponding particle belief MDP, thereby extending the convergence guarantees of the MDP algorithm to the POMDP. Practically, this is implemented by using the particle filter belief transition model as the generative model for the MDP solver. While this requires access to the observation density model from the POMDP, it only increases the transition sampling complexity of the MDP solver by a factor of O(C), where C is the number of particles. Thus, when combined with sparse sampling MDP algorithms, this approach can yield algorithms for POMDPs that have no direct theoretical dependence on the size of the state and observation spaces. In addition to our theoretical contribution, we perform five numerical experiments on benchmark POMDPs to demonstrate that a simple MDP algorithm adapted using PB-MDP approximation, Sparse-PFT, achieves performance competitive with other leading continuous observation POMDP solvers.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":"77 1","pages":"1591-1636"},"PeriodicalIF":5.0,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67392941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Roberto Asín Achá, Rodrigo López, Sebastián Hagedorn, Jorge A. Baier
Multi-agent pathfinding (MAPF) is an NP-hard problem. As such, dense maps may be very hard to solve optimally. In such scenarios, compilation-based approaches, via Boolean satisfiability (SAT) and answer set programming (ASP), have been shown to outperform heuristic-search-based approaches, such as conflict-based search (CBS). In this paper, we propose a new Boolean encoding for MAPF, and show how to implement it in ASP and MaxSAT. A feature that distinguishes our encoding from existing ones is that swap and follow conflicts are encoded using binary clauses, which can be exploited by current conflict-driven clause learning (CDCL) solvers. In addition, the number of clauses used to encode swap and follow conflicts do not depend on the number of agents, allowing us to scale better. For MaxSAT, we study different ways in which we may combine the MSU3 and LSU algorithms for maximum performance. In our experimental evaluation, we used square grids, ranging from 20 x 20 to 50 x 50 cells, and warehouse maps, with a varying number of agents and obstacles. We compared against representative solvers of the state-of-the-art, including the search-based algorithm CBS, the ASP-based solver ASP-MAPF, and the branch-and-cut-and-price hybrid solver, BCP. We observe that the ASP implementation of our encoding, ASP-MAPF2 outperforms other solvers in most of our experiments. The MaxSAT implementation of our encoding, MtMS shows best performance in relatively small warehouse maps when the number of agents is large, which are the instances with closer resemblance to hard puzzle-like problems.
多智能体寻路(MAPF)是一个np难题。因此,密集地图可能很难以最佳方式解决。在这种情况下,基于编译的方法,通过布尔可满足性(SAT)和答案集编程(ASP),已被证明优于基于启发式搜索的方法,如基于冲突的搜索(CBS)。本文提出了一种新的MAPF布尔编码方法,并给出了如何在ASP和MaxSAT中实现它。将我们的编码与现有编码区分开来的一个特征是,swap和follow冲突是使用二进制子句编码的,当前的冲突驱动子句学习(CDCL)解决方案可以利用二进制子句。此外,用于编码交换和跟踪冲突的子句的数量不依赖于代理的数量,从而允许我们更好地扩展。对于MaxSAT,我们研究了将MSU3和LSU算法结合在一起以获得最大性能的不同方法。在我们的实验评估中,我们使用方形网格,范围从20 x 20到50 x 50单元,以及仓库地图,具有不同数量的代理和障碍物。我们比较了最先进的代表性求解器,包括基于搜索的算法CBS,基于asp的求解器ASP-MAPF,以及分支-降价-价格混合求解器BCP。我们观察到我们编码的ASP实现,ASP- mapf2在我们的大多数实验中优于其他求解器。我们编码的MaxSAT实现MtMS在相对较小的仓库映射中显示出最佳性能,当代理数量很大时,这些实例更类似于困难的谜题问题。
{"title":"Multi-Agent Path Finding: A New Boolean Encoding","authors":"Roberto Asín Achá, Rodrigo López, Sebastián Hagedorn, Jorge A. Baier","doi":"10.1613/jair.1.13818","DOIUrl":"https://doi.org/10.1613/jair.1.13818","url":null,"abstract":"Multi-agent pathfinding (MAPF) is an NP-hard problem. As such, dense maps may be very hard to solve optimally. In such scenarios, compilation-based approaches, via Boolean satisfiability (SAT) and answer set programming (ASP), have been shown to outperform heuristic-search-based approaches, such as conflict-based search (CBS). In this paper, we propose a new Boolean encoding for MAPF, and show how to implement it in ASP and MaxSAT. A feature that distinguishes our encoding from existing ones is that swap and follow conflicts are encoded using binary clauses, which can be exploited by current conflict-driven clause learning (CDCL) solvers. In addition, the number of clauses used to encode swap and follow conflicts do not depend on the number of agents, allowing us to scale better. For MaxSAT, we study different ways in which we may combine the MSU3 and LSU algorithms for maximum performance. In our experimental evaluation, we used square grids, ranging from 20 x 20 to 50 x 50 cells, and warehouse maps, with a varying number of agents and obstacles. We compared against representative solvers of the state-of-the-art, including the search-based algorithm CBS, the ASP-based solver ASP-MAPF, and the branch-and-cut-and-price hybrid solver, BCP. We observe that the ASP implementation of our encoding, ASP-MAPF2 outperforms other solvers in most of our experiments. The MaxSAT implementation of our encoding, MtMS shows best performance in relatively small warehouse maps when the number of agents is large, which are the instances with closer resemblance to hard puzzle-like problems.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":"98 1","pages":"323-350"},"PeriodicalIF":5.0,"publicationDate":"2022-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85899961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}