Abductive explanations (AXp's) are widely used for understanding decisions of classifiers. Existing definitions are suitable when features are independent. However, we show that ignoring constraints when they exist between features may lead to an explosion in the number of redundant or superfluous AXp's. We propose three new types of explanations that take into account constraints and that can be generated from the whole feature space or from a sample (such as a dataset). They are based on a key notion of coverage of an explanation, the set of instances it explains. We show that coverage is powerful enough to discard redundant and superfluous AXp's. For each type, we analyse the complexity of finding an explanation and investigate its formal properties. The final result is a catalogue of different forms of AXp's with different complexities and different formal guarantees.
{"title":"Abductive explanations of classifiers under constraints: Complexity and properties","authors":"Martin Cooper, Leila Amgoud","doi":"arxiv-2409.12154","DOIUrl":"https://doi.org/arxiv-2409.12154","url":null,"abstract":"Abductive explanations (AXp's) are widely used for understanding decisions of\u0000classifiers. Existing definitions are suitable when features are independent.\u0000However, we show that ignoring constraints when they exist between features may\u0000lead to an explosion in the number of redundant or superfluous AXp's. We\u0000propose three new types of explanations that take into account constraints and\u0000that can be generated from the whole feature space or from a sample (such as a\u0000dataset). They are based on a key notion of coverage of an explanation, the set\u0000of instances it explains. We show that coverage is powerful enough to discard\u0000redundant and superfluous AXp's. For each type, we analyse the complexity of\u0000finding an explanation and investigate its formal properties. The final result\u0000is a catalogue of different forms of AXp's with different complexities and\u0000different formal guarantees.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142252650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abeer Alshehri, Amal Abdulrahman, Hajar Alamri, Tim Miller, Mor Vered
Goal recognition (GR) involves inferring an agent's unobserved goal from a sequence of observations. This is a critical problem in AI with diverse applications. Traditionally, GR has been addressed using 'inference to the best explanation' or abduction, where hypotheses about the agent's goals are generated as the most plausible explanations for observed behavior. Alternatively, some approaches enhance interpretability by ensuring that an agent's behavior aligns with an observer's expectations or by making the reasoning behind decisions more transparent. In this work, we tackle a different challenge: explaining the GR process in a way that is comprehensible to humans. We introduce and evaluate an explainable model for goal recognition (GR) agents, grounded in the theoretical framework and cognitive processes underlying human behavior explanation. Drawing on insights from two human-agent studies, we propose a conceptual framework for human-centered explanations of GR. Using this framework, we develop the eXplainable Goal Recognition (XGR) model, which generates explanations for both why and why not questions. We evaluate the model computationally across eight GR benchmarks and through three user studies. The first study assesses the efficiency of generating human-like explanations within the Sokoban game domain, the second examines perceived explainability in the same domain, and the third evaluates the model's effectiveness in aiding decision-making in illegal fishing detection. Results demonstrate that the XGR model significantly enhances user understanding, trust, and decision-making compared to baseline models, underscoring its potential to improve human-agent collaboration.
目标识别(GR)涉及从一系列观察结果中推断出一个代理的未观察到的目标。这是人工智能中的一个关键问题,有着多种多样的应用。传统上,人们使用 "最佳解释推理 "或归纳法来解决目标识别问题,在这种方法中,关于代理目标的假设被生成为对观察到的行为最合理的解释。在这项工作中,我们要解决一个不同的挑战:以人类能够理解的方式解释 GR 过程。我们以人类行为解释的理论框架和认知过程为基础,介绍并评估了目标识别(GR)代理的可解释模型。借鉴两项人类代理研究的见解,我们提出了一个以人为中心解释目标识别代理的概念框架。利用这个框架,我们开发了可解释目标识别(XGR)模型,该模型可以对为什么和为什么不的问题进行解释。我们通过八项 GR 基准和三项用户研究对该模型进行了计算评估。第一项研究评估了在推箱子游戏领域生成类人解释的效率,第二项研究考察了同一领域的感知可解释性,第三项研究评估了该模型在非法捕鱼检测中辅助决策的有效性。结果表明,与基线模型相比,XGR 模型大大增强了用户的理解力、信任度和决策能力,突出了该模型在改善人机协作方面的潜力。
{"title":"Towards Explainable Goal Recognition Using Weight of Evidence (WoE): A Human-Centered Approach","authors":"Abeer Alshehri, Amal Abdulrahman, Hajar Alamri, Tim Miller, Mor Vered","doi":"arxiv-2409.11675","DOIUrl":"https://doi.org/arxiv-2409.11675","url":null,"abstract":"Goal recognition (GR) involves inferring an agent's unobserved goal from a\u0000sequence of observations. This is a critical problem in AI with diverse\u0000applications. Traditionally, GR has been addressed using 'inference to the best\u0000explanation' or abduction, where hypotheses about the agent's goals are\u0000generated as the most plausible explanations for observed behavior.\u0000Alternatively, some approaches enhance interpretability by ensuring that an\u0000agent's behavior aligns with an observer's expectations or by making the\u0000reasoning behind decisions more transparent. In this work, we tackle a\u0000different challenge: explaining the GR process in a way that is comprehensible\u0000to humans. We introduce and evaluate an explainable model for goal recognition\u0000(GR) agents, grounded in the theoretical framework and cognitive processes\u0000underlying human behavior explanation. Drawing on insights from two human-agent\u0000studies, we propose a conceptual framework for human-centered explanations of\u0000GR. Using this framework, we develop the eXplainable Goal Recognition (XGR)\u0000model, which generates explanations for both why and why not questions. We\u0000evaluate the model computationally across eight GR benchmarks and through three\u0000user studies. The first study assesses the efficiency of generating human-like\u0000explanations within the Sokoban game domain, the second examines perceived\u0000explainability in the same domain, and the third evaluates the model's\u0000effectiveness in aiding decision-making in illegal fishing detection. Results\u0000demonstrate that the XGR model significantly enhances user understanding,\u0000trust, and decision-making compared to baseline models, underscoring its\u0000potential to improve human-agent collaboration.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142252652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A pandemic is the spread of a disease across large regions, and can have devastating costs to the society in terms of health, economic and social. As such, the study of effective pandemic mitigation strategies can yield significant positive impact on the society. A pandemic can be mathematically described using a compartmental model, such as the Susceptible Infected Removed (SIR) model. In this paper, we extend the solution equations of the SIR model to a state transition model with lockdowns. We formalize a metric hybrid planning problem based on this state transition model, and solve it using a metric hybrid planner. We improve the runtime effectiveness of the metric hybrid planner with the addition of valid inequalities, and demonstrate the success of our approach both theoretically and experimentally under various challenging settings.
大流行病是指一种疾病在大范围地区的传播,会给社会带来健康、经济和社会方面的巨大损失。因此,研究有效的大流行缓解策略会对社会产生重大的积极影响。大流行可以用分区模型来进行数学描述,如 "易感感染清除(SIR)"模型。在本文中,我们将 SIR 模型的求解方程扩展到带有锁定的状态转换模型。我们基于该状态转换模型形式化了一个度量混合规划问题,并使用度量混合规划器求解了该问题。通过添加有效不等式,我们提高了度量混合规划器的运行效率,并在各种挑战性设置下从理论和实验两方面证明了我们的方法是成功的。
{"title":"A Metric Hybrid Planning Approach to Solving Pandemic Planning Problems with Simple SIR Models","authors":"Ari Gestetner, Buser Say","doi":"arxiv-2409.11631","DOIUrl":"https://doi.org/arxiv-2409.11631","url":null,"abstract":"A pandemic is the spread of a disease across large regions, and can have\u0000devastating costs to the society in terms of health, economic and social. As\u0000such, the study of effective pandemic mitigation strategies can yield\u0000significant positive impact on the society. A pandemic can be mathematically\u0000described using a compartmental model, such as the Susceptible Infected Removed\u0000(SIR) model. In this paper, we extend the solution equations of the SIR model\u0000to a state transition model with lockdowns. We formalize a metric hybrid\u0000planning problem based on this state transition model, and solve it using a\u0000metric hybrid planner. We improve the runtime effectiveness of the metric\u0000hybrid planner with the addition of valid inequalities, and demonstrate the\u0000success of our approach both theoretically and experimentally under various\u0000challenging settings.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142268779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In our previous research, we provided a reasoning system (called LeSAC) based on argumentation theory to provide legal support to designers during the design process. Building on this, this paper explores how to provide designers with effective explanations for their legally relevant design decisions. We extend the previous system for providing explanations by specifying norms and the key legal or ethical principles for justifying actions in normative contexts. Considering that first-order logic has strong expressive power, in the current paper we adopt a first-order deontic logic system with deontic operators and preferences. We illustrate the advantages and necessity of introducing deontic logic and designing explanations under LeSAC by modelling two cases in the context of autonomous driving. In particular, this paper also discusses the requirements of the updated LeSAC to guarantee rationality, and proves that a well-defined LeSAC can satisfy the rationality postulate for rule-based argumentation frameworks. This ensures the system's ability to provide coherent, legally valid explanations for complex design decisions.
{"title":"Explaining Non-monotonic Normative Reasoning using Argumentation Theory with Deontic Logic","authors":"Zhe Yu, Yiwei Lu","doi":"arxiv-2409.11780","DOIUrl":"https://doi.org/arxiv-2409.11780","url":null,"abstract":"In our previous research, we provided a reasoning system (called LeSAC) based\u0000on argumentation theory to provide legal support to designers during the design\u0000process. Building on this, this paper explores how to provide designers with\u0000effective explanations for their legally relevant design decisions. We extend\u0000the previous system for providing explanations by specifying norms and the key\u0000legal or ethical principles for justifying actions in normative contexts.\u0000Considering that first-order logic has strong expressive power, in the current\u0000paper we adopt a first-order deontic logic system with deontic operators and\u0000preferences. We illustrate the advantages and necessity of introducing deontic\u0000logic and designing explanations under LeSAC by modelling two cases in the\u0000context of autonomous driving. In particular, this paper also discusses the\u0000requirements of the updated LeSAC to guarantee rationality, and proves that a\u0000well-defined LeSAC can satisfy the rationality postulate for rule-based\u0000argumentation frameworks. This ensures the system's ability to provide\u0000coherent, legally valid explanations for complex design decisions.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142252651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alberto Termine, Emanuele Ratti, Alessandro Facchini
In recent years, the dissemination of machine learning (ML) methodologies in scientific research has prompted discussions on theory ladenness. More specifically, the issue of theory ladenness has remerged as questions about whether and how ML models (MLMs) and ML modelling strategies are impacted by the domain theory of the scientific field in which ML is used and implemented (e.g., physics, chemistry, biology, etc). On the one hand, some have argued that there is no difference between traditional (pre ML) and ML assisted science. In both cases, theory plays an essential and unavoidable role in the analysis of phenomena and the construction and use of models. Others have argued instead that ML methodologies and models are theory independent and, in some cases, even theory free. In this article, we argue that both positions are overly simplistic and do not advance our understanding of the interplay between ML methods and domain theories. Specifically, we provide an analysis of theory ladenness in ML assisted science. Our analysis reveals that, while the construction of MLMs can be relatively independent of domain theory, the practical implementation and interpretation of these models within a given specific domain still relies on fundamental theoretical assumptions and background knowledge.
近年来,机器学习(ML)方法论在科学研究领域的传播引发了关于理论阶梯性的讨论。具体地说,理论阶梯性问题是指机器学习模型(MLMs)和机器学习建模策略是否以及如何受到使用和实施机器学习的科学领域(如物理、化学、生物等)的领域理论的影响。一方面,有人认为传统科学(ML 之前)和 ML 辅助科学之间没有区别。在这两种情况下,理论在分析现象、构建和使用模型方面都起着不可或缺的作用。另一些人则认为,ML 方法论和模型与理论无关,在某些情况下甚至不需要理论。在这篇文章中,我们认为这两种立场都过于简单化,并没有促进我们对 ML 方法与领域理论之间相互作用的理解。具体来说,我们分析了 ML 辅助科学中的理论独立性。我们的分析揭示出,虽然构建 MLM 可以相对独立于领域理论,但这些模型在特定领域内的实际应用和解释仍然依赖于基本理论假设和背景知识。
{"title":"Machine Learning and Theory Ladenness -- A Phenomenological Account","authors":"Alberto Termine, Emanuele Ratti, Alessandro Facchini","doi":"arxiv-2409.11277","DOIUrl":"https://doi.org/arxiv-2409.11277","url":null,"abstract":"In recent years, the dissemination of machine learning (ML) methodologies in\u0000scientific research has prompted discussions on theory ladenness. More\u0000specifically, the issue of theory ladenness has remerged as questions about\u0000whether and how ML models (MLMs) and ML modelling strategies are impacted by\u0000the domain theory of the scientific field in which ML is used and implemented\u0000(e.g., physics, chemistry, biology, etc). On the one hand, some have argued\u0000that there is no difference between traditional (pre ML) and ML assisted\u0000science. In both cases, theory plays an essential and unavoidable role in the\u0000analysis of phenomena and the construction and use of models. Others have\u0000argued instead that ML methodologies and models are theory independent and, in\u0000some cases, even theory free. In this article, we argue that both positions are\u0000overly simplistic and do not advance our understanding of the interplay between\u0000ML methods and domain theories. Specifically, we provide an analysis of theory\u0000ladenness in ML assisted science. Our analysis reveals that, while the\u0000construction of MLMs can be relatively independent of domain theory, the\u0000practical implementation and interpretation of these models within a given\u0000specific domain still relies on fundamental theoretical assumptions and\u0000background knowledge.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142252655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fatemeh Haji, Mazal Bethany, Maryam Tabar, Jason Chiang, Anthony Rios, Peyman Najafirad
Multi-agent strategies have emerged as a promising approach to enhance the reasoning abilities of Large Language Models (LLMs) by assigning specialized roles in the problem-solving process. Concurrently, Tree of Thoughts (ToT) methods have shown potential in improving reasoning for complex question-answering tasks by exploring diverse reasoning paths. A critical limitation in multi-agent reasoning is the 'Reasoner' agent's shallow exploration of reasoning paths. While ToT strategies could help mitigate this problem, they may generate flawed reasoning branches, which could harm the trustworthiness of the final answer. To leverage the strengths of both multi-agent reasoning and ToT strategies, we introduce a novel approach combining ToT-based Reasoner agents with a Thought Validator agent. Multiple Reasoner agents operate in parallel, employing ToT to explore diverse reasoning paths. The Thought Validator then scrutinizes these paths, considering a Reasoner's conclusion only if its reasoning is valid. This method enables a more robust voting strategy by discarding faulty reasoning paths, enhancing the system's ability to tackle tasks requiring systematic and trustworthy reasoning. Our method demonstrates superior performance compared to existing techniques when evaluated on the GSM8K dataset, outperforming the standard ToT strategy by an average 5.6% across four LLMs.
多代理策略是通过在问题解决过程中分配专门角色来提高大型语言模型(LLM)推理能力的一种有前途的方法。与此同时,思维树(ToT)方法通过探索不同的推理路径,在提高复杂问题解答任务的推理能力方面显示出了潜力。推理者"(Reasoner)代理对推理路径的浅层探索是多代理推理中一个令人诟病的限制因素。虽然 ToT 策略有助于缓解这一问题,但它们可能会产生有缺陷的推理分支,从而损害最终答案的可信度。为了充分利用多代理推理和 ToT 策略的优势,我们引入了一种将基于 ToT 的推理代理与思想验证代理相结合的新方法。多个推理代理并行运行,利用 ToT 探索不同的推理路径。然后,"思想验证器 "会仔细检查这些路径,只有在推理有效的情况下,才会考虑推理者的结论。这种方法通过摒弃错误的推理路径,实现了更稳健的投票策略,增强了系统处理需要系统化和可信推理的任务的能力。在 GSM8K 数据集上进行评估时,与现有技术相比,我们的方法表现出更优越的性能,在四个 LLM 中平均比标准 ToT 策略高出 5.6%。
{"title":"Improving LLM Reasoning with Multi-Agent Tree-of-Thought Validator Agent","authors":"Fatemeh Haji, Mazal Bethany, Maryam Tabar, Jason Chiang, Anthony Rios, Peyman Najafirad","doi":"arxiv-2409.11527","DOIUrl":"https://doi.org/arxiv-2409.11527","url":null,"abstract":"Multi-agent strategies have emerged as a promising approach to enhance the\u0000reasoning abilities of Large Language Models (LLMs) by assigning specialized\u0000roles in the problem-solving process. Concurrently, Tree of Thoughts (ToT)\u0000methods have shown potential in improving reasoning for complex\u0000question-answering tasks by exploring diverse reasoning paths. A critical\u0000limitation in multi-agent reasoning is the 'Reasoner' agent's shallow\u0000exploration of reasoning paths. While ToT strategies could help mitigate this\u0000problem, they may generate flawed reasoning branches, which could harm the\u0000trustworthiness of the final answer. To leverage the strengths of both\u0000multi-agent reasoning and ToT strategies, we introduce a novel approach\u0000combining ToT-based Reasoner agents with a Thought Validator agent. Multiple\u0000Reasoner agents operate in parallel, employing ToT to explore diverse reasoning\u0000paths. The Thought Validator then scrutinizes these paths, considering a\u0000Reasoner's conclusion only if its reasoning is valid. This method enables a\u0000more robust voting strategy by discarding faulty reasoning paths, enhancing the\u0000system's ability to tackle tasks requiring systematic and trustworthy\u0000reasoning. Our method demonstrates superior performance compared to existing\u0000techniques when evaluated on the GSM8K dataset, outperforming the standard ToT\u0000strategy by an average 5.6% across four LLMs.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142252654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Process-mining techniques have emerged as powerful tools for analyzing event data to gain insights into business processes. In this paper, we present a comprehensive analysis of road traffic fine management processes using the pm4py library in Python. We start by importing an event log dataset and explore its characteristics, including the distribution of activities and process variants. Through filtering and statistical analysis, we uncover key patterns and variations in the process executions. Subsequently, we apply various process-mining algorithms, including the Alpha Miner, Inductive Miner, and Heuristic Miner, to discover process models from the event log data. We visualize the discovered models to understand the workflow structures and dependencies within the process. Additionally, we discuss the strengths and limitations of each mining approach in capturing the underlying process dynamics. Our findings shed light on the efficiency and effectiveness of road traffic fine management processes, providing valuable insights for process optimization and decision-making. This study demonstrates the utility of pm4py in facilitating process mining tasks and its potential for analyzing real-world business processes.
{"title":"Navigating Process Mining: A Case study using pm4py","authors":"Ali Jlidi, László Kovács","doi":"arxiv-2409.11294","DOIUrl":"https://doi.org/arxiv-2409.11294","url":null,"abstract":"Process-mining techniques have emerged as powerful tools for analyzing event\u0000data to gain insights into business processes. In this paper, we present a\u0000comprehensive analysis of road traffic fine management processes using the\u0000pm4py library in Python. We start by importing an event log dataset and explore\u0000its characteristics, including the distribution of activities and process\u0000variants. Through filtering and statistical analysis, we uncover key patterns\u0000and variations in the process executions. Subsequently, we apply various\u0000process-mining algorithms, including the Alpha Miner, Inductive Miner, and\u0000Heuristic Miner, to discover process models from the event log data. We\u0000visualize the discovered models to understand the workflow structures and\u0000dependencies within the process. Additionally, we discuss the strengths and\u0000limitations of each mining approach in capturing the underlying process\u0000dynamics. Our findings shed light on the efficiency and effectiveness of road\u0000traffic fine management processes, providing valuable insights for process\u0000optimization and decision-making. This study demonstrates the utility of pm4py\u0000in facilitating process mining tasks and its potential for analyzing real-world\u0000business processes.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142268672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Vehicle Routing Problem is about optimizing the routes of vehicles to meet the needs of customers at specific locations. The route graph consists of depots on several levels and customer positions. Several optimization methods have been developed over the years, most of which are based on some type of classic heuristic: genetic algorithm, simulated annealing, tabu search, ant colony optimization, firefly algorithm. Recent developments in machine learning provide a new toolset, the rich family of neural networks, for tackling complex problems. The main area of application of neural networks is the area of classification and regression. Route optimization can be viewed as a new challenge for neural networks. The article first presents an analysis of the applicability of neural network tools, then a novel graphical neural network model is presented in detail. The efficiency analysis based on test experiments shows the applicability of the proposed NN architecture.
{"title":"Neural Networks for Vehicle Routing Problem","authors":"László Kovács, Ali Jlidi","doi":"arxiv-2409.11290","DOIUrl":"https://doi.org/arxiv-2409.11290","url":null,"abstract":"The Vehicle Routing Problem is about optimizing the routes of vehicles to\u0000meet the needs of customers at specific locations. The route graph consists of\u0000depots on several levels and customer positions. Several optimization methods\u0000have been developed over the years, most of which are based on some type of\u0000classic heuristic: genetic algorithm, simulated annealing, tabu search, ant\u0000colony optimization, firefly algorithm. Recent developments in machine learning\u0000provide a new toolset, the rich family of neural networks, for tackling complex\u0000problems. The main area of application of neural networks is the area of\u0000classification and regression. Route optimization can be viewed as a new\u0000challenge for neural networks. The article first presents an analysis of the\u0000applicability of neural network tools, then a novel graphical neural network\u0000model is presented in detail. The efficiency analysis based on test experiments\u0000shows the applicability of the proposed NN architecture.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142252653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Empathetic response generation necessitates the integration of emotional and intentional dynamics to foster meaningful interactions. Existing research either neglects the intricate interplay between emotion and intent, leading to suboptimal controllability of empathy, or resorts to large language models (LLMs), which incur significant computational overhead. In this paper, we introduce ReflectDiffu, a lightweight and comprehensive framework for empathetic response generation. This framework incorporates emotion contagion to augment emotional expressiveness and employs an emotion-reasoning mask to pinpoint critical emotional elements. Additionally, it integrates intent mimicry within reinforcement learning for refinement during diffusion. By harnessing an intent twice reflect the mechanism of Exploring-Sampling-Correcting, ReflectDiffu adeptly translates emotional decision-making into precise intent actions, thereby addressing empathetic response misalignments stemming from emotional misrecognition. Through reflection, the framework maps emotional states to intents, markedly enhancing both response empathy and flexibility. Comprehensive experiments reveal that ReflectDiffu outperforms existing models regarding relevance, controllability, and informativeness, achieving state-of-the-art results in both automatic and human evaluations.
{"title":"ReflectDiffu: Reflect between Emotion-intent Contagion and Mimicry for Empathetic Response Generation via a RL-Diffusion Framework","authors":"Jiahao Yuan, Zixiang Di, Zhiqing Cui, Guisong Yang, Usman Naseem","doi":"arxiv-2409.10289","DOIUrl":"https://doi.org/arxiv-2409.10289","url":null,"abstract":"Empathetic response generation necessitates the integration of emotional and\u0000intentional dynamics to foster meaningful interactions. Existing research\u0000either neglects the intricate interplay between emotion and intent, leading to\u0000suboptimal controllability of empathy, or resorts to large language models\u0000(LLMs), which incur significant computational overhead. In this paper, we\u0000introduce ReflectDiffu, a lightweight and comprehensive framework for\u0000empathetic response generation. This framework incorporates emotion contagion\u0000to augment emotional expressiveness and employs an emotion-reasoning mask to\u0000pinpoint critical emotional elements. Additionally, it integrates intent\u0000mimicry within reinforcement learning for refinement during diffusion. By\u0000harnessing an intent twice reflect the mechanism of\u0000Exploring-Sampling-Correcting, ReflectDiffu adeptly translates emotional\u0000decision-making into precise intent actions, thereby addressing empathetic\u0000response misalignments stemming from emotional misrecognition. Through\u0000reflection, the framework maps emotional states to intents, markedly enhancing\u0000both response empathy and flexibility. Comprehensive experiments reveal that\u0000ReflectDiffu outperforms existing models regarding relevance, controllability,\u0000and informativeness, achieving state-of-the-art results in both automatic and\u0000human evaluations.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142252656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Contemporary hardware design benefits from the abstraction provided by high-level logic gates, streamlining the implementation of logic circuits. Logic Synthesis Optimization (LSO) operates at one level of abstraction within the Electronic Design Automation (EDA) workflow, targeting improvements in logic circuits with respect to performance metrics such as size and speed in the final layout. Recent trends in the field show a growing interest in leveraging Machine Learning (ML) for EDA, notably through ML-guided logic synthesis utilizing policy-based Reinforcement Learning (RL) methods.Despite these advancements, existing models face challenges such as overfitting and limited generalization, attributed to constrained public circuits and the expressiveness limitations of graph encoders. To address these hurdles, and tackle data scarcity issues, we introduce LSOformer, a novel approach harnessing Autoregressive transformer models and predictive SSL to predict the trajectory of Quality of Results (QoR). LSOformer integrates cross-attention modules to merge insights from circuit graphs and optimization sequences, thereby enhancing prediction accuracy for QoR metrics. Experimental studies validate the effectiveness of LSOformer, showcasing its superior performance over baseline architectures in QoR prediction tasks, where it achieves improvements of 5.74%, 4.35%, and 17.06% on the EPFL, OABCD, and proprietary circuits datasets, respectively, in inductive setup.
{"title":"Logic Synthesis Optimization with Predictive Self-Supervision via Causal Transformers","authors":"Raika Karimi, Faezeh Faez, Yingxue Zhang, Xing Li, Lei Chen, Mingxuan Yuan, Mahdi Biparva","doi":"arxiv-2409.10653","DOIUrl":"https://doi.org/arxiv-2409.10653","url":null,"abstract":"Contemporary hardware design benefits from the abstraction provided by\u0000high-level logic gates, streamlining the implementation of logic circuits.\u0000Logic Synthesis Optimization (LSO) operates at one level of abstraction within\u0000the Electronic Design Automation (EDA) workflow, targeting improvements in\u0000logic circuits with respect to performance metrics such as size and speed in\u0000the final layout. Recent trends in the field show a growing interest in\u0000leveraging Machine Learning (ML) for EDA, notably through ML-guided logic\u0000synthesis utilizing policy-based Reinforcement Learning (RL) methods.Despite\u0000these advancements, existing models face challenges such as overfitting and\u0000limited generalization, attributed to constrained public circuits and the\u0000expressiveness limitations of graph encoders. To address these hurdles, and\u0000tackle data scarcity issues, we introduce LSOformer, a novel approach\u0000harnessing Autoregressive transformer models and predictive SSL to predict the\u0000trajectory of Quality of Results (QoR). LSOformer integrates cross-attention\u0000modules to merge insights from circuit graphs and optimization sequences,\u0000thereby enhancing prediction accuracy for QoR metrics. Experimental studies\u0000validate the effectiveness of LSOformer, showcasing its superior performance\u0000over baseline architectures in QoR prediction tasks, where it achieves\u0000improvements of 5.74%, 4.35%, and 17.06% on the EPFL, OABCD, and proprietary\u0000circuits datasets, respectively, in inductive setup.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142268782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}