Fatemeh Haji, Mazal Bethany, Maryam Tabar, Jason Chiang, Anthony Rios, Peyman Najafirad
{"title":"Improving LLM Reasoning with Multi-Agent Tree-of-Thought Validator Agent","authors":"Fatemeh Haji, Mazal Bethany, Maryam Tabar, Jason Chiang, Anthony Rios, Peyman Najafirad","doi":"arxiv-2409.11527","DOIUrl":null,"url":null,"abstract":"Multi-agent strategies have emerged as a promising approach to enhance the\nreasoning abilities of Large Language Models (LLMs) by assigning specialized\nroles in the problem-solving process. Concurrently, Tree of Thoughts (ToT)\nmethods have shown potential in improving reasoning for complex\nquestion-answering tasks by exploring diverse reasoning paths. A critical\nlimitation in multi-agent reasoning is the 'Reasoner' agent's shallow\nexploration of reasoning paths. While ToT strategies could help mitigate this\nproblem, they may generate flawed reasoning branches, which could harm the\ntrustworthiness of the final answer. To leverage the strengths of both\nmulti-agent reasoning and ToT strategies, we introduce a novel approach\ncombining ToT-based Reasoner agents with a Thought Validator agent. Multiple\nReasoner agents operate in parallel, employing ToT to explore diverse reasoning\npaths. The Thought Validator then scrutinizes these paths, considering a\nReasoner's conclusion only if its reasoning is valid. This method enables a\nmore robust voting strategy by discarding faulty reasoning paths, enhancing the\nsystem's ability to tackle tasks requiring systematic and trustworthy\nreasoning. Our method demonstrates superior performance compared to existing\ntechniques when evaluated on the GSM8K dataset, outperforming the standard ToT\nstrategy by an average 5.6\\% across four LLMs.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":"152 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11527","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Multi-agent strategies have emerged as a promising approach to enhance the
reasoning abilities of Large Language Models (LLMs) by assigning specialized
roles in the problem-solving process. Concurrently, Tree of Thoughts (ToT)
methods have shown potential in improving reasoning for complex
question-answering tasks by exploring diverse reasoning paths. A critical
limitation in multi-agent reasoning is the 'Reasoner' agent's shallow
exploration of reasoning paths. While ToT strategies could help mitigate this
problem, they may generate flawed reasoning branches, which could harm the
trustworthiness of the final answer. To leverage the strengths of both
multi-agent reasoning and ToT strategies, we introduce a novel approach
combining ToT-based Reasoner agents with a Thought Validator agent. Multiple
Reasoner agents operate in parallel, employing ToT to explore diverse reasoning
paths. The Thought Validator then scrutinizes these paths, considering a
Reasoner's conclusion only if its reasoning is valid. This method enables a
more robust voting strategy by discarding faulty reasoning paths, enhancing the
system's ability to tackle tasks requiring systematic and trustworthy
reasoning. Our method demonstrates superior performance compared to existing
techniques when evaluated on the GSM8K dataset, outperforming the standard ToT
strategy by an average 5.6\% across four LLMs.
多代理策略是通过在问题解决过程中分配专门角色来提高大型语言模型(LLM)推理能力的一种有前途的方法。与此同时,思维树(ToT)方法通过探索不同的推理路径,在提高复杂问题解答任务的推理能力方面显示出了潜力。推理者"(Reasoner)代理对推理路径的浅层探索是多代理推理中一个令人诟病的限制因素。虽然 ToT 策略有助于缓解这一问题,但它们可能会产生有缺陷的推理分支,从而损害最终答案的可信度。为了充分利用多代理推理和 ToT 策略的优势,我们引入了一种将基于 ToT 的推理代理与思想验证代理相结合的新方法。多个推理代理并行运行,利用 ToT 探索不同的推理路径。然后,"思想验证器 "会仔细检查这些路径,只有在推理有效的情况下,才会考虑推理者的结论。这种方法通过摒弃错误的推理路径,实现了更稳健的投票策略,增强了系统处理需要系统化和可信推理的任务的能力。在 GSM8K 数据集上进行评估时,与现有技术相比,我们的方法表现出更优越的性能,在四个 LLM 中平均比标准 ToT 策略高出 5.6%。