Beyond Boundaries: A Human-like Approach for Question Answering over Structured and Unstructured Information Sources

Transactions of the Association for Computational Linguistics Pub Date : 2024-06-01 DOI:10.1162/tacl_a_00671

Jens Lehmann, Dhananjay Bhandiwad, Preetam Gattogi, S. Vahdati

引用次数: 1

Abstract

Abstract Answering factual questions from heterogenous sources, such as graphs and text, is a key capacity of intelligent systems. Current approaches either (i) perform question answering over text and structured sources as separate pipelines followed by a merge step or (ii) provide an early integration, giving up the strengths of particular information sources. To solve this problem, we present “HumanIQ”, a method that teaches language models to dynamically combine retrieved information by imitating how humans use retrieval tools. Our approach couples a generic method for gathering human demonstrations of tool use with adaptive few-shot learning for tool augmented models. We show that HumanIQ confers significant benefits, including i) reducing the error rate of our strongest baseline (GPT-4) by over 50% across 3 benchmarks, (ii) improving human preference over responses from vanilla GPT-4 (45.3% wins, 46.7% ties, 8.0% loss), and (iii) outperforming numerous task-specific baselines.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

超越界限：结构化和非结构化信息源问题解答的类人方法

摘要从图形和文本等异源回答事实问题是智能系统的一项关键能力。目前的方法要么是：(i) 将文本和结构化信息源作为单独的管道执行问题解答，然后再进行合并；要么是：提供早期整合，放弃特定信息源的优势。为了解决这个问题，我们提出了 "HumanIQ"，这是一种通过模仿人类使用检索工具的方式，教语言模型动态合并检索信息的方法。我们的方法将收集人类使用工具演示的通用方法与工具增强模型的自适应少量学习相结合。我们的研究表明，HumanIQ 具有显著的优势，包括 i) 在 3 个基准测试中将我们最强基线（GPT-4）的错误率降低了 50%以上；ii) 提高了人类对普通 GPT-4 响应的偏好度（45.3% 的胜率、46.7% 的平局率、8.0% 的损失率）；iii) 优于众多特定任务基线。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Transactions of the Association for Computational Linguistics

自引率

0.00%

发文量