A confidence-based knowledge integration framework for cross-domain table question answering

IF 7.2 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Knowledge-Based Systems Pub Date : 2024-11-15 DOI:10.1016/j.knosys.2024.112718

Yuankai Fan , Tonghui Ren , Can Huang , Beini Zheng , Yinan Jing , Zhenying He , Jinbao Li , Jianxin Li

{"title":"A confidence-based knowledge integration framework for cross-domain table question answering","authors":"Yuankai Fan , Tonghui Ren , Can Huang , Beini Zheng , Yinan Jing , Zhenying He , Jinbao Li , Jianxin Li","doi":"10.1016/j.knosys.2024.112718","DOIUrl":null,"url":null,"abstract":"<div><div>Recent advancements in TableQA leverage sequence-to-sequence (Seq2seq) deep learning models to accurately respond to natural language queries. These models achieve this by converting the queries into SQL queries, using information drawn from one or more tables. However, Seq2seq models often produce uncertain (low-confidence) predictions when distributing probability mass across multiple outputs during a decoding step, frequently yielding translation errors. To tackle this problem, we present <span>Ckif</span>, a <em>confidence-based knowledge integration framework</em> that uses a two-stage deep-learning-based ranking technique to mitigate the low-confidence problem commonly associated with Seq2seq models for TableQA. The core idea of <span>Ckif</span> is to introduce a flexible framework that seamlessly integrates with any existing Seq2seq translation models to enhance their performance. Specifically, by inspecting the probability values in each decoding step, <span>Ckif</span> first masks out each low-confidence prediction from the predicted outcome of an underlying Seq2seq model. Subsequently, <span>Ckif</span> integrates prior knowledge of query language to generalize masked-out queries, enabling the generation of all possible queries and their corresponding NL expressions. Finally, a two-stage deep-learning ranking approach is developed to evaluate the semantic similarity of NL expressions to a given NL question, hence determining the best-matching result. Extensive experiments are conducted to investigate <span>Ckif</span> by applying it to five state-of-the-art Seq2seq models using a widely used public benchmark. The experimental results indicate that <span>Ckif</span> consistently enhances the performance of all the Seq2seq models, demonstrating its effectiveness for better supporting TableQA.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"306 ","pages":"Article 112718"},"PeriodicalIF":7.2000,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705124013522","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Recent advancements in TableQA leverage sequence-to-sequence (Seq2seq) deep learning models to accurately respond to natural language queries. These models achieve this by converting the queries into SQL queries, using information drawn from one or more tables. However, Seq2seq models often produce uncertain (low-confidence) predictions when distributing probability mass across multiple outputs during a decoding step, frequently yielding translation errors. To tackle this problem, we present Ckif, a confidence-based knowledge integration framework that uses a two-stage deep-learning-based ranking technique to mitigate the low-confidence problem commonly associated with Seq2seq models for TableQA. The core idea of Ckif is to introduce a flexible framework that seamlessly integrates with any existing Seq2seq translation models to enhance their performance. Specifically, by inspecting the probability values in each decoding step, Ckif first masks out each low-confidence prediction from the predicted outcome of an underlying Seq2seq model. Subsequently, Ckif integrates prior knowledge of query language to generalize masked-out queries, enabling the generation of all possible queries and their corresponding NL expressions. Finally, a two-stage deep-learning ranking approach is developed to evaluate the semantic similarity of NL expressions to a given NL question, hence determining the best-matching result. Extensive experiments are conducted to investigate Ckif by applying it to five state-of-the-art Seq2seq models using a widely used public benchmark. The experimental results indicate that Ckif consistently enhances the performance of all the Seq2seq models, demonstrating its effectiveness for better supporting TableQA.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于置信度的跨域表格问题解答知识整合框架

TableQA 的最新进展是利用序列到序列（Seq2seq）深度学习模型来准确响应自然语言查询。这些模型通过使用从一个或多个表中提取的信息将查询转换为 SQL 查询来实现这一目标。然而，Seq2seq 模型在解码步骤中将概率分布到多个输出时，往往会产生不确定（低置信度）的预测，从而经常产生翻译错误。为了解决这个问题，我们提出了基于置信度的知识整合框架 Ckif，该框架使用基于深度学习的两阶段排序技术来缓解 Seq2seq 模型在 TableQA 中常见的低置信度问题。Ckif 的核心理念是引入一个灵活的框架，与任何现有的 Seq2seq 翻译模型无缝集成，以提高其性能。具体来说，通过检查每个解码步骤中的概率值，Ckif 首先从底层 Seq2seq 模型的预测结果中屏蔽掉每个低置信度预测。随后，Ckif 整合查询语言的先验知识，对屏蔽掉的查询进行泛化，从而生成所有可能的查询及其相应的 NL 表达式。最后，开发了一种两阶段深度学习排序方法，用于评估 NL 表达式与给定 NL 问题的语义相似性，从而确定最佳匹配结果。为了研究 Ckif，我们进行了广泛的实验，使用广泛使用的公共基准将 Ckif 应用于五个最先进的 Seq2seq 模型。实验结果表明，Ckif 始终如一地提高了所有 Seq2seq 模型的性能，证明了它在更好地支持 TableQA 方面的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Knowledge-Based Systems 工程技术-计算机：人工智能

CiteScore

14.80

自引率

12.50%

发文量

1245

审稿时长

7.8 months

期刊介绍： Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.