A confidence-based knowledge integration framework for cross-domain table question answering

IF 7.2 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Knowledge-Based Systems Pub Date : 2024-11-15 DOI:10.1016/j.knosys.2024.112718
Yuankai Fan , Tonghui Ren , Can Huang , Beini Zheng , Yinan Jing , Zhenying He , Jinbao Li , Jianxin Li
{"title":"A confidence-based knowledge integration framework for cross-domain table question answering","authors":"Yuankai Fan ,&nbsp;Tonghui Ren ,&nbsp;Can Huang ,&nbsp;Beini Zheng ,&nbsp;Yinan Jing ,&nbsp;Zhenying He ,&nbsp;Jinbao Li ,&nbsp;Jianxin Li","doi":"10.1016/j.knosys.2024.112718","DOIUrl":null,"url":null,"abstract":"<div><div>Recent advancements in TableQA leverage sequence-to-sequence (Seq2seq) deep learning models to accurately respond to natural language queries. These models achieve this by converting the queries into SQL queries, using information drawn from one or more tables. However, Seq2seq models often produce uncertain (low-confidence) predictions when distributing probability mass across multiple outputs during a decoding step, frequently yielding translation errors. To tackle this problem, we present <span>Ckif</span>, a <em>confidence-based knowledge integration framework</em> that uses a two-stage deep-learning-based ranking technique to mitigate the low-confidence problem commonly associated with Seq2seq models for TableQA. The core idea of <span>Ckif</span> is to introduce a flexible framework that seamlessly integrates with any existing Seq2seq translation models to enhance their performance. Specifically, by inspecting the probability values in each decoding step, <span>Ckif</span> first masks out each low-confidence prediction from the predicted outcome of an underlying Seq2seq model. Subsequently, <span>Ckif</span> integrates prior knowledge of query language to generalize masked-out queries, enabling the generation of all possible queries and their corresponding NL expressions. Finally, a two-stage deep-learning ranking approach is developed to evaluate the semantic similarity of NL expressions to a given NL question, hence determining the best-matching result. Extensive experiments are conducted to investigate <span>Ckif</span> by applying it to five state-of-the-art Seq2seq models using a widely used public benchmark. The experimental results indicate that <span>Ckif</span> consistently enhances the performance of all the Seq2seq models, demonstrating its effectiveness for better supporting TableQA.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"306 ","pages":"Article 112718"},"PeriodicalIF":7.2000,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705124013522","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Recent advancements in TableQA leverage sequence-to-sequence (Seq2seq) deep learning models to accurately respond to natural language queries. These models achieve this by converting the queries into SQL queries, using information drawn from one or more tables. However, Seq2seq models often produce uncertain (low-confidence) predictions when distributing probability mass across multiple outputs during a decoding step, frequently yielding translation errors. To tackle this problem, we present Ckif, a confidence-based knowledge integration framework that uses a two-stage deep-learning-based ranking technique to mitigate the low-confidence problem commonly associated with Seq2seq models for TableQA. The core idea of Ckif is to introduce a flexible framework that seamlessly integrates with any existing Seq2seq translation models to enhance their performance. Specifically, by inspecting the probability values in each decoding step, Ckif first masks out each low-confidence prediction from the predicted outcome of an underlying Seq2seq model. Subsequently, Ckif integrates prior knowledge of query language to generalize masked-out queries, enabling the generation of all possible queries and their corresponding NL expressions. Finally, a two-stage deep-learning ranking approach is developed to evaluate the semantic similarity of NL expressions to a given NL question, hence determining the best-matching result. Extensive experiments are conducted to investigate Ckif by applying it to five state-of-the-art Seq2seq models using a widely used public benchmark. The experimental results indicate that Ckif consistently enhances the performance of all the Seq2seq models, demonstrating its effectiveness for better supporting TableQA.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于置信度的跨域表格问题解答知识整合框架
TableQA 的最新进展是利用序列到序列(Seq2seq)深度学习模型来准确响应自然语言查询。这些模型通过使用从一个或多个表中提取的信息将查询转换为 SQL 查询来实现这一目标。然而,Seq2seq 模型在解码步骤中将概率分布到多个输出时,往往会产生不确定(低置信度)的预测,从而经常产生翻译错误。为了解决这个问题,我们提出了基于置信度的知识整合框架 Ckif,该框架使用基于深度学习的两阶段排序技术来缓解 Seq2seq 模型在 TableQA 中常见的低置信度问题。Ckif 的核心理念是引入一个灵活的框架,与任何现有的 Seq2seq 翻译模型无缝集成,以提高其性能。具体来说,通过检查每个解码步骤中的概率值,Ckif 首先从底层 Seq2seq 模型的预测结果中屏蔽掉每个低置信度预测。随后,Ckif 整合查询语言的先验知识,对屏蔽掉的查询进行泛化,从而生成所有可能的查询及其相应的 NL 表达式。最后,开发了一种两阶段深度学习排序方法,用于评估 NL 表达式与给定 NL 问题的语义相似性,从而确定最佳匹配结果。为了研究 Ckif,我们进行了广泛的实验,使用广泛使用的公共基准将 Ckif 应用于五个最先进的 Seq2seq 模型。实验结果表明,Ckif 始终如一地提高了所有 Seq2seq 模型的性能,证明了它在更好地支持 TableQA 方面的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Knowledge-Based Systems
Knowledge-Based Systems 工程技术-计算机:人工智能
CiteScore
14.80
自引率
12.50%
发文量
1245
审稿时长
7.8 months
期刊介绍: Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.
期刊最新文献
Progressive de-preference task-specific processing for generalizable person re-identification GKA-GPT: Graphical knowledge aggregation for multiturn dialog generation A novel spatio-temporal feature interleaved contrast learning neural network from a robustness perspective PSNet: A non-uniform illumination correction method for underwater images based pseudo-siamese network A novel domain-private-suppress meta-recognition network based universal domain generalization for machinery fault diagnosis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1