Multi-head attention based candidate segment selection in QA over hybrid data

IF 0.8 4区 计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Intelligent Data Analysis Pub Date : 2023-11-20 DOI:10.3233/ida-227032
Qian Chen, Xiaoying Gao, Xin Guo, Suge Wang
{"title":"Multi-head attention based candidate segment selection in QA over hybrid data","authors":"Qian Chen, Xiaoying Gao, Xin Guo, Suge Wang","doi":"10.3233/ida-227032","DOIUrl":null,"url":null,"abstract":"Question Answering based on Tabular and Textual data is a novel task proposed in recent years in the field of QA. At present, most QA systems return answers from a single data form, such as knowledge graphs, tables, texts. However, hybrid data including structured and unstructured data is quite pervasive in real life instead of a single form. Recent research on TAT-QA mainly suffers from the higher error of extracting supporting evidences from both tabular and textual content. This paper aimed to address the problem of failure evidence extraction from more complex and realistic hybrid data. We first proposed two types of metrics to evaluate the performance of evidence extraction on hybrid data, i.e. wrong evidence ratio (WER) and missing evidence ratio (MER). Then we utilize a candidate extractor to obtain supporting evidence related to the question. Third, an origin selector is designed to determine from where the question’s answer comes. Finally, the loss of origin selector is fused to the final loss function, which can improve the evidence extraction performance. Experimental results on the TAT-QA dataset showed that our proposed model outperforms the best baseline in terms of F1, WER and MER, which proves the effectiveness of our model.","PeriodicalId":50355,"journal":{"name":"Intelligent Data Analysis","volume":"42 1","pages":""},"PeriodicalIF":0.8000,"publicationDate":"2023-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligent Data Analysis","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.3233/ida-227032","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Question Answering based on Tabular and Textual data is a novel task proposed in recent years in the field of QA. At present, most QA systems return answers from a single data form, such as knowledge graphs, tables, texts. However, hybrid data including structured and unstructured data is quite pervasive in real life instead of a single form. Recent research on TAT-QA mainly suffers from the higher error of extracting supporting evidences from both tabular and textual content. This paper aimed to address the problem of failure evidence extraction from more complex and realistic hybrid data. We first proposed two types of metrics to evaluate the performance of evidence extraction on hybrid data, i.e. wrong evidence ratio (WER) and missing evidence ratio (MER). Then we utilize a candidate extractor to obtain supporting evidence related to the question. Third, an origin selector is designed to determine from where the question’s answer comes. Finally, the loss of origin selector is fused to the final loss function, which can improve the evidence extraction performance. Experimental results on the TAT-QA dataset showed that our proposed model outperforms the best baseline in terms of F1, WER and MER, which proves the effectiveness of our model.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
混合数据质量保证中基于多头注意力的候选段选择
基于表格和文本数据的问题解答是近年来在质量保证领域提出的一项新任务。目前,大多数质量保证系统都是从知识图谱、表格、文本等单一数据形式返回答案的。然而,包括结构化数据和非结构化数据在内的混合数据在现实生活中非常普遍,而非单一形式。最近关于 TAT-QA 的研究主要存在从表格和文本内容中提取支持证据的误差较大的问题。本文旨在解决从更复杂、更现实的混合数据中提取失败证据的问题。我们首先提出了两类指标来评估混合数据中证据提取的性能,即错误证据率(WER)和缺失证据率(MER)。然后,我们利用候选提取器来获取与问题相关的支持证据。第三,设计一个来源选择器来确定问题答案的来源。最后,将起源选择器的损失融合到最终损失函数中,从而提高证据提取性能。在 TAT-QA 数据集上的实验结果表明,我们提出的模型在 F1、WER 和 MER 方面都优于最佳基线,这证明了我们模型的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Intelligent Data Analysis
Intelligent Data Analysis 工程技术-计算机:人工智能
CiteScore
2.20
自引率
5.90%
发文量
85
审稿时长
3.3 months
期刊介绍: Intelligent Data Analysis provides a forum for the examination of issues related to the research and applications of Artificial Intelligence techniques in data analysis across a variety of disciplines. These techniques include (but are not limited to): all areas of data visualization, data pre-processing (fusion, editing, transformation, filtering, sampling), data engineering, database mining techniques, tools and applications, use of domain knowledge in data analysis, big data applications, evolutionary algorithms, machine learning, neural nets, fuzzy logic, statistical pattern recognition, knowledge filtering, and post-processing. In particular, papers are preferred that discuss development of new AI related data analysis architectures, methodologies, and techniques and their applications to various domains.
期刊最新文献
ELCA: Enhanced boundary location for Chinese named entity recognition via contextual association Identifying relevant features of CSE-CIC-IDS2018 dataset for the development of an intrusion detection system Knowledge graph embedding in a uniform space MeFiNet: Modeling multi-semantic convolution-based feature interactions for CTR prediction Enhancing Adaboost performance in the presence of class-label noise: A comparative study on EEG-based classification of schizophrenic patients and benchmark datasets
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1