Open Domain Machine Reading Comprehension using InferSent

J. Kim, Chun-Bo Sim, Junyeong Kim, Jun Park, S. Park, S. Jung
{"title":"Open Domain Machine Reading Comprehension using InferSent","authors":"J. Kim, Chun-Bo Sim, Junyeong Kim, Jun Park, S. Park, S. Jung","doi":"10.30693/smj.2022.11.10.89","DOIUrl":null,"url":null,"abstract":"An open domain machine reading comprehension is a model that adds a function to search paragraphs as there are no paragraphs related to a given question. Document searches have an issue of lower performance with a lot of documents despite abundant research with word frequency based TF-IDF. Paragraph selections also have an issue of not extracting paragraph contexts, including sentence characteristics accurately despite a lot of research with word-based embedding. Document reading comprehension has an issue of slow learning due to the growing number of parameters despite a lot of research on BERT. Trying to solve these three issues, this study used BM25 which considered even sentence length and InferSent to get sentence contexts, and proposed an open domain machine reading comprehension with ALBERT to reduce the number of parameters. An experiment was conducted with SQuAD1.1 datasets. BM25 recorded a higher performance of document research than TF-IDF by 3.2%. InferSent showed a higher performance in paragraph selection than Transformer by 0.9%. Finally, as the number of paragraphs increased in document comprehension, ALBERT was 0.4% higher in EM and 0.2% higher in F1.","PeriodicalId":249252,"journal":{"name":"Korean Institute of Smart Media","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Korean Institute of Smart Media","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.30693/smj.2022.11.10.89","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

An open domain machine reading comprehension is a model that adds a function to search paragraphs as there are no paragraphs related to a given question. Document searches have an issue of lower performance with a lot of documents despite abundant research with word frequency based TF-IDF. Paragraph selections also have an issue of not extracting paragraph contexts, including sentence characteristics accurately despite a lot of research with word-based embedding. Document reading comprehension has an issue of slow learning due to the growing number of parameters despite a lot of research on BERT. Trying to solve these three issues, this study used BM25 which considered even sentence length and InferSent to get sentence contexts, and proposed an open domain machine reading comprehension with ALBERT to reduce the number of parameters. An experiment was conducted with SQuAD1.1 datasets. BM25 recorded a higher performance of document research than TF-IDF by 3.2%. InferSent showed a higher performance in paragraph selection than Transformer by 0.9%. Finally, as the number of paragraphs increased in document comprehension, ALBERT was 0.4% higher in EM and 0.2% higher in F1.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用InferSent的开放域机器阅读理解
开放域机器阅读理解是一个模型,当没有与给定问题相关的段落时,它增加了搜索段落的功能。尽管对基于词频的TF-IDF进行了大量研究,但对于大量文档的文档搜索存在性能较低的问题。段落选择也存在不能准确提取段落上下文(包括句子特征)的问题,尽管有很多基于词的嵌入研究。尽管对BERT进行了大量的研究,但由于参数数量的增加,文档阅读理解存在学习缓慢的问题。为了解决这三个问题,本研究使用了考虑均匀句子长度的BM25模型和基于语义的InferSent模型来获取句子上下文,并提出了一种基于ALBERT的开放域机器阅读理解方法来减少参数的数量。实验采用了squaw1.1数据集。BM25在文献研究方面的表现比TF-IDF高3.2%。InferSent在段落选择方面的性能比Transformer高0.9%。最后,随着文档理解段落数的增加,ALBERT在EM和F1中分别提高了0.4%和0.2%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Performance Evaluation of Object Detection Deep Learning Model for Paralichthys olivaceus Disease Symptoms Classification Exploring the Impact of Pesticide Usage on Crop Condition: A Causal Analysis of Agricultural Factors Journal of Knowledge Information Technology and Systems) Harvest Forecasting Improvement Using Federated Learning and Ensemble Model Apple detection dataset with visibility and deep learning detectionusing adaptive heatmap regression
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1