Study of Low Resource Language Document Extractive Summarization using Lexical chain and Bidirectional Encoder Representations from Transformers (BERT)

Pranjali Deshpande, Sunita Jahirabadkar
{"title":"Study of Low Resource Language Document Extractive Summarization using Lexical chain and Bidirectional Encoder Representations from Transformers (BERT)","authors":"Pranjali Deshpande, Sunita Jahirabadkar","doi":"10.1109/ComPE53109.2021.9751919","DOIUrl":null,"url":null,"abstract":"Language is the basic and unique tool of communication for humans. More than 7000 languages exist on our planet. Among these, the languages which lack in linguistic resources for building statistical NLP applications are known as low resource languages (LRL). Written communication is the most popular medium for humans to express and preserve their thoughts. Advancements in technology are bringing the world closer by facilitating remote communication access. Due to increase in the use of internet, with every second new textual information is getting generated. Not all this textual information is useful. With this context the task of summarization is gaining importance. Summary can be generated by two ways: Extractive and Abstractive. In Extractive summarization the key phrases and key sentences in the source document are retained, whereas Abstractive summary is generated by rewriting the key sentences. The task of summarization becomes more challenging in case of LRL documents. The paper focuses on the experiments carried out for extractive summarization of LRL documents using two approaches: Lexical chain and BERT.","PeriodicalId":211704,"journal":{"name":"2021 International Conference on Computational Performance Evaluation (ComPE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Computational Performance Evaluation (ComPE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ComPE53109.2021.9751919","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Language is the basic and unique tool of communication for humans. More than 7000 languages exist on our planet. Among these, the languages which lack in linguistic resources for building statistical NLP applications are known as low resource languages (LRL). Written communication is the most popular medium for humans to express and preserve their thoughts. Advancements in technology are bringing the world closer by facilitating remote communication access. Due to increase in the use of internet, with every second new textual information is getting generated. Not all this textual information is useful. With this context the task of summarization is gaining importance. Summary can be generated by two ways: Extractive and Abstractive. In Extractive summarization the key phrases and key sentences in the source document are retained, whereas Abstractive summary is generated by rewriting the key sentences. The task of summarization becomes more challenging in case of LRL documents. The paper focuses on the experiments carried out for extractive summarization of LRL documents using two approaches: Lexical chain and BERT.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于词法链和双向编码器表示的低资源语言文档抽取摘要研究
语言是人类最基本、最独特的交流工具。地球上有7000多种语言。其中,缺乏语言资源来构建统计NLP应用的语言被称为低资源语言(LRL)。书面交流是人类表达和保存思想的最流行的媒介。技术的进步通过促进远程通信接入使世界更加紧密。由于互联网使用的增加,每秒钟都有新的文本信息产生。并非所有的文本信息都是有用的。在这种背景下,总结的任务变得越来越重要。摘要可以通过两种方式生成:抽取和抽象。摘要提取是保留源文档中的关键短语和关键句子,而摘要抽象是通过重写关键句子生成的。对于LRL文档,摘要任务变得更具挑战性。本文重点研究了使用词法链和BERT两种方法对LRL文档进行提取摘要的实验。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
iSIMP with Integrity Validation using MD5 Hash A Fault Detection Scheme for IoT-enabled WSNs YOLOv3 based Real Time Social Distance Violation Detection in Public Places Finite Element Analysis of Femur Bone under Different Loading Conditions An Efficient and Anonymous Authentication Key Agreement Protocol for Smart Transportation System
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1