{"title":"BERT Tokenization and Hybrid-Optimized Deep Recurrent Neural Network for Hindi Document Summarization","authors":"Sumalatha Bandari, Vishnu Vardhan Bulusu","doi":"10.4018/ijfsa.313601","DOIUrl":null,"url":null,"abstract":"Text summarization generates a concise summary of the available information by determining the most relevant and important sentences in the document. In this paper, an effective approach of document summarization is developed for generating summary of Hindi documents. The developed deep learning-based Hindi document summarization system comprises of a number of phases, such as input data acquisition, tokenization, feature extraction, score generation, and sentence extraction. Here, a deep recurrent neural network (Deep RNN) is employed for generating the scores of the sentences based on the significant features, wherein the weights and learning parameters of the deep RNN are updated by using the devised coot remora optimization (CRO) algorithm. Moreover, the developed CRO-Deep RNN is examined for its efficacy considering metrics, like recall-oriented understudy for gisting evaluation (ROUGE), recall, precision, and f-measure, and is found to have attained values of 80.896%, 95.700%, 95.051%, and 95.374%, respectively.","PeriodicalId":38154,"journal":{"name":"International Journal of Fuzzy System Applications","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Fuzzy System Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4018/ijfsa.313601","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 0
Abstract
Text summarization generates a concise summary of the available information by determining the most relevant and important sentences in the document. In this paper, an effective approach of document summarization is developed for generating summary of Hindi documents. The developed deep learning-based Hindi document summarization system comprises of a number of phases, such as input data acquisition, tokenization, feature extraction, score generation, and sentence extraction. Here, a deep recurrent neural network (Deep RNN) is employed for generating the scores of the sentences based on the significant features, wherein the weights and learning parameters of the deep RNN are updated by using the devised coot remora optimization (CRO) algorithm. Moreover, the developed CRO-Deep RNN is examined for its efficacy considering metrics, like recall-oriented understudy for gisting evaluation (ROUGE), recall, precision, and f-measure, and is found to have attained values of 80.896%, 95.700%, 95.051%, and 95.374%, respectively.
文本摘要通过确定文档中最相关和最重要的句子,生成可用信息的简明摘要。本文提出了一种生成印地语文档摘要的有效方法。所开发的基于深度学习的印地语文档摘要系统包括多个阶段,如输入数据获取、标记化、特征提取、分数生成和句子提取。这里,深度递归神经网络(deep RNN)用于基于显著特征生成句子的分数,其中通过使用所设计的coot-remora优化(CRO)算法来更新深度RNN的权重和学习参数。此外,考虑到面向召回的注册评估替代研究(ROUGE)、召回率、精确度和f-measure等指标,对所开发的CRO Deep RNN的功效进行了检验,发现其值分别为80.896%、95.700%、95.051%和95.374%。