Stacked Language Models for an Optimized Next Word Generation

E. O. Aliyu, E. Kotzé
{"title":"Stacked Language Models for an Optimized Next Word Generation","authors":"E. O. Aliyu, E. Kotzé","doi":"10.23919/IST-Africa56635.2022.9845545","DOIUrl":null,"url":null,"abstract":"Next word prediction task is the application of a language model in natural language generation that deals with generating words by repeatedly sampling the next word conditioned on the previous choices. This paper proposes a stacked language model for optimized next word generation using three models. In stage I, the meaning of a word is captured through learn embedding and the structure of the text sequence is encoded using a stacked Long Short Term Memory (LSTM). In stage II, a Bidirectional Long Short Term Memory (Bi-LSTM) stacking on top of the unidirectional LSTM encodes the structure of the text sequences, while in stage III, a two-layer Gated Recurrent Unit (GRU) is used to capture text sequences of data. The proposed system was implemented using Python 3.7, Tensorflow 2.6.0 with Keras and a Nvidia Graphical Processing Unit (GPU). The proposed deep learning models were trained using the Pride and Prejudice corpus from the Project Gutenberg library of ebooks. The evaluation was performed by predicting the next 3 words after considering 10 sets of text sequences. From the experiment carried out, the accuracy of the two-layer LSTM model measured 83%, the accuracy of the Bi-LSTM stacking on unidirectional LSTM model measured 79%, and the accuracy of the two-layer GRU model measured 81%. Regarding predictions, the two-layer LSTM predicted the 10 sequences correctly, the Bi-LSTM stacking on unidirectional LSTM predicted 8 sequences correctly and the two-layer GRU predicted 7 sequences correctly.","PeriodicalId":142887,"journal":{"name":"2022 IST-Africa Conference (IST-Africa)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IST-Africa Conference (IST-Africa)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/IST-Africa56635.2022.9845545","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Next word prediction task is the application of a language model in natural language generation that deals with generating words by repeatedly sampling the next word conditioned on the previous choices. This paper proposes a stacked language model for optimized next word generation using three models. In stage I, the meaning of a word is captured through learn embedding and the structure of the text sequence is encoded using a stacked Long Short Term Memory (LSTM). In stage II, a Bidirectional Long Short Term Memory (Bi-LSTM) stacking on top of the unidirectional LSTM encodes the structure of the text sequences, while in stage III, a two-layer Gated Recurrent Unit (GRU) is used to capture text sequences of data. The proposed system was implemented using Python 3.7, Tensorflow 2.6.0 with Keras and a Nvidia Graphical Processing Unit (GPU). The proposed deep learning models were trained using the Pride and Prejudice corpus from the Project Gutenberg library of ebooks. The evaluation was performed by predicting the next 3 words after considering 10 sets of text sequences. From the experiment carried out, the accuracy of the two-layer LSTM model measured 83%, the accuracy of the Bi-LSTM stacking on unidirectional LSTM model measured 79%, and the accuracy of the two-layer GRU model measured 81%. Regarding predictions, the two-layer LSTM predicted the 10 sequences correctly, the Bi-LSTM stacking on unidirectional LSTM predicted 8 sequences correctly and the two-layer GRU predicted 7 sequences correctly.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于优化下一代单词生成的堆叠语言模型
下一个单词预测任务是一种语言模型在自然语言生成中的应用,它处理的是在前一个选择的条件下,通过重复采样下一个单词来生成单词。本文提出了一种基于三个模型的层叠语言模型,用于优化下一代词的生成。在第一阶段,通过学习嵌入捕获单词的含义,并使用堆叠长短期记忆(LSTM)对文本序列的结构进行编码。在阶段II中,在单向LSTM之上叠加双向长短期记忆(Bi-LSTM)编码文本序列的结构,而在阶段III中,使用两层门控循环单元(GRU)捕获数据的文本序列。该系统使用Python 3.7, Tensorflow 2.6.0与Keras和Nvidia图形处理单元(GPU)实现。所提出的深度学习模型使用来自古腾堡计划电子书库的傲慢与偏见语料库进行训练。在考虑了10组文本序列后,通过预测接下来的3个单词来进行评估。从所进行的实验来看,两层LSTM模型的准确率为83%,双向LSTM叠加在单向LSTM模型上的准确率为79%,两层GRU模型的准确率为81%。在预测方面,双层LSTM正确预测了10个序列,在单向LSTM上叠加的Bi-LSTM正确预测了8个序列,双层GRU正确预测了7个序列。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A Smart City Maturity Assessment Model for South African Municipalities Design of a Tomato Harvesting Robot for Agricultural Small and Medium Enterprises (SMEs) Case Study on Data Collection of Kreol Morisien, a Low-Resourced Creole Language The Development of a Livestock Farm Management Information System (LFMIS) Equitable Access to eLearning during Covid-19 Pandemic and beyond. A Comparative Analysis between Rural and Urban Schools in Zimbabwe
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1