使用词性和长短期记忆网络的语言建模

2019 9th International Conference on Computer and Knowledge Engineering (ICCKE) Pub Date : 2019-10-01 DOI:10.1109/ICCKE48569.2019.8964806

Sanaz Saki Norouzi, A. Akbari, B. Nasersharif

{"title":"使用词性和长短期记忆网络的语言建模","authors":"Sanaz Saki Norouzi, A. Akbari, B. Nasersharif","doi":"10.1109/ICCKE48569.2019.8964806","DOIUrl":null,"url":null,"abstract":"In recent years, neural networks have been widely used for language modeling in different tasks of natural language processing. Results show that long short-term memory (LSTM) neural networks are appropriate for language modeling due to their ability to process long sequences. Furthermore, many studies are shown that extra information improve language models (LMs) performance. In this research, we propose parallel structures for incorporating part-of-speech tags into language modeling task using both the unidirectional and bidirectional type of LSTMs. Words and part-of-speech tags are given to the network as parallel inputs. In this way, to concatenate these two paths, two different structures are proposed according to the type of network used in the parallel part. We analyze the efficiency on Penn Treebank (PTB) dataset using perplexity measure. These two proposed structures show improvements in comparison to the baseline models. Not only does the bidirectional LSTM method gain the lowest perplexity, but it also has the lowest training parameters among our proposed methods. The perplexity of proposed structures has reduced 1.5% and %13 for unidirectional and bidirectional LSTMs, respectively.","PeriodicalId":6685,"journal":{"name":"2019 9th International Conference on Computer and Knowledge Engineering (ICCKE)","volume":"12 1","pages":"182-187"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Language Modeling Using Part-of-speech and Long Short-Term Memory Networks\",\"authors\":\"Sanaz Saki Norouzi, A. Akbari, B. Nasersharif\",\"doi\":\"10.1109/ICCKE48569.2019.8964806\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, neural networks have been widely used for language modeling in different tasks of natural language processing. Results show that long short-term memory (LSTM) neural networks are appropriate for language modeling due to their ability to process long sequences. Furthermore, many studies are shown that extra information improve language models (LMs) performance. In this research, we propose parallel structures for incorporating part-of-speech tags into language modeling task using both the unidirectional and bidirectional type of LSTMs. Words and part-of-speech tags are given to the network as parallel inputs. In this way, to concatenate these two paths, two different structures are proposed according to the type of network used in the parallel part. We analyze the efficiency on Penn Treebank (PTB) dataset using perplexity measure. These two proposed structures show improvements in comparison to the baseline models. Not only does the bidirectional LSTM method gain the lowest perplexity, but it also has the lowest training parameters among our proposed methods. The perplexity of proposed structures has reduced 1.5% and %13 for unidirectional and bidirectional LSTMs, respectively.\",\"PeriodicalId\":6685,\"journal\":{\"name\":\"2019 9th International Conference on Computer and Knowledge Engineering (ICCKE)\",\"volume\":\"12 1\",\"pages\":\"182-187\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 9th International Conference on Computer and Knowledge Engineering (ICCKE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCKE48569.2019.8964806\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 9th International Conference on Computer and Knowledge Engineering (ICCKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCKE48569.2019.8964806","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

近年来，神经网络在自然语言处理的不同任务中被广泛应用于语言建模。结果表明，长短期记忆(LSTM)神经网络具有处理长序列的能力，适合于语言建模。此外，许多研究表明，额外的信息可以提高语言模型的性能。在这项研究中，我们提出了使用单向和双向lstm类型将词性标签纳入语言建模任务的并行结构。单词和词性标签作为并行输入输入给网络。这样，为了连接这两条路径，根据并行部分使用的网络类型，提出了两种不同的结构。利用困惑度测度分析了Penn Treebank (PTB)数据集的效率。与基线模型相比，这两种拟议的结构有所改进。在我们提出的方法中，双向LSTM方法不仅获得了最低的困惑，而且具有最低的训练参数。对于单向和双向lstm，所提出的结构的困惑度分别降低了1.5%和% 13%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Language Modeling Using Part-of-speech and Long Short-Term Memory Networks

In recent years, neural networks have been widely used for language modeling in different tasks of natural language processing. Results show that long short-term memory (LSTM) neural networks are appropriate for language modeling due to their ability to process long sequences. Furthermore, many studies are shown that extra information improve language models (LMs) performance. In this research, we propose parallel structures for incorporating part-of-speech tags into language modeling task using both the unidirectional and bidirectional type of LSTMs. Words and part-of-speech tags are given to the network as parallel inputs. In this way, to concatenate these two paths, two different structures are proposed according to the type of network used in the parallel part. We analyze the efficiency on Penn Treebank (PTB) dataset using perplexity measure. These two proposed structures show improvements in comparison to the baseline models. Not only does the bidirectional LSTM method gain the lowest perplexity, but it also has the lowest training parameters among our proposed methods. The perplexity of proposed structures has reduced 1.5% and %13 for unidirectional and bidirectional LSTMs, respectively.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 9th International Conference on Computer and Knowledge Engineering (ICCKE)

自引率

0.00%

发文量