{"title":"Text classification based on LSTM and attention","authors":"Xuemei Bai","doi":"10.1109/ICDIM.2018.8847061","DOIUrl":null,"url":null,"abstract":"An improved text classification method combining long short-term memory (LSTM) units and attention mechanism is proposed in this paper. First, the preliminary features are extracted from the convolution layer. Then, LSTM stores context history information with three gate structures - input gates, forget gates, and output gates. Attention mechanism generates semantic code containing the attention probability distribution and highlights the effect of input on the output. This mixed system model optimizes traditional models to represent features more accurately. The simulation shows that the proposed algorithm in this paper outperformed the RNN algorithm and the CNN algorithm which have long-distance dependency problem. Besides, the results also prove that the proposed algorithm works better than the LSTM algorithm by highlighting the impact of critical input in LSTM on the model.","PeriodicalId":120884,"journal":{"name":"2018 Thirteenth International Conference on Digital Information Management (ICDIM)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"26","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 Thirteenth International Conference on Digital Information Management (ICDIM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDIM.2018.8847061","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 26
Abstract
An improved text classification method combining long short-term memory (LSTM) units and attention mechanism is proposed in this paper. First, the preliminary features are extracted from the convolution layer. Then, LSTM stores context history information with three gate structures - input gates, forget gates, and output gates. Attention mechanism generates semantic code containing the attention probability distribution and highlights the effect of input on the output. This mixed system model optimizes traditional models to represent features more accurately. The simulation shows that the proposed algorithm in this paper outperformed the RNN algorithm and the CNN algorithm which have long-distance dependency problem. Besides, the results also prove that the proposed algorithm works better than the LSTM algorithm by highlighting the impact of critical input in LSTM on the model.