Theerapat Lapjaturapit, Kobkrit Viriyayudhakom, T. Theeramunkong
{"title":"Multi-Candidate Word Segmentation using Bi-directional LSTM Neural Networks","authors":"Theerapat Lapjaturapit, Kobkrit Viriyayudhakom, T. Theeramunkong","doi":"10.1109/ICESIT-ICICTES.2018.8442053","DOIUrl":null,"url":null,"abstract":"Most existing word segmentation methods output one single segmentation solution. This paper provides an analysis of word segmentation performance when more than one solutions are taken into account. Towards this investigation, a deep neural network with multiple thresholds is applied to generate multiple candidates for segmentation. As a test-bed, the well-known bidirectional long short-term memory (BiLSTM) units are used with eleven contexts in a deep neural network. As performance indices, three measures; recall, precision and f-measure, are plotted with respect to various thresholds for both boundary level and word level evaluation. By a number of experiments, the result shows that the multi-candidate word segmentation can help us increase the recalls while maintaining the precisions.","PeriodicalId":57136,"journal":{"name":"单片机与嵌入式系统应用","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2018-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"单片机与嵌入式系统应用","FirstCategoryId":"1093","ListUrlMain":"https://doi.org/10.1109/ICESIT-ICICTES.2018.8442053","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11
Abstract
Most existing word segmentation methods output one single segmentation solution. This paper provides an analysis of word segmentation performance when more than one solutions are taken into account. Towards this investigation, a deep neural network with multiple thresholds is applied to generate multiple candidates for segmentation. As a test-bed, the well-known bidirectional long short-term memory (BiLSTM) units are used with eleven contexts in a deep neural network. As performance indices, three measures; recall, precision and f-measure, are plotted with respect to various thresholds for both boundary level and word level evaluation. By a number of experiments, the result shows that the multi-candidate word segmentation can help us increase the recalls while maintaining the precisions.