{"title":"Lexical-index-based Japanese syntax matching","authors":"Wuying Liu, Lin Wang, Xing Zhang","doi":"10.1109/FSKD.2016.7603281","DOIUrl":null,"url":null,"abstract":"Syntax matching is a challenging basic issue, and related algorithms can be widely used in natural language processing. This paper addresses the problem of how to efficiently match a sentence with the most similar syntactic structure to a given Japanese sentence from a big set of Japanese sentences, designs a novel lexical index data structure of hiratoken-sentence index (HSI) according to our Japanese syntax identification hypothesis of hiragana token, and proposes a HSI-based Japanese syntax matching (HSIJSM) algorithm. Supported by the HSI data structure, the HSIJSM algorithm can approximately get the syntactic similarity after the fast calculating of formal similarity between two Japanese sentences. The experimental results show that the HSIJSM algorithm can achieve the preferable performance with greatly reduced time costs.","PeriodicalId":373155,"journal":{"name":"2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)","volume":"138 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FSKD.2016.7603281","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Syntax matching is a challenging basic issue, and related algorithms can be widely used in natural language processing. This paper addresses the problem of how to efficiently match a sentence with the most similar syntactic structure to a given Japanese sentence from a big set of Japanese sentences, designs a novel lexical index data structure of hiratoken-sentence index (HSI) according to our Japanese syntax identification hypothesis of hiragana token, and proposes a HSI-based Japanese syntax matching (HSIJSM) algorithm. Supported by the HSI data structure, the HSIJSM algorithm can approximately get the syntactic similarity after the fast calculating of formal similarity between two Japanese sentences. The experimental results show that the HSIJSM algorithm can achieve the preferable performance with greatly reduced time costs.