Ye Kyaw Thu, Hlaing Myat New, Hninn Aye Thant, Hay Man Htun, H. Mon, May Myat Myat Khaing, Hsu Pan Oo, Pale Phyu, Nang Aeindray Kyaw, T. Oo, T. Oo, Thet Thet Zin, T. Oo
{"title":"sylbreak4all:缅甸九大民族语言拆音节规则表达式","authors":"Ye Kyaw Thu, Hlaing Myat New, Hninn Aye Thant, Hay Man Htun, H. Mon, May Myat Myat Khaing, Hsu Pan Oo, Pale Phyu, Nang Aeindray Kyaw, T. Oo, T. Oo, Thet Thet Zin, T. Oo","doi":"10.1109/iSAI-NLP54397.2021.9678188","DOIUrl":null,"url":null,"abstract":"Unlike many other western languages, the Myanmar language uses a syllabic writing system and no space between words. Syllable segmentation is the necessary preprocess for natural language processing (NLP) tasks such as grapheme-to-phoneme (g2p) conversion, machine translation, romanization, and so on. In this study, sylbreak4all, a syllable segmentation tool, was developed for nine major ethnic languages of Myanmar, and they are Burmese, Shan, Pa’o, Pwo Kayin, S’gaw Kayin, Rakhine, Myeik, Dawei, and Mon by using regular expression (RE) patterns.","PeriodicalId":339826,"journal":{"name":"2021 16th International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"sylbreak4all: Regular Expressions for Syllable Breaking of Nine Major Ethnic Languages of Myanmar\",\"authors\":\"Ye Kyaw Thu, Hlaing Myat New, Hninn Aye Thant, Hay Man Htun, H. Mon, May Myat Myat Khaing, Hsu Pan Oo, Pale Phyu, Nang Aeindray Kyaw, T. Oo, T. Oo, Thet Thet Zin, T. Oo\",\"doi\":\"10.1109/iSAI-NLP54397.2021.9678188\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Unlike many other western languages, the Myanmar language uses a syllabic writing system and no space between words. Syllable segmentation is the necessary preprocess for natural language processing (NLP) tasks such as grapheme-to-phoneme (g2p) conversion, machine translation, romanization, and so on. In this study, sylbreak4all, a syllable segmentation tool, was developed for nine major ethnic languages of Myanmar, and they are Burmese, Shan, Pa’o, Pwo Kayin, S’gaw Kayin, Rakhine, Myeik, Dawei, and Mon by using regular expression (RE) patterns.\",\"PeriodicalId\":339826,\"journal\":{\"name\":\"2021 16th International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP)\",\"volume\":\"27 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 16th International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/iSAI-NLP54397.2021.9678188\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 16th International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/iSAI-NLP54397.2021.9678188","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
sylbreak4all: Regular Expressions for Syllable Breaking of Nine Major Ethnic Languages of Myanmar
Unlike many other western languages, the Myanmar language uses a syllabic writing system and no space between words. Syllable segmentation is the necessary preprocess for natural language processing (NLP) tasks such as grapheme-to-phoneme (g2p) conversion, machine translation, romanization, and so on. In this study, sylbreak4all, a syllable segmentation tool, was developed for nine major ethnic languages of Myanmar, and they are Burmese, Shan, Pa’o, Pwo Kayin, S’gaw Kayin, Rakhine, Myeik, Dawei, and Mon by using regular expression (RE) patterns.