{"title":"从序列中推断上下文无关语法的不确定性方法","authors":"Yuan Li, Jim X. Chen","doi":"10.1109/ICCWAMTIP.2014.7073350","DOIUrl":null,"url":null,"abstract":"Grammar induction has received a lot of attention from researchers in the past decades because of its practical and theoretical impact on data compression, pattern discovery and computation theory. There are a bunch of grammar induction algorithms for a given sequence are introduced. Most existing work on learning grammar for a given sequence is based on deterministic approach. Such deterministic approaches used by grammar induction algorithms can be categorized as greedy heuristics. In addition, there are many grammars, which can be learned from a given sequence. The smallest grammar problem is defined by some researchers to evaluate different grammars learned from a given sequence by different algorithms. Such problem is proved as NP-hard. In this work, we introduce a nondeterministic approach to address grammar induction for a given sequence based on genetic algorithm. We demonstrate that our grammar induction algorithm can effectively identify smaller grammar than a well-known grammar induction algorithm. Experimental results, which are presented, illustrate that our approach and algorithm are feasible to resolve difficult problems such as identifying patterns of DNA sequence.","PeriodicalId":211273,"journal":{"name":"2014 11th International Computer Conference on Wavelet Actiev Media Technology and Information Processing(ICCWAMTIP)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"A nondeterministic approach to infer context free grammar from sequence\",\"authors\":\"Yuan Li, Jim X. Chen\",\"doi\":\"10.1109/ICCWAMTIP.2014.7073350\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Grammar induction has received a lot of attention from researchers in the past decades because of its practical and theoretical impact on data compression, pattern discovery and computation theory. There are a bunch of grammar induction algorithms for a given sequence are introduced. Most existing work on learning grammar for a given sequence is based on deterministic approach. Such deterministic approaches used by grammar induction algorithms can be categorized as greedy heuristics. In addition, there are many grammars, which can be learned from a given sequence. The smallest grammar problem is defined by some researchers to evaluate different grammars learned from a given sequence by different algorithms. Such problem is proved as NP-hard. In this work, we introduce a nondeterministic approach to address grammar induction for a given sequence based on genetic algorithm. We demonstrate that our grammar induction algorithm can effectively identify smaller grammar than a well-known grammar induction algorithm. Experimental results, which are presented, illustrate that our approach and algorithm are feasible to resolve difficult problems such as identifying patterns of DNA sequence.\",\"PeriodicalId\":211273,\"journal\":{\"name\":\"2014 11th International Computer Conference on Wavelet Actiev Media Technology and Information Processing(ICCWAMTIP)\",\"volume\":\"40 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 11th International Computer Conference on Wavelet Actiev Media Technology and Information Processing(ICCWAMTIP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCWAMTIP.2014.7073350\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 11th International Computer Conference on Wavelet Actiev Media Technology and Information Processing(ICCWAMTIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCWAMTIP.2014.7073350","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A nondeterministic approach to infer context free grammar from sequence
Grammar induction has received a lot of attention from researchers in the past decades because of its practical and theoretical impact on data compression, pattern discovery and computation theory. There are a bunch of grammar induction algorithms for a given sequence are introduced. Most existing work on learning grammar for a given sequence is based on deterministic approach. Such deterministic approaches used by grammar induction algorithms can be categorized as greedy heuristics. In addition, there are many grammars, which can be learned from a given sequence. The smallest grammar problem is defined by some researchers to evaluate different grammars learned from a given sequence by different algorithms. Such problem is proved as NP-hard. In this work, we introduce a nondeterministic approach to address grammar induction for a given sequence based on genetic algorithm. We demonstrate that our grammar induction algorithm can effectively identify smaller grammar than a well-known grammar induction algorithm. Experimental results, which are presented, illustrate that our approach and algorithm are feasible to resolve difficult problems such as identifying patterns of DNA sequence.