{"title":"学术文献图像处理的规则学习方法","authors":"A. Takasu, S. Satoh, E. Katsura","doi":"10.1109/ICDAR.1995.598985","DOIUrl":null,"url":null,"abstract":"A syntactic rule learning method is presented for analyzing document images and constructing a database from them. This method is used in a digital library system named CyberMagazine, where document images are sequentially converted into database tuples by block segmentation, rough classification, and syntactic analysis. The syntactic rule has an ability to analyze symbols located in two dimensional plane, and has a syntax similar to an ordinal context free grammar except for the concatenation of symbols. In the presented learning method, the syntactic rules are generated from a set of parse trees by decomposing the trees according to non terminal symbols, generalizing the decomposed trees to a syntactic rule, and merging them.","PeriodicalId":273519,"journal":{"name":"Proceedings of 3rd International Conference on Document Analysis and Recognition","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1995-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"A rule learning method for academic document image processing\",\"authors\":\"A. Takasu, S. Satoh, E. Katsura\",\"doi\":\"10.1109/ICDAR.1995.598985\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A syntactic rule learning method is presented for analyzing document images and constructing a database from them. This method is used in a digital library system named CyberMagazine, where document images are sequentially converted into database tuples by block segmentation, rough classification, and syntactic analysis. The syntactic rule has an ability to analyze symbols located in two dimensional plane, and has a syntax similar to an ordinal context free grammar except for the concatenation of symbols. In the presented learning method, the syntactic rules are generated from a set of parse trees by decomposing the trees according to non terminal symbols, generalizing the decomposed trees to a syntactic rule, and merging them.\",\"PeriodicalId\":273519,\"journal\":{\"name\":\"Proceedings of 3rd International Conference on Document Analysis and Recognition\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1995-08-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of 3rd International Conference on Document Analysis and Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDAR.1995.598985\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of 3rd International Conference on Document Analysis and Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDAR.1995.598985","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A rule learning method for academic document image processing
A syntactic rule learning method is presented for analyzing document images and constructing a database from them. This method is used in a digital library system named CyberMagazine, where document images are sequentially converted into database tuples by block segmentation, rough classification, and syntactic analysis. The syntactic rule has an ability to analyze symbols located in two dimensional plane, and has a syntax similar to an ordinal context free grammar except for the concatenation of symbols. In the presented learning method, the syntactic rules are generated from a set of parse trees by decomposing the trees according to non terminal symbols, generalizing the decomposed trees to a syntactic rule, and merging them.