{"title":"Clustering and classification of document structure-a machine learning approach","authors":"A. Dengel, F. Dubiel","doi":"10.1109/ICDAR.1995.601965","DOIUrl":null,"url":null,"abstract":"We describe a system which is capable of learning the presentation of document logical structures, exemplarily shown for business letters. Presenting a set of instances to the system, it clusters them into structural concepts and induces a concept hierarchy. This concept hierarchy is taken as a source for classifying future input. The paper introduces the different learning steps, describes how the resulting concept hierarchy is applied for logical labeling and reports on the results.","PeriodicalId":273519,"journal":{"name":"Proceedings of 3rd International Conference on Document Analysis and Recognition","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1995-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"54","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of 3rd International Conference on Document Analysis and Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDAR.1995.601965","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 54
Abstract
We describe a system which is capable of learning the presentation of document logical structures, exemplarily shown for business letters. Presenting a set of instances to the system, it clusters them into structural concepts and induces a concept hierarchy. This concept hierarchy is taken as a source for classifying future input. The paper introduces the different learning steps, describes how the resulting concept hierarchy is applied for logical labeling and reports on the results.