{"title":"Automated detection and segmentation of table of contents page and index pages from document images","authors":"Sekhar Mandal, S. Chowdhury, A. Das, B. Chanda","doi":"10.1109/ICIAP.2003.1234052","DOIUrl":null,"url":null,"abstract":"The requirement of identifying and segmenting the table of contents (TOC) and index pages in the development of a digital library is obvious. A digital document library is created to provide a non-labour intensive, cheap and flexible way of storing, representing and managing paper documents in electronic form to facilitate indexing, viewing, printing and extracting the intended portions. Information from the TOC and index pages is extracted to use in a document database for effective retrieval of the required pieces of information. We present fully automatic identification and segmentation of TOC and index pages from a scanned document.","PeriodicalId":218076,"journal":{"name":"12th International Conference on Image Analysis and Processing, 2003.Proceedings.","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"12th International Conference on Image Analysis and Processing, 2003.Proceedings.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIAP.2003.1234052","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11
Abstract
The requirement of identifying and segmenting the table of contents (TOC) and index pages in the development of a digital library is obvious. A digital document library is created to provide a non-labour intensive, cheap and flexible way of storing, representing and managing paper documents in electronic form to facilitate indexing, viewing, printing and extracting the intended portions. Information from the TOC and index pages is extracted to use in a document database for effective retrieval of the required pieces of information. We present fully automatic identification and segmentation of TOC and index pages from a scanned document.