H. Hase, Toshiyuki Shinokawa, M. Yoneda, M. Sakai, H. Maruyama
{"title":"Character string extraction by multi-stage relaxation","authors":"H. Hase, Toshiyuki Shinokawa, M. Yoneda, M. Sakai, H. Maruyama","doi":"10.1109/ICDAR.1997.619860","DOIUrl":null,"url":null,"abstract":"An extraction algorithm for character strings is proposed. We first obtain a set of eight-connected components from a document image. For the components, we apply a relaxation method. The method makes mutual connections between components increase or decrease depending on the state of the neighboring components. While applying the relaxation method several times, the process proceeds from a local connection to a global connection, and finally character strings are extracted. We call this process multi stage relaxation. The advantages of this algorithm are that it does not need to nominate character components from an image beforehand, it is adaptive for character size and font, and it can also cope with a document which includes strings with various orientations. In our experiments we use a color image of a magazine cover and a monochromatic image of a graph. For the color image, the multi stage relaxation was executed for each binary image obtained by color segmentation. Lastly, we show the results of the experiments and discuss the effectiveness of our method.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"1244 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDAR.1997.619860","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12
Abstract
An extraction algorithm for character strings is proposed. We first obtain a set of eight-connected components from a document image. For the components, we apply a relaxation method. The method makes mutual connections between components increase or decrease depending on the state of the neighboring components. While applying the relaxation method several times, the process proceeds from a local connection to a global connection, and finally character strings are extracted. We call this process multi stage relaxation. The advantages of this algorithm are that it does not need to nominate character components from an image beforehand, it is adaptive for character size and font, and it can also cope with a document which includes strings with various orientations. In our experiments we use a color image of a magazine cover and a monochromatic image of a graph. For the color image, the multi stage relaxation was executed for each binary image obtained by color segmentation. Lastly, we show the results of the experiments and discuss the effectiveness of our method.