{"title":"用卷积神经网络读取自然场景图像中的数字","authors":"Qiang Guo, Jun Lei, D. Tu, Guohui Li","doi":"10.1109/SPAC.2014.6982655","DOIUrl":null,"url":null,"abstract":"Reading text from natural images is a hard computer vision task. We present a method for applying deep convolutional neural networks to recognize numbers in natural scene images. In this paper, we proposed a noval method to eliminating the need of explicit segmentation when deal with multi-digit number recognition in natural scene images. Convolution Neural Network(CNN) requires fixed dimensional input while number images contain unknown amount of digits. Our method integrats CNN with probabilistic graphical model to deal with the problem. We use hidden Markov model(HMM) to model the image and use CNN to model digits appearance. This method combines the advantages of both the two models and make them fit to the problem. By using this method we can perform the training and recognition procedure both at word level. There is no explicit segmentation operation at all which save lots of labour for sophisticated segmentation algorithm design or finegrained character labeling. Experiments show that deep CNN can dramaticly improve the performance compared with using Gaussian Mixture model as the digit model. We obtaied competitive results on the street view house number(SVHN) dataset.","PeriodicalId":326246,"journal":{"name":"Proceedings 2014 IEEE International Conference on Security, Pattern Analysis, and Cybernetics (SPAC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Reading numbers in natural scene images with convolutional neural networks\",\"authors\":\"Qiang Guo, Jun Lei, D. Tu, Guohui Li\",\"doi\":\"10.1109/SPAC.2014.6982655\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Reading text from natural images is a hard computer vision task. We present a method for applying deep convolutional neural networks to recognize numbers in natural scene images. In this paper, we proposed a noval method to eliminating the need of explicit segmentation when deal with multi-digit number recognition in natural scene images. Convolution Neural Network(CNN) requires fixed dimensional input while number images contain unknown amount of digits. Our method integrats CNN with probabilistic graphical model to deal with the problem. We use hidden Markov model(HMM) to model the image and use CNN to model digits appearance. This method combines the advantages of both the two models and make them fit to the problem. By using this method we can perform the training and recognition procedure both at word level. There is no explicit segmentation operation at all which save lots of labour for sophisticated segmentation algorithm design or finegrained character labeling. Experiments show that deep CNN can dramaticly improve the performance compared with using Gaussian Mixture model as the digit model. We obtaied competitive results on the street view house number(SVHN) dataset.\",\"PeriodicalId\":326246,\"journal\":{\"name\":\"Proceedings 2014 IEEE International Conference on Security, Pattern Analysis, and Cybernetics (SPAC)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-12-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings 2014 IEEE International Conference on Security, Pattern Analysis, and Cybernetics (SPAC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SPAC.2014.6982655\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 2014 IEEE International Conference on Security, Pattern Analysis, and Cybernetics (SPAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPAC.2014.6982655","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Reading numbers in natural scene images with convolutional neural networks
Reading text from natural images is a hard computer vision task. We present a method for applying deep convolutional neural networks to recognize numbers in natural scene images. In this paper, we proposed a noval method to eliminating the need of explicit segmentation when deal with multi-digit number recognition in natural scene images. Convolution Neural Network(CNN) requires fixed dimensional input while number images contain unknown amount of digits. Our method integrats CNN with probabilistic graphical model to deal with the problem. We use hidden Markov model(HMM) to model the image and use CNN to model digits appearance. This method combines the advantages of both the two models and make them fit to the problem. By using this method we can perform the training and recognition procedure both at word level. There is no explicit segmentation operation at all which save lots of labour for sophisticated segmentation algorithm design or finegrained character labeling. Experiments show that deep CNN can dramaticly improve the performance compared with using Gaussian Mixture model as the digit model. We obtaied competitive results on the street view house number(SVHN) dataset.