Pub Date : 2013-12-01DOI: 10.1109/NCVPRIPG.2013.6776269
S. Maity, H. Maity
A generalized form of reversible contrast mapping (RCM), analogous to M-ary modulation in communication, is developed here for reversible watermarking in digital images. Then an optimized distortion control framework in M-ary scheme is considered to improve data hiding capacity while meeting the embedding distortion constraint. Simulation results show that the combination of different M-ary approaches, using the different points representing the different RCM transformation functions, outperforms the embedding rate-visual quality-security of the hidden information compared to the existing RCM, difference expansion (DE) and prediction error expansion (PEE) methods during over embedding. Numerical results show that an average of 20% improvement in visual quality, 35% improvement in security of the hidden data at 1 bpp embedding rate is achieved for the proposed method compared to the existing PEE works. All these effectiveness are demonstrated with a number of simulation results.
{"title":"M-ary reversible contrast mapping in reversible watermarking with optimal distortion control","authors":"S. Maity, H. Maity","doi":"10.1109/NCVPRIPG.2013.6776269","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776269","url":null,"abstract":"A generalized form of reversible contrast mapping (RCM), analogous to M-ary modulation in communication, is developed here for reversible watermarking in digital images. Then an optimized distortion control framework in M-ary scheme is considered to improve data hiding capacity while meeting the embedding distortion constraint. Simulation results show that the combination of different M-ary approaches, using the different points representing the different RCM transformation functions, outperforms the embedding rate-visual quality-security of the hidden information compared to the existing RCM, difference expansion (DE) and prediction error expansion (PEE) methods during over embedding. Numerical results show that an average of 20% improvement in visual quality, 35% improvement in security of the hidden data at 1 bpp embedding rate is achieved for the proposed method compared to the existing PEE works. All these effectiveness are demonstrated with a number of simulation results.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126785954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-12-01DOI: 10.1109/NCVPRIPG.2013.6776244
David Macurak, Amrutha Sethuram, K. Ricanek, B. Barbour
The main contribution of this paper is to introduce DASM - Dynamic Active Shape Models, an open source software for the automatic detection of fiducial points on objects for subsequent registration, to the research community. DASM leverages the tremendous work of STASM, a well known software library for automatic detection of points on faces. In this work we compare DASM to other well-known techniques for automatic face registration: Active Appearance Models (AAM) and Constrained Local Models (CLM). Further we show that DASM outperforms these techniques on a per registration-point error, average object error, and on cumulative error distribution. As a follow on, we show that DASM outperforms STASM v3.1 on model training and registration by leveraging open source libraries for computer vision (OpenCV v2.4) and threading/parallelism (OpenMP). The improvements in speed and performance of DASM allows for extremely dense registration, 252 points on the face, in video applications.
本文的主要贡献是将DASM (Dynamic Active Shape Models)这一开源软件引入研究领域,该软件用于自动检测物体上的基准点以进行后续配准。DASM利用STASM的巨大工作,STASM是一个著名的软件库,用于自动检测人脸上的点。在这项工作中,我们将DASM与其他著名的自动人脸配准技术进行了比较:主动外观模型(AAM)和约束局部模型(CLM)。我们进一步表明,DASM在每个配准点误差、平均对象误差和累积误差分布上优于这些技术。接下来,我们通过利用计算机视觉的开源库(OpenCV v2.4)和线程/并行性(OpenMP),证明DASM在模型训练和注册方面优于STASM v3.1。DASM在速度和性能上的改进允许在视频应用中进行极其密集的配准,在面部上有252个点。
{"title":"DASM: An open source active shape model for automatic registration of objects","authors":"David Macurak, Amrutha Sethuram, K. Ricanek, B. Barbour","doi":"10.1109/NCVPRIPG.2013.6776244","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776244","url":null,"abstract":"The main contribution of this paper is to introduce DASM - Dynamic Active Shape Models, an open source software for the automatic detection of fiducial points on objects for subsequent registration, to the research community. DASM leverages the tremendous work of STASM, a well known software library for automatic detection of points on faces. In this work we compare DASM to other well-known techniques for automatic face registration: Active Appearance Models (AAM) and Constrained Local Models (CLM). Further we show that DASM outperforms these techniques on a per registration-point error, average object error, and on cumulative error distribution. As a follow on, we show that DASM outperforms STASM v3.1 on model training and registration by leveraging open source libraries for computer vision (OpenCV v2.4) and threading/parallelism (OpenMP). The improvements in speed and performance of DASM allows for extremely dense registration, 252 points on the face, in video applications.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130675652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-12-01DOI: 10.1109/NCVPRIPG.2013.6776173
A. DeepakN., R. Hariharan, U. Sinha
Conventional human action recognition algorithm and method generate coarse clusters of input videos approximately 2-4 clusters with less information regarding the cluster generation. This problem is solved by proposing Latent Dirichlet Allocation algorithm that transforms the extracted gait sequences in gait domain into documents-words in text domain. These words are then used to group the input documents into finer clusters approximately 8-9 clusters. In this approach, we have made an attempt to use gait analysis in recognizing human actions, where the gait analysis requires to have some motion in lower parts of the human body like leg. As the videos of Weizmann dataset have some actions that exhibits these movements, we are able use these motion parameters to recognize certain human actions. Experiments on Weizmann dataset suggest that the proposed Latent Dirichlet Allocation algorithm is an efficient method for recognizing human actions from the video streams.
{"title":"Analysing gait sequences using Latent Dirichlet Allocation for certain human actions","authors":"A. DeepakN., R. Hariharan, U. Sinha","doi":"10.1109/NCVPRIPG.2013.6776173","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776173","url":null,"abstract":"Conventional human action recognition algorithm and method generate coarse clusters of input videos approximately 2-4 clusters with less information regarding the cluster generation. This problem is solved by proposing Latent Dirichlet Allocation algorithm that transforms the extracted gait sequences in gait domain into documents-words in text domain. These words are then used to group the input documents into finer clusters approximately 8-9 clusters. In this approach, we have made an attempt to use gait analysis in recognizing human actions, where the gait analysis requires to have some motion in lower parts of the human body like leg. As the videos of Weizmann dataset have some actions that exhibits these movements, we are able use these motion parameters to recognize certain human actions. Experiments on Weizmann dataset suggest that the proposed Latent Dirichlet Allocation algorithm is an efficient method for recognizing human actions from the video streams.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131159878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-12-01DOI: 10.1109/NCVPRIPG.2013.6776223
P. KrishnaKumar, L. Parameswaran
Video event detection (VED) is a challenging task especially with a large variety of objects in the environment. Even though there exist numerous algorithms for event detection, most of them are unsuitable for a typical consumer purpose. A hybrid method for detecting and identifying the moving objects by their color and spatial information is presented in this paper. In tracking multiple moving objects, the system makes use of motion of changed regions. In this approach, first, the object detector will look for the existence of objects that have already been registered. Then the control is passed on to an event detector which will wait for an event to happen which can be object placement or object removal. The object detector becomes active only if any event is detected. Simple training procedure using a single color camera in HSV color space makes it a consumer application. The proposed model has proved to be robust in various indoor environments and different types of background scenes. The experimental results prove the feasibility of the proposed method.
{"title":"A hybrid method for object identification and event detection in video","authors":"P. KrishnaKumar, L. Parameswaran","doi":"10.1109/NCVPRIPG.2013.6776223","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776223","url":null,"abstract":"Video event detection (VED) is a challenging task especially with a large variety of objects in the environment. Even though there exist numerous algorithms for event detection, most of them are unsuitable for a typical consumer purpose. A hybrid method for detecting and identifying the moving objects by their color and spatial information is presented in this paper. In tracking multiple moving objects, the system makes use of motion of changed regions. In this approach, first, the object detector will look for the existence of objects that have already been registered. Then the control is passed on to an event detector which will wait for an event to happen which can be object placement or object removal. The object detector becomes active only if any event is detected. Simple training procedure using a single color camera in HSV color space makes it a consumer application. The proposed model has proved to be robust in various indoor environments and different types of background scenes. The experimental results prove the feasibility of the proposed method.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128443381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-12-01DOI: 10.1109/NCVPRIPG.2013.6776156
Deshna Jain, G. Shikkenawis, S. Mitra, S. K. Parulkar
Face images of a person taken in varying expressions, orientations, lighting conditions are expected to be close to each other even under any mathematical transformation. These high dimensional face images are difficult to be recognized as faces of same person by machines in contrast to the humans. Many of the existing face recognition systems thus explicitly reduce the dimensions before performing recognition task. However, it is not guaranteed that varying faces of a single person could still be close in the lower dimensional space. Dimensionality reduction technique such as Extended Locality Preserving Projection (ELPP) not only reduces the dimension of the input data remarkably but also preserves the locality using neighbourhood information in the projected space. This paper deals with a face recognition system where ELPP is used to reduce the dimension of face images and hence uses ELPP coefficients as features to the classifier for recognition. In specific, two classifiers namely Naive Bayes classifier and Support Vector Machine are used. Results of face recognition of different data sets are highly impressive and at the same time results of facial expressions are encouraging. Experiments have also been carried out by taking a supervised version of ELPP (ESLPP).
{"title":"Face and facial expression recognition using Extended Locality Preserving Projection","authors":"Deshna Jain, G. Shikkenawis, S. Mitra, S. K. Parulkar","doi":"10.1109/NCVPRIPG.2013.6776156","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776156","url":null,"abstract":"Face images of a person taken in varying expressions, orientations, lighting conditions are expected to be close to each other even under any mathematical transformation. These high dimensional face images are difficult to be recognized as faces of same person by machines in contrast to the humans. Many of the existing face recognition systems thus explicitly reduce the dimensions before performing recognition task. However, it is not guaranteed that varying faces of a single person could still be close in the lower dimensional space. Dimensionality reduction technique such as Extended Locality Preserving Projection (ELPP) not only reduces the dimension of the input data remarkably but also preserves the locality using neighbourhood information in the projected space. This paper deals with a face recognition system where ELPP is used to reduce the dimension of face images and hence uses ELPP coefficients as features to the classifier for recognition. In specific, two classifiers namely Naive Bayes classifier and Support Vector Machine are used. Results of face recognition of different data sets are highly impressive and at the same time results of facial expressions are encouraging. Experiments have also been carried out by taking a supervised version of ELPP (ESLPP).","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133183055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-12-01DOI: 10.1109/NCVPRIPG.2013.6776208
D. Deshpande, A. Rajurkar, R. Manthalkar
Breast cancer, the most common type of cancer in women is one of the leading causes of cancer deaths. Due to this, early detection of cancer is the major concern for cancer treatment. The most common screening test called mammography is useful for early detection of cancer. It has been proven that there is potential raise in the cancers detected due to consecutive reading of mammograms. But this approach is not monetarily viable. Therefore there is a significant need of computer aided detection systems which can produce intended results and assist medical staff for accurate diagnosis. In this research we made an attempt to build classification system for mammograms using association rule mining based on texture features. The proposed system uses most relevant GLCM based texture features of mammograms. New method is proposed to form associations among different texture features by judging the importance of different features. Resultant associations can be used for classification of mammograms. Experiments are carried out using MIAS Image Database. The performance of the proposed method is compared with standard Apriori algorithm. It is found that performance of proposed method is better due to reduction in multiple times scanning of database which results in less computation time. We also investigated the use of association rules in the field of medical image analysis for the problem of mammogram classification.
{"title":"Medical image analysis an attempt for mammogram classification using texture based association rule mining","authors":"D. Deshpande, A. Rajurkar, R. Manthalkar","doi":"10.1109/NCVPRIPG.2013.6776208","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776208","url":null,"abstract":"Breast cancer, the most common type of cancer in women is one of the leading causes of cancer deaths. Due to this, early detection of cancer is the major concern for cancer treatment. The most common screening test called mammography is useful for early detection of cancer. It has been proven that there is potential raise in the cancers detected due to consecutive reading of mammograms. But this approach is not monetarily viable. Therefore there is a significant need of computer aided detection systems which can produce intended results and assist medical staff for accurate diagnosis. In this research we made an attempt to build classification system for mammograms using association rule mining based on texture features. The proposed system uses most relevant GLCM based texture features of mammograms. New method is proposed to form associations among different texture features by judging the importance of different features. Resultant associations can be used for classification of mammograms. Experiments are carried out using MIAS Image Database. The performance of the proposed method is compared with standard Apriori algorithm. It is found that performance of proposed method is better due to reduction in multiple times scanning of database which results in less computation time. We also investigated the use of association rules in the field of medical image analysis for the problem of mammogram classification.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131806844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-12-01DOI: 10.1109/NCVPRIPG.2013.6776271
T. SushmaLeela, R. Chandrakanth, J. Saibaba, G. Varadan, S. Mohan
Object detection from remote sensing images has inherent difficulties due to cluttered backgrounds and noisy regions from the urban area in high resolution images. Detection of objects with regular geometry, such as circles from an image uses strict feature based detection. Using region based segmentation techniques such as K-Means has the inherent disadvantage of knowing the number of classes apriori. Contour based techniques such as Active contour models, sometimes used in remote sensing also has the problem of knowing the approximate location of the region and also the noise will hinder its performance. A template based approach is not scale and rotation invariant with different resolutions and using multiple templates is not a feasible solution. This paper proposes a methodology for object detection based on mean shift segmentation and non-parametric clustering. Mean shift is a non-parametric segmentation technique, which in its inherent nature is able to segment regions according to the desirable properties like spatial and spectral radiance of the object. A prior knowledge about the shape of the object is used to extract the desire object. A hierarchical clustering method is adopted to cluster the objects having similar shape and spatial features. The proposed methodology is applied on high resolution EO images to extract circular objects. The methodology found to be better and robust even in the cluttered and noisy background. The results are also evaluated using different evaluation measures.
{"title":"Mean-shift based object detection and clustering from high resolution remote sensing imagery","authors":"T. SushmaLeela, R. Chandrakanth, J. Saibaba, G. Varadan, S. Mohan","doi":"10.1109/NCVPRIPG.2013.6776271","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776271","url":null,"abstract":"Object detection from remote sensing images has inherent difficulties due to cluttered backgrounds and noisy regions from the urban area in high resolution images. Detection of objects with regular geometry, such as circles from an image uses strict feature based detection. Using region based segmentation techniques such as K-Means has the inherent disadvantage of knowing the number of classes apriori. Contour based techniques such as Active contour models, sometimes used in remote sensing also has the problem of knowing the approximate location of the region and also the noise will hinder its performance. A template based approach is not scale and rotation invariant with different resolutions and using multiple templates is not a feasible solution. This paper proposes a methodology for object detection based on mean shift segmentation and non-parametric clustering. Mean shift is a non-parametric segmentation technique, which in its inherent nature is able to segment regions according to the desirable properties like spatial and spectral radiance of the object. A prior knowledge about the shape of the object is used to extract the desire object. A hierarchical clustering method is adopted to cluster the objects having similar shape and spatial features. The proposed methodology is applied on high resolution EO images to extract circular objects. The methodology found to be better and robust even in the cluttered and noisy background. The results are also evaluated using different evaluation measures.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127769309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-12-01DOI: 10.1109/NCVPRIPG.2013.6776181
Arindam Das, U. Bhattacharya
Online handwriting recognition research has recently received significant thrust. Specifically for Indian scripts, handwriting recognition has not been focused much till in the near past. However, due to generous Government funding through the group on Technology Development for Indian Languages (TDIL) of the Ministry of Communication & Information Technology (MC&IT), Govt. of India, research in this area has received due attention and several groups are now engaged in research and development works for online handwriting recognition in different Indian scripts. An extensive bottleneck of the desired progress in this area is the difficulty of collection of large sample databases of online handwriting in various scripts. Towards the same, recently a user-friendly tool on Android platform has been developed to collect data on handheld devices. This tool is called ISIgraphy and has been uploaded in the Google Play for free download. This application is designed well enough to store handwritten data samples in large scales in user-given file names for distinct users. Its use is script independent, meaning that it can collect and store handwriting samples written in any language, not necessarily an Indian script. It has an additional module for retrieval and display of stored data. Moreover, it can directly send the collected data to others via electronic mail.
{"title":"ISIgraphy: A tool for online handwriting sample database generation","authors":"Arindam Das, U. Bhattacharya","doi":"10.1109/NCVPRIPG.2013.6776181","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776181","url":null,"abstract":"Online handwriting recognition research has recently received significant thrust. Specifically for Indian scripts, handwriting recognition has not been focused much till in the near past. However, due to generous Government funding through the group on Technology Development for Indian Languages (TDIL) of the Ministry of Communication & Information Technology (MC&IT), Govt. of India, research in this area has received due attention and several groups are now engaged in research and development works for online handwriting recognition in different Indian scripts. An extensive bottleneck of the desired progress in this area is the difficulty of collection of large sample databases of online handwriting in various scripts. Towards the same, recently a user-friendly tool on Android platform has been developed to collect data on handheld devices. This tool is called ISIgraphy and has been uploaded in the Google Play for free download. This application is designed well enough to store handwritten data samples in large scales in user-given file names for distinct users. Its use is script independent, meaning that it can collect and store handwriting samples written in any language, not necessarily an Indian script. It has an additional module for retrieval and display of stored data. Moreover, it can directly send the collected data to others via electronic mail.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128692314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-12-01DOI: 10.1109/NCVPRIPG.2013.6776195
M. Javed, P. Nagabhushan, B. B. Chaudhuri
Segmentation of a text-document into lines, words and characters, which is considered to be the crucial preprocessing stage in Optical Character Recognition (OCR) is traditionally carried out on uncompressed documents, although most of the documents in real life are available in compressed form, for the reasons such as transmission and storage efficiency. However, this implies that the compressed image should be decompressed, which indents additional computing resources. This limitation has motivated us to take up research in document image analysis using compressed documents. In this paper, we think in a new way to carry out segmentation at line, word and character level in run-length compressed printed-text-documents. We extract the horizontal projection profile curve from the compressed file and using the local minima points perform line segmentation. However, tracing vertical information which leads to tracking words-characters in a run-length compressed file is not very straight forward. Therefore, we propose a novel technique for carrying out simultaneous word and character segmentation by popping out column runs from each row in an intelligent sequence. The proposed algorithms have been validated with 1101 text-lines, 1409 words and 7582 characters from a data-set of 35 noise and skew free compressed documents of Bengali, Kannada and English Scripts.
{"title":"Extraction of line-word-character segments directly from run-length compressed printed text-documents","authors":"M. Javed, P. Nagabhushan, B. B. Chaudhuri","doi":"10.1109/NCVPRIPG.2013.6776195","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776195","url":null,"abstract":"Segmentation of a text-document into lines, words and characters, which is considered to be the crucial preprocessing stage in Optical Character Recognition (OCR) is traditionally carried out on uncompressed documents, although most of the documents in real life are available in compressed form, for the reasons such as transmission and storage efficiency. However, this implies that the compressed image should be decompressed, which indents additional computing resources. This limitation has motivated us to take up research in document image analysis using compressed documents. In this paper, we think in a new way to carry out segmentation at line, word and character level in run-length compressed printed-text-documents. We extract the horizontal projection profile curve from the compressed file and using the local minima points perform line segmentation. However, tracing vertical information which leads to tracking words-characters in a run-length compressed file is not very straight forward. Therefore, we propose a novel technique for carrying out simultaneous word and character segmentation by popping out column runs from each row in an intelligent sequence. The proposed algorithms have been validated with 1101 text-lines, 1409 words and 7582 characters from a data-set of 35 noise and skew free compressed documents of Bengali, Kannada and English Scripts.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117340286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-12-01DOI: 10.1109/NCVPRIPG.2013.6776228
Subhankar Ghosh, U. Barman, P. Bora, Tourangbam Harishore Singh, B. Chaudhuri
This paper presents an implementation of an OCR system for the Meetei Mayek script. The script has been newly reintroduced and there is a growing set of documents currently available in this script. Our system accepts an image of the textual portion of a page and outputs the text in the Unicode format. It incorporates preprocessing, segmentation and classification stages. However, no post-processing is done to the output. The system achieves an accuracy of about 96% on a moderate database.
{"title":"An OCR system for the Meetei Mayek script","authors":"Subhankar Ghosh, U. Barman, P. Bora, Tourangbam Harishore Singh, B. Chaudhuri","doi":"10.1109/NCVPRIPG.2013.6776228","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776228","url":null,"abstract":"This paper presents an implementation of an OCR system for the Meetei Mayek script. The script has been newly reintroduced and there is a growing set of documents currently available in this script. Our system accepts an image of the textual portion of a page and outputs the text in the Unicode format. It incorporates preprocessing, segmentation and classification stages. However, no post-processing is done to the output. The system achieves an accuracy of about 96% on a moderate database.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"167 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114751563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}