首页 > 最新文献

2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)最新文献

英文 中文
DASM: An open source active shape model for automatic registration of objects DASM:用于对象自动配准的开源活动形状模型
David Macurak, Amrutha Sethuram, K. Ricanek, B. Barbour
The main contribution of this paper is to introduce DASM - Dynamic Active Shape Models, an open source software for the automatic detection of fiducial points on objects for subsequent registration, to the research community. DASM leverages the tremendous work of STASM, a well known software library for automatic detection of points on faces. In this work we compare DASM to other well-known techniques for automatic face registration: Active Appearance Models (AAM) and Constrained Local Models (CLM). Further we show that DASM outperforms these techniques on a per registration-point error, average object error, and on cumulative error distribution. As a follow on, we show that DASM outperforms STASM v3.1 on model training and registration by leveraging open source libraries for computer vision (OpenCV v2.4) and threading/parallelism (OpenMP). The improvements in speed and performance of DASM allows for extremely dense registration, 252 points on the face, in video applications.
本文的主要贡献是将DASM (Dynamic Active Shape Models)这一开源软件引入研究领域,该软件用于自动检测物体上的基准点以进行后续配准。DASM利用STASM的巨大工作,STASM是一个著名的软件库,用于自动检测人脸上的点。在这项工作中,我们将DASM与其他著名的自动人脸配准技术进行了比较:主动外观模型(AAM)和约束局部模型(CLM)。我们进一步表明,DASM在每个配准点误差、平均对象误差和累积误差分布上优于这些技术。接下来,我们通过利用计算机视觉的开源库(OpenCV v2.4)和线程/并行性(OpenMP),证明DASM在模型训练和注册方面优于STASM v3.1。DASM在速度和性能上的改进允许在视频应用中进行极其密集的配准,在面部上有252个点。
{"title":"DASM: An open source active shape model for automatic registration of objects","authors":"David Macurak, Amrutha Sethuram, K. Ricanek, B. Barbour","doi":"10.1109/NCVPRIPG.2013.6776244","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776244","url":null,"abstract":"The main contribution of this paper is to introduce DASM - Dynamic Active Shape Models, an open source software for the automatic detection of fiducial points on objects for subsequent registration, to the research community. DASM leverages the tremendous work of STASM, a well known software library for automatic detection of points on faces. In this work we compare DASM to other well-known techniques for automatic face registration: Active Appearance Models (AAM) and Constrained Local Models (CLM). Further we show that DASM outperforms these techniques on a per registration-point error, average object error, and on cumulative error distribution. As a follow on, we show that DASM outperforms STASM v3.1 on model training and registration by leveraging open source libraries for computer vision (OpenCV v2.4) and threading/parallelism (OpenMP). The improvements in speed and performance of DASM allows for extremely dense registration, 252 points on the face, in video applications.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130675652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analysing gait sequences using Latent Dirichlet Allocation for certain human actions 基于潜在狄利克雷分配的步态序列分析
A. DeepakN., R. Hariharan, U. Sinha
Conventional human action recognition algorithm and method generate coarse clusters of input videos approximately 2-4 clusters with less information regarding the cluster generation. This problem is solved by proposing Latent Dirichlet Allocation algorithm that transforms the extracted gait sequences in gait domain into documents-words in text domain. These words are then used to group the input documents into finer clusters approximately 8-9 clusters. In this approach, we have made an attempt to use gait analysis in recognizing human actions, where the gait analysis requires to have some motion in lower parts of the human body like leg. As the videos of Weizmann dataset have some actions that exhibits these movements, we are able use these motion parameters to recognize certain human actions. Experiments on Weizmann dataset suggest that the proposed Latent Dirichlet Allocation algorithm is an efficient method for recognizing human actions from the video streams.
传统的人体动作识别算法和方法产生的输入视频的粗聚类大约为2-4个聚类,关于聚类生成的信息较少。提出了隐狄利克雷分配算法,将提取的步态序列在步态域转化为文本域的文档-词。然后使用这些单词将输入文档分组为更精细的集群,大约为8-9个集群。在这种方法中,我们尝试使用步态分析来识别人类的动作,其中步态分析需要在人体的下部如腿部有一些运动。由于Weizmann数据集的视频中有一些动作展示了这些动作,我们可以使用这些动作参数来识别某些人类动作。在Weizmann数据集上的实验表明,所提出的潜在狄利克雷分配算法是一种从视频流中识别人类行为的有效方法。
{"title":"Analysing gait sequences using Latent Dirichlet Allocation for certain human actions","authors":"A. DeepakN., R. Hariharan, U. Sinha","doi":"10.1109/NCVPRIPG.2013.6776173","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776173","url":null,"abstract":"Conventional human action recognition algorithm and method generate coarse clusters of input videos approximately 2-4 clusters with less information regarding the cluster generation. This problem is solved by proposing Latent Dirichlet Allocation algorithm that transforms the extracted gait sequences in gait domain into documents-words in text domain. These words are then used to group the input documents into finer clusters approximately 8-9 clusters. In this approach, we have made an attempt to use gait analysis in recognizing human actions, where the gait analysis requires to have some motion in lower parts of the human body like leg. As the videos of Weizmann dataset have some actions that exhibits these movements, we are able use these motion parameters to recognize certain human actions. Experiments on Weizmann dataset suggest that the proposed Latent Dirichlet Allocation algorithm is an efficient method for recognizing human actions from the video streams.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131159878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Recognition and identification of target images using feature based retrieval in UAV missions 无人机任务中基于特征检索的目标图像识别与识别
Shweta Singh, D. V. Rao
With the introduction of unmanned air vehicles as force multipliers in the defense services worldwide, automatic recognition and identification of ground based targets has become an important area of research in the defense community. Due to inherent instabilities in smaller unmanned platforms, image blurredness and distortion need to be addressed for the successful recognition of the target. In this paper, an image enhancement technique that can improve images' quality acquired by an unmanned system is proposed. An image de-blurring technique based on blind de-convolution algorithm which adaptively enhances the edges of characters and wipes off blurredness effectively is proposed. A content-based image retrieval technique based on features extraction to generate an image description and a compact feature vector that represents the visual information, color, texture and shape is used with a minimum distance algorithm to effectively retrieve the plausible target images from a library of images stored in a target folder. This methodology was implemented for planning and gaming the UAV/UCAV missions in the Air Warfare Simulation System.
随着无人飞行器作为力量倍增器在世界范围内的应用,地面目标的自动识别和识别已成为防务界的一个重要研究领域。由于小型无人平台固有的不稳定性,为了成功识别目标,需要解决图像模糊和失真问题。本文提出了一种提高无人系统图像质量的图像增强技术。提出了一种基于盲反卷积算法的图像去模糊技术,该技术能自适应增强字符的边缘,有效地消除模糊。采用基于内容的图像检索技术,基于特征提取生成图像描述和表示视觉信息、颜色、纹理和形状的紧凑特征向量,并采用最小距离算法从存储在目标文件夹中的图像库中有效检索出可信的目标图像。该方法在空战模拟系统中用于UAV/UCAV任务的规划和博弈。
{"title":"Recognition and identification of target images using feature based retrieval in UAV missions","authors":"Shweta Singh, D. V. Rao","doi":"10.1109/NCVPRIPG.2013.6776165","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776165","url":null,"abstract":"With the introduction of unmanned air vehicles as force multipliers in the defense services worldwide, automatic recognition and identification of ground based targets has become an important area of research in the defense community. Due to inherent instabilities in smaller unmanned platforms, image blurredness and distortion need to be addressed for the successful recognition of the target. In this paper, an image enhancement technique that can improve images' quality acquired by an unmanned system is proposed. An image de-blurring technique based on blind de-convolution algorithm which adaptively enhances the edges of characters and wipes off blurredness effectively is proposed. A content-based image retrieval technique based on features extraction to generate an image description and a compact feature vector that represents the visual information, color, texture and shape is used with a minimum distance algorithm to effectively retrieve the plausible target images from a library of images stored in a target folder. This methodology was implemented for planning and gaming the UAV/UCAV missions in the Air Warfare Simulation System.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128882269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A hybrid method for object identification and event detection in video 视频中目标识别与事件检测的混合方法
P. KrishnaKumar, L. Parameswaran
Video event detection (VED) is a challenging task especially with a large variety of objects in the environment. Even though there exist numerous algorithms for event detection, most of them are unsuitable for a typical consumer purpose. A hybrid method for detecting and identifying the moving objects by their color and spatial information is presented in this paper. In tracking multiple moving objects, the system makes use of motion of changed regions. In this approach, first, the object detector will look for the existence of objects that have already been registered. Then the control is passed on to an event detector which will wait for an event to happen which can be object placement or object removal. The object detector becomes active only if any event is detected. Simple training procedure using a single color camera in HSV color space makes it a consumer application. The proposed model has proved to be robust in various indoor environments and different types of background scenes. The experimental results prove the feasibility of the proposed method.
视频事件检测(VED)是一项具有挑战性的任务,特别是在环境中对象种类繁多的情况下。尽管存在许多用于事件检测的算法,但其中大多数都不适合典型的消费者目的。提出了一种利用运动目标的颜色和空间信息对运动目标进行检测和识别的混合方法。在跟踪多个运动目标时,该系统利用了变化区域的运动。在这种方法中,首先,对象检测器将查找已经注册的对象的存在。然后将控件传递给事件检测器,该检测器将等待事件发生,该事件可以是对象放置或对象删除。只有在检测到任何事件时,对象检测器才会激活。简单的训练程序使用单一颜色的相机在HSV色彩空间,使其成为消费者的应用程序。该模型在各种室内环境和不同背景场景下均具有较强的鲁棒性。实验结果证明了该方法的可行性。
{"title":"A hybrid method for object identification and event detection in video","authors":"P. KrishnaKumar, L. Parameswaran","doi":"10.1109/NCVPRIPG.2013.6776223","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776223","url":null,"abstract":"Video event detection (VED) is a challenging task especially with a large variety of objects in the environment. Even though there exist numerous algorithms for event detection, most of them are unsuitable for a typical consumer purpose. A hybrid method for detecting and identifying the moving objects by their color and spatial information is presented in this paper. In tracking multiple moving objects, the system makes use of motion of changed regions. In this approach, first, the object detector will look for the existence of objects that have already been registered. Then the control is passed on to an event detector which will wait for an event to happen which can be object placement or object removal. The object detector becomes active only if any event is detected. Simple training procedure using a single color camera in HSV color space makes it a consumer application. The proposed model has proved to be robust in various indoor environments and different types of background scenes. The experimental results prove the feasibility of the proposed method.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128443381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
M-ary reversible contrast mapping in reversible watermarking with optimal distortion control 具有最优失真控制的可逆水印中的任意可逆对比度映射
S. Maity, H. Maity
A generalized form of reversible contrast mapping (RCM), analogous to M-ary modulation in communication, is developed here for reversible watermarking in digital images. Then an optimized distortion control framework in M-ary scheme is considered to improve data hiding capacity while meeting the embedding distortion constraint. Simulation results show that the combination of different M-ary approaches, using the different points representing the different RCM transformation functions, outperforms the embedding rate-visual quality-security of the hidden information compared to the existing RCM, difference expansion (DE) and prediction error expansion (PEE) methods during over embedding. Numerical results show that an average of 20% improvement in visual quality, 35% improvement in security of the hidden data at 1 bpp embedding rate is achieved for the proposed method compared to the existing PEE works. All these effectiveness are demonstrated with a number of simulation results.
一种广义形式的可逆对比映射(RCM),类似于通信中的M-ary调制,在这里开发了数字图像中的可逆水印。然后考虑了一种优化的M-ary方案中的失真控制框架,在满足嵌入失真约束的同时提高了数据隐藏能力。仿真结果表明,与现有的RCM、差分展开(DE)和预测误差展开(PEE)方法相比,使用不同的点表示不同的RCM变换函数的不同M-ary方法组合在过嵌入过程中,隐藏信息的嵌入率-视觉质量-安全性优于现有的RCM、差分展开(DE)和预测误差展开(PEE)方法。数值结果表明,在1 bpp的嵌入率下,与现有的PEE方法相比,该方法的视觉质量平均提高20%,隐藏数据的安全性平均提高35%。通过大量的仿真结果验证了这些方法的有效性。
{"title":"M-ary reversible contrast mapping in reversible watermarking with optimal distortion control","authors":"S. Maity, H. Maity","doi":"10.1109/NCVPRIPG.2013.6776269","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776269","url":null,"abstract":"A generalized form of reversible contrast mapping (RCM), analogous to M-ary modulation in communication, is developed here for reversible watermarking in digital images. Then an optimized distortion control framework in M-ary scheme is considered to improve data hiding capacity while meeting the embedding distortion constraint. Simulation results show that the combination of different M-ary approaches, using the different points representing the different RCM transformation functions, outperforms the embedding rate-visual quality-security of the hidden information compared to the existing RCM, difference expansion (DE) and prediction error expansion (PEE) methods during over embedding. Numerical results show that an average of 20% improvement in visual quality, 35% improvement in security of the hidden data at 1 bpp embedding rate is achieved for the proposed method compared to the existing PEE works. All these effectiveness are demonstrated with a number of simulation results.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126785954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Real-time approximate and exact CSG of implicit surfaces on the GPU 在GPU上实现隐式曲面的实时逼近和精确CSG
Jag Mohan Singh
We present a simple and powerful scheme to allow CSG of implicit surfaces on the GPU. We decompose the boolean expression of surfaces into sum-of-products form. Our algorithm presented in this paper then renders each product term, sum of products can be automatically by enabling depth test. Our Approximate CSG uses adaptive marching points algorithm for finding ray-surface intersection. Once we find an interval where root exists after root-isolation, this is used for presence of intersection. We perform root-refinement only for the uncomplemented terms in the product. Exact CSG is done by using the discriminant of the ray-surface intersection for the presence of the root. Now we can simply evaluate the product expression by checking all uncomplemented terms should be true and all complemented terms should be false. If our condition is met, we find the maximum of all the roots among uncomplemented terms to be the solution. Our algorithm is linear in the number of terms O(n). We achieve real-time rates for 4-5 terms in the product for approximate CSG. We achieve more than real-time rates for Exact CSG. Our primitives are implicit surfaces so we can achieve fairly complex results with less terms.
我们提出了一个简单而强大的方案来实现GPU上隐式曲面的CSG。我们把曲面的布尔表达式分解成乘积和的形式。本文提出的算法可以通过启用深度测试自动生成每个产品项,产品和。我们的近似CSG使用自适应行军点算法来寻找射线表面交集。一旦我们在根隔离之后找到了一个存在根的区间,这就被用来表示交集的存在。我们只对乘积中的未补项执行根细化。精确的CSG是通过对存在根的射线-表面相交的判别来完成的。现在我们可以简单地计算乘积表达式,通过检查所有未互补项是否为真,所有互补项是否为假。如果我们的条件满足,我们找到所有未互补项的根的最大值为解。我们的算法在项数O(n)上是线性的。我们在产品中实现了4-5个术语的实时率,近似CSG。我们在Exact CSG中实现的不仅仅是实时速率。我们的原语是隐式曲面,所以我们可以用更少的项获得相当复杂的结果。
{"title":"Real-time approximate and exact CSG of implicit surfaces on the GPU","authors":"Jag Mohan Singh","doi":"10.1109/NCVPRIPG.2013.6776199","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776199","url":null,"abstract":"We present a simple and powerful scheme to allow CSG of implicit surfaces on the GPU. We decompose the boolean expression of surfaces into sum-of-products form. Our algorithm presented in this paper then renders each product term, sum of products can be automatically by enabling depth test. Our Approximate CSG uses adaptive marching points algorithm for finding ray-surface intersection. Once we find an interval where root exists after root-isolation, this is used for presence of intersection. We perform root-refinement only for the uncomplemented terms in the product. Exact CSG is done by using the discriminant of the ray-surface intersection for the presence of the root. Now we can simply evaluate the product expression by checking all uncomplemented terms should be true and all complemented terms should be false. If our condition is met, we find the maximum of all the roots among uncomplemented terms to be the solution. Our algorithm is linear in the number of terms O(n). We achieve real-time rates for 4-5 terms in the product for approximate CSG. We achieve more than real-time rates for Exact CSG. Our primitives are implicit surfaces so we can achieve fairly complex results with less terms.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122552265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mean-shift based object detection and clustering from high resolution remote sensing imagery 基于均值偏移的高分辨率遥感影像目标检测与聚类
T. SushmaLeela, R. Chandrakanth, J. Saibaba, G. Varadan, S. Mohan
Object detection from remote sensing images has inherent difficulties due to cluttered backgrounds and noisy regions from the urban area in high resolution images. Detection of objects with regular geometry, such as circles from an image uses strict feature based detection. Using region based segmentation techniques such as K-Means has the inherent disadvantage of knowing the number of classes apriori. Contour based techniques such as Active contour models, sometimes used in remote sensing also has the problem of knowing the approximate location of the region and also the noise will hinder its performance. A template based approach is not scale and rotation invariant with different resolutions and using multiple templates is not a feasible solution. This paper proposes a methodology for object detection based on mean shift segmentation and non-parametric clustering. Mean shift is a non-parametric segmentation technique, which in its inherent nature is able to segment regions according to the desirable properties like spatial and spectral radiance of the object. A prior knowledge about the shape of the object is used to extract the desire object. A hierarchical clustering method is adopted to cluster the objects having similar shape and spatial features. The proposed methodology is applied on high resolution EO images to extract circular objects. The methodology found to be better and robust even in the cluttered and noisy background. The results are also evaluated using different evaluation measures.
由于高分辨率遥感图像中城市地区的背景杂乱和噪声区域,从遥感图像中检测目标存在固有的困难。检测具有规则几何形状的物体,例如图像中的圆,使用严格的基于特征的检测。使用基于区域的分割技术,如K-Means,其固有的缺点是无法先验地知道类的数量。基于轮廓的技术,如主动轮廓模型,有时用于遥感也有知道区域的大致位置的问题,并且噪声会影响其性能。基于模板的方法在不同分辨率下不是缩放和旋转不变的,使用多个模板不是可行的解决方案。提出了一种基于均值偏移分割和非参数聚类的目标检测方法。均值移位是一种非参数分割技术,其本质是能够根据目标的空间和光谱辐射等理想属性对区域进行分割。利用关于物体形状的先验知识来提取想要的物体。采用层次聚类方法对具有相似形状和空间特征的目标进行聚类。将该方法应用于高分辨率EO图像中提取圆形目标。结果表明,该方法即使在杂乱、嘈杂的背景下也具有较好的鲁棒性。采用不同的评价方法对评价结果进行了评价。
{"title":"Mean-shift based object detection and clustering from high resolution remote sensing imagery","authors":"T. SushmaLeela, R. Chandrakanth, J. Saibaba, G. Varadan, S. Mohan","doi":"10.1109/NCVPRIPG.2013.6776271","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776271","url":null,"abstract":"Object detection from remote sensing images has inherent difficulties due to cluttered backgrounds and noisy regions from the urban area in high resolution images. Detection of objects with regular geometry, such as circles from an image uses strict feature based detection. Using region based segmentation techniques such as K-Means has the inherent disadvantage of knowing the number of classes apriori. Contour based techniques such as Active contour models, sometimes used in remote sensing also has the problem of knowing the approximate location of the region and also the noise will hinder its performance. A template based approach is not scale and rotation invariant with different resolutions and using multiple templates is not a feasible solution. This paper proposes a methodology for object detection based on mean shift segmentation and non-parametric clustering. Mean shift is a non-parametric segmentation technique, which in its inherent nature is able to segment regions according to the desirable properties like spatial and spectral radiance of the object. A prior knowledge about the shape of the object is used to extract the desire object. A hierarchical clustering method is adopted to cluster the objects having similar shape and spatial features. The proposed methodology is applied on high resolution EO images to extract circular objects. The methodology found to be better and robust even in the cluttered and noisy background. The results are also evaluated using different evaluation measures.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127769309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
ISIgraphy: A tool for online handwriting sample database generation ISIgraphy:一个在线手写样本数据库生成工具
Arindam Das, U. Bhattacharya
Online handwriting recognition research has recently received significant thrust. Specifically for Indian scripts, handwriting recognition has not been focused much till in the near past. However, due to generous Government funding through the group on Technology Development for Indian Languages (TDIL) of the Ministry of Communication & Information Technology (MC&IT), Govt. of India, research in this area has received due attention and several groups are now engaged in research and development works for online handwriting recognition in different Indian scripts. An extensive bottleneck of the desired progress in this area is the difficulty of collection of large sample databases of online handwriting in various scripts. Towards the same, recently a user-friendly tool on Android platform has been developed to collect data on handheld devices. This tool is called ISIgraphy and has been uploaded in the Google Play for free download. This application is designed well enough to store handwritten data samples in large scales in user-given file names for distinct users. Its use is script independent, meaning that it can collect and store handwriting samples written in any language, not necessarily an Indian script. It has an additional module for retrieval and display of stored data. Moreover, it can directly send the collected data to others via electronic mail.
在线手写识别研究最近受到了极大的关注。特别是对于印度文字,手写识别直到最近才得到重视。然而,由于印度政府通过印度通信和信息技术部(MC&IT)的印度语言技术发展小组(TDIL)提供的慷慨资助,这一领域的研究得到了适当的关注,现在有几个小组正在从事不同印度文字的在线手写识别研究和开发工作。在这一领域取得预期进展的一个广泛瓶颈是收集各种脚本的在线手写的大型样本数据库的困难。同样,最近在Android平台上开发了一个用户友好的工具来收集手持设备上的数据。这个工具被称为ISIgraphy,已经上传到Google Play中供免费下载。这个应用程序设计得很好,可以在用户为不同用户指定的文件名中大规模存储手写数据样本。它的使用是独立于文字的,这意味着它可以收集和存储任何语言的笔迹样本,不一定是印度文字。它有一个用于检索和显示存储数据的附加模块。此外,它可以直接将收集到的数据通过电子邮件发送给其他人。
{"title":"ISIgraphy: A tool for online handwriting sample database generation","authors":"Arindam Das, U. Bhattacharya","doi":"10.1109/NCVPRIPG.2013.6776181","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776181","url":null,"abstract":"Online handwriting recognition research has recently received significant thrust. Specifically for Indian scripts, handwriting recognition has not been focused much till in the near past. However, due to generous Government funding through the group on Technology Development for Indian Languages (TDIL) of the Ministry of Communication & Information Technology (MC&IT), Govt. of India, research in this area has received due attention and several groups are now engaged in research and development works for online handwriting recognition in different Indian scripts. An extensive bottleneck of the desired progress in this area is the difficulty of collection of large sample databases of online handwriting in various scripts. Towards the same, recently a user-friendly tool on Android platform has been developed to collect data on handheld devices. This tool is called ISIgraphy and has been uploaded in the Google Play for free download. This application is designed well enough to store handwritten data samples in large scales in user-given file names for distinct users. Its use is script independent, meaning that it can collect and store handwriting samples written in any language, not necessarily an Indian script. It has an additional module for retrieval and display of stored data. Moreover, it can directly send the collected data to others via electronic mail.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128692314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Extraction of line-word-character segments directly from run-length compressed printed text-documents 直接从运行长度压缩的打印文本-文档中提取行-字-字符段
M. Javed, P. Nagabhushan, B. B. Chaudhuri
Segmentation of a text-document into lines, words and characters, which is considered to be the crucial preprocessing stage in Optical Character Recognition (OCR) is traditionally carried out on uncompressed documents, although most of the documents in real life are available in compressed form, for the reasons such as transmission and storage efficiency. However, this implies that the compressed image should be decompressed, which indents additional computing resources. This limitation has motivated us to take up research in document image analysis using compressed documents. In this paper, we think in a new way to carry out segmentation at line, word and character level in run-length compressed printed-text-documents. We extract the horizontal projection profile curve from the compressed file and using the local minima points perform line segmentation. However, tracing vertical information which leads to tracking words-characters in a run-length compressed file is not very straight forward. Therefore, we propose a novel technique for carrying out simultaneous word and character segmentation by popping out column runs from each row in an intelligent sequence. The proposed algorithms have been validated with 1101 text-lines, 1409 words and 7582 characters from a data-set of 35 noise and skew free compressed documents of Bengali, Kannada and English Scripts.
将文本文档分割成行、词、字符是光学字符识别(OCR)中至关重要的预处理阶段,传统上是在未压缩的文档上进行的,尽管现实生活中的大多数文档出于传输和存储效率等原因都是以压缩形式存在的。但是,这意味着压缩后的图像应该解压缩,这会减少额外的计算资源。这种限制促使我们开始研究使用压缩文档进行文档图像分析。本文提出了一种新的方法来实现行、字、字符级的行、字、字符级的压缩打印文本分割。从压缩文件中提取水平投影轮廓曲线,利用局部极小点进行直线分割。但是,跟踪垂直信息导致跟踪运行长度压缩文件中的单词-字符并不是很直接。因此,我们提出了一种新技术,通过在智能序列中从每一行弹出列运行来实现同时进行单词和字符分割。该算法已在35个孟加拉语、卡纳达语和英语文本的压缩文档中进行了1101行、1409个单词和7582个字符的验证。
{"title":"Extraction of line-word-character segments directly from run-length compressed printed text-documents","authors":"M. Javed, P. Nagabhushan, B. B. Chaudhuri","doi":"10.1109/NCVPRIPG.2013.6776195","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776195","url":null,"abstract":"Segmentation of a text-document into lines, words and characters, which is considered to be the crucial preprocessing stage in Optical Character Recognition (OCR) is traditionally carried out on uncompressed documents, although most of the documents in real life are available in compressed form, for the reasons such as transmission and storage efficiency. However, this implies that the compressed image should be decompressed, which indents additional computing resources. This limitation has motivated us to take up research in document image analysis using compressed documents. In this paper, we think in a new way to carry out segmentation at line, word and character level in run-length compressed printed-text-documents. We extract the horizontal projection profile curve from the compressed file and using the local minima points perform line segmentation. However, tracing vertical information which leads to tracking words-characters in a run-length compressed file is not very straight forward. Therefore, we propose a novel technique for carrying out simultaneous word and character segmentation by popping out column runs from each row in an intelligent sequence. The proposed algorithms have been validated with 1101 text-lines, 1409 words and 7582 characters from a data-set of 35 noise and skew free compressed documents of Bengali, Kannada and English Scripts.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117340286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
An OCR system for the Meetei Mayek script Meetei Mayek脚本的OCR系统
Subhankar Ghosh, U. Barman, P. Bora, Tourangbam Harishore Singh, B. Chaudhuri
This paper presents an implementation of an OCR system for the Meetei Mayek script. The script has been newly reintroduced and there is a growing set of documents currently available in this script. Our system accepts an image of the textual portion of a page and outputs the text in the Unicode format. It incorporates preprocessing, segmentation and classification stages. However, no post-processing is done to the output. The system achieves an accuracy of about 96% on a moderate database.
本文介绍了Meetei Mayek脚本的OCR系统的实现。该脚本最近重新引入,并且该脚本中当前可用的文档越来越多。我们的系统接受页面文本部分的图像,并以Unicode格式输出文本。它包括预处理、分割和分类三个阶段。但是,没有对输出进行后处理。该系统在一个中等规模的数据库上达到了96%左右的准确率。
{"title":"An OCR system for the Meetei Mayek script","authors":"Subhankar Ghosh, U. Barman, P. Bora, Tourangbam Harishore Singh, B. Chaudhuri","doi":"10.1109/NCVPRIPG.2013.6776228","DOIUrl":"https://doi.org/10.1109/NCVPRIPG.2013.6776228","url":null,"abstract":"This paper presents an implementation of an OCR system for the Meetei Mayek script. The script has been newly reintroduced and there is a growing set of documents currently available in this script. Our system accepts an image of the textual portion of a page and outputs the text in the Unicode format. It incorporates preprocessing, segmentation and classification stages. However, no post-processing is done to the output. The system achieves an accuracy of about 96% on a moderate database.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"167 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114751563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
期刊
2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1