2014 22nd International Conference on Pattern Recognition最新文献

英文中文

Robust Text Detection in Natural Scene Images by Generalized Color-Enhanced Contrasting Extremal Region and Neural Networks 基于广义颜色增强对比极值区域和神经网络的自然场景图像鲁棒文本检测

2014 22nd International Conference on Pattern Recognition

Pub Date : 2014-08-24 DOI: 10.1109/ICPR.2014.469

Lei Sun, Qiang Huo, Wei Jia, Kai Chen

This paper presents a robust text detection approach based on generalized color-enhanced contrasting extremal region (CER) and neural networks. Given a color natural scene image, six component-trees are built from its gray scale image, hue and saturation channel images in a perception-based illumination invariant color space, and their inverted images, respectively. From each component-tree, generalized color-enhanced CERs are extracted as character candidates. By using a "divide-and-conquer" strategy, each candidate image patch is labeled reliably by rules as one of five types, namely, Long, Thin, Fill, Square-large and Square-small, and classified as text or non-text by a corresponding neural network, which is trained by an ambiguity-free learning strategy. After pruning non-text components, repeating components in each component-tree are pruned by using color and area information to obtain a component graph, from which candidate text-lines are formed and verified by another set of neural networks. Finally, results from six component-trees are combined, and a post-processing step is used to recover lost characters and split text lines into words as appropriate. Our proposed method achieves 85.72% recall, 87.03% precision, and 86.37% F-score on ICDAR-2013 "Reading Text in Scene Images" test set.

提出了一种基于广义颜色增强对比极值区域(CER)和神经网络的鲁棒文本检测方法。给定一幅彩色自然场景图像，分别从其灰度图像、基于感知的照明不变色彩空间中的色调和饱和度通道图像及其倒立图像构建6个分量树。从每个组件树中提取广义颜色增强cer作为候选字符。通过“分而治之”策略，每个候选图像patch被规则可靠地标记为Long, Thin, Fill, Square-large和Square-small五种类型之一，并由相应的神经网络分类为文本或非文本，该神经网络通过无歧义学习策略进行训练。在对非文本成分进行剪枝后，利用颜色和面积信息对每个成分树中的重复成分进行剪枝，得到成分图，形成候选文本行，并由另一组神经网络进行验证。最后，对来自六个组件树的结果进行组合，并使用后处理步骤来恢复丢失的字符，并根据需要将文本行拆分为单词。该方法在ICDAR-2013“场景图像中阅读文本”测试集上达到了85.72%的召回率、87.03%的准确率和86.37%的f分。

{"title":"Robust Text Detection in Natural Scene Images by Generalized Color-Enhanced Contrasting Extremal Region and Neural Networks","authors":"Lei Sun, Qiang Huo, Wei Jia, Kai Chen","doi":"10.1109/ICPR.2014.469","DOIUrl":"https://doi.org/10.1109/ICPR.2014.469","url":null,"abstract":"This paper presents a robust text detection approach based on generalized color-enhanced contrasting extremal region (CER) and neural networks. Given a color natural scene image, six component-trees are built from its gray scale image, hue and saturation channel images in a perception-based illumination invariant color space, and their inverted images, respectively. From each component-tree, generalized color-enhanced CERs are extracted as character candidates. By using a \"divide-and-conquer\" strategy, each candidate image patch is labeled reliably by rules as one of five types, namely, Long, Thin, Fill, Square-large and Square-small, and classified as text or non-text by a corresponding neural network, which is trained by an ambiguity-free learning strategy. After pruning non-text components, repeating components in each component-tree are pruned by using color and area information to obtain a component graph, from which candidate text-lines are formed and verified by another set of neural networks. Finally, results from six component-trees are combined, and a post-processing step is used to recover lost characters and split text lines into words as appropriate. Our proposed method achieves 85.72% recall, 87.03% precision, and 86.37% F-score on ICDAR-2013 \"Reading Text in Scene Images\" test set.","PeriodicalId":142159,"journal":{"name":"2014 22nd International Conference on Pattern Recognition","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127704866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 43

New Geometric Interpretation and Analytic Solution for Quadrilateral Reconstruction 四边形重构的新几何解释与解析解

2014 22nd International Conference on Pattern Recognition

Pub Date : 2014-08-24 DOI: 10.1109/ICPR.2014.688

Joo-Haeng Lee

A new geometric framework, called generalized coupled line camera (GCLC), is proposed to derive an analytic solution to reconstruct an unknown scene quadrilateral and the relevant projective structure from a single or multiple image quadrilaterals. We extend the previous approach developed for rectangle to handle arbitrary scene quadrilaterals. First, we generalize a single line camera by removing the centering constraint that the principal axis should bisect a scene line. Then, we couple a pair of generalized line cameras to model a frustum with a quadrilateral base. Finally, we show that the scene quadrilateral and the center of projection can be analytically reconstructed from a single view when prior knowledge on the quadrilateral is available. A completely unknown quadrilateral can be reconstructed from four views through non-linear optimization. We also describe a improved method to handle an off-centered case by geometrically inferring a centered proxy quadrilateral, which accelerates a reconstruction process without relying on homography. The proposed method is easy to implement since each step is expressed as a simple analytic equation. We present the experimental results on real and synthetic examples.

提出了一种新的几何框架——广义耦合线相机(GCLC)，推导了从单个或多个图像四边形重构未知场景四边形及其相关投影结构的解析解。我们将之前针对矩形开发的方法扩展到处理任意场景四边形。首先，我们通过去除主轴应该平分场景线的中心约束来推广单线相机。然后，我们耦合了一对广义线相机来模拟具有四边形基底的截锥体。最后，我们证明了当四边形的先验知识可用时，可以从单个视图解析重建场景四边形和投影中心。通过非线性优化，可以从四个视图重构一个完全未知的四边形。我们还描述了一种改进的方法，通过几何推断一个有中心的代理四边形来处理偏离中心的情况，这加快了重建过程，而不依赖于单应性。该方法易于实现，因为每一步都用简单的解析方程表示。给出了实际算例和综合算例的实验结果。

{"title":"New Geometric Interpretation and Analytic Solution for Quadrilateral Reconstruction","authors":"Joo-Haeng Lee","doi":"10.1109/ICPR.2014.688","DOIUrl":"https://doi.org/10.1109/ICPR.2014.688","url":null,"abstract":"A new geometric framework, called generalized coupled line camera (GCLC), is proposed to derive an analytic solution to reconstruct an unknown scene quadrilateral and the relevant projective structure from a single or multiple image quadrilaterals. We extend the previous approach developed for rectangle to handle arbitrary scene quadrilaterals. First, we generalize a single line camera by removing the centering constraint that the principal axis should bisect a scene line. Then, we couple a pair of generalized line cameras to model a frustum with a quadrilateral base. Finally, we show that the scene quadrilateral and the center of projection can be analytically reconstructed from a single view when prior knowledge on the quadrilateral is available. A completely unknown quadrilateral can be reconstructed from four views through non-linear optimization. We also describe a improved method to handle an off-centered case by geometrically inferring a centered proxy quadrilateral, which accelerates a reconstruction process without relying on homography. The proposed method is easy to implement since each step is expressed as a simple analytic equation. We present the experimental results on real and synthetic examples.","PeriodicalId":142159,"journal":{"name":"2014 22nd International Conference on Pattern Recognition","volume":"455 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125796264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Dynamic Insertions in TLAESA Fast NN Search Algorithm TLAESA快速神经网络搜索算法中的动态插入

2014 22nd International Conference on Pattern Recognition

Pub Date : 2014-08-24 DOI: 10.1109/ICPR.2014.657

L. Micó, J. Oncina

Nearest Neighbour search (NNS) is a widely used technique in Pattern Recognition. In order to speed up the search many indexing techniques have been proposed. The need to work with large dynamic databases, in interactive or online systems, has resulted in an increasing interest in adapting or creating fast methods to update these indexes. TLAESA is a fast search algorithm with sub linear overhead that, using of a branch and bound technique, can find the nearest neighbour computing a very low number of distance computations. In this paper, we propose a new fast updating method for the TLAESA index. The behaviour of this index has been analysed theoretical and experimentally. We have obtained a log-square upper bound of the rebuilding expected time. This bound has been verified experimentally on several synthetic and real data experiments.

最近邻居搜索(NNS)是一种广泛应用于模式识别的技术。为了加快检索速度，人们提出了许多索引技术。由于需要在交互式或在线系统中使用大型动态数据库，因此人们越来越有兴趣调整或创建快速方法来更新这些索引。TLAESA是一种亚线性开销的快速搜索算法，它利用分支定界技术，可以通过极低的距离计算次数找到最近的邻居。本文提出了一种新的TLAESA索引快速更新方法。本文从理论和实验两方面分析了该指标的性能。我们得到了重建期望时间的对数平方上界。该界已在几个合成实验和实际数据实验中得到了验证。

引用次数: 2

Offline Features for Classifying Handwritten Math Symbols with Recurrent Neural Networks 递归神经网络手写数学符号分类的离线特征

2014 22nd International Conference on Pattern Recognition

Pub Date : 2014-08-24 DOI: 10.1109/ICPR.2014.507

Francisco Alvaro, Joan Andreu Sánchez, J. Benedí

In mathematical expression recognition, symbol classification is a crucial step. Numerous approaches for recognizing handwritten math symbols have been published, but most of them are either an online approach or a hybrid approach. There is an absence of a study focused on offline features for handwritten math symbol recognition. Furthermore, many papers provide results difficult to compare. In this paper we assess the performance of several well-known offline features for this task. We also test a novel set of features based on polar histograms and the vertical repositioning method for feature extraction. Finally, we report and analyze the results of several experiments using recurrent neural networks on a large public database of online handwritten math expressions. The combination of online and offline features significantly improved the recognition rate.

在数学表达式识别中，符号分类是至关重要的一步。已经发表了许多识别手写数学符号的方法，但大多数方法要么是在线方法，要么是混合方法。目前还没有针对手写数学符号识别的离线特征的研究。此外，许多论文提供的结果难以比较。在本文中，我们评估了该任务中几个知名的离线特征的性能。我们还测试了一组新的基于极坐标直方图和垂直重定位的特征提取方法。最后，我们报告并分析了在一个大型在线手写数学表达式公共数据库上使用递归神经网络的几个实验结果。在线和离线特征的结合显著提高了识别率。

引用次数: 27

EM-Based Layout Analysis Method for Structured Documents 基于em的结构化文档布局分析方法

2014 22nd International Conference on Pattern Recognition

Pub Date : 2014-08-24 DOI: 10.1109/ICPR.2014.63

Francisco Cruz, O. R. Terrades

In this paper we present a method to perform layout analysis in structured documents. We proposed an EM-based algorithm to fit a set of Gaussian mixtures to the different regions according to the logical distribution along the page. After the convergence, we estimate the final shape of the regions according to the parameters computed for each component of the mixture. We evaluated our method in the task of record detection in a collection of historical structured documents and performed a comparison with other previous works in this task.

本文提出了一种在结构化文档中进行布局分析的方法。我们提出了一种基于em的算法，根据页面的逻辑分布将一组高斯混合物拟合到不同的区域。收敛后，我们根据计算得到的混合物各组分的参数估计最终的区域形状。我们在历史结构化文档集合中的记录检测任务中评估了我们的方法，并与该任务中的其他先前工作进行了比较。

引用次数: 8

Learning Room Occupancy Patterns from Sparsely Recovered Light Transport Models 从稀疏恢复的轻输运模型学习房间占用模式

2014 22nd International Conference on Pattern Recognition

Pub Date : 2014-08-24 DOI: 10.1109/ICPR.2014.347

Quan Wang, Xinchi Zhang, M. Wang, K. Boyer

In traditional vision systems, high level information is usually inferred from images or videos captured by cameras, or depth images captured by depth sensors. These images, whether gray-level, RGB, or depth, have a human-readable 2D structure which describes the spatial distribution of the scene. In this paper, we explore the possibility to use distributed color sensors to infer high level information, such as room occupancy. Unlike a camera, the output of a color sensor has only a few variables. However, if the light in the room is color controllable, we can use the outputs of multiple color sensors under different lighting conditions to recover the light transport model (LTM) in the room. While the room occupancy changes, the LTM also changes accordingly, and we can use machine learning to establish the mapping from LTM to room occupancy.

在传统的视觉系统中，通常从摄像机捕获的图像或视频或深度传感器捕获的深度图像中推断出高级信息。这些图像，无论是灰度、RGB还是深度，都具有人类可读的2D结构，描述了场景的空间分布。在本文中，我们探索了使用分布式颜色传感器来推断高级别信息的可能性，例如房间占用率。与照相机不同，颜色传感器的输出只有几个变量。但是，如果房间内的光是颜色可控的，我们可以使用多个颜色传感器在不同照明条件下的输出来恢复房间内的光传输模型(LTM)。当房间占用率变化时，LTM也随之变化，我们可以使用机器学习来建立从LTM到房间占用率的映射。

引用次数: 4

Evaluating Multi-task Learning for Multi-view Head-Pose Classification in Interactive Environments 评价交互环境下多视角头姿分类的多任务学习

2014 22nd International Conference on Pattern Recognition

Pub Date : 2014-08-24 DOI: 10.1109/ICPR.2014.717

Yan Yan, Subramanian Ramanathan, E. Ricci, O. Lanz, N. Sebe

Social attention behavior offers vital cues towards inferring one's personality traits from interactive settings such as round-table meetings and cocktail parties. Head orientation is typically employed as a proxy for determining the social attention direction when faces are captured at low-resolution. Recently, multi-task learning has been proposed to robustly compute head pose under perspective and scale-based facial appearance variations when multiple, distant and large field-of-view cameras are employed for visual analysis in smart-room applications. In this paper, we evaluate the effectiveness of an SVM-based MTL (SVM+MTL) framework with various facial descriptors (KL, HOG, LBP, etc.). The KL+HOG feature combination is found to produce the best classification performance, with SVM+MTL outperforming classical SVM irrespective of the feature used.

社会注意行为为从圆桌会议和鸡尾酒会等互动环境中推断一个人的个性特征提供了重要线索。当以低分辨率捕捉人脸时，头部方向通常被用作确定社会注意方向的代理。最近，多任务学习被提出，用于在智能房间应用中使用多个远距离和大视场摄像机进行视觉分析时，在基于视角和尺度的面部外观变化下稳健地计算头部姿势。在本文中，我们评估了基于SVM的MTL (SVM+MTL)框架与各种面部描述符(KL, HOG, LBP等)的有效性。发现KL+HOG特征组合产生最佳分类性能，无论使用何种特征，SVM+MTL都优于经典SVM。

引用次数: 12

Recursive Separation of Stationary Components by Subspace Projection and Stochastic Constraints 基于子空间投影和随机约束的平稳分量递归分离

2014 22nd International Conference on Pattern Recognition

Pub Date : 2014-08-24 DOI: 10.1109/ICPR.2014.597

J. D. Martínez-Vargas, Cristian Castro Hoyos, A. Álvarez-Meza, C. Acosta-Medina, G. Castellanos-Domínguez

We propose a filtration approach to discriminate between stationary and non-stationary signals which consist into recursively update an enhanced representation of input time-series in such a way that the decomposition is able to identify time-varying statistical parameters of the data. The approach is based on the hypothesis that such updating providing a time-varying subspace projection under stationary constraints, allows to obtain a better separation. Validation of quality separation is carried on simulated and real data. In both cases, obtained separation shows that proposed approach is able to identify different dynamics on analyzed data.

我们提出了一种过滤方法来区分平稳和非平稳信号，这些信号包括递归地更新输入时间序列的增强表示，这样分解就能够识别数据的时变统计参数。该方法基于这样的假设，即这种更新在平稳约束下提供时变子空间投影，可以获得更好的分离。通过仿真和实际数据对质量分离进行了验证。在这两种情况下，得到的分离表明，所提出的方法能够识别分析数据上的不同动态。

引用次数: 1

Ancient Coin Recognition Based on Spatial Coding 基于空间编码的古钱币识别

2014 22nd International Conference on Pattern Recognition

Pub Date : 2014-08-24 DOI: 10.1109/ICPR.2014.64

Jongpil Kim, V. Pavlovic

Roman coins play an important role to understand the Roman empire because they convey rich information about key historical events of the time. Moreover, as large amounts of coins are daily traded over the Internet, it becomes necessary to develop automatic coin recognition systems to prevent illegal trades. In this paper, we propose an automatic recognition method for ancient Roman coins. The proposed method exploits the structure of the coin by using a spatially local coding method. Results show that the proposed method outperforms traditional rigid spatial structure models such as the spatial pyramid.

罗马钱币对了解罗马帝国起着重要的作用，因为它们传达了有关当时重要历史事件的丰富信息。此外，由于每天都有大量的硬币在互联网上交易，因此有必要开发自动识别硬币的系统，以防止非法交易。本文提出了一种古罗马钱币自动识别方法。所提出的方法利用空间局部编码方法来利用硬币的结构。结果表明，该方法优于空间金字塔等传统的刚性空间结构模型。

引用次数: 15

Activity Recognition in Smart Homes Using Clustering Based Classification 基于聚类分类的智能家居活动识别

2014 22nd International Conference on Pattern Recognition

Pub Date : 2014-08-24 DOI: 10.1109/ICPR.2014.241

L. Fahad, Syed Fahad Tahir, M. Rajarajan

Activity recognition in smart homes plays an important role in healthcare by maintaining the well being of elderly and patients through remote monitoring and assisted technologies. In this paper, we propose a two level classification approach for activity recognition by utilizing the information obtained from the sensors deployed in a smart home. In order to separates the similar activities from the non similar activities, we group the homogeneous activities using the Lloyd's clustering algorithm. For the classification of non-separated activities within each cluster, we apply a computationally less expensive learning algorithm Evidence Theoretic K-Nearest Neighbor, which performs better in uncertain conditions and noisy data. The approach enables us to achieve improved recognition accuracy particularly for overlapping activities. A comparison of the proposed approach with the existing activity recognition approaches is presented on two publicly available smart home datasets. The proposed approach demonstrates better recognition rate compared to the existing methods.

智能家居中的活动识别通过远程监控和辅助技术在医疗保健中发挥重要作用，维持老年人和患者的健康。在本文中，我们提出了一种利用智能家居中部署的传感器获得的信息进行活动识别的两级分类方法。为了将相似的活动从不相似的活动中分离出来，我们使用劳埃德聚类算法对同质活动进行分组。对于每个聚类中的非分离活动的分类，我们采用了计算成本更低的证据理论k -最近邻学习算法，该算法在不确定条件和噪声数据中表现更好。该方法使我们能够实现更高的识别精度，特别是对于重叠活动。在两个公开的智能家居数据集上，将所提出的方法与现有的活动识别方法进行了比较。与现有方法相比，该方法具有更好的识别率。

引用次数: 52

首页上一页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2014 22nd International Conference on Pattern Recognition

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀