首页 > 最新文献

2014 22nd International Conference on Pattern Recognition最新文献

英文 中文
Efficient Evaluation of SVM Classifiers Using Error Space Encoding 基于错误空间编码的SVM分类器的高效评估
Pub Date : 2014-12-08 DOI: 10.1109/ICPR.2014.755
Nisarg Raval, Rashmi Vilas Tonge, C. V. Jawahar
Many computer vision tasks require efficient evaluation of Support Vector Machine (SVM) classifiers on large image databases. Our goal is to efficiently evaluate SVM classifiers on a large number of images. We propose a novel Error Space Encoding (ESE) scheme for SVM evaluation which utilizes large number of classifiers already evaluated on the similar data set. We model this problem as an encoding of a novel classifier (query) in terms of the existing classifiers (query logs). With sufficiently large query logs, we show that ESE performs far better than any other existing encoding schemes. With this method we are able to retrieve nearly 100% correct top-k images from a dataset of 1 Million images spanning across 1000 categories. We also demonstrate application of our method in terms of relevance feedback and query expansion mechanism and show that our method achieves the same accuracy 90 times faster than exhaustive SVM evaluations.
许多计算机视觉任务需要在大型图像数据库上对支持向量机(SVM)分类器进行有效的评估。我们的目标是在大量图像上有效地评估SVM分类器。我们提出了一种新的错误空间编码(ESE)方案用于SVM评估,该方案利用了在相似数据集上已经评估过的大量分类器。我们将这个问题建模为基于现有分类器(查询日志)的新分类器(查询)的编码。使用足够大的查询日志,我们发现ESE的性能远远好于任何其他现有的编码方案。使用这种方法,我们能够从跨越1000个类别的100万张图像的数据集中检索几乎100%正确的top-k图像。我们还演示了我们的方法在相关性反馈和查询扩展机制方面的应用,并表明我们的方法比穷举支持向量机评估快90倍。
{"title":"Efficient Evaluation of SVM Classifiers Using Error Space Encoding","authors":"Nisarg Raval, Rashmi Vilas Tonge, C. V. Jawahar","doi":"10.1109/ICPR.2014.755","DOIUrl":"https://doi.org/10.1109/ICPR.2014.755","url":null,"abstract":"Many computer vision tasks require efficient evaluation of Support Vector Machine (SVM) classifiers on large image databases. Our goal is to efficiently evaluate SVM classifiers on a large number of images. We propose a novel Error Space Encoding (ESE) scheme for SVM evaluation which utilizes large number of classifiers already evaluated on the similar data set. We model this problem as an encoding of a novel classifier (query) in terms of the existing classifiers (query logs). With sufficiently large query logs, we show that ESE performs far better than any other existing encoding schemes. With this method we are able to retrieve nearly 100% correct top-k images from a dataset of 1 Million images spanning across 1000 categories. We also demonstrate application of our method in terms of relevance feedback and query expansion mechanism and show that our method achieves the same accuracy 90 times faster than exhaustive SVM evaluations.","PeriodicalId":142159,"journal":{"name":"2014 22nd International Conference on Pattern Recognition","volume":"188 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127449678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Inferring Plane Orientation from a Single Motion Blurred Image 从单一运动模糊图像推断平面方向
Pub Date : 2014-12-08 DOI: 10.1109/ICPR.2014.364
M. P. Rao, A. Rajagopalan, G. Seetharaman
We present a scheme for recovering the orientation of a planar scene from a single translation ally-motion blurred image. By leveraging the homography relationship among image coordinates of 3D points lying on a plane, and by exploiting natural correspondences among the extremities of the blur kernels derived from the motion blurred observation, the proposed method can accurately infer the normal of the planar surface. We validate our approach on synthetic as well as real planar scenes.
我们提出了一种从单幅平移同步运动模糊图像中恢复平面场景方向的方案。该方法利用平面上三维点的图像坐标之间的单应性关系,利用运动模糊观测得到的模糊核极值之间的自然对应关系,可以准确地推断出平面的法线。我们在合成和真实的平面场景中验证了我们的方法。
{"title":"Inferring Plane Orientation from a Single Motion Blurred Image","authors":"M. P. Rao, A. Rajagopalan, G. Seetharaman","doi":"10.1109/ICPR.2014.364","DOIUrl":"https://doi.org/10.1109/ICPR.2014.364","url":null,"abstract":"We present a scheme for recovering the orientation of a planar scene from a single translation ally-motion blurred image. By leveraging the homography relationship among image coordinates of 3D points lying on a plane, and by exploiting natural correspondences among the extremities of the blur kernels derived from the motion blurred observation, the proposed method can accurately infer the normal of the planar surface. We validate our approach on synthetic as well as real planar scenes.","PeriodicalId":142159,"journal":{"name":"2014 22nd International Conference on Pattern Recognition","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115050752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Super-resolution Reconstruction for Binocular 3D Data 双目三维数据的超分辨率重建
Pub Date : 2014-12-08 DOI: 10.1109/ICPR.2014.721
Wei-Tsung Hsiao, Jin-Jang Leou, H. Hsiao
In this study, a super-resolution reconstruction approach for binocular 3D data is proposed. The aim is to obtain the high-resolution (HR) disparity map from a low-resolution (LR) binocular image pair by super-resolution reconstruction. The proposed approach contains five stages, i.e., initial disparity map estimation using local aggregation, disparity plane model computation, global energy cost minimization, HR disparity map composition by region-based fusion (selection), and fused HR disparity map refinement. Based on the experimental results obtained in this study, in terms of PSNR and bad pixel rate (BPR), the final HR disparity maps by the proposed approach are better than those by four comparison approaches.
本研究提出了一种双目三维数据的超分辨率重建方法。目的是通过超分辨率重建,从低分辨率双目图像对获得高分辨率视差图。该方法包含5个阶段,即利用局部聚合估计初始视差图、视差平面模型计算、全局能量成本最小化、基于区域的融合(选择)合成人力资源视差图、融合人力资源视差图细化。从本研究的实验结果来看,在PSNR和bad pixel rate (BPR)方面,本文方法最终得到的HR视差图优于4种比较方法。
{"title":"Super-resolution Reconstruction for Binocular 3D Data","authors":"Wei-Tsung Hsiao, Jin-Jang Leou, H. Hsiao","doi":"10.1109/ICPR.2014.721","DOIUrl":"https://doi.org/10.1109/ICPR.2014.721","url":null,"abstract":"In this study, a super-resolution reconstruction approach for binocular 3D data is proposed. The aim is to obtain the high-resolution (HR) disparity map from a low-resolution (LR) binocular image pair by super-resolution reconstruction. The proposed approach contains five stages, i.e., initial disparity map estimation using local aggregation, disparity plane model computation, global energy cost minimization, HR disparity map composition by region-based fusion (selection), and fused HR disparity map refinement. Based on the experimental results obtained in this study, in terms of PSNR and bad pixel rate (BPR), the final HR disparity maps by the proposed approach are better than those by four comparison approaches.","PeriodicalId":142159,"journal":{"name":"2014 22nd International Conference on Pattern Recognition","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115509905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Information Divergence Based Saliency Detection with a Global Center-Surround Mechanism 基于全局中心-环绕机制的信息发散显著性检测
Pub Date : 2014-12-08 DOI: 10.1109/ICPR.2014.590
Ibrahim M. H. Rahman, C. Hollitt, Mengjie Zhang
In this paper a novel technique for saliency detection called Global Information Divergence is proposed. The technique is based on the diversity in information between two regions. Initially patches are extracted at multi-scales from the input images. This is followed by reducing the dimensionality of the extracted patches using Principal Component Analysis. After that the information divergence is evaluated between the reduced dimensionality patches, and calculated between a center and a surround region. Our technique uses a global method for defining the center patch and the surround patches collectively. The technique is tested on four competitive and complex datasets both for saliency detection and segmentation. The results obtained show a good performance in terms of quality of the saliency maps and speed compared with 16 state-of-the-art techniques.
本文提出了一种新的显著性检测技术——全局信息发散。该技术基于两个区域之间信息的多样性。首先从输入图像中提取多尺度的补丁。接下来是使用主成分分析降低提取的斑块的维数。然后评估降维补丁之间的信息散度,并计算中心和周围区域之间的信息散度。我们的技术使用全局方法来共同定义中心补丁和周围补丁。该技术在四个竞争性和复杂的数据集上进行了显著性检测和分割测试。与16种最先进的技术相比,所获得的结果在显著性图的质量和速度方面表现良好。
{"title":"Information Divergence Based Saliency Detection with a Global Center-Surround Mechanism","authors":"Ibrahim M. H. Rahman, C. Hollitt, Mengjie Zhang","doi":"10.1109/ICPR.2014.590","DOIUrl":"https://doi.org/10.1109/ICPR.2014.590","url":null,"abstract":"In this paper a novel technique for saliency detection called Global Information Divergence is proposed. The technique is based on the diversity in information between two regions. Initially patches are extracted at multi-scales from the input images. This is followed by reducing the dimensionality of the extracted patches using Principal Component Analysis. After that the information divergence is evaluated between the reduced dimensionality patches, and calculated between a center and a surround region. Our technique uses a global method for defining the center patch and the surround patches collectively. The technique is tested on four competitive and complex datasets both for saliency detection and segmentation. The results obtained show a good performance in terms of quality of the saliency maps and speed compared with 16 state-of-the-art techniques.","PeriodicalId":142159,"journal":{"name":"2014 22nd International Conference on Pattern Recognition","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123561504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Computer Assisted Analysis System of Electroencephalogram for Diagnosing Epilepsy 诊断癫痫的脑电图计算机辅助分析系统
Pub Date : 2014-12-08 DOI: 10.1109/ICPR.2014.583
Malik Anas Ahmad, N. Khan, W. Majeed
Automation of Electroencephalogram (EEG) analysis can significantly help the neurologist during the diagnosis of epilepsy. During last few years lot of work has been done in the field of computer assisted analysis to detect an epileptic activity in an EEG. Still there is a significant amount of need to make these computer assisted EEG analysis systems more convenient and informative for a neurologist. After briefly discussing some of the existing work we have suggested an approach which can make these systems more helpful, detailed and precise for the neurologist. In our proposed approach we have handled each epoch of each channel for each type of epileptic pattern exclusive to each other. In our approach feature extraction starts with an application of multilevel Discrete Wavelet Transform (DWT) on each 1 sec non-overlapping epochs. Then we apply Principal Component Analysis (PCA) to reduce the effect of redundant and noisy data. Afterwards we apply Support Vector Machine (SVM) to classify these epochs as Epileptic or not. In our system a user can mark any mistakes he encounters. The concept behind the inclusion of the retraining is that, if there is more than one example with same attributes but different labels, the classifier is going to get trained to the one with most population. These corrective marking will be saved as examples. On retraining the classifier will improve its classification, hence it will tries to adapt the user. In the end we have discussed the results we have acquired till now. Due to limitation in the available data we are only able to report the classification performance for generalised absence seizure. The reported accuracy is resulted on very versatile dataset of 21 patients from Punjab Institute of Mental Health (PIMH) and 21 patients from Children Hospital Boston (CHB) which have different number of channel and sampling frequency. This usage of the data proves the robustness of our algorithm.
脑电图(EEG)分析的自动化对神经科医生诊断癫痫有重要的帮助。在过去的几年中,在计算机辅助分析领域进行了大量的工作,以检测脑电图中的癫痫活动。然而,对于神经科医生来说,使这些计算机辅助脑电图分析系统更加方便和信息丰富仍有很大的需要。在简要讨论了一些现有的工作之后,我们提出了一种方法,可以使这些系统对神经科医生更有帮助,更详细,更精确。在我们提出的方法中,我们已经处理了每种类型的癫痫模式的每个通道的每个时代。在我们的方法中,特征提取首先在每个1秒非重叠时期应用多层离散小波变换(DWT)。然后应用主成分分析(PCA)来减少冗余和噪声数据的影响。然后应用支持向量机(SVM)对这些时代进行癫痫性和非癫痫性分类。在我们的系统中,用户可以标记他遇到的任何错误。包含再训练背后的概念是,如果有多个具有相同属性但不同标签的示例,分类器将被训练为具有最多人口的那个。这些校正标记将作为样本保存。通过对分类器的再训练,分类器的分类能力会得到提高,因此它会尝试适应用户。最后讨论了目前所取得的成果。由于可用数据的限制,我们只能报告广义缺勤发作的分类表现。报告的准确性是基于旁遮普精神卫生研究所(PIMH)的21名患者和波士顿儿童医院(CHB)的21名患者的非常通用的数据集得出的,这些数据集具有不同的通道数量和采样频率。数据的使用证明了算法的鲁棒性。
{"title":"Computer Assisted Analysis System of Electroencephalogram for Diagnosing Epilepsy","authors":"Malik Anas Ahmad, N. Khan, W. Majeed","doi":"10.1109/ICPR.2014.583","DOIUrl":"https://doi.org/10.1109/ICPR.2014.583","url":null,"abstract":"Automation of Electroencephalogram (EEG) analysis can significantly help the neurologist during the diagnosis of epilepsy. During last few years lot of work has been done in the field of computer assisted analysis to detect an epileptic activity in an EEG. Still there is a significant amount of need to make these computer assisted EEG analysis systems more convenient and informative for a neurologist. After briefly discussing some of the existing work we have suggested an approach which can make these systems more helpful, detailed and precise for the neurologist. In our proposed approach we have handled each epoch of each channel for each type of epileptic pattern exclusive to each other. In our approach feature extraction starts with an application of multilevel Discrete Wavelet Transform (DWT) on each 1 sec non-overlapping epochs. Then we apply Principal Component Analysis (PCA) to reduce the effect of redundant and noisy data. Afterwards we apply Support Vector Machine (SVM) to classify these epochs as Epileptic or not. In our system a user can mark any mistakes he encounters. The concept behind the inclusion of the retraining is that, if there is more than one example with same attributes but different labels, the classifier is going to get trained to the one with most population. These corrective marking will be saved as examples. On retraining the classifier will improve its classification, hence it will tries to adapt the user. In the end we have discussed the results we have acquired till now. Due to limitation in the available data we are only able to report the classification performance for generalised absence seizure. The reported accuracy is resulted on very versatile dataset of 21 patients from Punjab Institute of Mental Health (PIMH) and 21 patients from Children Hospital Boston (CHB) which have different number of channel and sampling frequency. This usage of the data proves the robustness of our algorithm.","PeriodicalId":142159,"journal":{"name":"2014 22nd International Conference on Pattern Recognition","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125212202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Pseudo-Marginal Bayesian Multiple-Class Multiple-Kernel Learning for Neuroimaging Data 神经影像数据的伪边际贝叶斯多类多核学习
Pub Date : 2014-12-08 DOI: 10.1109/ICPR.2014.549
Andrew D. O'Harney, A. Marquand, K. Rubia, K. Chantiluke, Anna B. Smith, Ana Cubillo, C. Blain, M. Filippone
In clinical neuroimaging applications where subjects belong to one of multiple classes of disease states and multiple imaging sources are available, the aim is to achieve accurate classification while assessing the importance of the sources in the classification task. This work proposes the use of fully Bayesian multiple-class multiple-kernel learning based on Gaussian Processes, as it offers flexible classification capabilities and a sound quantification of uncertainty in parameter estimates and predictions. The exact inference of parameters and accurate quantification of uncertainty in Gaussian Process models, however, poses a computationally challenging problem. This paper proposes the application of advanced inference techniques based on Markov chain Monte Carlo and unbiased estimates of the marginal likelihood, and demonstrates their ability to accurately and efficiently carry out inference in their application on synthetic data and real clinical neuroimaging data. The results in this paper are important as they further work in the direction of achieving computationally feasible fully Bayesian models for a wide range of real world applications.
在临床神经影像学应用中,受试者属于多个疾病状态类别之一,并且有多个成像源可用,其目的是在评估分类任务中来源的重要性的同时实现准确的分类。这项工作提出了基于高斯过程的全贝叶斯多类多核学习的使用,因为它提供了灵活的分类能力和参数估计和预测中的不确定性的可靠量化。然而,高斯过程模型中参数的精确推断和不确定性的精确量化在计算上是一个具有挑战性的问题。本文提出了基于马尔可夫链蒙特卡罗和无偏边际似然估计的高级推理技术的应用,并在合成数据和临床真实神经影像学数据的应用中展示了其准确高效的推理能力。本文的结果很重要,因为它们进一步朝着实现广泛的现实世界应用的计算可行的全贝叶斯模型的方向工作。
{"title":"Pseudo-Marginal Bayesian Multiple-Class Multiple-Kernel Learning for Neuroimaging Data","authors":"Andrew D. O'Harney, A. Marquand, K. Rubia, K. Chantiluke, Anna B. Smith, Ana Cubillo, C. Blain, M. Filippone","doi":"10.1109/ICPR.2014.549","DOIUrl":"https://doi.org/10.1109/ICPR.2014.549","url":null,"abstract":"In clinical neuroimaging applications where subjects belong to one of multiple classes of disease states and multiple imaging sources are available, the aim is to achieve accurate classification while assessing the importance of the sources in the classification task. This work proposes the use of fully Bayesian multiple-class multiple-kernel learning based on Gaussian Processes, as it offers flexible classification capabilities and a sound quantification of uncertainty in parameter estimates and predictions. The exact inference of parameters and accurate quantification of uncertainty in Gaussian Process models, however, poses a computationally challenging problem. This paper proposes the application of advanced inference techniques based on Markov chain Monte Carlo and unbiased estimates of the marginal likelihood, and demonstrates their ability to accurately and efficiently carry out inference in their application on synthetic data and real clinical neuroimaging data. The results in this paper are important as they further work in the direction of achieving computationally feasible fully Bayesian models for a wide range of real world applications.","PeriodicalId":142159,"journal":{"name":"2014 22nd International Conference on Pattern Recognition","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116801281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
3D Face Reconstruction via Feature Point Depth Estimation and Shape Deformation 基于特征点深度估计和形状变形的三维人脸重建
Pub Date : 2014-12-08 DOI: 10.1109/ICPR.2014.392
Quan Xiao, Lihua Han, Peizhong Liu
Since a human face can be represented by a few feature points (FPs) with less redundant information, and calculated by a linear combination of a small number of prototypical faces, we propose a two-step 3D face reconstruction approach including FP depth estimation and shape deformation. The proposed approach can reconstruct a realistic 3D face from a 2D frontal face image. In the first step, a coupled dictionary learning method based on sparse representation is employed to explore the underlying mappings between 2D and 3D training FPs, and then the depth of the FPs is estimated. In the second step, a novel shape deformation method is proposed to reconstruct the 3D face by combining a small number of most relevant deformed faces by the estimated FPs. The proposed approach can explore the distributions of 2D and 3D faces and the underlying mappings between them well, because human faces are represented by low-dimensional FPs, and their distributions are described by sparse representations. Moreover, it is much more flexible since we can make any change in any step. Extensive experiments are conducted on BJUT_3D database, and the results validate the effectiveness of the proposed approach.
由于人脸可以由具有较少冗余信息的几个特征点(FPs)来表示,并且可以通过少量原型人脸的线性组合来计算,因此我们提出了一种包含FP深度估计和形状变形的两步三维人脸重建方法。该方法可以从二维正面人脸图像中重建出逼真的三维人脸。首先,采用基于稀疏表示的耦合字典学习方法探索二维和三维训练FPs之间的底层映射,然后估计FPs的深度;在第二步中,提出了一种新的形状变形方法,通过估计的FPs组合少量最相关的变形人脸来重建三维人脸。由于人脸是由低维FPs表示的,并且其分布是由稀疏表示来描述的,因此该方法可以很好地探索二维和三维人脸的分布及其之间的底层映射。此外,它更加灵活,因为我们可以在任何步骤中进行任何更改。在BJUT_3D数据库上进行了大量实验,结果验证了该方法的有效性。
{"title":"3D Face Reconstruction via Feature Point Depth Estimation and Shape Deformation","authors":"Quan Xiao, Lihua Han, Peizhong Liu","doi":"10.1109/ICPR.2014.392","DOIUrl":"https://doi.org/10.1109/ICPR.2014.392","url":null,"abstract":"Since a human face can be represented by a few feature points (FPs) with less redundant information, and calculated by a linear combination of a small number of prototypical faces, we propose a two-step 3D face reconstruction approach including FP depth estimation and shape deformation. The proposed approach can reconstruct a realistic 3D face from a 2D frontal face image. In the first step, a coupled dictionary learning method based on sparse representation is employed to explore the underlying mappings between 2D and 3D training FPs, and then the depth of the FPs is estimated. In the second step, a novel shape deformation method is proposed to reconstruct the 3D face by combining a small number of most relevant deformed faces by the estimated FPs. The proposed approach can explore the distributions of 2D and 3D faces and the underlying mappings between them well, because human faces are represented by low-dimensional FPs, and their distributions are described by sparse representations. Moreover, it is much more flexible since we can make any change in any step. Extensive experiments are conducted on BJUT_3D database, and the results validate the effectiveness of the proposed approach.","PeriodicalId":142159,"journal":{"name":"2014 22nd International Conference on Pattern Recognition","volume":"152 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131782718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Representation Learning for Contextual Object and Region Detection in Remote Sensing 遥感中上下文目标和区域检测的表示学习
Pub Date : 2014-12-08 DOI: 10.1109/ICPR.2014.637
Orhan Firat, G. Can, F. Yarman-Vural
The performance of object recognition and classification on remote sensing imagery is highly dependent on the quality of extracted features, amount of labelled data and the priors defined for contextual models. In this study, we examine the representation learning opportunities for remote sensing. First we attacked localization of contextual cues for complex object detection using disentangling factors learnt from a small amount of labelled data. The complex object, which consists of several sub-parts is further represented under the Conditional Markov Random Fields framework. As a second task, end-to-end target detection using convolutional sparse auto-encoders (CSA) using large amount of unlabelled data is analysed. Proposed methodologies are tested on complex airfield detection problem using Conditional Random Fields and recognition of dispersal areas, park areas, taxi routes, airplanes using CSA. The method is also tested on the detection of the dry docks in harbours. Performance of the proposed method is compared with standard feature engineering methods and found competitive with currently used rule-based and supervised methods.
遥感图像的目标识别和分类性能高度依赖于提取特征的质量、标记数据的数量和为上下文模型定义的先验。在本研究中,我们研究了遥感表征学习的机会。首先,我们使用从少量标记数据中学习的解纠缠因素来攻击复杂目标检测的上下文线索定位。在条件马尔可夫随机场框架下进一步表示由多个子部分组成的复杂对象。作为第二项任务,分析了使用大量未标记数据的卷积稀疏自编码器(CSA)进行端到端目标检测。利用条件随机场对复杂的机场检测问题进行了测试,并利用CSA对分散区域、公园区域、出租车路线和飞机进行了识别。并在港口干船坞的检测中进行了试验。将该方法的性能与标准特征工程方法进行了比较,并与目前使用的基于规则和监督的方法进行了比较。
{"title":"Representation Learning for Contextual Object and Region Detection in Remote Sensing","authors":"Orhan Firat, G. Can, F. Yarman-Vural","doi":"10.1109/ICPR.2014.637","DOIUrl":"https://doi.org/10.1109/ICPR.2014.637","url":null,"abstract":"The performance of object recognition and classification on remote sensing imagery is highly dependent on the quality of extracted features, amount of labelled data and the priors defined for contextual models. In this study, we examine the representation learning opportunities for remote sensing. First we attacked localization of contextual cues for complex object detection using disentangling factors learnt from a small amount of labelled data. The complex object, which consists of several sub-parts is further represented under the Conditional Markov Random Fields framework. As a second task, end-to-end target detection using convolutional sparse auto-encoders (CSA) using large amount of unlabelled data is analysed. Proposed methodologies are tested on complex airfield detection problem using Conditional Random Fields and recognition of dispersal areas, park areas, taxi routes, airplanes using CSA. The method is also tested on the detection of the dry docks in harbours. Performance of the proposed method is compared with standard feature engineering methods and found competitive with currently used rule-based and supervised methods.","PeriodicalId":142159,"journal":{"name":"2014 22nd International Conference on Pattern Recognition","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129010782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Compact Signature-Based Compressed Video Matching Using Dominant Color Profiles (DCP) 基于主色配置文件的压缩视频匹配
Pub Date : 2014-12-08 DOI: 10.1109/ICPR.2014.674
Saddam Bekhet, Amr Ahmed
This paper presents a novel technique for efficient and generic matching of compressed video shots, through compact signatures extracted directly without decompression. The compact signature is based on the Dominant Color Profile (DCP), a sequence of dominant colors extracted and arranged as a sequence of spikes in analogy to the human retinal representation of a scene. The proposed signature represents a given video shot with ~490 integer values, facilitating for real time processing to retrieve a maximum set of matching videos. The technique is able to work directly on MPEG compressed videos, without full decompression, as it utilizes the DC-image as a base for extracting color features. The DC-image has a highly reduced size, while retaining most of visual aspects, and provides high performance compared to the full I-frame. The experiments and results on various standard datasets show the promising performance, both the accuracy and the efficient computation complexity, of the proposed technique.
本文提出了一种新的方法,通过直接提取压缩签名而不进行解压缩,从而实现对压缩视频镜头的高效通用匹配。紧凑的签名是基于主色配置文件(DCP),一个序列的主色提取和安排作为一个序列的尖峰,类比于一个场景的人类视网膜表示。所提出的签名代表一个给定的视频镜头,具有约490个整数值,便于实时处理以检索最大匹配视频集。该技术能够直接在MPEG压缩视频上工作,而不需要完全解压缩,因为它利用dc图像作为提取颜色特征的基础。dc图像具有高度缩小的尺寸,同时保留了大多数视觉方面,并且与全i帧相比提供了高性能。在各种标准数据集上的实验和结果表明,该方法具有良好的精度和高效的计算复杂度。
{"title":"Compact Signature-Based Compressed Video Matching Using Dominant Color Profiles (DCP)","authors":"Saddam Bekhet, Amr Ahmed","doi":"10.1109/ICPR.2014.674","DOIUrl":"https://doi.org/10.1109/ICPR.2014.674","url":null,"abstract":"This paper presents a novel technique for efficient and generic matching of compressed video shots, through compact signatures extracted directly without decompression. The compact signature is based on the Dominant Color Profile (DCP), a sequence of dominant colors extracted and arranged as a sequence of spikes in analogy to the human retinal representation of a scene. The proposed signature represents a given video shot with ~490 integer values, facilitating for real time processing to retrieve a maximum set of matching videos. The technique is able to work directly on MPEG compressed videos, without full decompression, as it utilizes the DC-image as a base for extracting color features. The DC-image has a highly reduced size, while retaining most of visual aspects, and provides high performance compared to the full I-frame. The experiments and results on various standard datasets show the promising performance, both the accuracy and the efficient computation complexity, of the proposed technique.","PeriodicalId":142159,"journal":{"name":"2014 22nd International Conference on Pattern Recognition","volume":"127 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114660164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Learning Flexible Binary Code for Linear Projection Based Hashing with Random Forest 基于随机森林线性投影哈希的灵活二进制码学习
Pub Date : 2014-12-08 DOI: 10.1109/ICPR.2014.464
Shuze Du, Wei Zhang, Shifeng Chen, Y. Wen
Existing linear projection based hashing methods have witnessed many progresses in finding the approximate nearest neighbor(s) of a given query. They perform well when using a short code. But their code length depends on the original data dimension, thus their performance can not be further improved with higher number of bits for low dimensional data. In addition, in the case of high dimensional data, it is not a good choice to produce each bit by a sign function. In this paper, we propose a novel random forest based approach to cope with the above shortcomings. The bits are obtained by recording the paths when a point traversing each tree in the forest. Then we propose a new metric to calculate the similarity between any two codes. Experimental results on two large benchmark datasets show that our approach outperforms its counterparts and demonstrate its superiority over the existing state-of-the-art hashing methods for descriptor retrieval.
现有的基于线性投影的哈希方法在查找给定查询的近似最近邻居方面取得了许多进展。它们在使用短代码时表现良好。但是它们的码长依赖于原始数据维数,因此对于低维数据,它们的性能不能随着比特数的增加而进一步提高。此外,在高维数据的情况下,通过符号函数生成每个位不是一个好的选择。在本文中,我们提出了一种新的基于随机森林的方法来克服上述缺点。当一个点遍历森林中的每棵树时,通过记录路径获得比特。然后,我们提出了一个新的度量来计算任意两个代码之间的相似度。在两个大型基准数据集上的实验结果表明,我们的方法优于同类方法,并证明了其优于现有的描述符检索的最先进的哈希方法。
{"title":"Learning Flexible Binary Code for Linear Projection Based Hashing with Random Forest","authors":"Shuze Du, Wei Zhang, Shifeng Chen, Y. Wen","doi":"10.1109/ICPR.2014.464","DOIUrl":"https://doi.org/10.1109/ICPR.2014.464","url":null,"abstract":"Existing linear projection based hashing methods have witnessed many progresses in finding the approximate nearest neighbor(s) of a given query. They perform well when using a short code. But their code length depends on the original data dimension, thus their performance can not be further improved with higher number of bits for low dimensional data. In addition, in the case of high dimensional data, it is not a good choice to produce each bit by a sign function. In this paper, we propose a novel random forest based approach to cope with the above shortcomings. The bits are obtained by recording the paths when a point traversing each tree in the forest. Then we propose a new metric to calculate the similarity between any two codes. Experimental results on two large benchmark datasets show that our approach outperforms its counterparts and demonstrate its superiority over the existing state-of-the-art hashing methods for descriptor retrieval.","PeriodicalId":142159,"journal":{"name":"2014 22nd International Conference on Pattern Recognition","volume":"140 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127482380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
2014 22nd International Conference on Pattern Recognition
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1