首页 > 最新文献

IPSJ Transactions on Computer Vision and Applications最新文献

英文 中文
Neural 1D Barcode Detection Using the Hough Transform 基于霍夫变换的一维条码检测
Q1 Computer Science Pub Date : 2015-01-01 DOI: 10.2197/ipsjtcva.7.1
Alessandro Zamberletti, I. Gallo, S. Albertini, L. Noce
Barcode reading mobile applications to identify products from pictures acquired by mobile devices are widely used by customers from all over the world to perform online price comparisons or to access reviews written by other customers. Most of the currently available 1D barcode reading applications focus on effectively decoding barcodes and treat the underlying detection task as a side problem that needs to be solved using general purpose object detection methods. However, the majority of mobile devices do not meet the minimum working requirements of those complex general purpose object detection algorithms and most of the efficient specifically designed 1D barcode detection algorithms require user interaction to work properly. In this work, we present a novel method for 1D barcode detection in camera captured images, based on a supervised machine learning algorithm that identifies the characteristic visual patterns of 1D barcodes’ parallel bars in the two-dimensional Hough Transform space of the processed images. The method we propose is angle invariant, requires no user interaction and can be effectively executed on a mobile device; it achieves excellent results for two standard 1D barcode datasets: WWU Muenster Barcode Database and ArTe-Lab 1D Medium Barcode Dataset. Moreover, we prove that it is possible to enhance the performance of a state-of-the-art 1D barcode reading library by coupling it with our detection method.
从移动设备获取的图片中识别产品的条形码读取移动应用程序被世界各地的客户广泛用于进行在线价格比较或访问其他客户撰写的评论。目前大多数可用的一维条码读取应用侧重于有效解码条形码,并将底层检测任务视为需要使用通用对象检测方法解决的附带问题。然而,大多数移动设备无法满足这些复杂的通用目标检测算法的最低工作要求,而大多数高效的专门设计的1D条形码检测算法需要用户交互才能正常工作。在这项工作中,我们提出了一种在相机捕获图像中检测1D条形码的新方法,该方法基于有监督的机器学习算法,该算法在处理图像的二维霍夫变换空间中识别1D条形码平行条的特征视觉模式。我们提出的方法是角度不变的,不需要用户交互,可以有效地在移动设备上执行;它在两个标准的1D条形码数据集:WWU Muenster条形码数据库和art - lab 1D Medium条形码数据集上取得了优异的结果。此外,我们证明,通过将最先进的1D条形码读取库与我们的检测方法相结合,可以提高其性能。
{"title":"Neural 1D Barcode Detection Using the Hough Transform","authors":"Alessandro Zamberletti, I. Gallo, S. Albertini, L. Noce","doi":"10.2197/ipsjtcva.7.1","DOIUrl":"https://doi.org/10.2197/ipsjtcva.7.1","url":null,"abstract":"Barcode reading mobile applications to identify products from pictures acquired by mobile devices are widely used by customers from all over the world to perform online price comparisons or to access reviews written by other customers. Most of the currently available 1D barcode reading applications focus on effectively decoding barcodes and treat the underlying detection task as a side problem that needs to be solved using general purpose object detection methods. However, the majority of mobile devices do not meet the minimum working requirements of those complex general purpose object detection algorithms and most of the efficient specifically designed 1D barcode detection algorithms require user interaction to work properly. In this work, we present a novel method for 1D barcode detection in camera captured images, based on a supervised machine learning algorithm that identifies the characteristic visual patterns of 1D barcodes’ parallel bars in the two-dimensional Hough Transform space of the processed images. The method we propose is angle invariant, requires no user interaction and can be effectively executed on a mobile device; it achieves excellent results for two standard 1D barcode datasets: WWU Muenster Barcode Database and ArTe-Lab 1D Medium Barcode Dataset. Moreover, we prove that it is possible to enhance the performance of a state-of-the-art 1D barcode reading library by coupling it with our detection method.","PeriodicalId":38957,"journal":{"name":"IPSJ Transactions on Computer Vision and Applications","volume":"2 1","pages":"1-9"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85591940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Automatic Martian Dust Storm Detection from Multiple Wavelength Data Based on Decision Level Fusion 基于决策级融合的多波长数据自动探测火星沙尘暴
Q1 Computer Science Pub Date : 2015-01-01 DOI: 10.2197/ipsjtcva.7.79
Keisuke Maeda, Takahiro Ogawa, M. Haseyama
This paper presents automatic Martian dust storm detection from multiple wavelength data based on decision level fusion. In our proposed method, visual features are first extracted from multiple wavelength data, and optimal features are selected for Martian dust storm detection based on the minimal-Redundancy-Maximal-Relevance algorithm. Second, the selected visual features are used to train the Support Vector Machine classifiers that are constructed on each data. Furthermore, as a main contribution of this paper, the proposed method integrates the multiple detection results obtained from heterogeneous data based on decision level fusion, while considering each classifier’s detection performance to obtain accurate final detection results. Consequently, the proposed method realizes successful Martian dust storm detection.
提出了一种基于决策级融合的多波长火星沙尘暴自动探测方法。该方法首先从多波长数据中提取视觉特征,并基于最小冗余度-最大相关性算法选择最优特征用于火星沙尘暴检测。其次,选择的视觉特征用于训练在每个数据上构建的支持向量机分类器。此外,作为本文的主要贡献,该方法基于决策级融合对异构数据获得的多个检测结果进行集成,同时考虑各个分类器的检测性能,以获得准确的最终检测结果。结果表明,该方法成功实现了火星沙尘暴探测。
{"title":"Automatic Martian Dust Storm Detection from Multiple Wavelength Data Based on Decision Level Fusion","authors":"Keisuke Maeda, Takahiro Ogawa, M. Haseyama","doi":"10.2197/ipsjtcva.7.79","DOIUrl":"https://doi.org/10.2197/ipsjtcva.7.79","url":null,"abstract":"This paper presents automatic Martian dust storm detection from multiple wavelength data based on decision level fusion. In our proposed method, visual features are first extracted from multiple wavelength data, and optimal features are selected for Martian dust storm detection based on the minimal-Redundancy-Maximal-Relevance algorithm. Second, the selected visual features are used to train the Support Vector Machine classifiers that are constructed on each data. Furthermore, as a main contribution of this paper, the proposed method integrates the multiple detection results obtained from heterogeneous data based on decision level fusion, while considering each classifier’s detection performance to obtain accurate final detection results. Consequently, the proposed method realizes successful Martian dust storm detection.","PeriodicalId":38957,"journal":{"name":"IPSJ Transactions on Computer Vision and Applications","volume":"51 7 1","pages":"79-83"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83401976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Fast and Accurate Object Detection Based on Binary Co-occurrence Features 基于二值共现特征的快速准确目标检测
Q1 Computer Science Pub Date : 2015-01-01 DOI: 10.2197/ipsjtcva.7.55
Mitsuru Ambai, Taketo Kimura, Chiori Sakai
In this paper, we propose a fast and accurate object detection algorithm based on binary co-occurrence features. In our method, co-occurrences of all the possible pairs of binary elements in a block of binarized HOG are enumerated by logical operations, i.g. circular shift and XOR. This resulted in extremely fast co-occurrence extraction. Our experiments revealed that our method can process a VGA-size image at 64.6 fps, that is two times faster than the camera frame rate (30 fps), on only a single core of CPU (Intel Core i7-3820 3.60 GHz), while at the same time achieving a higher classification accuracy than original (real-valued) HOG in the case of a pedestrian detection task.
本文提出了一种基于二值共现特征的快速准确的目标检测算法。在我们的方法中,通过逻辑运算,如循环移位和异或,列举了二值化HOG块中所有可能的二进制元素对的共现。这导致了极快的共现提取。我们的实验表明,我们的方法可以以64.6 fps的速度处理vga大小的图像,这是相机帧速率(30 fps)的两倍,仅在单核CPU (Intel core i7-3820 3.60 GHz)上,同时在行人检测任务的情况下,获得比原始(实值)HOG更高的分类精度。
{"title":"Fast and Accurate Object Detection Based on Binary Co-occurrence Features","authors":"Mitsuru Ambai, Taketo Kimura, Chiori Sakai","doi":"10.2197/ipsjtcva.7.55","DOIUrl":"https://doi.org/10.2197/ipsjtcva.7.55","url":null,"abstract":"In this paper, we propose a fast and accurate object detection algorithm based on binary co-occurrence features. In our method, co-occurrences of all the possible pairs of binary elements in a block of binarized HOG are enumerated by logical operations, i.g. circular shift and XOR. This resulted in extremely fast co-occurrence extraction. Our experiments revealed that our method can process a VGA-size image at 64.6 fps, that is two times faster than the camera frame rate (30 fps), on only a single core of CPU (Intel Core i7-3820 3.60 GHz), while at the same time achieving a higher classification accuracy than original (real-valued) HOG in the case of a pedestrian detection task.","PeriodicalId":38957,"journal":{"name":"IPSJ Transactions on Computer Vision and Applications","volume":"6 1","pages":"55-58"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76032269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Computer Simulation of Color Confusion for Dichromats in Video Device Gamut under Proportionality Law 基于比例律的视频设备色域中二色体颜色混淆的计算机模拟
Q1 Computer Science Pub Date : 2015-01-01 DOI: 10.2197/ipsjtcva.7.41
H. Fukuda, Shintaro Hara, K. Asakawa, H. Ishikawa, M. Noshiro, Mituaki Katuya
Dichromats are color-blind persons missing one of the three cone systems. We consider a computer simulation of color confusion for dichromats for any colors on any video device, which transforms color in each pixel into a representative color among the set of its confusion colors. As a guiding principle of the simulation we adopt the proportionality law between the pre-transformed and post-transformed colors, which ensures that the same colors are not transformed to two or more different colors apart from intensity. We show that such a simulation algorithm with the proportionality law is unique for the video displays whose projected gamut onto the plane perpendicular to the color confusion axis in the LMS space is hexagon. Almost all video display including sRGB satisfy this condition and we demonstrate this unique simulation in sRGB video display. As a corollary we show that it is impossible to build an appropriate algorithm if we demand the additivity law, which is mathematically stronger than the proportionality law and enable the additive mixture among post-transformed colors as well as for dichromats.
二色者是缺少三种视锥系统之一的色盲者。我们考虑在任何视频设备上对任何颜色的二色者进行颜色混淆的计算机模拟,该模拟将每个像素中的颜色转换为其混淆色集合中的代表性颜色。作为仿真的指导原则,我们采用变换前和变换后的颜色之间的比例律,以确保相同的颜色不会被变换成两种或两种以上不同的颜色。我们证明了这种具有比例律的模拟算法对于在LMS空间中投影色域到垂直于颜色混淆轴的平面上的视频显示器是六边形的是独特的。包括sRGB在内的几乎所有视频显示都满足这一条件,我们在sRGB视频显示中进行了独特的仿真。作为一个推论,我们表明,如果我们要求加性律,它是不可能建立一个适当的算法,它在数学上比比例律强,并使后变换颜色之间的加性混合以及二色。
{"title":"Computer Simulation of Color Confusion for Dichromats in Video Device Gamut under Proportionality Law","authors":"H. Fukuda, Shintaro Hara, K. Asakawa, H. Ishikawa, M. Noshiro, Mituaki Katuya","doi":"10.2197/ipsjtcva.7.41","DOIUrl":"https://doi.org/10.2197/ipsjtcva.7.41","url":null,"abstract":"Dichromats are color-blind persons missing one of the three cone systems. We consider a computer simulation of color confusion for dichromats for any colors on any video device, which transforms color in each pixel into a representative color among the set of its confusion colors. As a guiding principle of the simulation we adopt the proportionality law between the pre-transformed and post-transformed colors, which ensures that the same colors are not transformed to two or more different colors apart from intensity. We show that such a simulation algorithm with the proportionality law is unique for the video displays whose projected gamut onto the plane perpendicular to the color confusion axis in the LMS space is hexagon. Almost all video display including sRGB satisfy this condition and we demonstrate this unique simulation in sRGB video display. As a corollary we show that it is impossible to build an appropriate algorithm if we demand the additivity law, which is mathematically stronger than the proportionality law and enable the additive mixture among post-transformed colors as well as for dichromats.","PeriodicalId":38957,"journal":{"name":"IPSJ Transactions on Computer Vision and Applications","volume":"75 1","pages":"41-49"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86286249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dithering-based Sampling and Weighted α-shapes for Local Feature Detection 基于抖动采样和加权α-形状的局部特征检测
Q1 Computer Science Pub Date : 2015-01-01 DOI: 10.2197/ipsjtcva.7.189
Christos Varytimidis, Konstantinos Rapantzikos, Yannis Avrithis, S. Kollias
Local feature detection has been an essential part of many methods for computer vision applications like large scale image retrieval, object detection, or tracking. Recently, structure-guided feature detectors have been proposed, exploiting image edges to accurately capture local shape. Among them, the WαSH detector [Varytimidis et al., 2012] starts from sampling binary edges and exploits α-shapes, a computational geometry representation that describes local shape in different scales. In this work, we propose a novel image sampling method, based on dithering smooth image functions other than intensity. Samples are extracted on image contours representing the underlying shapes, with sampling density determined by image functions like the gradient or Hessian response, rather than being fixed. We thoroughly evaluate the parameters of the method, and achieve state-of-the-art performance on a series of matching and retrieval experiments.
局部特征检测已经成为许多计算机视觉应用方法的重要组成部分,如大规模图像检索、目标检测或跟踪。近年来,人们提出了结构引导特征检测器,利用图像边缘来精确捕获局部形状。其中,WαSH探测器[Varytimidis等人,2012]从二进制边缘采样开始,利用α-形状,这是一种描述不同尺度局部形状的计算几何表示。在这项工作中,我们提出了一种新的图像采样方法,基于抖动平滑图像函数而不是强度。样本是在代表底层形状的图像轮廓上提取的,采样密度由梯度或Hessian响应等图像函数决定,而不是固定的。我们全面评估了该方法的参数,并在一系列匹配和检索实验中取得了最先进的性能。
{"title":"Dithering-based Sampling and Weighted α-shapes for Local Feature Detection","authors":"Christos Varytimidis, Konstantinos Rapantzikos, Yannis Avrithis, S. Kollias","doi":"10.2197/ipsjtcva.7.189","DOIUrl":"https://doi.org/10.2197/ipsjtcva.7.189","url":null,"abstract":"Local feature detection has been an essential part of many methods for computer vision applications like large scale image retrieval, object detection, or tracking. Recently, structure-guided feature detectors have been proposed, exploiting image edges to accurately capture local shape. Among them, the WαSH detector [Varytimidis et al., 2012] starts from sampling binary edges and exploits α-shapes, a computational geometry representation that describes local shape in different scales. In this work, we propose a novel image sampling method, based on dithering smooth image functions other than intensity. Samples are extracted on image contours representing the underlying shapes, with sampling density determined by image functions like the gradient or Hessian response, rather than being fixed. We thoroughly evaluate the parameters of the method, and achieve state-of-the-art performance on a series of matching and retrieval experiments.","PeriodicalId":38957,"journal":{"name":"IPSJ Transactions on Computer Vision and Applications","volume":"4 1","pages":"189-200"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84697153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Facial Expression Recognition and Analysis: A Comparison Study of Feature Descriptors 面部表情识别与分析:特征描述符的比较研究
Q1 Computer Science Pub Date : 2015-01-01 DOI: 10.2197/ipsjtcva.7.104
Chun Fui Liew, T. Yairi
Facial expression recognition (FER) is a crucial technology and a challenging task for human–computer interaction. Previous methods have been using different feature descriptors for FER and there is a lack of comparison study. In this paper, we aim to identify the best features descriptor for FER by empirically evaluating five feature descriptors, namely Gabor, Haar, Local Binary Pattern (LBP), Histogram of Oriented Gradients (HOG), and Binary Robust Independent Elementary Features (BRIEF) descriptors. We examine each feature descriptor by considering six classification methods, such as k-Nearest Neighbors (k-NN), Linear Discriminant Analysis (LDA), Support Vector Machine (SVM), and Adaptive Boosting (AdaBoost) with four unique facial expression datasets. In addition to test accuracies, we present confusion matrices of FER. We also analyze the effect of combined features and image resolutions on FER performance. Our study indicates that HOG descriptor works the best for FER when image resolution of a detected face is higher than 48×48 pixels.
面部表情识别是人机交互中的一项关键技术,也是一项具有挑战性的任务。以往的方法都是使用不同的特征描述符,缺乏比较研究。本文旨在通过对Gabor、Haar、局部二值模式(LBP)、定向梯度直方图(HOG)和二元鲁棒独立初等特征(BRIEF)五个特征描述符进行经验评价,来确定最优的特征描述符。我们通过考虑六种分类方法来检查每个特征描述符,例如k-最近邻(k-NN),线性判别分析(LDA),支持向量机(SVM)和自适应增强(AdaBoost)与四个独特的面部表情数据集。除了测试精度外,我们还提出了FER的混淆矩阵。我们还分析了组合特征和图像分辨率对FER性能的影响。我们的研究表明,当被检测人脸的图像分辨率高于48×48像素时,HOG描述符对FER的效果最好。
{"title":"Facial Expression Recognition and Analysis: A Comparison Study of Feature Descriptors","authors":"Chun Fui Liew, T. Yairi","doi":"10.2197/ipsjtcva.7.104","DOIUrl":"https://doi.org/10.2197/ipsjtcva.7.104","url":null,"abstract":"Facial expression recognition (FER) is a crucial technology and a challenging task for human–computer interaction. Previous methods have been using different feature descriptors for FER and there is a lack of comparison study. In this paper, we aim to identify the best features descriptor for FER by empirically evaluating five feature descriptors, namely Gabor, Haar, Local Binary Pattern (LBP), Histogram of Oriented Gradients (HOG), and Binary Robust Independent Elementary Features (BRIEF) descriptors. We examine each feature descriptor by considering six classification methods, such as k-Nearest Neighbors (k-NN), Linear Discriminant Analysis (LDA), Support Vector Machine (SVM), and Adaptive Boosting (AdaBoost) with four unique facial expression datasets. In addition to test accuracies, we present confusion matrices of FER. We also analyze the effect of combined features and image resolutions on FER performance. Our study indicates that HOG descriptor works the best for FER when image resolution of a detected face is higher than 48×48 pixels.","PeriodicalId":38957,"journal":{"name":"IPSJ Transactions on Computer Vision and Applications","volume":"17 1","pages":"104-120"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89472112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 34
Block-Propagative Background Subtraction System for UHDTV Videos 超高清电视视频的块传播背景减法系统
Q1 Computer Science Pub Date : 2015-01-01 DOI: 10.2197/ipsjtcva.7.31
A. Beaugendre, S. Goto
The process of Ultra High Definition TV videos requires a lot of resources in terms of memory and computation time. In this paper we consider a block-propagation background subtraction (BPBGS) method which spreads to neighboring blocks if a part of an object is detected on the borders of the current block. This allows us to avoid processing unnecessary areas which do not contain any object thus saving memory and computational time. The results show that our method is particularly efficient in sequences where objects occupy a small portion of the scene despite the fact that there are a lot of background movements. At same scale our BPBGS performs much faster than the state-of-art methods for a similar detection quality.
超高清电视视频的处理需要大量的内存资源和计算时间。在本文中,我们考虑了一种块传播背景减法(BPBGS)方法,当在当前块的边界上检测到物体的一部分时,该方法会扩散到相邻的块。这允许我们避免处理不包含任何对象的不必要区域,从而节省内存和计算时间。结果表明,我们的方法在物体占据场景一小部分的序列中特别有效,尽管有很多背景运动。在相同的尺度下,我们的BPBGS在类似的检测质量下比最先进的方法执行得快得多。
{"title":"Block-Propagative Background Subtraction System for UHDTV Videos","authors":"A. Beaugendre, S. Goto","doi":"10.2197/ipsjtcva.7.31","DOIUrl":"https://doi.org/10.2197/ipsjtcva.7.31","url":null,"abstract":"The process of Ultra High Definition TV videos requires a lot of resources in terms of memory and computation time. In this paper we consider a block-propagation background subtraction (BPBGS) method which spreads to neighboring blocks if a part of an object is detected on the borders of the current block. This allows us to avoid processing unnecessary areas which do not contain any object thus saving memory and computational time. The results show that our method is particularly efficient in sequences where objects occupy a small portion of the scene despite the fact that there are a lot of background movements. At same scale our BPBGS performs much faster than the state-of-art methods for a similar detection quality.","PeriodicalId":38957,"journal":{"name":"IPSJ Transactions on Computer Vision and Applications","volume":"74 1","pages":"31-34"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90604125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
The Maximal Self-dissimilarity Interest Point Detector 极大自不相似感兴趣点检测器
Q1 Computer Science Pub Date : 2015-01-01 DOI: 10.2197/ipsjtcva.7.175
Federico Tombari, L. D. Stefano
We propose a novel interest point detector stemming from the intuition that image patches which are highly dissimilar over a relatively large extent of their surroundings hold the property of being repeatable and distinctive. This concept of contextual self-dissimilarity reverses the key paradigm of recent successful techniques such as the Local Self-Similarity descriptor and the Non-Local Means filter, which build upon the presence of similar rather than dissimilar patches. Moreover, our approach extends to contextual information the local self-dissimilarity notion embedded in established detectors of corner-like interest points, thereby achieving enhanced repeatability, distinctiveness and localization accuracy. As the key principle and machinery of our method are amenable to a variety of data kinds, including multi-channel images and organized 3D measurements, we delineate how to extend the basic formulation in order to deal with range and RGB-D images, such as those provided by consumer depth cameras.
我们提出了一种新的兴趣点检测器,这种检测器源于一种直觉,即在相对较大的范围内高度不同的图像斑块具有可重复和独特的特性。上下文自不相似的概念颠覆了最近成功的技术的关键范例,如局部自相似描述符和非局部均值过滤器,它们建立在相似而不是不同补丁的存在上。此外,我们的方法将嵌入在已建立的角样兴趣点检测器中的局部自不相似概念扩展到上下文信息,从而提高了可重复性、独特性和定位精度。由于我们方法的关键原理和机制适用于各种数据类型,包括多通道图像和有组织的3D测量,我们描述了如何扩展基本公式以处理范围和RGB-D图像,例如由消费者深度相机提供的图像。
{"title":"The Maximal Self-dissimilarity Interest Point Detector","authors":"Federico Tombari, L. D. Stefano","doi":"10.2197/ipsjtcva.7.175","DOIUrl":"https://doi.org/10.2197/ipsjtcva.7.175","url":null,"abstract":"We propose a novel interest point detector stemming from the intuition that image patches which are highly dissimilar over a relatively large extent of their surroundings hold the property of being repeatable and distinctive. This concept of contextual self-dissimilarity reverses the key paradigm of recent successful techniques such as the Local Self-Similarity descriptor and the Non-Local Means filter, which build upon the presence of similar rather than dissimilar patches. Moreover, our approach extends to contextual information the local self-dissimilarity notion embedded in established detectors of corner-like interest points, thereby achieving enhanced repeatability, distinctiveness and localization accuracy. As the key principle and machinery of our method are amenable to a variety of data kinds, including multi-channel images and organized 3D measurements, we delineate how to extend the basic formulation in order to deal with range and RGB-D images, such as those provided by consumer depth cameras.","PeriodicalId":38957,"journal":{"name":"IPSJ Transactions on Computer Vision and Applications","volume":"28 1","pages":"175-188"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81513428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Lower Body Pose Estimation in Team Sports Videos Using Label-Grid Classifier Integrated with Tracking-by-Detection 结合检测跟踪的标签网格分类器在团队运动视频中的下体姿态估计
Q1 Computer Science Pub Date : 2015-01-01 DOI: 10.2197/ipsjtcva.7.18
Masaki Hayashi, Kyoko Oshima, Masamoto Tanabiki, Y. Aoki
We propose a human lower body pose estimation method for team sport videos, which is integrated with tracking-by-detection technique. The proposed Label-Grid classifier uses the grid histogram feature of the tracked window from the tracker and estimates the lower body joint position of a specific joint as the class label of the multiclass classifiers, whose classes correspond to the candidate joint positions on the grid. By learning various types of player poses and scales of Histogram-of-Oriented Gradients features within one team sport, our method can estimate poses even if the players are motion-blurred and low-resolution images without requiring a motion-model regression or part-based model, which are popular vision-based human pose estimation techniques. Moreover, our method can estimate poses with part-occlusions and non-upright side poses, which part-detector-based methods find it difficult to estimate with only one model. Experimental results show the advantage of our method for side running poses and non-walking poses. The results also show the robustness of our method for a large variety of poses and scales in team sports videos.
提出了一种结合检测跟踪技术的团队运动视频人体下半身姿态估计方法。提出的标签-网格分类器利用跟踪窗口的网格直方图特征,估计特定关节的下体关节位置作为多类分类器的类标签,多类分类器的类对应于网格上的候选关节位置。通过学习一个团队运动中不同类型的运动员姿势和梯度直方图特征的尺度,我们的方法可以估计姿势,即使运动员是运动模糊和低分辨率的图像,也不需要运动模型回归或基于部分的模型,这是流行的基于视觉的人体姿势估计技术。此外,我们的方法可以估计部分遮挡和非直立侧位的姿态,这是基于部分检测器的方法仅使用一个模型难以估计的。实验结果表明,该方法对侧跑姿态和非步行姿态具有一定的优越性。结果还表明,我们的方法对团队运动视频中各种姿势和尺度的鲁棒性。
{"title":"Lower Body Pose Estimation in Team Sports Videos Using Label-Grid Classifier Integrated with Tracking-by-Detection","authors":"Masaki Hayashi, Kyoko Oshima, Masamoto Tanabiki, Y. Aoki","doi":"10.2197/ipsjtcva.7.18","DOIUrl":"https://doi.org/10.2197/ipsjtcva.7.18","url":null,"abstract":"We propose a human lower body pose estimation method for team sport videos, which is integrated with tracking-by-detection technique. The proposed Label-Grid classifier uses the grid histogram feature of the tracked window from the tracker and estimates the lower body joint position of a specific joint as the class label of the multiclass classifiers, whose classes correspond to the candidate joint positions on the grid. By learning various types of player poses and scales of Histogram-of-Oriented Gradients features within one team sport, our method can estimate poses even if the players are motion-blurred and low-resolution images without requiring a motion-model regression or part-based model, which are popular vision-based human pose estimation techniques. Moreover, our method can estimate poses with part-occlusions and non-upright side poses, which part-detector-based methods find it difficult to estimate with only one model. Experimental results show the advantage of our method for side running poses and non-walking poses. The results also show the robustness of our method for a large variety of poses and scales in team sports videos.","PeriodicalId":38957,"journal":{"name":"IPSJ Transactions on Computer Vision and Applications","volume":"14 1","pages":"18-30"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82481270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Aircraft Detection by Deep Convolutional Neural Networks 基于深度卷积神经网络的飞机检测
Q1 Computer Science Pub Date : 2015-01-01 DOI: 10.2197/ipsjtcva.7.10
Xueyun Chen, Shiming Xiang, Cheng-Lin Liu, Chunhong Pan
Features play crucial role in the performance of classifier for object detection from high-resolution remote sensing images. In this paper, we implemented two types of deep learning methods, deep convolutional neural network (DNN) and deep belief net (DBN), comparing their performances with that of the traditional methods (handcrafted features with a shallow classifier) in the task of aircraft detection. These methods learn robust features from a large set of training samples to obtain a better performance. The depth of their layers (>6 layers) grants them the ability to extract stable and large-scale features from the image. Our experiments show both deep learning methods reduce at least 40% of the false alarm rate of the traditional methods (HOG, LBP+SVM), and DNN performs a little better than DBN. We also fed some multi-preprocessed images simultaneously to one DNN model, and found that such a practice helps to improve the performance of the model obviously with no extra-computing burden adding.
特征对高分辨率遥感图像目标检测分类器的性能起着至关重要的作用。在本文中,我们实现了两种类型的深度学习方法,深度卷积神经网络(DNN)和深度信念网络(DBN),并将它们与传统方法(用浅分类器手工制作特征)在飞机检测任务中的性能进行了比较。这些方法从大量的训练样本中学习鲁棒特征,以获得更好的性能。层的深度(>6层)使他们能够从图像中提取稳定和大规模的特征。我们的实验表明,两种深度学习方法都比传统方法(HOG、LBP+SVM)至少降低了40%的虚警率,并且DNN的性能略好于DBN。我们还将一些预处理后的图像同时输入到一个DNN模型中,发现这种做法有助于明显提高模型的性能,并且没有增加额外的计算负担。
{"title":"Aircraft Detection by Deep Convolutional Neural Networks","authors":"Xueyun Chen, Shiming Xiang, Cheng-Lin Liu, Chunhong Pan","doi":"10.2197/ipsjtcva.7.10","DOIUrl":"https://doi.org/10.2197/ipsjtcva.7.10","url":null,"abstract":"Features play crucial role in the performance of classifier for object detection from high-resolution remote sensing images. In this paper, we implemented two types of deep learning methods, deep convolutional neural network (DNN) and deep belief net (DBN), comparing their performances with that of the traditional methods (handcrafted features with a shallow classifier) in the task of aircraft detection. These methods learn robust features from a large set of training samples to obtain a better performance. The depth of their layers (>6 layers) grants them the ability to extract stable and large-scale features from the image. Our experiments show both deep learning methods reduce at least 40% of the false alarm rate of the traditional methods (HOG, LBP+SVM), and DNN performs a little better than DBN. We also fed some multi-preprocessed images simultaneously to one DNN model, and found that such a practice helps to improve the performance of the model obviously with no extra-computing burden adding.","PeriodicalId":38957,"journal":{"name":"IPSJ Transactions on Computer Vision and Applications","volume":"5 1","pages":"10-17"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86847670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
期刊
IPSJ Transactions on Computer Vision and Applications
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1