首页 > 最新文献

2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)最新文献

英文 中文
Pain recognition with camera photoplethysmography 用相机光电脉搏波识别疼痛
Viktor Kessler, Patrick Thiam, Mohammadreza Amirian, F. Schwenker
In the last years a lot of effort was made in predicting the heart rate of a participant with remote Photo-plethysmography (rPPG) from the video channel but only few authors used it as a biosignal for classification of e.g. stress. In this work, we present the rPPG signal as a new modality for pain classification and evaluate the benefit of the three color channels (red, green, blue) of the rPPG signal. In short the rPPG signal is filtered in multiple frequency ranges to extract the heart rate and the respiration rate as biophysiological signals. Then the pain is classified with a Support Vector Machine (SVM) and Random Forest classifier. The performance is compared to the electrocardiogram (ECG) and the respiration from the biosignal amplifier and facial landmark features from the video. The results show that the rPPG signal can be used for pain classification, especially its low frequencies.
在过去的几年里,很多人都在努力用视频频道的远程图像容积描记(rPPG)来预测参与者的心率,但只有少数作者把它作为一种生物信号来分类,比如压力。在这项工作中,我们提出了rPPG信号作为疼痛分类的新模式,并评估了rPPG信号的三种颜色通道(红、绿、蓝)的好处。简而言之,将rPPG信号在多个频率范围内进行滤波,提取出心率和呼吸频率作为生物生理信号。然后利用支持向量机(SVM)和随机森林分类器对疼痛进行分类。将其性能与生物信号放大器的心电图和呼吸以及视频中的面部标志特征进行比较。结果表明,rPPG信号可以用于疼痛分类,特别是低频信号。
{"title":"Pain recognition with camera photoplethysmography","authors":"Viktor Kessler, Patrick Thiam, Mohammadreza Amirian, F. Schwenker","doi":"10.1109/IPTA.2017.8310110","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310110","url":null,"abstract":"In the last years a lot of effort was made in predicting the heart rate of a participant with remote Photo-plethysmography (rPPG) from the video channel but only few authors used it as a biosignal for classification of e.g. stress. In this work, we present the rPPG signal as a new modality for pain classification and evaluate the benefit of the three color channels (red, green, blue) of the rPPG signal. In short the rPPG signal is filtered in multiple frequency ranges to extract the heart rate and the respiration rate as biophysiological signals. Then the pain is classified with a Support Vector Machine (SVM) and Random Forest classifier. The performance is compared to the electrocardiogram (ECG) and the respiration from the biosignal amplifier and facial landmark features from the video. The results show that the rPPG signal can be used for pain classification, especially its low frequencies.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123615011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
A new semi-supervised method for image co-segmentation 一种新的半监督图像共分割方法
Rachida Es-salhi, I. Daoudi, H. Ouardi
Image co-segmentation addresses the problem of simultaneously extracting the common targets from a set of related images. However, designing a robust and efficient co-segmentation algorithm is a challenging work because of the variety and complexity of the object and the background. In this paper, we propose a new semi-supervised method to extract foreground object from an image collection. The proposed method is composed of three tasks: 1) object proposal generation, 2) object prior propagation and 3) foreground extraction. The main idea of this paper is to transfer the segmentation from a subset of training images to test images. The comparison experiments conducted on public datasets iCoseg and MSRC demonstrate the performance of the proposed method.
图像共分割解决了从一组相关图像中同时提取共同目标的问题。然而,由于目标和背景的多样性和复杂性,设计一种鲁棒、高效的协同分割算法是一项具有挑战性的工作。在本文中,我们提出了一种新的半监督方法从图像集合中提取前景目标。该方法由三个任务组成:1)目标建议生成,2)目标先验传播和3)前景提取。本文的主要思想是将训练图像子集的分割转移到测试图像中。在公共数据集iCoseg和MSRC上进行的对比实验验证了该方法的有效性。
{"title":"A new semi-supervised method for image co-segmentation","authors":"Rachida Es-salhi, I. Daoudi, H. Ouardi","doi":"10.1109/IPTA.2017.8310099","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310099","url":null,"abstract":"Image co-segmentation addresses the problem of simultaneously extracting the common targets from a set of related images. However, designing a robust and efficient co-segmentation algorithm is a challenging work because of the variety and complexity of the object and the background. In this paper, we propose a new semi-supervised method to extract foreground object from an image collection. The proposed method is composed of three tasks: 1) object proposal generation, 2) object prior propagation and 3) foreground extraction. The main idea of this paper is to transfer the segmentation from a subset of training images to test images. The comparison experiments conducted on public datasets iCoseg and MSRC demonstrate the performance of the proposed method.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128723888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Enlarging the discriminability of bag-of-words representations with deep convolutional features 利用深度卷积特征扩大词袋表示的可判别性
D. Manger, D. Willersinn
In this work, we propose an extension of established image retrieval models which are based on the bag-of-words representation, i.e. on models which quantize local features such as SIFT to leverage an inverted file indexing scheme for speedup. Since the quantization of local features impairs their discriminability, the ability to retrieve those database images which show the same object or scene to a given query image is decreasing with the growing number of images in the database. We address this issue by extending a quantized local feature with information from its local spatial neighborhood incorporating a representation based on pooling features from deep convolutional neural network layer outputs. Using four public datasets, we evaluate both the discriminability of the representation and its overall performance in a large-scale image retrieval setup.
在这项工作中,我们提出了一种基于词袋表示的已建立的图像检索模型的扩展,即基于量化局部特征(如SIFT)的模型来利用反向文件索引方案来加速。由于局部特征的量化削弱了它们的可辨别性,检索那些与给定查询图像显示相同对象或场景的数据库图像的能力随着数据库中图像数量的增加而降低。我们通过使用来自其局部空间邻域的信息扩展量化局部特征,并结合基于深度卷积神经网络层输出的池化特征的表示来解决这个问题。使用四个公共数据集,我们评估了表示的可辨别性及其在大规模图像检索设置中的整体性能。
{"title":"Enlarging the discriminability of bag-of-words representations with deep convolutional features","authors":"D. Manger, D. Willersinn","doi":"10.1109/IPTA.2017.8310096","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310096","url":null,"abstract":"In this work, we propose an extension of established image retrieval models which are based on the bag-of-words representation, i.e. on models which quantize local features such as SIFT to leverage an inverted file indexing scheme for speedup. Since the quantization of local features impairs their discriminability, the ability to retrieve those database images which show the same object or scene to a given query image is decreasing with the growing number of images in the database. We address this issue by extending a quantized local feature with information from its local spatial neighborhood incorporating a representation based on pooling features from deep convolutional neural network layer outputs. Using four public datasets, we evaluate both the discriminability of the representation and its overall performance in a large-scale image retrieval setup.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129945159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Two-steps perceptual important points estimator in 8-connected curves from handwritten signature 手写签名8连通曲线的两步感知关键点估计
M. A. Ferrer-Ballester, Moisés Díaz, C. Carmona-Duarte
Estimating the salient points in 8-connected curves from handwritten signature is a difficult task due to their relation to the writer neuromotor system. This paper faces up this topic proposing a two-steps perceptual important points estimation method: the first step estimates the sharper salient points by a curvature analysis at multiple scales, whereas the second step estimates the smoother salient points relying on circular shapes between estimated salient points in step one. In this approach, both sharper and smoother salient points represent the set of perceptual important points in an eight connected signature trajectory. Our validations, conducted on 2112 signatures from 132 users of the BiosecurID database, are focused on i) evaluating the number of estimated perceptual important points; ii) evaluating their locations in the trajectory and iii) evaluating the accuracy of the estimated duration of the signatures from the number of perceptual important points. The obtained results are encouraging for new developments in handwriting analysis based on this procedure.
由于手写签名与书写神经运动系统的关系,估计手写签名中8连接曲线的突出点是一项困难的任务。针对这一问题,本文提出了一种两步感知关键点估计方法:第一步通过多尺度曲率分析来估计更尖锐的关键点,而第二步依靠第一步估计的关键点之间的圆形来估计更平滑的关键点。在这种方法中,更尖锐和更平滑的突出点代表了8个连接的特征轨迹中的感知要点集。我们对来自BiosecurID数据库132个用户的2112个签名进行了验证,主要集中在i)评估估计的感知关键点的数量;Ii)评估它们在轨迹中的位置,iii)根据感知要点的数量评估估计签名持续时间的准确性。所得结果对基于该方法的笔迹分析的新发展具有鼓舞作用。
{"title":"Two-steps perceptual important points estimator in 8-connected curves from handwritten signature","authors":"M. A. Ferrer-Ballester, Moisés Díaz, C. Carmona-Duarte","doi":"10.1109/IPTA.2017.8310077","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310077","url":null,"abstract":"Estimating the salient points in 8-connected curves from handwritten signature is a difficult task due to their relation to the writer neuromotor system. This paper faces up this topic proposing a two-steps perceptual important points estimation method: the first step estimates the sharper salient points by a curvature analysis at multiple scales, whereas the second step estimates the smoother salient points relying on circular shapes between estimated salient points in step one. In this approach, both sharper and smoother salient points represent the set of perceptual important points in an eight connected signature trajectory. Our validations, conducted on 2112 signatures from 132 users of the BiosecurID database, are focused on i) evaluating the number of estimated perceptual important points; ii) evaluating their locations in the trajectory and iii) evaluating the accuracy of the estimated duration of the signatures from the number of perceptual important points. The obtained results are encouraging for new developments in handwriting analysis based on this procedure.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127890399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A new latent generalized dirichlet allocation model for image classification 一种新的图像分类潜广义dirichlet分配模型
Koffi Eddy Ihou, N. Bouguila
As a response to the limitations of the LDA in topic modeling and large scale applications, several extensions using flexible priors have been introduced to expose the problem of topic correlation. Models such as CTM, PAM, GD-LDA, and LGDA have been able to explore and capture semantic relationships between topics. However, many of these models suffer from incomplete generative processes which affect inferences efficiency. In addition, knowing these traditional inference techniques carry major limitations, the new approach in this paper, the CVB-LGDA is an extension to the state-of-the-art. It reconciles a complete generative process to a robust inference technique in a topic correlation framework. Its performance in image classification shows its robustness.
为了应对LDA在主题建模和大规模应用中的局限性,引入了一些使用灵活先验的扩展来暴露主题关联问题。CTM、PAM、GD-LDA和LGDA等模型已经能够探索和捕获主题之间的语义关系。然而,许多模型的生成过程不完整,影响了推理效率。此外,知道这些传统的推理技术有很大的局限性,本文中的新方法,CVB-LGDA是对最先进技术的扩展。它将完整的生成过程与主题关联框架中的鲁棒推理技术相协调。该方法在图像分类中的表现表明了其鲁棒性。
{"title":"A new latent generalized dirichlet allocation model for image classification","authors":"Koffi Eddy Ihou, N. Bouguila","doi":"10.1109/IPTA.2017.8310106","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310106","url":null,"abstract":"As a response to the limitations of the LDA in topic modeling and large scale applications, several extensions using flexible priors have been introduced to expose the problem of topic correlation. Models such as CTM, PAM, GD-LDA, and LGDA have been able to explore and capture semantic relationships between topics. However, many of these models suffer from incomplete generative processes which affect inferences efficiency. In addition, knowing these traditional inference techniques carry major limitations, the new approach in this paper, the CVB-LGDA is an extension to the state-of-the-art. It reconciles a complete generative process to a robust inference technique in a topic correlation framework. Its performance in image classification shows its robustness.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129157221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Local radon descriptors for image search 用于图像搜索的局部氡描述符
Morteza Babaie, H. Tizhoosh, Seyed Amin Khatami, M. Shiri
Radon transform and its inverse operation are important techniques in medical imaging tasks. Recently, there has been renewed interest in Radon transform for applications such as content-based medical image retrieval. However, all studies so far have used Radon transform as a global or quasi-global image descriptor by extracting projections of the whole image or large sub-images. This paper attempts to show that the dense sampling to generate the histogram of local Radon projections has a much higher discrimination capability than the global one. In this paper, we introduce Local Radon Descriptor (LRD) and apply it to the IRMA dataset, which contains 14,410 x-ray images as well as to the INRIA Holidays dataset with 1,990 images. Our results show significant improvement in retrieval performance by using LRD versus its global version. We also demonstrate that LRD can deliver results comparable to well-established descriptors like LBP and HOG.
氡变换及其逆运算是医学成像任务中的重要技术。最近,人们对Radon变换在基于内容的医学图像检索等应用中的应用重新产生了兴趣。然而,目前所有的研究都是利用Radon变换作为全局或准全局图像描述符,提取整幅图像或大幅子图像的投影。本文试图证明密集抽样生成局部Radon投影直方图的判别能力要比全局抽样高得多。在本文中,我们引入了局部氡描述子(LRD),并将其应用于包含14410张x射线图像的IRMA数据集和包含1990张图像的INRIA节假日数据集。我们的研究结果表明,使用LRD与使用全局版本相比,检索性能有显著提高。我们还证明了LRD可以提供与LBP和HOG等成熟描述符相当的结果。
{"title":"Local radon descriptors for image search","authors":"Morteza Babaie, H. Tizhoosh, Seyed Amin Khatami, M. Shiri","doi":"10.1109/IPTA.2017.8310144","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310144","url":null,"abstract":"Radon transform and its inverse operation are important techniques in medical imaging tasks. Recently, there has been renewed interest in Radon transform for applications such as content-based medical image retrieval. However, all studies so far have used Radon transform as a global or quasi-global image descriptor by extracting projections of the whole image or large sub-images. This paper attempts to show that the dense sampling to generate the histogram of local Radon projections has a much higher discrimination capability than the global one. In this paper, we introduce Local Radon Descriptor (LRD) and apply it to the IRMA dataset, which contains 14,410 x-ray images as well as to the INRIA Holidays dataset with 1,990 images. Our results show significant improvement in retrieval performance by using LRD versus its global version. We also demonstrate that LRD can deliver results comparable to well-established descriptors like LBP and HOG.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114519697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Convolutional neural networks for histopathology image classification: Training vs. Using pre-trained networks 组织病理学图像分类的卷积神经网络:训练vs.使用预训练网络
Brady Kieffer, Morteza Babaie, S. Kalra, H. Tizhoosh
We explore the problem of classification within a medical image data-set based on a feature vector extracted from the deepest layer of pre-trained Convolution Neural Networks. We have used feature vectors from several pre-trained structures, including networks with/without transfer learning to evaluate the performance of pre-trained deep features versus CNNs which have been trained by that specific dataset as well as the impact of transfer learning with a small number of samples. All experiments are done on Kimia Path24 dataset which consists of 27,055 histopathology training patches in 24 tissue texture classes along with 1,325 test patches for evaluation. The result shows that pre-trained networks are quite competitive against training from scratch. As well, fine-tuning does not seem to add any tangible improvement for VGG16 to justify additional training while we observed considerable improvement in retrieval and classification accuracy when we fine-tuned the Inception structure.
我们探索了基于从预训练卷积神经网络的最深层提取的特征向量的医学图像数据集中的分类问题。我们使用了来自几个预训练结构的特征向量,包括带/不带迁移学习的网络,来评估预训练深度特征与由特定数据集训练的cnn的性能,以及使用少量样本进行迁移学习的影响。所有实验都在Kimia Path24数据集上完成,该数据集由24个组织纹理类的27,055个组织病理学训练补丁和1,325个用于评估的测试补丁组成。结果表明,与从头开始训练相比,预训练的网络具有很强的竞争力。同样,微调似乎没有为VGG16增加任何切实的改进,以证明额外的训练是合理的,而当我们对Inception结构进行微调时,我们观察到检索和分类准确性有了相当大的提高。
{"title":"Convolutional neural networks for histopathology image classification: Training vs. Using pre-trained networks","authors":"Brady Kieffer, Morteza Babaie, S. Kalra, H. Tizhoosh","doi":"10.1109/IPTA.2017.8310149","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310149","url":null,"abstract":"We explore the problem of classification within a medical image data-set based on a feature vector extracted from the deepest layer of pre-trained Convolution Neural Networks. We have used feature vectors from several pre-trained structures, including networks with/without transfer learning to evaluate the performance of pre-trained deep features versus CNNs which have been trained by that specific dataset as well as the impact of transfer learning with a small number of samples. All experiments are done on Kimia Path24 dataset which consists of 27,055 histopathology training patches in 24 tissue texture classes along with 1,325 test patches for evaluation. The result shows that pre-trained networks are quite competitive against training from scratch. As well, fine-tuning does not seem to add any tangible improvement for VGG16 to justify additional training while we observed considerable improvement in retrieval and classification accuracy when we fine-tuned the Inception structure.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126914452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 103
Educational video classification by using a transcript to image transform and supervised learning 教育视频分类利用文本图像变换和监督学习
Houssem Chatbri, Marlon Oliveira, Kevin McGuinness, S. Little, K. Kameyama, P. Kwan, Alistair Sutherland, N. O’Connor
In this work, we present a method for automatic topic classification of educational videos using a speech transcript transform. Our method works as follows: First, speech recognition is used to generate video transcripts. Then, the transcripts are converted into images using a statistical cooccurrence transformation that we designed. Finally, a classifier is used to produce video category labels for a transcript image input. For our classifiers, we report results using a convolutional neural network (CNN) and a principal component analysis (PCA) model. In order to evaluate our method, we used the Khan Academy on a Stick dataset that contains 2,545 videos, where each video is labeled with one or two of 13 categories. Experiments show that our method is effective and strongly competitive against other supervised learning-based methods.
在这项工作中,我们提出了一种使用语音文本变换的教育视频自动主题分类方法。我们的方法是这样的:首先,使用语音识别生成视频文本。然后,使用我们设计的统计并发转换将转录本转换为图像。最后,使用分类器为文本图像输入生成视频类别标签。对于我们的分类器,我们使用卷积神经网络(CNN)和主成分分析(PCA)模型报告结果。为了评估我们的方法,我们在包含2545个视频的Stick数据集上使用了可汗学院,其中每个视频被标记为13个类别中的一个或两个。实验表明,我们的方法是有效的,与其他基于监督学习的方法相比具有很强的竞争力。
{"title":"Educational video classification by using a transcript to image transform and supervised learning","authors":"Houssem Chatbri, Marlon Oliveira, Kevin McGuinness, S. Little, K. Kameyama, P. Kwan, Alistair Sutherland, N. O’Connor","doi":"10.1109/IPTA.2017.8853988","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8853988","url":null,"abstract":"In this work, we present a method for automatic topic classification of educational videos using a speech transcript transform. Our method works as follows: First, speech recognition is used to generate video transcripts. Then, the transcripts are converted into images using a statistical cooccurrence transformation that we designed. Finally, a classifier is used to produce video category labels for a transcript image input. For our classifiers, we report results using a convolutional neural network (CNN) and a principal component analysis (PCA) model. In order to evaluate our method, we used the Khan Academy on a Stick dataset that contains 2,545 videos, where each video is labeled with one or two of 13 categories. Experiments show that our method is effective and strongly competitive against other supervised learning-based methods.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121979826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Bayesian optimization for refining object proposals 改进对象建议的贝叶斯优化
A. Rhodes, Jordan M. Witte, B. Jedynak, Melanie Mitchell
We develop a general-purpose algorithm using a Bayesian optimization framework for the efficient refinement of object proposals. While recent research has achieved substantial progress for object localization and related objectives in computer vision, current state-of-the-art object localization procedures are nevertheless encumbered by inefficiency and inaccuracy. We present a novel, computationally efficient method for refining inaccurate bounding-box proposals for a target object using Bayesian optimization. Offline, image features from a convolutional neural network are used to train a model to predict an object proposal's offset distance from a target object. Online, this model is used in a Bayesian active search to improve inaccurate object proposals. In experiments, we compare our approach to a state-of-the-art bounding-box regression method for localization refinement of pedestrian object proposals.
我们开发了一种通用算法,使用贝叶斯优化框架来有效地改进对象建议。虽然最近的研究在计算机视觉中的目标定位和相关目标方面取得了实质性进展,但目前最先进的目标定位程序仍然存在效率低下和不准确的问题。我们提出了一种新颖的,计算效率高的方法,用于使用贝叶斯优化来改进目标对象的不准确边界盒建议。在离线状态下,使用卷积神经网络的图像特征来训练模型来预测物体提案与目标物体的偏移距离。在线上,该模型用于贝叶斯主动搜索,以改善不准确的目标建议。在实验中,我们将我们的方法与最先进的边界盒回归方法进行了比较,用于行人目标建议的定位细化。
{"title":"Bayesian optimization for refining object proposals","authors":"A. Rhodes, Jordan M. Witte, B. Jedynak, Melanie Mitchell","doi":"10.1109/IPTA.2017.8310084","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310084","url":null,"abstract":"We develop a general-purpose algorithm using a Bayesian optimization framework for the efficient refinement of object proposals. While recent research has achieved substantial progress for object localization and related objectives in computer vision, current state-of-the-art object localization procedures are nevertheless encumbered by inefficiency and inaccuracy. We present a novel, computationally efficient method for refining inaccurate bounding-box proposals for a target object using Bayesian optimization. Offline, image features from a convolutional neural network are used to train a model to predict an object proposal's offset distance from a target object. Online, this model is used in a Bayesian active search to improve inaccurate object proposals. In experiments, we compare our approach to a state-of-the-art bounding-box regression method for localization refinement of pedestrian object proposals.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133519490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Offline handwritten signature verification — Literature review 离线手写签名验证-文献综述
Luiz G. Hafemann, R. Sabourin, Luiz Oliveira
The area of Handwritten Signature Verification has been broadly researched in the last decades, but remains an open research problem. The objective of signature verification systems is to discriminate if a given signature is genuine (produced by the claimed individual), or a forgery (produced by an impostor). This has demonstrated to be a challenging task, in particular in the offline (static) scenario, that uses images of scanned signatures, where the dynamic information about the signing process is not available. Many advancements have been proposed in the literature in the last 5–10 years, most notably the application of Deep Learning methods to learn feature representations from signature images. In this paper, we present how the problem has been handled in the past few decades, analyze the recent advancements in the field, and the potential directions for future research.
近几十年来,手写签名验证领域得到了广泛的研究,但仍然是一个开放的研究问题。签名验证系统的目的是区分给定的签名是真实的(由声称的个人产生)还是伪造的(由冒名顶替者产生)。这已被证明是一项具有挑战性的任务,特别是在使用扫描签名图像的脱机(静态)场景中,其中无法获得有关签名过程的动态信息。在过去的5-10年里,文献中提出了许多进步,最值得注意的是应用深度学习方法从签名图像中学习特征表示。在本文中,我们介绍了过去几十年来如何处理这个问题,分析了该领域的最新进展,以及未来研究的潜在方向。
{"title":"Offline handwritten signature verification — Literature review","authors":"Luiz G. Hafemann, R. Sabourin, Luiz Oliveira","doi":"10.1109/IPTA.2017.8310112","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310112","url":null,"abstract":"The area of Handwritten Signature Verification has been broadly researched in the last decades, but remains an open research problem. The objective of signature verification systems is to discriminate if a given signature is genuine (produced by the claimed individual), or a forgery (produced by an impostor). This has demonstrated to be a challenging task, in particular in the offline (static) scenario, that uses images of scanned signatures, where the dynamic information about the signing process is not available. Many advancements have been proposed in the literature in the last 5–10 years, most notably the application of Deep Learning methods to learn feature representations from signature images. In this paper, we present how the problem has been handled in the past few decades, analyze the recent advancements in the field, and the potential directions for future research.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125284268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 179
期刊
2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1