2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)最新文献

英文中文

Extraction of saliency in images and video: Problems, methods and applications. A survey 图像和视频显著性提取:问题、方法和应用。一项调查显示

2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)

Pub Date : 2017-11-28 DOI: 10.1109/IPTA.2017.8310116

J. Benois-Pineau, Mihai Mitrea

Rather than meeting the theoretical, methodological and applicative expectancies, the impressing number of state-of-the-art saliency oriented studies raises new fundamental questions about the very nature of this psycho-cognitive process. Such questions encompass fundamental modeling aspects from the very saliency dependency on the representation format to its potential relationship to other fundamental research areas, like information theory, for instance. The present survey, structured according to three main saliency applicative fields (visual quality evaluation, watermarking and task-oriented computer vision) is meant to identify the latest trends of research.

而不是满足理论，方法和应用的期望，令人印象深刻的数量的最先进的显著性为导向的研究提出了关于这个心理认知过程的本质的新的基本问题。这些问题包括基本的建模方面，从对表示格式的显著依赖到它与其他基础研究领域(例如信息论)的潜在关系。本调查根据三个主要的显著应用领域(视觉质量评估、水印和面向任务的计算机视觉)进行结构化，旨在确定研究的最新趋势。

引用次数: 2

Majorization-minimization algorithms for maximum likelihood estimation of magnetic resonance images 磁共振图像最大似然估计的最大化-最小化算法

2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)

Pub Date : 2017-11-27 DOI: 10.1109/IPTA.2017.8310150

Qianyi Jiang, S. Moussaoui, J. Idier, G. Collewet, Mai Xu

This paper addresses maximum likelihood estimation of images corrupted by a Rician noise, with the aim to propose an efficient optimization method. The application example is the restoration of magnetic resonance images. Starting from the fact that the criterion to minimize is non-convex but unimodal, the main contribution of this work is to propose an optimization scheme based on the majorization-minimization framework after introducing a variable change allowing to get a strictly convex criterion. The resulting descent algorithm is compared to the classical MM descent algorithm and its performances are assessed using synthetic and real MR images. Finally, by combining these two MM algorithms, two optimization strategies are proposed to improve the numerical efficiency of the image restoration for any signal-to-noise ratio.

本文研究了被噪声破坏的图像的极大似然估计问题，旨在提出一种有效的优化方法。应用实例是磁共振图像的恢复。从最小化准则是非凸单峰的事实出发，本工作的主要贡献是在引入变量变化后，提出了一种基于最大化-最小化框架的优化方案，从而得到严格的凸准则。将所得到的下降算法与经典的MM下降算法进行了比较，并使用合成和真实MR图像对其性能进行了评估。最后，结合这两种MM算法，提出了两种优化策略，以提高任意信噪比下图像恢复的数值效率。

引用次数: 2

Dynamic hand gesture recognition based on 3D pattern assembled trajectories 基于三维模式组合轨迹的动态手势识别

2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)

Pub Date : 2017-11-01 DOI: 10.1109/IPTA.2017.8310146

Said Yacine Boulahia, É. Anquetil, F. Multon, R. Kulpa

Over the past few years, advances in commercial 3D sensors have substantially promoted the research of dynamic hand gesture recognition. On a other side, whole body gestures recognition has also attracted increasing attention since the emergence of Kinect like sensors. One may notice that both research topics deal with human-made motions and are likely to face similar challenges. In this paper, our aim is thus to evaluate the applicability of an action recognition feature-set to model dynamic hand gestures using skeleton data. Furthermore, existing datasets are often composed of pre-segmented gestures that are performed with a single hand only. We collected therefore a more challenging dataset, which contains unsegmented streams of 13 hand gesture classes, performed with either a single hand or two hands. Our approach is first evaluated on an existing dataset, namely DHG dataset, and then using our collected dataset. Better results compared to previous approaches are reported.

在过去的几年里，商用3D传感器的进步极大地推动了动态手势识别的研究。另一方面，自Kinect类传感器出现以来，全身手势识别也引起了越来越多的关注。人们可能会注意到，这两个研究主题都涉及人造运动，并且可能面临类似的挑战。因此，在本文中，我们的目的是评估动作识别特征集使用骨架数据建模动态手势的适用性。此外，现有的数据集通常由预分割的手势组成，这些手势仅由单手执行。因此，我们收集了一个更具挑战性的数据集，其中包含13个手势类的未分割流，用单手或双手执行。我们的方法首先在现有的数据集上进行评估，即DHG数据集，然后使用我们收集的数据集。与以前的方法相比，报告了更好的结果。

引用次数: 41

Effective keyframe extraction from RGB and RGB-D video sequences 从RGB和RGB- d视频序列中有效提取关键帧

2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)

Pub Date : 2017-11-01 DOI: 10.1109/IPTA.2017.8310120

Julien Valognes, Maria A. Amer, Niloufar Salehi Dastjerdi

The rapid increase in digital video content demands effective summarization techniques, specially with the creation of RGBD videos. Keyframe extraction significantly reduces the amount of raw data in a video sequence. In this paper, we present a two-stage (histogram and filtering) keyframe extraction algorithm applicable on RGB and RGBD videos. In the first stage, RGB and depth histogram similarities of consecutive frames are computed and candidate keyframes are extracted. In the second stage, we filter neighboring candidate keyframes based on the MAD of their Euclidean distance and their MSE. Subjective and objective experimental results show our algorithm effectively extracts keyframes from both RGB and RGBD videos.

数字视频内容的快速增长需要有效的摘要技术，特别是RGBD视频的创建。关键帧提取显著减少了视频序列中的原始数据量。本文提出了一种适用于RGB和RGBD视频的两阶段(直方图和滤波)关键帧提取算法。第一阶段，计算连续帧的RGB和深度直方图相似度，提取候选关键帧;在第二阶段，我们基于候选关键帧的欧几里得距离和MSE的MAD来过滤相邻候选关键帧。主观和客观实验结果表明，该算法能有效地从RGB和RGBD视频中提取关键帧。

引用次数: 6

Digital spotlighting parameter evaluation for SAR imaging SAR成像数字聚光参数评估

2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)

Pub Date : 2017-11-01 DOI: 10.1109/IPTA.2017.8310132

E. Balster, David B. Mundy, Andrew M. Kordik, Kerry L. Hill

In this paper, a synthetic aperture radar (SAR) image formation simulator is used to objectively evaluate parameter selection within the digital spotlighting process. Specifically, recommendations for the filter type and filter order of the low-pass filters used in the range and azimuth decimation processes within the digital spotlighting algorithm are determined to maximize image quality and minimize computational cost. Results show that an FIR low-pass filter with a Taylor (n = 5) window applied provides the highest image quality over a wide range of filter orders and decimation factors. Additionally, a linear relationship between filter length and decimation factor is found.

本文利用合成孔径雷达(SAR)成像模拟器对数字射光过程中的参数选择进行客观评价。具体来说，在数字聚光灯算法的范围和方位角采样过程中使用的低通滤波器的滤波器类型和滤波器顺序的建议是确定的，以最大限度地提高图像质量和最小化计算成本。结果表明，应用泰勒(n = 5)窗口的FIR低通滤波器在广泛的滤波器阶数和抽取因子范围内提供最高的图像质量。此外，还发现了滤波器长度与抽取因子之间的线性关系。

引用次数: 1

Continuous activity understanding based on accumulative pose-context visual patterns 基于累积的姿势-上下文视觉模式的连续活动理解

2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)

Pub Date : 2017-11-01 DOI: 10.1109/IPTA.2017.8310114

Yan Zhang, Georg Layher, H. Neumann

In application domains, such as human-robot interaction and ambient intelligence, it is expected that an intelligent agent can respond to the person's actions efficiently or make predictions while the person's activity is still ongoing. In this paper, we investigate the problem of continuous activity understanding, based on a visual pattern extraction mechanism which fuses decomposed body pose features from estimated 2D skeletons (based on deep learning skeleton inference) and localized appearance-motion features around spatiotemporal interest points (STIPs). Considering that human activities are observed and inferred gradually, we partition the video into snippets, extract the visual pattern accumulatively and infer the activities in an online fashion. We evaluated the proposed method on two benchmark datasets and achieved 92.6% on the KTH dataset and 92.7% on the Rochester Assisted Daily Living dataset in the equilibrated inference states. In parallel, we discover that context information mainly contributed by STIPs is probably more favourable to activity recognition than the pose information, especially in scenarios of daily living activities. In addition, incorporating the visual patterns of activities from early stages to train the classifier can improve the performance of early recognition; however, it could degrade the recognition rate in later time. To overcome this issue, we propose a mixture model, where the classifier trained with early visual patterns are used in early stages while the classifier trained without early patterns are used in later stages. The experimental results show that this straightforward approach can improve early recognition while retaining the recognition correctness of later times.

在人机交互和环境智能等应用领域，人们期望智能代理能够在人的活动仍在进行时有效地响应人的动作或做出预测。在本文中，我们研究了基于视觉模式提取机制的连续活动理解问题，该机制融合了来自估计的2D骨骼(基于深度学习骨骼推理)分解的身体姿势特征和围绕时空兴趣点(STIPs)的局部外观运动特征。考虑到人类活动是逐渐观察和推断的，我们将视频分割成片段，累积提取视觉模式，并以在线方式推断活动。我们在两个基准数据集上评估了所提出的方法，在平衡推理状态下，KTH数据集和罗切斯特辅助日常生活数据集的准确率分别达到了92.6%和92.7%。同时，我们发现主要由sti提供的上下文信息可能比姿势信息更有利于活动识别，特别是在日常生活活动的场景中。此外，结合早期活动的视觉模式来训练分类器可以提高早期识别的性能;但是，这可能会降低后期的识别率。为了克服这个问题，我们提出了一个混合模型，在早期阶段使用早期视觉模式训练的分类器，而在后期阶段使用没有早期模式训练的分类器。实验结果表明，该方法在保证后期识别正确性的同时，提高了早期的识别效率。

{"title":"Continuous activity understanding based on accumulative pose-context visual patterns","authors":"Yan Zhang, Georg Layher, H. Neumann","doi":"10.1109/IPTA.2017.8310114","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310114","url":null,"abstract":"In application domains, such as human-robot interaction and ambient intelligence, it is expected that an intelligent agent can respond to the person's actions efficiently or make predictions while the person's activity is still ongoing. In this paper, we investigate the problem of continuous activity understanding, based on a visual pattern extraction mechanism which fuses decomposed body pose features from estimated 2D skeletons (based on deep learning skeleton inference) and localized appearance-motion features around spatiotemporal interest points (STIPs). Considering that human activities are observed and inferred gradually, we partition the video into snippets, extract the visual pattern accumulatively and infer the activities in an online fashion. We evaluated the proposed method on two benchmark datasets and achieved 92.6% on the KTH dataset and 92.7% on the Rochester Assisted Daily Living dataset in the equilibrated inference states. In parallel, we discover that context information mainly contributed by STIPs is probably more favourable to activity recognition than the pose information, especially in scenarios of daily living activities. In addition, incorporating the visual patterns of activities from early stages to train the classifier can improve the performance of early recognition; however, it could degrade the recognition rate in later time. To overcome this issue, we propose a mixture model, where the classifier trained with early visual patterns are used in early stages while the classifier trained without early patterns are used in later stages. The experimental results show that this straightforward approach can improve early recognition while retaining the recognition correctness of later times.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129043916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Line segmentation for grayscale text images of khmer palm leaf manuscripts 高棉棕榈叶手稿灰度文本图像的直线分割

2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)

Pub Date : 2017-11-01 DOI: 10.1109/IPTA.2017.8310097

Dona Valy, M. Verleysen, Kimheng Sok

Text line segmentation is one of the most essential pre-processing steps in character recognition and document analysis. In ancient documents, a variety of deformations caused by aging produce noises which make the binarization process very challenging. Moreover, due to the irregular layout such as skewness and fluctuation of text lines, segmenting an ancient manuscript page into lines still remains an open problem to solve. In this paper, we propose a novel line segmentation scheme for grayscale images of Khmer ancient documents. First, a stroke width transform is applied to extract connected components from the document page. The number and medial positions of text lines are estimated using a modified piece-wise projection profile technique. Those positions are then modified adaptively according to the curvature of the actual text lines. Finally, a path finding approach is used to separate touching components and also to mark the boundary of the text lines. Experiments are conducted on a dataset of 110 pages of Khmer palm leaf manuscript images by comparing the robustness of the proposed approach with existing methods from the literature.

文本线分割是字符识别和文档分析中最重要的预处理步骤之一。在古代文献中，由于老化引起的各种形变会产生噪声，这给二值化过程带来了很大的挑战。此外，由于文本线条的歪斜、起伏等不规则布局，古代手稿页面的线段分割仍然是一个有待解决的问题。本文提出了一种新的高棉古代文献灰度图像的直线分割方案。首先，应用笔画宽度变换从文档页面中提取连接组件。使用改进的分段投影轮廓技术估计文本线的数量和中间位置。然后根据实际文本行的曲率自适应地修改这些位置。最后，使用寻径方法分离触摸组件并标记文本行边界。实验在110页的高棉棕榈叶手稿图像数据集上进行，通过比较所提出的方法与文献中现有方法的鲁棒性。

{"title":"Line segmentation for grayscale text images of khmer palm leaf manuscripts","authors":"Dona Valy, M. Verleysen, Kimheng Sok","doi":"10.1109/IPTA.2017.8310097","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310097","url":null,"abstract":"Text line segmentation is one of the most essential pre-processing steps in character recognition and document analysis. In ancient documents, a variety of deformations caused by aging produce noises which make the binarization process very challenging. Moreover, due to the irregular layout such as skewness and fluctuation of text lines, segmenting an ancient manuscript page into lines still remains an open problem to solve. In this paper, we propose a novel line segmentation scheme for grayscale images of Khmer ancient documents. First, a stroke width transform is applied to extract connected components from the document page. The number and medial positions of text lines are estimated using a modified piece-wise projection profile technique. Those positions are then modified adaptively according to the curvature of the actual text lines. Finally, a path finding approach is used to separate touching components and also to mark the boundary of the text lines. Experiments are conducted on a dataset of 110 pages of Khmer palm leaf manuscript images by comparing the robustness of the proposed approach with existing methods from the literature.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121921378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

HEVC stream saliency extraction: Synergies between FIT and information theory principles HEVC流显著性提取:FIT与信息论原理的协同作用

2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)

Pub Date : 2017-11-01 DOI: 10.1109/IPTA.2017.8310117

Marwa Ammar, M. Mitrea, Ismail Boujelbane

The present paper studies the potential synergies between a popular approach in saliency extraction — FIT (feature integration theory) and source coding principles. By combining these two approaches, a new saliency model, extracted directly at the HEVC steam syntax elements level is defined. The experiments confront the new model to the human saliency, captured by eye-tracking devices. They consider a reference corpus representing density fixation maps, two objective criteria, two objective measures and 7 state-of-the-art saliency models (3 acting in pixel domain and 4 acting in compressed domain).

本文研究了一种流行的显著性提取方法- FIT(特征集成理论)与源编码原理之间的潜在协同作用。通过结合这两种方法，定义了一个新的显著性模型，直接在HEVC蒸汽语法元素级别提取。实验将新模型与眼球追踪设备捕捉到的人类显著性进行了对比。他们考虑了一个代表密度固定图的参考语料库，两个客观标准，两个客观测量和7个最先进的显著性模型(3个作用于像素域，4个作用于压缩域)。

引用次数: 0

Background modelling, analysis and implementation for thermographic images 热成像图像的背景建模、分析和实现

2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)

Pub Date : 2017-11-01 DOI: 10.1109/IPTA.2017.8310078

Irida Shallari, Qaiser Anwar, Muhammad Imran, M. O’nils

Background subtraction is one of the fundamental steps in the image-processing pipeline for distinguishing foreground from background. Most of the methods have been investigated with respect to visual images, in which case challenges are different compared to thermal images. Thermal sensors are invariant to light changes and have reduced privacy concerns. We propose the use of a low-pass IIR filter for background modelling in thermographic imagery due to its better performance compared to algorithms such as Mixture of Gaussians and K-nearest neighbour, while reducing memory requirements for implementation in embedded architectures. Based on the analysis of four different image datasets both indoor and outdoor, with and without people presence, the learning rate for the filter is set to 3×10−3 Hz and the proposed model is implemented on an Artix-7 FPGA.

背景减法是图像处理流程中区分前景和背景的基本步骤之一。大多数方法都是针对视觉图像进行研究的，在这种情况下，挑战与热图像相比是不同的。热传感器不受光线变化的影响，减少了对隐私的担忧。我们建议在热成像图像中使用低通IIR滤波器进行背景建模，因为与混合高斯和k近邻算法相比，它的性能更好，同时减少了在嵌入式架构中实现的内存需求。基于对室内和室外，有和没有人在场的四种不同图像数据集的分析，将滤波器的学习率设置为3×10−3 Hz，并在Artix-7 FPGA上实现了所提出的模型。

引用次数: 4

Footnote-based document image classification using 1D convolutional neural networks and histograms 基于脚注的文档图像分类使用一维卷积神经网络和直方图

2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)

Pub Date : 2017-11-01 DOI: 10.1109/IPTA.2017.8310140

Mohamed Mhiri, Sherif Abuelwafa, Christian Desrosiers, M. Cheriet

Classifying historical document images is a challenging task due to the high variability of their content and the common presence of degradation in these documents. For scholars, footnotes are essential to analyze and investigate historical documents. In this work, a novel classification method is proposed for detecting and segmenting footnotes from document images. Our proposed method utilizes horizontal histograms of text lines as inputs to a 1D Convolutional Neural Network (CNN). Experiments on a dataset of historical documents show the proposed method to be effective in dealing with the high variability of footnotes, even while using a small training set. Our method yielded an overall F-measure of 56.36% and a precision of 89.76%, outperforming significantly existing approaches for this task.

对历史文档图像进行分类是一项具有挑战性的任务，因为它们的内容具有高度可变性，并且这些文档中普遍存在退化。对学者来说，脚注是分析和研究历史文献的必要条件。在这项工作中，提出了一种新的分类方法来检测和分割文档图像中的脚注。我们提出的方法利用文本行的水平直方图作为1D卷积神经网络(CNN)的输入。在历史文献数据集上的实验表明，即使使用较小的训练集，该方法也能有效地处理脚注的高可变性。我们的方法总体f值为56.36%，精度为89.76%，明显优于现有的方法。

引用次数: 7

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀