Pub Date : 2017-12-01DOI: 10.1109/IPTA.2017.8310108
R. Welikala, M. Fraz, M. Habib, S. Daniel-Tong, M. Yates, P. J. Foster, P. Whincup, A. Rudnicka, C. Owen, D. Strachan, S. Barman
The morphometric characteristics of the retinal vascular network have been associated with risk markers of many systemic and vascular diseases. However, analysis of data from large population based studies is needed to help resolve uncertainties in these associations. QUARTZ (QUantitative Analysis of Retinal vessel Topology and siZe) is a fully automated retinal image analysis system that has been designed to process large numbers of retinal images and obtains quantitative measures of vessel morphology to be used in epidemiological studies. QUARTZ has been used to process retinal images from UK Biobank which is a large population-based cohort study. In this paper, we address issues of robustness with respect to processing large datasets and validate QUARTZ using a subset of 4,692 UK Biobank retinal images. Ground truth data produced by human observers for validation have been made available online. Following validation, 135,867 retinal images (68,549 participants) from the UK Biobank study were processed by QUARTZ. 71.53% of these images were classified as being of adequate quality, which equated to 80.90% participants with at least one image of adequate quality. The vessel morphometric data are currently being used in epidemiological studies. The intention of the UK Biobank Eye and Vision Consortium is to include these derived measures in the UK Biobank data archive.
视网膜血管网的形态特征与许多系统性和血管疾病的危险标志物有关。然而,需要对基于大量人口的研究的数据进行分析,以帮助解决这些关联中的不确定性。QUARTZ (QUantitative Analysis of Retinal vessel Topology and siZe)是一个全自动的视网膜图像分析系统,用于处理大量的视网膜图像,并获得用于流行病学研究的血管形态的定量测量。QUARTZ已被用于处理来自英国生物银行的视网膜图像,这是一项基于人群的大型队列研究。在本文中,我们解决了处理大型数据集的鲁棒性问题,并使用4,692个UK Biobank视网膜图像子集验证QUARTZ。人类观察员为验证而产生的地面真实数据已在网上提供。经过验证后,QUARTZ处理了来自英国生物银行研究的135,867张视网膜图像(68,549名参与者)。71.53%的这些图像被归类为质量良好,这相当于80.90%的参与者至少有一张质量良好的图像。血管形态测量数据目前被用于流行病学研究。英国生物银行眼科和视力协会的意图是将这些衍生措施纳入英国生物银行数据档案。
{"title":"Automated quantification of retinal vessel morphometry in the UK biobank cohort","authors":"R. Welikala, M. Fraz, M. Habib, S. Daniel-Tong, M. Yates, P. J. Foster, P. Whincup, A. Rudnicka, C. Owen, D. Strachan, S. Barman","doi":"10.1109/IPTA.2017.8310108","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310108","url":null,"abstract":"The morphometric characteristics of the retinal vascular network have been associated with risk markers of many systemic and vascular diseases. However, analysis of data from large population based studies is needed to help resolve uncertainties in these associations. QUARTZ (QUantitative Analysis of Retinal vessel Topology and siZe) is a fully automated retinal image analysis system that has been designed to process large numbers of retinal images and obtains quantitative measures of vessel morphology to be used in epidemiological studies. QUARTZ has been used to process retinal images from UK Biobank which is a large population-based cohort study. In this paper, we address issues of robustness with respect to processing large datasets and validate QUARTZ using a subset of 4,692 UK Biobank retinal images. Ground truth data produced by human observers for validation have been made available online. Following validation, 135,867 retinal images (68,549 participants) from the UK Biobank study were processed by QUARTZ. 71.53% of these images were classified as being of adequate quality, which equated to 80.90% participants with at least one image of adequate quality. The vessel morphometric data are currently being used in epidemiological studies. The intention of the UK Biobank Eye and Vision Consortium is to include these derived measures in the UK Biobank data archive.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130096936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-11-28DOI: 10.1109/IPTA.2017.8310135
Sofiane Mihoubi, B. Mathon, Jean-Baptiste Thomas, O. Losson, L. Macaire
Snapshot multispectral cameras that are equipped with filter arrays acquire a raw image that represents the radiance of a scene over the electromagnetic spectrum at video rate. These cameras require a demosaicing procedure to estimate a multispectral image with full spatio-spectral definition. Such a procedure is based on spectral correlation properties that are sensitive to illumination. In this paper, we first highlight the influence of illumination on demosaicing performances. Then we propose camera-, illumination-, and raw image-based normalisations that make demosaicing robust to illumination. Experimental results on state-of-the-art demosaicing algorithms show that such normalisations improve the quality of multispectral images estimated from raw images acquired under various illuminations.
{"title":"Illumination-robust multispectral demosaicing","authors":"Sofiane Mihoubi, B. Mathon, Jean-Baptiste Thomas, O. Losson, L. Macaire","doi":"10.1109/IPTA.2017.8310135","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310135","url":null,"abstract":"Snapshot multispectral cameras that are equipped with filter arrays acquire a raw image that represents the radiance of a scene over the electromagnetic spectrum at video rate. These cameras require a demosaicing procedure to estimate a multispectral image with full spatio-spectral definition. Such a procedure is based on spectral correlation properties that are sensitive to illumination. In this paper, we first highlight the influence of illumination on demosaicing performances. Then we propose camera-, illumination-, and raw image-based normalisations that make demosaicing robust to illumination. Experimental results on state-of-the-art demosaicing algorithms show that such normalisations improve the quality of multispectral images estimated from raw images acquired under various illuminations.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"502 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116549915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-11-28DOI: 10.1109/IPTA.2017.8310143
Pauline Puteaux, W. Puech
With the development of cloud computing, the growth in information technology has led to serious security issues. For this reason, a lot of multimedia files are stored in encrypted forms. Methods of reversible data hiding in encrypted images (RDHEI) have been designed to provide authentication and integrity in the encrypted domain. The original image is firstly encrypted to ensure confidentiality, by making the content unreadable. A secret message is then embedded in the encrypted image, without the need of the encryption key or any access to the clear content. The challenge lies in finding the best trade-off between embedding capacity and quality of the reconstructed image. In 2008, Puech et al. suggested using the AES algorithm to encrypt an original image and to embed one bit in each block of 16 pixels (payload = 0.0625 bpp) [12]. During the decryption phase, the original image is reconstructed by measuring the standard deviation into each block. In this paper, we propose an improvement to this method, by performing an adaptive local entropy measurement. We can achieve a larger payload without altering the recovered image quality. Our obtained results are very good and better than most of the modern state-of-the-art methods, whilst offering an improved security level with the use of the AES algorithm, defined as the encryption standard by the NIST.
{"title":"Reversible data hiding in encrypted images based on adaptive local entropy analysis","authors":"Pauline Puteaux, W. Puech","doi":"10.1109/IPTA.2017.8310143","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310143","url":null,"abstract":"With the development of cloud computing, the growth in information technology has led to serious security issues. For this reason, a lot of multimedia files are stored in encrypted forms. Methods of reversible data hiding in encrypted images (RDHEI) have been designed to provide authentication and integrity in the encrypted domain. The original image is firstly encrypted to ensure confidentiality, by making the content unreadable. A secret message is then embedded in the encrypted image, without the need of the encryption key or any access to the clear content. The challenge lies in finding the best trade-off between embedding capacity and quality of the reconstructed image. In 2008, Puech et al. suggested using the AES algorithm to encrypt an original image and to embed one bit in each block of 16 pixels (payload = 0.0625 bpp) [12]. During the decryption phase, the original image is reconstructed by measuring the standard deviation into each block. In this paper, we propose an improvement to this method, by performing an adaptive local entropy measurement. We can achieve a larger payload without altering the recovered image quality. Our obtained results are very good and better than most of the modern state-of-the-art methods, whilst offering an improved security level with the use of the AES algorithm, defined as the encryption standard by the NIST.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"20 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132026405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-11-28DOI: 10.1109/IPTA.2017.8310130
Thanh Tuan Nguyen, T. Nguyen, F. Bouchara
Dynamic texture (DT) is a challenging problem in computer vision because of the chaotic motion of textures. We address in this paper a new dynamic texture operator by considering local structure patterns (LSP) and completed local binary patterns (CLBP) for static images in three orthogonal planes to capture spatial-temporal texture structures. Since the typical operator of local binary patterns (LBP), which uses center pixel for thresholding, has some limitations such as sensitivity to noise and near uniform regions, the proposed approach can deal with these drawbacks by using global and local texture information for adaptive thresholding and CLBP for exploiting complementary texture information in three orthogonal planes. Evaluations on different datasets of dynamic textures (UCLA, DynTex, DynTex++) show that our proposal significantly outperforms recent results in the state-of-the-art approaches.
{"title":"Completed local structure patterns on three orthogonal planes for dynamic texture recognition","authors":"Thanh Tuan Nguyen, T. Nguyen, F. Bouchara","doi":"10.1109/IPTA.2017.8310130","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310130","url":null,"abstract":"Dynamic texture (DT) is a challenging problem in computer vision because of the chaotic motion of textures. We address in this paper a new dynamic texture operator by considering local structure patterns (LSP) and completed local binary patterns (CLBP) for static images in three orthogonal planes to capture spatial-temporal texture structures. Since the typical operator of local binary patterns (LBP), which uses center pixel for thresholding, has some limitations such as sensitivity to noise and near uniform regions, the proposed approach can deal with these drawbacks by using global and local texture information for adaptive thresholding and CLBP for exploiting complementary texture information in three orthogonal planes. Evaluations on different datasets of dynamic textures (UCLA, DynTex, DynTex++) show that our proposal significantly outperforms recent results in the state-of-the-art approaches.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122530336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-11-28DOI: 10.1109/IPTA.2017.8310142
Anass Nouri, C. Charrier, O. Lézoray
Many computer graphics applications use visual saliency information to guide their treatments such as adaptive compression, viewpoint-selection, segmentation, etc. However, all these applications rest on a partial estimation of visual saliency insofar that only geometric properties of the considered 3D mesh are taken into account leaving aside the colorimetric ones. As humans, our visual attention is sensitive to both geometric and colorimetric informations. Indeed, colorimetric information modifies the eye mouvements while visualizing a multimedia content. We propose in this paper an innovative approach for the detection of global saliency that takes into account both geometric and colorimetric features of a 3D mesh simulating hence the Human Visual System (HVS). For this, we generate two multi-scale saliency maps based on local geometric and colorimetric patch descriptors. These saliency maps are pooled using the Evidence Theory. We show the contribution and the benefit of our proposed global saliency approach for two applications: automatic optimal viewpoint selection and adaptive denoising of 3D colored meshes.
{"title":"Global visual saliency: Geometric and colorimetrie saliency fusion and its applications for 3D colored meshes","authors":"Anass Nouri, C. Charrier, O. Lézoray","doi":"10.1109/IPTA.2017.8310142","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310142","url":null,"abstract":"Many computer graphics applications use visual saliency information to guide their treatments such as adaptive compression, viewpoint-selection, segmentation, etc. However, all these applications rest on a partial estimation of visual saliency insofar that only geometric properties of the considered 3D mesh are taken into account leaving aside the colorimetric ones. As humans, our visual attention is sensitive to both geometric and colorimetric informations. Indeed, colorimetric information modifies the eye mouvements while visualizing a multimedia content. We propose in this paper an innovative approach for the detection of global saliency that takes into account both geometric and colorimetric features of a 3D mesh simulating hence the Human Visual System (HVS). For this, we generate two multi-scale saliency maps based on local geometric and colorimetric patch descriptors. These saliency maps are pooled using the Evidence Theory. We show the contribution and the benefit of our proposed global saliency approach for two applications: automatic optimal viewpoint selection and adaptive denoising of 3D colored meshes.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134531244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-11-28DOI: 10.1109/IPTA.2017.8310088
Rizlène Raoui-Outach, Cécile Million-Rousseau, A. Benoît, P. Lambert
As a general rule, data analytics are now mandatory for companies. Scanned document analysis brings additional challenges introduced by paper damages and scanning quality. In an industrial context, this work focuses on the automatic understanding of sale receipts which enable access to essential and accurate consumption statistics. Given an image acquired with a smart-phone, the proposed work mainly focuses on the first steps of the full tool chain which aims at providing essential information such as the store brand, purchased products and related prices with the highest possible confidence. To get this high confidence level, even if scanning is not perfectly controlled, we propose a double check processing tool-chain using Deep Convolutional Neural Networks (DCNNs) on one hand and more classical image and text processings on another hand. The originality of this work relates in this double check processing and in the joint use of DCNNs for different applications and text analysis.
{"title":"Deep learning for automatic sale receipt understanding","authors":"Rizlène Raoui-Outach, Cécile Million-Rousseau, A. Benoît, P. Lambert","doi":"10.1109/IPTA.2017.8310088","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310088","url":null,"abstract":"As a general rule, data analytics are now mandatory for companies. Scanned document analysis brings additional challenges introduced by paper damages and scanning quality. In an industrial context, this work focuses on the automatic understanding of sale receipts which enable access to essential and accurate consumption statistics. Given an image acquired with a smart-phone, the proposed work mainly focuses on the first steps of the full tool chain which aims at providing essential information such as the store brand, purchased products and related prices with the highest possible confidence. To get this high confidence level, even if scanning is not perfectly controlled, we propose a double check processing tool-chain using Deep Convolutional Neural Networks (DCNNs) on one hand and more classical image and text processings on another hand. The originality of this work relates in this double check processing and in the joint use of DCNNs for different applications and text analysis.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114352620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-11-28DOI: 10.1109/IPTA.2017.8310147
Laurent Cabaret, L. Lacassagne, D. Etiemble
Modern computer architectures are mainly composed of multi-core processors and GPUs. Consequently, solely providing a sequential implementation of algorithms or comparing algorithm performance without regard to architecture is no longer pertinent. Today, algorithms have to address parallelism, multithreading and memory topology (private/shared memory, cache or scratchpad, …). Most Connected Component Labeling (CCL) algorithms are sequential, direct and optimized for processors. Few were designed specifically for GPU architectures and none were designed to be adapted to different architectures. The most efficient GPU implementations are iterative; in order to manage synchronizations between processing units, but the number of iterations depends on the image shape and density. This paper describes the DLP (Distanceless Label Propagation) algorithms, an adaptable set of algorithms usable both on GPU and multi-core architectures, and DLP-GPU, an efficient direct CCL algorithm for GPU based on DLP mechanisms.
{"title":"Distanceless label propagation: An efficient direct connected component labeling algorithm for GPUs","authors":"Laurent Cabaret, L. Lacassagne, D. Etiemble","doi":"10.1109/IPTA.2017.8310147","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310147","url":null,"abstract":"Modern computer architectures are mainly composed of multi-core processors and GPUs. Consequently, solely providing a sequential implementation of algorithms or comparing algorithm performance without regard to architecture is no longer pertinent. Today, algorithms have to address parallelism, multithreading and memory topology (private/shared memory, cache or scratchpad, …). Most Connected Component Labeling (CCL) algorithms are sequential, direct and optimized for processors. Few were designed specifically for GPU architectures and none were designed to be adapted to different architectures. The most efficient GPU implementations are iterative; in order to manage synchronizations between processing units, but the number of iterations depends on the image shape and density. This paper describes the DLP (Distanceless Label Propagation) algorithms, an adaptable set of algorithms usable both on GPU and multi-core architectures, and DLP-GPU, an efficient direct CCL algorithm for GPU based on DLP mechanisms.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130265756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-11-28DOI: 10.1109/IPTA.2017.8310081
Asma Bougrine, R. Harba, R. Canals, R. Lédée, M. Jabloun
The aim of the present study is to propose a new joint segmentation method dedicated to plantar foot thermal images. The proposed method is based on a modified active contour method (Snake) that includes a prior shape information, namely an atlas of the plantar foot contour, as an extra term in the Snake energy function. This term guides the Snake to the targeted contours during the deformation process, by calculating a curvature difference between the Snake curve and the atlas curve of the plantar foot surface. The proposed method was validated using a database of 50 plantar foot thermal images. Results showed the proposed method to outperform the classical Snake method and seven other recent methods. The comparison was done using two evaluation metrics, the root-mean-square error (RMSE) and the dice similarity coefficient (DSC). When compared to ground truth, the best average RMSE of 6 pixels and DSC score of 93% were obtained using the proposed method.
{"title":"A joint snake and atlas-based segmentation of plantar foot thermal images","authors":"Asma Bougrine, R. Harba, R. Canals, R. Lédée, M. Jabloun","doi":"10.1109/IPTA.2017.8310081","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310081","url":null,"abstract":"The aim of the present study is to propose a new joint segmentation method dedicated to plantar foot thermal images. The proposed method is based on a modified active contour method (Snake) that includes a prior shape information, namely an atlas of the plantar foot contour, as an extra term in the Snake energy function. This term guides the Snake to the targeted contours during the deformation process, by calculating a curvature difference between the Snake curve and the atlas curve of the plantar foot surface. The proposed method was validated using a database of 50 plantar foot thermal images. Results showed the proposed method to outperform the classical Snake method and seven other recent methods. The comparison was done using two evaluation metrics, the root-mean-square error (RMSE) and the dice similarity coefficient (DSC). When compared to ground truth, the best average RMSE of 6 pixels and DSC score of 93% were obtained using the proposed method.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131350991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-11-28DOI: 10.1109/IPTA.2017.8310091
B. Mocanu, Ruxandra Tapu, T. Zaharia
In this paper we introduce a novel single object tracker based on two convolutional neural networks (CNNs) trained offline using data from large videos repositories. The key principle consists of alternating between tracking using motion information and adjusting the predicted location based on visual similarity. First, we construct a deep regression network architecture able to learn generic relations between the object appearance models and its associated motion patterns. Then, based on visual similarity constraints, the objects bounding box position, size and shape are continuously updated in order to maximize a patch similarity function designed using CNN. Finally, a multi-resolution fusion between the outputs of the two CNNs is performed for accurate object localization. The experimental evaluation performed on challenging datasets, proposed in the visual object tracking (VOT) international contest, validates the proposed method when compared with state-of-the-art systems. In terms of computational speed our tracker runs at 20fps.
{"title":"Single object tracking using offline trained deep regression networks","authors":"B. Mocanu, Ruxandra Tapu, T. Zaharia","doi":"10.1109/IPTA.2017.8310091","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310091","url":null,"abstract":"In this paper we introduce a novel single object tracker based on two convolutional neural networks (CNNs) trained offline using data from large videos repositories. The key principle consists of alternating between tracking using motion information and adjusting the predicted location based on visual similarity. First, we construct a deep regression network architecture able to learn generic relations between the object appearance models and its associated motion patterns. Then, based on visual similarity constraints, the objects bounding box position, size and shape are continuously updated in order to maximize a patch similarity function designed using CNN. Finally, a multi-resolution fusion between the outputs of the two CNNs is performed for accurate object localization. The experimental evaluation performed on challenging datasets, proposed in the visual object tracking (VOT) international contest, validates the proposed method when compared with state-of-the-art systems. In terms of computational speed our tracker runs at 20fps.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124263304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-11-28DOI: 10.1109/IPTA.2017.8310141
A. Akl, C. Yaacoub, M. Donias, Jean-Pierre Da Costa, C. Germain
Volumetric texture synthesis is mainly used in computer graphics for texturing objects in order to increase the realism of the 3D scenario. It is also of particular interest in many application domains such as studying the three-dimensional internal structure of materials and modelling volumetric data obtained by 3D imaging techniques for medical purposes. Based on a previously proposed 2D structure/texture synthesis algorithm, this paper proposes a two-stage 3D texture synthesis approach where the volumetric structure layer of the input texture is first synthesized, then used to help the synthesis of the volumetric texture. Results show that, using the structural information helps the synthesis of the volumetric texture and can outperform the synthesis based only on intensity information.
{"title":"Two-Stage volumetric texture synthesis based on structural information","authors":"A. Akl, C. Yaacoub, M. Donias, Jean-Pierre Da Costa, C. Germain","doi":"10.1109/IPTA.2017.8310141","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310141","url":null,"abstract":"Volumetric texture synthesis is mainly used in computer graphics for texturing objects in order to increase the realism of the 3D scenario. It is also of particular interest in many application domains such as studying the three-dimensional internal structure of materials and modelling volumetric data obtained by 3D imaging techniques for medical purposes. Based on a previously proposed 2D structure/texture synthesis algorithm, this paper proposes a two-stage 3D texture synthesis approach where the volumetric structure layer of the input texture is first synthesized, then used to help the synthesis of the volumetric texture. Results show that, using the structural information helps the synthesis of the volumetric texture and can outperform the synthesis based only on intensity information.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125627687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}