Pub Date : 2016-09-01DOI: 10.1109/ICIP.2016.7532483
L. Kerofsky, Yan Ye, Yuwen He
In this paper we review the current status and ongoing development of High Dynamic Range and Wide Color Gamut (HDR/WCG) video compression within MPEG. We review how existing MPEG, ITU-R and SMPTE standards may be used for coding HDR content. The history of an exploratory activity within MPEG investigating technologies for improved compression of HDR/WCG content is reviewed. An overview of the MPEG Call for Evidence related to HDR/WCG compression technology is provided. An overview of activities within MPEG related to HDR/WCG coding including progress and a snapshot of ongoing core experiments as of December, 2015 is given. Future outlook for this activity is described.
{"title":"Recent developments from MPEG in HDR video compression","authors":"L. Kerofsky, Yan Ye, Yuwen He","doi":"10.1109/ICIP.2016.7532483","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532483","url":null,"abstract":"In this paper we review the current status and ongoing development of High Dynamic Range and Wide Color Gamut (HDR/WCG) video compression within MPEG. We review how existing MPEG, ITU-R and SMPTE standards may be used for coding HDR content. The history of an exploratory activity within MPEG investigating technologies for improved compression of HDR/WCG content is reviewed. An overview of the MPEG Call for Evidence related to HDR/WCG compression technology is provided. An overview of activities within MPEG related to HDR/WCG coding including progress and a snapshot of ongoing core experiments as of December, 2015 is given. Future outlook for this activity is described.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"52 1","pages":"879-883"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82097396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-09-01DOI: 10.1109/ICIP.2016.7532736
Yun-Fu Liu, Jing-Ming Guo, Yu Cheng
Block truncation coding (BTC) has been considered as a highly efficient compression technique for decades, but the blocking artifact is its main issue. The halftoning-based BTC has significantly eased this issue, yet an apparent impulse noise artifact is accompanied. In this study, an improved BTC, termed adaptive dot-diffused BTC (ADBTC), is proposed to further improve the visual quality. Also, this method provides an additional flexibility on the compression ratios determination in contrast to the former fixed and few number of configuration possibilities. As documented in the experimental results, the proposed method achieves the superior image quality regarding the five various objective IQA methods. As a result, it is a very competitive approach for the needs of both high frame rate and high-resolution image compression.
{"title":"Adaptive block truncation coding image compression technique using optimized dot diffusion","authors":"Yun-Fu Liu, Jing-Ming Guo, Yu Cheng","doi":"10.1109/ICIP.2016.7532736","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532736","url":null,"abstract":"Block truncation coding (BTC) has been considered as a highly efficient compression technique for decades, but the blocking artifact is its main issue. The halftoning-based BTC has significantly eased this issue, yet an apparent impulse noise artifact is accompanied. In this study, an improved BTC, termed adaptive dot-diffused BTC (ADBTC), is proposed to further improve the visual quality. Also, this method provides an additional flexibility on the compression ratios determination in contrast to the former fixed and few number of configuration possibilities. As documented in the experimental results, the proposed method achieves the superior image quality regarding the five various objective IQA methods. As a result, it is a very competitive approach for the needs of both high frame rate and high-resolution image compression.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"9 1","pages":"2137-2141"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82220282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-09-01DOI: 10.1109/ICIP.2016.7533030
Jiaqi Yang, Qian Zhang, Ke Xian, Yang Xiao, ZHIGUO CAO
This paper presents a novel local surface descriptor called rotational contour signatures (RCS) for 3D rigid objects. RCS comprises several signatures that characterize the 2D contour information derived from 3D-to-2D projection of the local surface. The inspiration of our encoding technique comes from that, viewing towards an object, its contour is an effective and robust cue for representing its shape. In order to achieve a comprehensive geometry encoding, the local surface is continually rotated in a predefined local reference frame (LRF) so that multi-view information is obtained. Experiments on two publicly available datasets demonstrate the effectiveness and robustness of the proposed descriptor. Further, comparisons with five state-of-the-art descriptors show the superiority of our RCS descriptor.
{"title":"Rotational contour signatures for robust local surface description","authors":"Jiaqi Yang, Qian Zhang, Ke Xian, Yang Xiao, ZHIGUO CAO","doi":"10.1109/ICIP.2016.7533030","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7533030","url":null,"abstract":"This paper presents a novel local surface descriptor called rotational contour signatures (RCS) for 3D rigid objects. RCS comprises several signatures that characterize the 2D contour information derived from 3D-to-2D projection of the local surface. The inspiration of our encoding technique comes from that, viewing towards an object, its contour is an effective and robust cue for representing its shape. In order to achieve a comprehensive geometry encoding, the local surface is continually rotated in a predefined local reference frame (LRF) so that multi-view information is obtained. Experiments on two publicly available datasets demonstrate the effectiveness and robustness of the proposed descriptor. Further, comparisons with five state-of-the-art descriptors show the superiority of our RCS descriptor.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"76 1","pages":"3598-3602"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83861193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-09-01DOI: 10.1109/ICIP.2016.7532514
Rui Zhu, Xiao-Jiao Mao, Qi-Hai Zhu, Ning Li, Yubin Yang
Text detection is a difficult task due to the significant diversity of the texts appearing in natural scene images. In this paper, we propose a novel text descriptor, SPP-net, extracted by equipping the Convolutional Neural Network (CNN) with spatial pyramid pooling. We first compute the feature maps from the original text lines without any cropping or warping, and then generate the fixed-size representations for text discrimination. Experimental results on the latest ICDAR 2011 and 2013 datasets have proven that the proposed descriptor outperforms the state-of-the-art methods by a noticeable margin on F-measure with its merit of incorporating multi-scale text information and its flexibility of describing text regions with different sizes and shapes.
{"title":"Text detection based on convolutional neural networks with spatial pyramid pooling","authors":"Rui Zhu, Xiao-Jiao Mao, Qi-Hai Zhu, Ning Li, Yubin Yang","doi":"10.1109/ICIP.2016.7532514","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532514","url":null,"abstract":"Text detection is a difficult task due to the significant diversity of the texts appearing in natural scene images. In this paper, we propose a novel text descriptor, SPP-net, extracted by equipping the Convolutional Neural Network (CNN) with spatial pyramid pooling. We first compute the feature maps from the original text lines without any cropping or warping, and then generate the fixed-size representations for text discrimination. Experimental results on the latest ICDAR 2011 and 2013 datasets have proven that the proposed descriptor outperforms the state-of-the-art methods by a noticeable margin on F-measure with its merit of incorporating multi-scale text information and its flexibility of describing text regions with different sizes and shapes.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"167 1","pages":"1032-1036"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80530858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-09-01DOI: 10.1109/ICIP.2016.7532327
Edward T. Scott, S. Hemami
Traditional quality estimators evaluate an image's resemblance to a reference image. However, quality estimators are not well suited to the similar but somewhat different task of utility estimation, where an image is judged instead by how useful it would be in comparison to a reference in the context of accomplishing some task. Multi-Scale Difference of Gaussian Utility (MS-DGU), a reduced-reference algorithm for image utility estimation, relies on matching image contours across scales tuned to spatial frequencies important for utility estimation. MS-DGU estimates utility with greater accuracy than previous techniques. A fast algorithm for utility-optimized image compression was developed through rate-utility optimization for MS-DGU. By simple scaling of JPEG quantization step sizes according to a “utility factor,” data rates were reduced by an average of 24% (and up to 30%) compared to standard JPEG while maintaining utility.
{"title":"Image utility estimation using difference-of-Gaussian scale space","authors":"Edward T. Scott, S. Hemami","doi":"10.1109/ICIP.2016.7532327","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532327","url":null,"abstract":"Traditional quality estimators evaluate an image's resemblance to a reference image. However, quality estimators are not well suited to the similar but somewhat different task of utility estimation, where an image is judged instead by how useful it would be in comparison to a reference in the context of accomplishing some task. Multi-Scale Difference of Gaussian Utility (MS-DGU), a reduced-reference algorithm for image utility estimation, relies on matching image contours across scales tuned to spatial frequencies important for utility estimation. MS-DGU estimates utility with greater accuracy than previous techniques. A fast algorithm for utility-optimized image compression was developed through rate-utility optimization for MS-DGU. By simple scaling of JPEG quantization step sizes according to a “utility factor,” data rates were reduced by an average of 24% (and up to 30%) compared to standard JPEG while maintaining utility.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"44 1","pages":"101-105"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82561100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-09-01DOI: 10.1109/ICIP.2016.7532928
Hangfan Liu, Xinfeng Zhang, Ruiqin Xiong
Prior knowledge plays an important role in image denoising tasks. This paper utilizes the data of the input image to adaptively model the prior distribution. The proposed scheme is based on the observation that, for a natural image, a matrix consisted of its vectorized non-local similar patches is of low rank. We use a non-convex smooth surrogate for the low-rank regularization, and view the optimization problem from the empirical Bayesian perspective. In such framework, a parameter-free distribution prior is derived from the grouped non-local similar image contents. Experimental results show that the proposed approach is highly competitive with several state-of-art denoising methods in PSNR and visual quality.
{"title":"Content-adaptive low rank regularization for image denoising","authors":"Hangfan Liu, Xinfeng Zhang, Ruiqin Xiong","doi":"10.1109/ICIP.2016.7532928","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532928","url":null,"abstract":"Prior knowledge plays an important role in image denoising tasks. This paper utilizes the data of the input image to adaptively model the prior distribution. The proposed scheme is based on the observation that, for a natural image, a matrix consisted of its vectorized non-local similar patches is of low rank. We use a non-convex smooth surrogate for the low-rank regularization, and view the optimization problem from the empirical Bayesian perspective. In such framework, a parameter-free distribution prior is derived from the grouped non-local similar image contents. Experimental results show that the proposed approach is highly competitive with several state-of-art denoising methods in PSNR and visual quality.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"25 1 1","pages":"3091-3095"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82690449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-09-01DOI: 10.1109/ICIP.2016.7533081
Ye Han, Zhigang Liu, Dah-Jye Lee, Guinan Zhang, Miao Deng
Catenary system maintenance is an important task to the operation of a high-seed railway system. Currently, the inspection of damaged parts in the catenary system is performed manually, which is often slow and unreliable. This paper proposes a method to detect and locate the rod-insulators in the image taken from the high-speed railway catenary system. Sub-images containing bar-shaped devices such as cantilever, strut, rod, and pole are first extracted from the image. Rod-insulator is then recognized and detected from these bar-shaped sub-images by using deformable part models and latent SVM. Experimental results show that the proposed method is able to locate rod-insulators accurately from the catenary image for the subsequent detect inspection process. The robustness of this method ensures its performance in different imaging conditions.
{"title":"High-speed railway rod-insulator detection using segment clustering and deformable part models","authors":"Ye Han, Zhigang Liu, Dah-Jye Lee, Guinan Zhang, Miao Deng","doi":"10.1109/ICIP.2016.7533081","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7533081","url":null,"abstract":"Catenary system maintenance is an important task to the operation of a high-seed railway system. Currently, the inspection of damaged parts in the catenary system is performed manually, which is often slow and unreliable. This paper proposes a method to detect and locate the rod-insulators in the image taken from the high-speed railway catenary system. Sub-images containing bar-shaped devices such as cantilever, strut, rod, and pole are first extracted from the image. Rod-insulator is then recognized and detected from these bar-shaped sub-images by using deformable part models and latent SVM. Experimental results show that the proposed method is able to locate rod-insulators accurately from the catenary image for the subsequent detect inspection process. The robustness of this method ensures its performance in different imaging conditions.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"12 1","pages":"3852-3856"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82722752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-09-01DOI: 10.1109/ICIP.2016.7532410
Tien-Dung Mai, T. Ngo, Duy-Dinh Le, D. Duong, Kiem Hoang, S. Satoh
Hierarchical classification is a computational efficient approach for large-scale image classification. The main challenging issue of this approach is to deal with error propagation. Irrelevant branching decision made at a parent node cannot be corrected at its child nodes in traversing the tree for classification. This paper presents a novel approach to reduce branching error at a node by taking its relative relationship into account. Given a node on the tree, we model each candidate branch by considering classification response of its child nodes, grandchild nodes and their differences with siblings. A maximum margin classifier is then applied to select the most discriminating candidate. Our proposed approach outperforms related approaches on Caltech-256, SUN-397 and ILSVRC2010-1K.
{"title":"Using node relationships for hierarchical classification","authors":"Tien-Dung Mai, T. Ngo, Duy-Dinh Le, D. Duong, Kiem Hoang, S. Satoh","doi":"10.1109/ICIP.2016.7532410","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532410","url":null,"abstract":"Hierarchical classification is a computational efficient approach for large-scale image classification. The main challenging issue of this approach is to deal with error propagation. Irrelevant branching decision made at a parent node cannot be corrected at its child nodes in traversing the tree for classification. This paper presents a novel approach to reduce branching error at a node by taking its relative relationship into account. Given a node on the tree, we model each candidate branch by considering classification response of its child nodes, grandchild nodes and their differences with siblings. A maximum margin classifier is then applied to select the most discriminating candidate. Our proposed approach outperforms related approaches on Caltech-256, SUN-397 and ILSVRC2010-1K.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"4 1","pages":"514-518"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80957645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-09-01DOI: 10.1109/ICIP.2016.7532808
Wei-Feng Ou, Chun-Ling Yang, Wen-Hao Li, Li-Hong Ma
Existing multi-hypothesis (MH) prediction algorithms in compressed video sensing (CVS) are all deployed in measurement domain, which restricts the flexibility of block partitioning in the reconstruction process and decreases the reconstruction accuracy. To address this issue, this paper proposes a two-stage multi-hypothesis reconstruction (2sMHR) scheme which deploys the MH prediction in measurement domain and pixel domain successively. Two implementation schemes, GOP-wise and frame-wise scheme, are developed for the 2sMHR. Furthermore, a new weighted metric combining the Euclidean distance and correlation coefficient is designed for the Tikhonov-regularized MH prediction model. Simulation results show that the proposed two-stage MH reconstruction scheme obtains higher reconstruction accuracy than the state-of-the-art CVS prediction methods.
{"title":"A two-stage multi-hypothesis reconstruction scheme in compressed video sensing","authors":"Wei-Feng Ou, Chun-Ling Yang, Wen-Hao Li, Li-Hong Ma","doi":"10.1109/ICIP.2016.7532808","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532808","url":null,"abstract":"Existing multi-hypothesis (MH) prediction algorithms in compressed video sensing (CVS) are all deployed in measurement domain, which restricts the flexibility of block partitioning in the reconstruction process and decreases the reconstruction accuracy. To address this issue, this paper proposes a two-stage multi-hypothesis reconstruction (2sMHR) scheme which deploys the MH prediction in measurement domain and pixel domain successively. Two implementation schemes, GOP-wise and frame-wise scheme, are developed for the 2sMHR. Furthermore, a new weighted metric combining the Euclidean distance and correlation coefficient is designed for the Tikhonov-regularized MH prediction model. Simulation results show that the proposed two-stage MH reconstruction scheme obtains higher reconstruction accuracy than the state-of-the-art CVS prediction methods.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"28 1","pages":"2494-2498"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83625969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-09-01DOI: 10.1109/ICIP.2016.7532986
Yongjian Yu, Jue Wang, S. Acton
We present a histogram-based real-time solution to detecting directly irradiated regions in digital fluoroscopic images. Our method leverages the power of model matching, machine learning and domain knowledge to characterize and segment images using histograms. The input image is automatically identified as containing partial, all, or null direct radiation. The regions with direct radiation are segmented out via global thresholding according to image characterizations. The algorithm involves only one-dimensional processing. The test results achieved 99.82% accurate detection rate on a dataset of 9256 clinical images.
{"title":"Automatic detection of direct radiation for digital fluoroscopy optimization","authors":"Yongjian Yu, Jue Wang, S. Acton","doi":"10.1109/ICIP.2016.7532986","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532986","url":null,"abstract":"We present a histogram-based real-time solution to detecting directly irradiated regions in digital fluoroscopic images. Our method leverages the power of model matching, machine learning and domain knowledge to characterize and segment images using histograms. The input image is automatically identified as containing partial, all, or null direct radiation. The regions with direct radiation are segmented out via global thresholding according to image characterizations. The algorithm involves only one-dimensional processing. The test results achieved 99.82% accurate detection rate on a dataset of 9256 clinical images.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"1 1","pages":"3379-3383"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89232991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}