Pub Date : 2016-09-01DOI: 10.1109/ICIP.2016.7532945
M. U. Sadiq, J. Simmons, C. Bouman
Computed tomography is increasingly enabling scientists to study physical processes of materials at micron scales. The MBIR framework provides a powerful method for CT reconstruction by incorporating both a measurement model and prior model. Classically, the choice of prior has been limited to models enforcing local similarity in the image data. In some material science problems, however, much more may be known about the underlying physical process being imaged. Moreover, recent work in Plug-And-Play decoupling of the MBIR problem has enabled researchers to look beyond classical prior models, and innovations in methods of data acquisition such as interlaced view sampling have also shown promise for imaging of dynamic physical processes. In this paper, we propose an MBIR framework with a physics based prior model - namely the Cahn-Hilliard equation. The Cahn-Hilliard equation can be used to describe the spatiotemporal evolution of binary alloys. After formulating the MBIR cost with Cahn-Hilliard prior, we use Plug-And-Play algorithm with ICD optimization to minimize this cost. We apply this method to simulated data using the interlaced-view sampling method of data acquisition. Results show superior reconstruction quality compared to the Filtered Back Projection. Though we use Cahn-Hilliard equation as one instance, the method can be easily extended to use any other physics-based prior model for a different set of applications.
{"title":"Model based image reconstruction with physics based priors","authors":"M. U. Sadiq, J. Simmons, C. Bouman","doi":"10.1109/ICIP.2016.7532945","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532945","url":null,"abstract":"Computed tomography is increasingly enabling scientists to study physical processes of materials at micron scales. The MBIR framework provides a powerful method for CT reconstruction by incorporating both a measurement model and prior model. Classically, the choice of prior has been limited to models enforcing local similarity in the image data. In some material science problems, however, much more may be known about the underlying physical process being imaged. Moreover, recent work in Plug-And-Play decoupling of the MBIR problem has enabled researchers to look beyond classical prior models, and innovations in methods of data acquisition such as interlaced view sampling have also shown promise for imaging of dynamic physical processes. In this paper, we propose an MBIR framework with a physics based prior model - namely the Cahn-Hilliard equation. The Cahn-Hilliard equation can be used to describe the spatiotemporal evolution of binary alloys. After formulating the MBIR cost with Cahn-Hilliard prior, we use Plug-And-Play algorithm with ICD optimization to minimize this cost. We apply this method to simulated data using the interlaced-view sampling method of data acquisition. Results show superior reconstruction quality compared to the Filtered Back Projection. Though we use Cahn-Hilliard equation as one instance, the method can be easily extended to use any other physics-based prior model for a different set of applications.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"153 1","pages":"3176-3179"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86155506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-09-01DOI: 10.1109/ICIP.2016.7532540
Zhenyu Wu, H. Hu
Up-sampling has been one of the key techniques for multimedia processing. Higher resolution videos are always the pursuit goal of major customers. Pervasive applications of multimedia processing starve for this technique to solve mismatch problems in display resolutions between senders and receivers. The hybrid DCT-Wiener-based interpolation method designs a powerful framework to interpolate video frames by mining the information in both spatial and DCT domain briefly. It can provide better objective as well as visual qualities with low complexity than many existing well studied interpolation methods. This paper presents an analysis about the bottleneck of hybrid DCT-Wiener-based interpolation method firstly. And then, proposes a trilateral filtering-based hybrid up-sampling algorithm in dual domains, which has dug the information in spatial and frequency domains more deeply. The proposed spatial domain interpolation scheme is an adaptive Wiener filter with a trilateral filtering enhancement, which possess capability to overcome the quarter-pixel shift mismatch of hybrid DCT-Wiener-based interpolation method and achieve much more accurate detail information estimation. Furthermore, flexible block size chosen mechanism in frequency domain enables the whole proposed up-sampling algorithm achieve further advantages in high frequency coefficients retaining. Experiments have been carried out to demonstrate that the proposed algorithm is able to achieve noticeable gains over state of art methods in both objective and visual qualities measurements.
{"title":"Trilateral filtering-based hybrid up-sampling in dual domains for single video frame super resolution","authors":"Zhenyu Wu, H. Hu","doi":"10.1109/ICIP.2016.7532540","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532540","url":null,"abstract":"Up-sampling has been one of the key techniques for multimedia processing. Higher resolution videos are always the pursuit goal of major customers. Pervasive applications of multimedia processing starve for this technique to solve mismatch problems in display resolutions between senders and receivers. The hybrid DCT-Wiener-based interpolation method designs a powerful framework to interpolate video frames by mining the information in both spatial and DCT domain briefly. It can provide better objective as well as visual qualities with low complexity than many existing well studied interpolation methods. This paper presents an analysis about the bottleneck of hybrid DCT-Wiener-based interpolation method firstly. And then, proposes a trilateral filtering-based hybrid up-sampling algorithm in dual domains, which has dug the information in spatial and frequency domains more deeply. The proposed spatial domain interpolation scheme is an adaptive Wiener filter with a trilateral filtering enhancement, which possess capability to overcome the quarter-pixel shift mismatch of hybrid DCT-Wiener-based interpolation method and achieve much more accurate detail information estimation. Furthermore, flexible block size chosen mechanism in frequency domain enables the whole proposed up-sampling algorithm achieve further advantages in high frequency coefficients retaining. Experiments have been carried out to demonstrate that the proposed algorithm is able to achieve noticeable gains over state of art methods in both objective and visual qualities measurements.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"15 1","pages":"1160-1164"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84737057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-09-01DOI: 10.1109/ICIP.2016.7532895
Gongbo Liang, Qi Li, Xiangui Kang
In this paper, we propose a leg-driven physiology framework for pedestrian detection. The framework is introduced to reduce the search space of candidate regions of pedestrians. Given a set of vertical line segments, we can generate a space of rectangular candidate regions, based on a model of body proportions. The proposed framework can be either integrated with or without learning-based pedestrian detection methods to validate the candidate regions. A symmetry constraint is then applied to validate each candidate region to decrease the false positive rate. The experiment demonstrates the promising results of the proposed method by comparing it with Dalal & Triggs method. For example, rectangular regions detected by the proposed method has much similar area to the ground truth than regions detected by Dalal & Triggs method.
{"title":"Pedestrian detection via a leg-driven physiology framework","authors":"Gongbo Liang, Qi Li, Xiangui Kang","doi":"10.1109/ICIP.2016.7532895","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532895","url":null,"abstract":"In this paper, we propose a leg-driven physiology framework for pedestrian detection. The framework is introduced to reduce the search space of candidate regions of pedestrians. Given a set of vertical line segments, we can generate a space of rectangular candidate regions, based on a model of body proportions. The proposed framework can be either integrated with or without learning-based pedestrian detection methods to validate the candidate regions. A symmetry constraint is then applied to validate each candidate region to decrease the false positive rate. The experiment demonstrates the promising results of the proposed method by comparing it with Dalal & Triggs method. For example, rectangular regions detected by the proposed method has much similar area to the ground truth than regions detected by Dalal & Triggs method.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"13 1","pages":"2926-2930"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87222277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-09-01DOI: 10.1109/ICIP.2016.7532321
J. Lainema, M. Hannuksela, V. Vadakital, Emre B. Aksu
The High Efficiency Video Coding (HEVC) standard includes support for a large range of image representation formats and provides an excellent image compression capability. The High Efficiency Image File Format (HEIF) offers a convenient way to encapsulate HEVC coded images, image sequences and animations together with associated metadata into a single file. This paper discusses various features and functionalities of the HEIF file format and compares the compression efficiency of HEVC still image coding to that of JPEG 2000. According to the experimental results HEVC provides about 25% bitrate reduction compared to JPEG 2000, while keeping the same objective picture quality.
{"title":"HEVC still image coding and high efficiency image file format","authors":"J. Lainema, M. Hannuksela, V. Vadakital, Emre B. Aksu","doi":"10.1109/ICIP.2016.7532321","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532321","url":null,"abstract":"The High Efficiency Video Coding (HEVC) standard includes support for a large range of image representation formats and provides an excellent image compression capability. The High Efficiency Image File Format (HEIF) offers a convenient way to encapsulate HEVC coded images, image sequences and animations together with associated metadata into a single file. This paper discusses various features and functionalities of the HEIF file format and compares the compression efficiency of HEVC still image coding to that of JPEG 2000. According to the experimental results HEVC provides about 25% bitrate reduction compared to JPEG 2000, while keeping the same objective picture quality.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"2015 1","pages":"71-75"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87946528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-09-01DOI: 10.1109/ICIP.2016.7532426
Yule Li, Y. Dou, Xinwang Liu, Teng Li
People head detection in crowded scenes is challenging due to the large variability in clothing and appearance, small scales of people, and strong partial occlusions. Traditional bottom-up proposal methods and existing region proposal network approaches suffer from either poor recall or low precision. In this paper, we propose to improve both the recall and precision of head detection of region proposal models by integrating the local head information. In specific, we first use a region proposal network to predict the bounding boxes and corresponding scores of multiple instances in the region. A local head classifier network is then trained to score the bounding box generated from the region proposal model. After that, we propose an adaptive fusion method by optimally combining both the region and local scores to obtain the final score of each candidate bounding box. Furthermore, our fusion models can automatically learn the optimal hyper-parameters from data. Our algorithm achieves superior people head detection performance on the crowded scenes data set, which significantly outperforms several recent state-of-the-art baselines in the literature.
{"title":"Localized region context and object feature fusion for people head detection","authors":"Yule Li, Y. Dou, Xinwang Liu, Teng Li","doi":"10.1109/ICIP.2016.7532426","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532426","url":null,"abstract":"People head detection in crowded scenes is challenging due to the large variability in clothing and appearance, small scales of people, and strong partial occlusions. Traditional bottom-up proposal methods and existing region proposal network approaches suffer from either poor recall or low precision. In this paper, we propose to improve both the recall and precision of head detection of region proposal models by integrating the local head information. In specific, we first use a region proposal network to predict the bounding boxes and corresponding scores of multiple instances in the region. A local head classifier network is then trained to score the bounding box generated from the region proposal model. After that, we propose an adaptive fusion method by optimally combining both the region and local scores to obtain the final score of each candidate bounding box. Furthermore, our fusion models can automatically learn the optimal hyper-parameters from data. Our algorithm achieves superior people head detection performance on the crowded scenes data set, which significantly outperforms several recent state-of-the-art baselines in the literature.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"35 1","pages":"594-598"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88260530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-09-01DOI: 10.1109/ICIP.2016.7532797
Chen Bai, A. Reibman
First-person videos (FPVs) captured by wearable cameras often contain heavy distortions, including motion blur, rolling shutter artifacts and rotation. Existing image and video quality estimators are inefficient for this type of video. We develop a method specifically to measure the distortions present in FPVs, without using a high quality reference video. Our local visual information (LVI) algorithm measures motion blur, and we combine homography estimation with line angle histogram to measure rolling shutter artifacts and rotation. Our experiments demonstrate that captured FPVs have dramatically different distortions compared to traditional source videos. We also show that LVI is responsive to motion blur, but insensitive to rotation and shear.
{"title":"Characterizing distortions in first-person videos","authors":"Chen Bai, A. Reibman","doi":"10.1109/ICIP.2016.7532797","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532797","url":null,"abstract":"First-person videos (FPVs) captured by wearable cameras often contain heavy distortions, including motion blur, rolling shutter artifacts and rotation. Existing image and video quality estimators are inefficient for this type of video. We develop a method specifically to measure the distortions present in FPVs, without using a high quality reference video. Our local visual information (LVI) algorithm measures motion blur, and we combine homography estimation with line angle histogram to measure rolling shutter artifacts and rotation. Our experiments demonstrate that captured FPVs have dramatically different distortions compared to traditional source videos. We also show that LVI is responsive to motion blur, but insensitive to rotation and shear.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"58 1","pages":"2440-2444"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86965780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-09-01DOI: 10.1109/ICIP.2016.7532785
Xun Cai, J. Lim
In intra video coding, intra frames are predicted with intra prediction and the prediction residual signal is encoded. In many transform-based video coding systems, intra prediction residuals are encoded with transforms. For example, the Discrete Cosine Transform (DCT) and the Asymmetric Discrete Sine Transform (ADST) are used for intra prediction residuals in many coding systems. In the recent work, a set of transforms based on prediction inaccuracy modeling (PIM) has been proposed. These transforms are developed based on the observation that much of the residual non-stationarity is due to the use of an inaccurate prediction parameter. These transforms are shown to be effective for non-stationarity that arises in directional intra prediction residuals. In this paper, we implement the transforms based on prediction inaccuracy modeling on the H.264 intra coding system. The proposed transform is used in hybrid with the ADST. We compare the performance of the hybrid transform with the ADST and show that a significant bit-rate reduction is obtained with the proposed transform.
{"title":"H.264 intra coding with transforms based on prediction inaccuracy modeling","authors":"Xun Cai, J. Lim","doi":"10.1109/ICIP.2016.7532785","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532785","url":null,"abstract":"In intra video coding, intra frames are predicted with intra prediction and the prediction residual signal is encoded. In many transform-based video coding systems, intra prediction residuals are encoded with transforms. For example, the Discrete Cosine Transform (DCT) and the Asymmetric Discrete Sine Transform (ADST) are used for intra prediction residuals in many coding systems. In the recent work, a set of transforms based on prediction inaccuracy modeling (PIM) has been proposed. These transforms are developed based on the observation that much of the residual non-stationarity is due to the use of an inaccurate prediction parameter. These transforms are shown to be effective for non-stationarity that arises in directional intra prediction residuals. In this paper, we implement the transforms based on prediction inaccuracy modeling on the H.264 intra coding system. The proposed transform is used in hybrid with the ADST. We compare the performance of the hybrid transform with the ADST and show that a significant bit-rate reduction is obtained with the proposed transform.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"16 1","pages":"2380-2384"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87048311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-09-01DOI: 10.1109/ICIP.2016.7533037
Ilker Buzcu, Aydin Alatan
An enhancement to one of the existing visual object detection approaches is proposed for generating candidate windows that improves detection accuracy at no additional computational cost. Hypothesis windows for object detection are obtained based on Fisher Vector representations over initially obtained superpixels. In order to obtain new window hypotheses, hierarchical merging of superpixel regions are applied, depending upon improvements on some objectiveness measures with no additional cost due to additivity of Fisher Vectors. The proposed technique is further improved by concatenating these representations with that of deep networks. Based on the results of the simulations on typical data sets, it can be argued that the approach is quite promising for its use of handcrafted features left to dust due to the rise of deep learning.
{"title":"Fisher-selective search for object detection","authors":"Ilker Buzcu, Aydin Alatan","doi":"10.1109/ICIP.2016.7533037","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7533037","url":null,"abstract":"An enhancement to one of the existing visual object detection approaches is proposed for generating candidate windows that improves detection accuracy at no additional computational cost. Hypothesis windows for object detection are obtained based on Fisher Vector representations over initially obtained superpixels. In order to obtain new window hypotheses, hierarchical merging of superpixel regions are applied, depending upon improvements on some objectiveness measures with no additional cost due to additivity of Fisher Vectors. The proposed technique is further improved by concatenating these representations with that of deep networks. Based on the results of the simulations on typical data sets, it can be argued that the approach is quite promising for its use of handcrafted features left to dust due to the rise of deep learning.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"20 1","pages":"3633-3637"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85903620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-09-01DOI: 10.1109/ICIP.2016.7532678
Xue Li, J. Zhou, Lei Tong, Xun Yu, Jianhui Guo, Chunxia Zhao
Hyperspectral unmixing is an important technique for identifying the constituent spectra and estimating their corresponding fractions in an image. Nonnegative Matrix Factorization (NMF) has recently been widely used for hyperspectral unmixing. However, due to the complex distribution of hyperspectral data, most existing NMF algorithms cannot adequately reflect the intrinsic relationship of the data. In this paper, we propose a novel method, Structured Discriminative Nonnegative Matrix Factorization (SDNMF), to preserve the structural information of hyperspectral data. This is achieved by introducing structured discriminative regularization terms to model both local affinity and distant repulsion of observed spectral responses. Moreover, considering that the abundances of most materials are sparse, a sparseness constraint is also introduced into SDNMF. Experimental results on both synthetic and real data have validated the effectiveness of the proposed method which achieves better unmixing performance than several alternative approaches.
{"title":"Structured Discriminative Nonnegative Matrix Factorization for hyperspectral unmixing","authors":"Xue Li, J. Zhou, Lei Tong, Xun Yu, Jianhui Guo, Chunxia Zhao","doi":"10.1109/ICIP.2016.7532678","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532678","url":null,"abstract":"Hyperspectral unmixing is an important technique for identifying the constituent spectra and estimating their corresponding fractions in an image. Nonnegative Matrix Factorization (NMF) has recently been widely used for hyperspectral unmixing. However, due to the complex distribution of hyperspectral data, most existing NMF algorithms cannot adequately reflect the intrinsic relationship of the data. In this paper, we propose a novel method, Structured Discriminative Nonnegative Matrix Factorization (SDNMF), to preserve the structural information of hyperspectral data. This is achieved by introducing structured discriminative regularization terms to model both local affinity and distant repulsion of observed spectral responses. Moreover, considering that the abundances of most materials are sparse, a sparseness constraint is also introduced into SDNMF. Experimental results on both synthetic and real data have validated the effectiveness of the proposed method which achieves better unmixing performance than several alternative approaches.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"113 1","pages":"1848-1852"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86225580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-09-01DOI: 10.1109/ICIP.2016.7532697
Utkarsh Gaur, M. Kourakis, E. Newman-Smith, William C. Smith, B. S. Manjunath
Segmentation is a key component of several bio-medical image processing systems. Recently, segmentation methods based on supervised learning such as deep convolutional networks have enjoyed immense success for natural image datasets and biological datasets alike. These methods require large volumes of data to avoid overfitting which limits their applicability. In this work, we present a transfer learning mechanism based on active learning which allows us to utilize pre-trained deep networks for segmenting new domains with limited labelled data. We introduce a novel optimization criterion to allow feedback on the most uncertain, yet abundant image patterns thus provisioning for an expert in the loop albeit with minimum amount of guidance. Our experiments demonstrate the effectiveness of the proposed method in improving segmentation performance with very limited labelled data.
{"title":"Membrane segmentation via active learning with deep networks","authors":"Utkarsh Gaur, M. Kourakis, E. Newman-Smith, William C. Smith, B. S. Manjunath","doi":"10.1109/ICIP.2016.7532697","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532697","url":null,"abstract":"Segmentation is a key component of several bio-medical image processing systems. Recently, segmentation methods based on supervised learning such as deep convolutional networks have enjoyed immense success for natural image datasets and biological datasets alike. These methods require large volumes of data to avoid overfitting which limits their applicability. In this work, we present a transfer learning mechanism based on active learning which allows us to utilize pre-trained deep networks for segmenting new domains with limited labelled data. We introduce a novel optimization criterion to allow feedback on the most uncertain, yet abundant image patterns thus provisioning for an expert in the loop albeit with minimum amount of guidance. Our experiments demonstrate the effectiveness of the proposed method in improving segmentation performance with very limited labelled data.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"81 1","pages":"1943-1947"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83140227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}