首页 > 最新文献

2014 International Conference on Digital Image Computing: Techniques and Applications (DICTA)最新文献

英文 中文
Efficient Coding of Depth Map by Exploiting Temporal Correlation 利用时间相关性的深度图高效编码
Shampa Shahriyar, M. Murshed, Mortuza Ali, M. Paul
With the growing demands for 3D and multi-view video content, efficient depth data coding becomes a vital issue in image and video coding area. In this paper, we propose a simple depth coding scheme using multiple prediction modes exploiting temporal correlation of depth map. Current depth coding techniques mostly depend on intra-coding mode that cannot get the advantage of temporal redundancy in the depth maps and higher spatial redundancy in inter-predicted depth residuals. Depth maps are characterized by smooth regions with sharp edges that play an important role in the view synthesis process. As depth maps are more sensitive to coding errors, use of transformation or approximation of edges by explicit edge modelling has impact on view synthesis quality. Moreover, lossy compression of depth map brings additional geometrical distortion to synthetic view. In this paper, we have demonstrated that encoding inter-coded depth block residuals with quantization at pixel domain is more efficient than the intra-coding techniques relying on explicit edge preservation. On standard 3D video sequences, the proposed depth coding has achieved superior image quality of synthesized views against the new 3D-HEVC standard for depth map bit-rate 0.25 bpp or higher.
随着人们对3D和多视点视频内容的需求日益增长,高效的深度数据编码成为图像和视频编码领域的关键问题。本文提出了一种利用深度图时间相关性的多种预测模式的简单深度编码方案。目前的深度编码技术主要依赖于内编码模式,无法充分利用深度图的时间冗余性和预测间深度残差的空间冗余性。深度图的特点是光滑的区域和锐利的边缘,在视图合成过程中起着重要的作用。由于深度图对编码错误更敏感,通过显式边缘建模来使用转换或近似边缘会影响视图合成质量。此外,深度图的有损压缩给合成视图带来了额外的几何畸变。在本文中,我们证明了在像素域量化编码间编码深度块残差比依赖显式边缘保存的内编码技术更有效。在标准3D视频序列上,相对于深度图比特率为0.25 bpp或更高的新3D- hevc标准,所提出的深度编码获得了较好的合成视图图像质量。
{"title":"Efficient Coding of Depth Map by Exploiting Temporal Correlation","authors":"Shampa Shahriyar, M. Murshed, Mortuza Ali, M. Paul","doi":"10.1109/DICTA.2014.7008105","DOIUrl":"https://doi.org/10.1109/DICTA.2014.7008105","url":null,"abstract":"With the growing demands for 3D and multi-view video content, efficient depth data coding becomes a vital issue in image and video coding area. In this paper, we propose a simple depth coding scheme using multiple prediction modes exploiting temporal correlation of depth map. Current depth coding techniques mostly depend on intra-coding mode that cannot get the advantage of temporal redundancy in the depth maps and higher spatial redundancy in inter-predicted depth residuals. Depth maps are characterized by smooth regions with sharp edges that play an important role in the view synthesis process. As depth maps are more sensitive to coding errors, use of transformation or approximation of edges by explicit edge modelling has impact on view synthesis quality. Moreover, lossy compression of depth map brings additional geometrical distortion to synthetic view. In this paper, we have demonstrated that encoding inter-coded depth block residuals with quantization at pixel domain is more efficient than the intra-coding techniques relying on explicit edge preservation. On standard 3D video sequences, the proposed depth coding has achieved superior image quality of synthesized views against the new 3D-HEVC standard for depth map bit-rate 0.25 bpp or higher.","PeriodicalId":146695,"journal":{"name":"2014 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116549790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
An Evaluation of Sparseness as a Criterion for Selecting Independent Component Filters, When Applied to Texture Retrieval 稀疏度评价作为选择独立分量滤波器的准则,应用于纹理检索
Nabeel Mohammed, D. Squire
In this paper we evaluate the utility of sparseness as a criterion for selecting a sub-set of independent component filters (ICF). Four sparseness measures were presented more than a decade ago by Le Borgne et al., but have since been ignored for ICF selection. In this paper we present our evaluation in the context of texture retrieval. We compare the sparseness-based method with the dispersal-based method, also proposed by Le Borgne et al., and the clustering-based method previously proposed by us. We show that the sparse filters and highly dispersed filters are quite different. In fact we show that highly dispersed filters tend to have lower sparseness. We also show that the sparse filters give better results compared to the highly dispersed filters when applied to texture retrieval. However the sparseness measures are calculated over filter response energies, making this method susceptible to choosing a redundant filter set. This issue is demonstrated and we show that ICF selected using our clustering-based method, which chooses a filter set with much lower redundancy, outperforms the sparse filters.
在本文中,我们评估了稀疏性作为选择独立分量滤波器子集(ICF)的标准的效用。十多年前,Le Borgne等人提出了四种稀疏度度量,但此后在ICF选择中被忽略。本文在纹理检索的背景下给出了我们的评价。我们将基于稀疏度的方法与Le Borgne等人提出的基于分散度的方法以及我们之前提出的基于聚类的方法进行了比较。我们发现稀疏滤波器和高度分散滤波器是完全不同的。事实上,我们表明,高度分散的滤波器往往具有较低的稀疏性。我们还表明,当应用于纹理检索时,稀疏滤波器比高度分散滤波器具有更好的结果。然而,稀疏度度量是在滤波器响应能量上计算的,使得该方法容易选择冗余滤波器集。我们证明了这个问题,并表明使用我们的基于聚类的方法选择的ICF,它选择了一个冗余度低得多的滤波器集,优于稀疏滤波器。
{"title":"An Evaluation of Sparseness as a Criterion for Selecting Independent Component Filters, When Applied to Texture Retrieval","authors":"Nabeel Mohammed, D. Squire","doi":"10.1109/DICTA.2014.7008095","DOIUrl":"https://doi.org/10.1109/DICTA.2014.7008095","url":null,"abstract":"In this paper we evaluate the utility of sparseness as a criterion for selecting a sub-set of independent component filters (ICF). Four sparseness measures were presented more than a decade ago by Le Borgne et al., but have since been ignored for ICF selection. In this paper we present our evaluation in the context of texture retrieval. We compare the sparseness-based method with the dispersal-based method, also proposed by Le Borgne et al., and the clustering-based method previously proposed by us. We show that the sparse filters and highly dispersed filters are quite different. In fact we show that highly dispersed filters tend to have lower sparseness. We also show that the sparse filters give better results compared to the highly dispersed filters when applied to texture retrieval. However the sparseness measures are calculated over filter response energies, making this method susceptible to choosing a redundant filter set. This issue is demonstrated and we show that ICF selected using our clustering-based method, which chooses a filter set with much lower redundancy, outperforms the sparse filters.","PeriodicalId":146695,"journal":{"name":"2014 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126909059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Biologically Inspired Facilitation Mechanism Enhances the Detection and Pursuit of Targets of Varying Contrast 生物学启发的促进机制增强了对不同对比度目标的检测和追求
Zahra M. Bagheri, S. Wiederman, B. Cazzolato, S. Grainger, D. O’Carroll
Many species of flying insects detect and chase prey or conspecifics within a visually cluttered surround, e.g. for predation, territorial or mating behavior. We modeled such detection and pursuit for small moving targets, and tested it within a closed-loop, virtual reality flight arena. Our model is inspired directly by electrophysiological recordings from 'small target motion detector' (STMD) neurons in the insect brain that are likely to underlie this behavioral task. The front-end uses a variant of a biologically inspired 'elementary' small target motion detector (ESTMD), elaborated to detect targets in natural scenes of both contrast polarities (i.e. both dark and light targets). We also include an additional model for the recently identified physiological 'facilitation' mechanism believed to form the basis for selective attention in insect STMDs, and quantify the improvement this provides for pursuit success and target discriminability over a range of target contrasts.
许多种类的飞行昆虫在视觉上混乱的环境中发现并追逐猎物或同种昆虫,例如为了捕食,领土或交配行为。我们为小型移动目标建立了这种探测和追踪模型,并在闭环虚拟现实飞行舞台上进行了测试。我们的模型直接受到昆虫大脑中“小目标运动检测器”(STMD)神经元的电生理记录的启发,这些记录可能是这种行为任务的基础。前端使用一种受生物学启发的“初级”小目标运动检测器(ESTMD)的变体,用于检测自然场景中两种对比极性的目标(即暗目标和亮目标)。我们还为最近发现的生理“促进”机制建立了一个额外的模型,该机制被认为是昆虫stmd选择性注意的基础,并量化了这一机制在一系列目标对比中为追求成功和目标可辨明性提供的改进。
{"title":"A Biologically Inspired Facilitation Mechanism Enhances the Detection and Pursuit of Targets of Varying Contrast","authors":"Zahra M. Bagheri, S. Wiederman, B. Cazzolato, S. Grainger, D. O’Carroll","doi":"10.1109/DICTA.2014.7008082","DOIUrl":"https://doi.org/10.1109/DICTA.2014.7008082","url":null,"abstract":"Many species of flying insects detect and chase prey or conspecifics within a visually cluttered surround, e.g. for predation, territorial or mating behavior. We modeled such detection and pursuit for small moving targets, and tested it within a closed-loop, virtual reality flight arena. Our model is inspired directly by electrophysiological recordings from 'small target motion detector' (STMD) neurons in the insect brain that are likely to underlie this behavioral task. The front-end uses a variant of a biologically inspired 'elementary' small target motion detector (ESTMD), elaborated to detect targets in natural scenes of both contrast polarities (i.e. both dark and light targets). We also include an additional model for the recently identified physiological 'facilitation' mechanism believed to form the basis for selective attention in insect STMDs, and quantify the improvement this provides for pursuit success and target discriminability over a range of target contrasts.","PeriodicalId":146695,"journal":{"name":"2014 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127369766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Image Segmentation Using Dictionary Learning and Compressed Random Features 使用字典学习和压缩随机特征的图像分割
Geoff Bull, Junbin Gao, M. Antolovich
Image segmentation seeks to partition the pixels in images into distinct regions to assist other image processing functions such as object recognition. Over the last few years dictionary learning methods have become very popular for image processing tasks such as denoising, and recently structured low rank dictionary learning has been shown to be capable of promising results for recognition tasks. This paper investigates the suitability of dictionary learning for image segmentation. A structured low rank dictionary learning algorithm is developed to segment images using compressed sensed features from image patches. To enable a supervised learning approach, classes of pixels in images are designated using training scribbles. A classifier is then learned from these training pixels and subsequently used to classify all other pixels in the images to form the segmentations. A number of dictionary learning models are compared together with K-means/nearest neighbour and support vector machine classifiers.
图像分割旨在将图像中的像素划分为不同的区域,以辅助其他图像处理功能,如对象识别。在过去的几年里,字典学习方法在图像处理任务(如去噪)中变得非常流行,最近结构化的低秩字典学习已被证明能够在识别任务中取得有希望的结果。本文研究了字典学习在图像分割中的适用性。提出了一种结构化低秩字典学习算法,利用图像补丁中压缩的感知特征对图像进行分割。为了实现监督学习方法,使用训练涂鸦来指定图像中的像素类。然后从这些训练像素中学习分类器,并随后使用分类器对图像中的所有其他像素进行分类以形成分割。将许多字典学习模型与k均值/最近邻和支持向量机分类器进行比较。
{"title":"Image Segmentation Using Dictionary Learning and Compressed Random Features","authors":"Geoff Bull, Junbin Gao, M. Antolovich","doi":"10.1109/DICTA.2014.7008112","DOIUrl":"https://doi.org/10.1109/DICTA.2014.7008112","url":null,"abstract":"Image segmentation seeks to partition the pixels in images into distinct regions to assist other image processing functions such as object recognition. Over the last few years dictionary learning methods have become very popular for image processing tasks such as denoising, and recently structured low rank dictionary learning has been shown to be capable of promising results for recognition tasks. This paper investigates the suitability of dictionary learning for image segmentation. A structured low rank dictionary learning algorithm is developed to segment images using compressed sensed features from image patches. To enable a supervised learning approach, classes of pixels in images are designated using training scribbles. A classifier is then learned from these training pixels and subsequently used to classify all other pixels in the images to form the segmentations. A number of dictionary learning models are compared together with K-means/nearest neighbour and support vector machine classifiers.","PeriodicalId":146695,"journal":{"name":"2014 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128032156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Discriminative Multi-Task Sparse Learning for Robust Visual Tracking Using Conditional Random Field 基于条件随机场的判别多任务稀疏学习鲁棒视觉跟踪
B. Bozorgtabar, Roland Göcke
In this paper, we propose a discriminative multitask sparse learning scheme for object tracking in a particle filter framework. By representing each particle as a linear combination of adaptive dictionary templates, we utilise the correlations among different particles (tasks) to obtain a better representation and a more efficient scheme than learning each task individually. However, this model is completely generative and the designed tracker may not be robust enough to prevent the drifting problem in the presence of rapid appearance changes. In this paper, we use a Conditional Random Field (CRF) along with the multitask sparse model to extend our scheme to distinguish the object candidate from the background particle candidate. By this way, the number of particle samples is reduced significantly, while we make the tracker more robust. The proposed algorithm is evaluated on 11 challenging sequences and the results confirm the effectiveness of the approach and significantly outperforms the state-of-the-art trackers in terms of accuracy measures including the centre location error and the overlap ratio, respectively.
本文提出了一种基于粒子滤波框架的判别多任务稀疏学习目标跟踪方案。通过将每个粒子表示为自适应字典模板的线性组合,我们利用不同粒子(任务)之间的相关性来获得比单独学习每个任务更好的表示和更有效的方案。然而,该模型是完全生成的,所设计的跟踪器可能不够鲁棒,无法在存在快速外观变化的情况下防止漂移问题。在本文中,我们使用条件随机场(CRF)和多任务稀疏模型来扩展我们的方案,以区分目标候选和背景候选粒子。这样,大大减少了粒子样本的数量,同时使跟踪器更加鲁棒。在11个具有挑战性的序列上对所提出的算法进行了评估,结果证实了该方法的有效性,并且在中心定位误差和重叠率方面分别显著优于最先进的跟踪器。
{"title":"Discriminative Multi-Task Sparse Learning for Robust Visual Tracking Using Conditional Random Field","authors":"B. Bozorgtabar, Roland Göcke","doi":"10.1109/DICTA.2014.7008102","DOIUrl":"https://doi.org/10.1109/DICTA.2014.7008102","url":null,"abstract":"In this paper, we propose a discriminative multitask sparse learning scheme for object tracking in a particle filter framework. By representing each particle as a linear combination of adaptive dictionary templates, we utilise the correlations among different particles (tasks) to obtain a better representation and a more efficient scheme than learning each task individually. However, this model is completely generative and the designed tracker may not be robust enough to prevent the drifting problem in the presence of rapid appearance changes. In this paper, we use a Conditional Random Field (CRF) along with the multitask sparse model to extend our scheme to distinguish the object candidate from the background particle candidate. By this way, the number of particle samples is reduced significantly, while we make the tracker more robust. The proposed algorithm is evaluated on 11 challenging sequences and the results confirm the effectiveness of the approach and significantly outperforms the state-of-the-art trackers in terms of accuracy measures including the centre location error and the overlap ratio, respectively.","PeriodicalId":146695,"journal":{"name":"2014 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125061315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Blind and Robust Video Watermarking Scheme Using Chrominance Embedding 一种基于色度嵌入的盲鲁棒视频水印方案
Md. Asikuzzaman, M. Alam, A. Lambert, M. Pickering
Piracy of a digital movie is a significant threat for movie studios and producers. Digital video watermarking is an important technique that can be used to protect the content. In existing watermarking algorithms, robustness to several attacks of the watermark has been improved. However, none of the existing techniques are robust to a combination of the common geometric distortions of scaling, rotation, and cropping with other attacks. In this paper, we propose a blind video watermarking algorithm where the watermark is embedded into both chrominance channels using a dual-tree complex wavelet transform. Embedding the watermark into the chrominance channels maintains the original video quality and the dual-tree complex wavelet transform ensures the robustness to geometric attacks due to its shift invariance characteristics. The watermark is extracted using the information from a single frame without using the original frame which makes this approach robust to temporal synchronization attacks such as frame dropping and frame rate change. This approach is also robust to downscaling in arbitrary resolution, aspect ratio change, compression, and camcording.
数字电影的盗版对电影制片厂和制片人来说是一个重大威胁。数字视频水印是一种重要的内容保护技术。在现有的水印算法中,水印对多种攻击的鲁棒性都得到了提高。然而,没有一种现有的技术对缩放、旋转和裁剪等常见几何扭曲与其他攻击的组合具有鲁棒性。本文提出了一种利用双树复小波变换将水印嵌入到两个色度通道中的盲视频水印算法。将水印嵌入到色度通道中可以保持原有的视频质量,双树复小波变换由于其移位不变性的特点保证了对几何攻击的鲁棒性。该方法利用单帧信息提取水印,不使用原始帧,对丢帧、帧率变化等时间同步攻击具有较强的鲁棒性。这种方法对于任意分辨率、宽高比变化、压缩和录制的降比例也很健壮。
{"title":"A Blind and Robust Video Watermarking Scheme Using Chrominance Embedding","authors":"Md. Asikuzzaman, M. Alam, A. Lambert, M. Pickering","doi":"10.1109/DICTA.2014.7008083","DOIUrl":"https://doi.org/10.1109/DICTA.2014.7008083","url":null,"abstract":"Piracy of a digital movie is a significant threat for movie studios and producers. Digital video watermarking is an important technique that can be used to protect the content. In existing watermarking algorithms, robustness to several attacks of the watermark has been improved. However, none of the existing techniques are robust to a combination of the common geometric distortions of scaling, rotation, and cropping with other attacks. In this paper, we propose a blind video watermarking algorithm where the watermark is embedded into both chrominance channels using a dual-tree complex wavelet transform. Embedding the watermark into the chrominance channels maintains the original video quality and the dual-tree complex wavelet transform ensures the robustness to geometric attacks due to its shift invariance characteristics. The watermark is extracted using the information from a single frame without using the original frame which makes this approach robust to temporal synchronization attacks such as frame dropping and frame rate change. This approach is also robust to downscaling in arbitrary resolution, aspect ratio change, compression, and camcording.","PeriodicalId":146695,"journal":{"name":"2014 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114949247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Latent Semantic Association for Medical Image Retrieval 医学图像检索的潜在语义关联
Fan Zhang, Yang Song, Sidong Liu, Sonia Pujol, R. Kikinis, D. Feng, Weidong (Tom) Cai
In this work, we propose a Latent Semantic Association Retrieval(LSAR) method to break the bottleneck of the low-level feature based medical image retrieval. The method constructs the high-level semantic correlations among patients based on the low-level feature set extracted from the images. Specifically, a Pair-LDA model is firstly designed to refine the topic generation process of traditional Latent Dirichlet Allocation (LDA), by generating the topics in a pair-wise context. Then, the latent association, called CCA-Correlation, is extracted to capture the correlations among the images in the Pair-LDA topic space based on Canonical Correlation Analysis (CCA). Finally, we calculate the similarity between images using the derived CCA-Correlation model and apply it to medical image retrieval. To evaluate the effectiveness of our method, we conduct the retrieval experiments on the Alzheimer's Disease Neuroimaging Initiative (ADNI) baseline cohort with 331 subjects, and our method achieves good improvement compared to the state-of-the-art medical image retrieval methods. LSAR is independent on problem domain, thus can be generally applicable to other medical or general image analysis.
在这项工作中,我们提出了一种潜在语义关联检索(LSAR)方法来打破基于低级特征的医学图像检索的瓶颈。该方法基于从图像中提取的低级特征集构建患者之间的高级语义关联。具体而言,首先设计了一个Pair-LDA模型,通过在成对上下文中生成主题来改进传统的潜在狄利克雷分配(Latent Dirichlet Allocation, LDA)的主题生成过程。然后,基于典型相关分析(Canonical Correlation Analysis, CCA),提取潜在关联,即CCA-Correlation,以捕获Pair-LDA主题空间中图像之间的相关性。最后,利用推导的CCA-Correlation模型计算图像之间的相似度,并将其应用于医学图像检索。为了评估我们方法的有效性,我们对331名受试者进行了阿尔茨海默病神经影像学倡议(ADNI)基线队列的检索实验,与目前最先进的医学图像检索方法相比,我们的方法取得了较好的改进。LSAR不依赖于问题域,因此可以普遍适用于其他医学或一般图像分析。
{"title":"Latent Semantic Association for Medical Image Retrieval","authors":"Fan Zhang, Yang Song, Sidong Liu, Sonia Pujol, R. Kikinis, D. Feng, Weidong (Tom) Cai","doi":"10.1109/DICTA.2014.7008114","DOIUrl":"https://doi.org/10.1109/DICTA.2014.7008114","url":null,"abstract":"In this work, we propose a Latent Semantic Association Retrieval(LSAR) method to break the bottleneck of the low-level feature based medical image retrieval. The method constructs the high-level semantic correlations among patients based on the low-level feature set extracted from the images. Specifically, a Pair-LDA model is firstly designed to refine the topic generation process of traditional Latent Dirichlet Allocation (LDA), by generating the topics in a pair-wise context. Then, the latent association, called CCA-Correlation, is extracted to capture the correlations among the images in the Pair-LDA topic space based on Canonical Correlation Analysis (CCA). Finally, we calculate the similarity between images using the derived CCA-Correlation model and apply it to medical image retrieval. To evaluate the effectiveness of our method, we conduct the retrieval experiments on the Alzheimer's Disease Neuroimaging Initiative (ADNI) baseline cohort with 331 subjects, and our method achieves good improvement compared to the state-of-the-art medical image retrieval methods. LSAR is independent on problem domain, thus can be generally applicable to other medical or general image analysis.","PeriodicalId":146695,"journal":{"name":"2014 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"26 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120893411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A Robust Framework for 2D Human Pose Tracking with Spatial and Temporal Constraints 基于时空约束的二维人体姿态跟踪鲁棒框架
Jinglan Tian, Ling Li, Wanquan Liu
We work on the task of 2D articulated human pose tracking in monocular image sequences, an extremely challenging task due to background cluttering, variation in body appearance, occlusion and imaging conditions. Most of current approaches only deal with simple appearance and adjacent body part dependencies, especially the Gaussian tree-structured priors assumed over body part connections. Such prior makes the part connections independent to image evidence and in turn severely limits accuracy. Building on the successful pictorial structures model, we propose a novel framework combining an image-conditioned model that incorporates higher order dependencies of multiple body parts. In order to establish the conditioning variables, we employ the effective poselet features. In addition to this, we introduce a full body detector as the first step of our framework to reduce the search space for pose tracking. We evaluate our framework on two challenging image sequences and conduct a series of comparison experiments to compare the performance with another two approaches. The results illustrate that the proposed framework in this work outperforms the state-of-the-art 2D pose tracking systems.
我们的工作是在单眼图像序列中进行2D关节人体姿态跟踪的任务,这是一项极具挑战性的任务,因为背景杂乱,身体外观变化,遮挡和成像条件。目前大多数方法只处理简单的外观和相邻身体部位的依赖关系,特别是高斯树结构先验假设身体部位的连接。这种先验使得零件连接独立于图像证据,从而严重限制了准确性。在成功的图像结构模型的基础上,我们提出了一个结合图像条件模型的新框架,该模型包含了多个身体部位的高阶依赖关系。为了建立条件变量,我们采用了有效集特征。除此之外,我们引入了一个全身探测器作为我们框架的第一步,以减少姿态跟踪的搜索空间。我们在两个具有挑战性的图像序列上评估了我们的框架,并进行了一系列的比较实验,以比较其他两种方法的性能。结果表明,在这项工作中提出的框架优于最先进的2D姿态跟踪系统。
{"title":"A Robust Framework for 2D Human Pose Tracking with Spatial and Temporal Constraints","authors":"Jinglan Tian, Ling Li, Wanquan Liu","doi":"10.1109/DICTA.2014.7008091","DOIUrl":"https://doi.org/10.1109/DICTA.2014.7008091","url":null,"abstract":"We work on the task of 2D articulated human pose tracking in monocular image sequences, an extremely challenging task due to background cluttering, variation in body appearance, occlusion and imaging conditions. Most of current approaches only deal with simple appearance and adjacent body part dependencies, especially the Gaussian tree-structured priors assumed over body part connections. Such prior makes the part connections independent to image evidence and in turn severely limits accuracy. Building on the successful pictorial structures model, we propose a novel framework combining an image-conditioned model that incorporates higher order dependencies of multiple body parts. In order to establish the conditioning variables, we employ the effective poselet features. In addition to this, we introduce a full body detector as the first step of our framework to reduce the search space for pose tracking. We evaluate our framework on two challenging image sequences and conduct a series of comparison experiments to compare the performance with another two approaches. The results illustrate that the proposed framework in this work outperforms the state-of-the-art 2D pose tracking systems.","PeriodicalId":146695,"journal":{"name":"2014 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"373 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131646295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Efficient People Counting with Limited Manual Interferences 人工干扰少,计数效率高
Jingsong Xu, Qiang Wu, Jian Zhang, B. Silk, Gia Thuan Ngo, Zhenmin Tang
People counting is a topic with various practical applications. Over the last decade, two general approaches have been proposed to tackle this problem: (a) counting based on individual human detection; (b)counting by measuring regression relation between the crowd density and number of people. Because the regression based method can avoid explicit people detection which faces several well-known challenges, it has been considered as a robust method particularly on a complicated environments. An efficient regression based method is proposed in this paper, which can be well adopted into any existing video surveillance system. It adopts color based segmentation to extract foreground regions in images. Regression is established based on the foreground density and the number of people. This method is fast and can deal with lighting condition changes. Experiments on public datasets and one captured dataset have shown the effectiveness and robustness of the method.
人口统计是一个有多种实际应用的课题。在过去十年中,已经提出了两种一般方法来解决这个问题:(a)基于个人检测的计数;(b)通过测量人群密度与人数的回归关系进行计数。由于基于回归的方法可以避免显式人员检测所面临的几个众所周知的挑战,因此被认为是一种鲁棒的方法,特别是在复杂的环境下。本文提出了一种有效的基于回归的视频监控方法,可以很好地应用于现有的视频监控系统中。它采用基于颜色的分割来提取图像中的前景区域。根据前景密度和人数建立回归。该方法速度快,可以处理光照条件的变化。在公共数据集和一个捕获数据集上的实验表明了该方法的有效性和鲁棒性。
{"title":"Efficient People Counting with Limited Manual Interferences","authors":"Jingsong Xu, Qiang Wu, Jian Zhang, B. Silk, Gia Thuan Ngo, Zhenmin Tang","doi":"10.1109/DICTA.2014.7008106","DOIUrl":"https://doi.org/10.1109/DICTA.2014.7008106","url":null,"abstract":"People counting is a topic with various practical applications. Over the last decade, two general approaches have been proposed to tackle this problem: (a) counting based on individual human detection; (b)counting by measuring regression relation between the crowd density and number of people. Because the regression based method can avoid explicit people detection which faces several well-known challenges, it has been considered as a robust method particularly on a complicated environments. An efficient regression based method is proposed in this paper, which can be well adopted into any existing video surveillance system. It adopts color based segmentation to extract foreground regions in images. Regression is established based on the foreground density and the number of people. This method is fast and can deal with lighting condition changes. Experiments on public datasets and one captured dataset have shown the effectiveness and robustness of the method.","PeriodicalId":146695,"journal":{"name":"2014 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"76 7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127392613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Experimental Study of Unsupervised Feature Learning for HEp-2 Cell Images Clustering HEp-2细胞图像聚类的无监督特征学习实验研究
Yan Zhao, Zhimin Gao, Lei Wang, Luping Zhou
Automatic identification of HEp-2 cell images has received an increasing research attention. Feature representations play a critical role in achieving good identification performance. Much recent work has focused on supervised feature learning. Typical methods consist of BoW model (based on hand-crafted features) and deep learning model (learning hierarchical features). However, these labels used in supervised feature learning are very labour-intensive and time-consuming. They are commonly manually annotated by specialists and very expensive to obtain. In this paper, we follow this fact and focus on unsupervised feature learning. We have verified and compared the features of these two typical models by clustering. Experimental results show the BoW model generally perform better than deep learning models. Also, we illustrate BoW model and deep learning models have complementarity properties.
HEp-2细胞图像的自动识别越来越受到人们的关注。特征表示在获得良好的识别性能方面起着至关重要的作用。最近的许多工作都集中在监督特征学习上。典型的方法包括BoW模型(基于手工特征)和深度学习模型(学习分层特征)。然而,在监督特征学习中使用的这些标签是非常耗费人力和时间的。它们通常由专家手工注释,获取起来非常昂贵。在本文中,我们遵循这一事实,专注于无监督特征学习。我们通过聚类对这两种典型模型的特征进行了验证和比较。实验结果表明,BoW模型总体上优于深度学习模型。此外,我们还说明了BoW模型和深度学习模型具有互补性。
{"title":"Experimental Study of Unsupervised Feature Learning for HEp-2 Cell Images Clustering","authors":"Yan Zhao, Zhimin Gao, Lei Wang, Luping Zhou","doi":"10.1109/DICTA.2014.7008108","DOIUrl":"https://doi.org/10.1109/DICTA.2014.7008108","url":null,"abstract":"Automatic identification of HEp-2 cell images has received an increasing research attention. Feature representations play a critical role in achieving good identification performance. Much recent work has focused on supervised feature learning. Typical methods consist of BoW model (based on hand-crafted features) and deep learning model (learning hierarchical features). However, these labels used in supervised feature learning are very labour-intensive and time-consuming. They are commonly manually annotated by specialists and very expensive to obtain. In this paper, we follow this fact and focus on unsupervised feature learning. We have verified and compared the features of these two typical models by clustering. Experimental results show the BoW model generally perform better than deep learning models. Also, we illustrate BoW model and deep learning models have complementarity properties.","PeriodicalId":146695,"journal":{"name":"2014 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"151 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131746282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
2014 International Conference on Digital Image Computing: Techniques and Applications (DICTA)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1