首页 > 最新文献

2015 IEEE International Conference on Multimedia and Expo (ICME)最新文献

英文 中文
Learning class-specific pooling shapes for image classification 学习用于图像分类的特定类池形状
Pub Date : 2015-06-01 DOI: 10.1109/ICME.2015.7177433
Jinzhuo Wang, Wenmin Wang, Ronggang Wang, Wen Gao
Spatial pyramid (SP) representation is an extension of bag-of-feature model which embeds spatial layout information of local features by pooling feature codes over pre-defined spatial shapes. However, the uniform style of spatial pooling shapes used in standard SP is an ad-hoc manner without theoretical motivation, thus lacking the generalization power to adapt to different distribution of geometric properties across image classes. In this paper, we propose a data-driven approach to adaptively learn class-specific pooling shapes (CSPS). Specifically, we first establish an over-complete set of spatial shapes providing candidates with more flexible geometric patterns. Then the optimal subset for each class is selected by training a linear classifier with structured sparsity constraint and color distribution cues. To further enhance the robust of our model, the representations over CSPS are compressed according to the shape importance and finally fed to SVM with a multi-shape matching kernel for classification task. Experimental results on three challenging datasets (Caltech-256, Scene-15 and Indoor-67) demonstrate the effectiveness of the proposed method on both object and scene images.
空间金字塔(SP)表示是特征袋模型的扩展,它通过在预定义的空间形状上汇集特征代码来嵌入局部特征的空间布局信息。然而,标准SP中使用的统一风格的空间池形状是一种没有理论动机的临时方式,因此缺乏适应不同图像类别之间几何属性分布的泛化能力。在本文中,我们提出了一种数据驱动的方法来自适应学习类特定池形状(CSPS)。具体来说,我们首先建立了一个超完整的空间形状集,为候选对象提供了更灵活的几何图案。然后通过训练一个具有结构化稀疏性约束和颜色分布线索的线性分类器来选择每个类的最优子集。为了进一步增强模型的鲁棒性,根据形状重要度对CSPS上的表示进行压缩,最后通过多形状匹配核将其输入支持向量机进行分类。在三个具有挑战性的数据集(Caltech-256、scene -15和Indoor-67)上的实验结果表明,该方法对物体和场景图像都是有效的。
{"title":"Learning class-specific pooling shapes for image classification","authors":"Jinzhuo Wang, Wenmin Wang, Ronggang Wang, Wen Gao","doi":"10.1109/ICME.2015.7177433","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177433","url":null,"abstract":"Spatial pyramid (SP) representation is an extension of bag-of-feature model which embeds spatial layout information of local features by pooling feature codes over pre-defined spatial shapes. However, the uniform style of spatial pooling shapes used in standard SP is an ad-hoc manner without theoretical motivation, thus lacking the generalization power to adapt to different distribution of geometric properties across image classes. In this paper, we propose a data-driven approach to adaptively learn class-specific pooling shapes (CSPS). Specifically, we first establish an over-complete set of spatial shapes providing candidates with more flexible geometric patterns. Then the optimal subset for each class is selected by training a linear classifier with structured sparsity constraint and color distribution cues. To further enhance the robust of our model, the representations over CSPS are compressed according to the shape importance and finally fed to SVM with a multi-shape matching kernel for classification task. Experimental results on three challenging datasets (Caltech-256, Scene-15 and Indoor-67) demonstrate the effectiveness of the proposed method on both object and scene images.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122227918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Shape description using phase-preserving Fourier descriptor 用保相傅里叶描述子进行形状描述
Pub Date : 2015-06-01 DOI: 10.1109/ICME.2015.7177425
E. Sokic, S. Konjicija
Contour-based Fourier descriptors are established as a simple and effective shape description method for content-based image retrieval. In order to achieve invariance under rotation and starting point change, most Fourier descriptor implementations disregard the phase of the Fourier coefficients. We introduce a novel method for extracting Fourier descriptors, which preserve the phase of Fourier coefficients and have the desired invariance. We propose specific points, called pseudomirror points, to be used as shape orientation reference. Experimental results indicate that the proposed method significantly outperforms other Fourier descriptor based techniques.
基于轮廓的傅里叶描述子是一种简单有效的基于内容的图像检索的形状描述方法。为了实现旋转和起始点变化下的不变性,大多数傅立叶描述子实现忽略了傅立叶系数的相位。提出了一种提取傅里叶描述子的新方法,该方法既保留了傅里叶系数的相位,又具有期望的不变性。我们提出了特定的点,称为伪镜像点,作为形状取向参考。实验结果表明,该方法明显优于其他基于傅里叶描述子的方法。
{"title":"Shape description using phase-preserving Fourier descriptor","authors":"E. Sokic, S. Konjicija","doi":"10.1109/ICME.2015.7177425","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177425","url":null,"abstract":"Contour-based Fourier descriptors are established as a simple and effective shape description method for content-based image retrieval. In order to achieve invariance under rotation and starting point change, most Fourier descriptor implementations disregard the phase of the Fourier coefficients. We introduce a novel method for extracting Fourier descriptors, which preserve the phase of Fourier coefficients and have the desired invariance. We propose specific points, called pseudomirror points, to be used as shape orientation reference. Experimental results indicate that the proposed method significantly outperforms other Fourier descriptor based techniques.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"637 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115114015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
3D ear identification using LC-KSVD and local histograms of surface types 利用LC-KSVD和局部直方图的表面类型进行三维耳识别
Pub Date : 2015-06-01 DOI: 10.1109/ICME.2015.7177475
Lida Li, Lin Zhang, Hongyu Li
In this paper, we propose a novel 3D ear classification scheme, making use of the label consistent K-SVD (LC-KSVD) framework. As an effective supervised dictionary learning algorithm, LC-KSVD learns a compact discriminative dictionary for sparse coding and a multi-class linear classifier simultaneously. To use LC-KSVD, one key issue is how to extract feature vectors from 3D ear scans. To this end, we propose a block-wise statistics based scheme. Specifically, we divide a 3D ear ROI into blocks and extract a histogram of surface types from each block; histograms from all blocks are concatenated to form the desired feature vector. Feature vectors extracted in this way are highly discriminative and are robust to mere misalignment. Experimental results demonstrate that the proposed approach can achieve much better recognition accuracy than the other state-of-the-art methods. More importantly, its computational complexity is extremely low at the classification stage.
本文提出了一种新的基于标签一致K-SVD (LC-KSVD)框架的三维耳朵分类方案。LC-KSVD是一种有效的监督式字典学习算法,它可以同时学习用于稀疏编码的紧凑判别字典和多类线性分类器。使用LC-KSVD的一个关键问题是如何从三维耳扫描中提取特征向量。为此,我们提出了一种基于分块统计的方案。具体来说,我们将3D耳朵ROI分成多个块,并从每个块中提取表面类型的直方图;将所有块的直方图连接起来形成所需的特征向量。用这种方法提取的特征向量具有很强的判别性,对单纯的不对齐具有很强的鲁棒性。实验结果表明,该方法具有较好的识别精度。更重要的是,它在分类阶段的计算复杂度极低。
{"title":"3D ear identification using LC-KSVD and local histograms of surface types","authors":"Lida Li, Lin Zhang, Hongyu Li","doi":"10.1109/ICME.2015.7177475","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177475","url":null,"abstract":"In this paper, we propose a novel 3D ear classification scheme, making use of the label consistent K-SVD (LC-KSVD) framework. As an effective supervised dictionary learning algorithm, LC-KSVD learns a compact discriminative dictionary for sparse coding and a multi-class linear classifier simultaneously. To use LC-KSVD, one key issue is how to extract feature vectors from 3D ear scans. To this end, we propose a block-wise statistics based scheme. Specifically, we divide a 3D ear ROI into blocks and extract a histogram of surface types from each block; histograms from all blocks are concatenated to form the desired feature vector. Feature vectors extracted in this way are highly discriminative and are robust to mere misalignment. Experimental results demonstrate that the proposed approach can achieve much better recognition accuracy than the other state-of-the-art methods. More importantly, its computational complexity is extremely low at the classification stage.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130730202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Performance evaluation of the 1st and 2nd generation Kinect for multimedia applications 第一代和第二代Kinect多媒体应用性能评估
Pub Date : 2015-06-01 DOI: 10.1109/ICME.2015.7177380
Simone Zennaro, Matteo Munaro, S. Milani, P. Zanuttigh, A. Bernardi, S. Ghidoni, E. Menegatti
Microsoft Kinect had a key role in the development of consumer depth sensors being the device that brought depth acquisition to the mass market. Despite the success of this sensor, with the introduction of the second generation, Microsoft has completely changed the technology behind the sensor from structured light to Time-Of-Flight. This paper presents a comparison of the data provided by the first and second generation Kinect in order to explain the achievements that have been obtained with the switch of technology. After an accurate analysis of the accuracy of the two sensors under different conditions, two sample applications, i.e., 3D reconstruction and people tracking, are presented and used to compare the performance of the two sensors.
微软Kinect在消费者深度传感器的发展中发挥了关键作用,它将深度采集技术带入了大众市场。尽管这款传感器取得了成功,但随着第二代传感器的推出,微软已经完全改变了传感器背后的技术,从结构光到飞行时间。本文将第一代和第二代Kinect所提供的数据进行比较,以说明随着技术的切换所取得的成就。在准确分析了两种传感器在不同条件下的精度后,提出了两种应用示例,即三维重建和人员跟踪,并使用它们来比较两种传感器的性能。
{"title":"Performance evaluation of the 1st and 2nd generation Kinect for multimedia applications","authors":"Simone Zennaro, Matteo Munaro, S. Milani, P. Zanuttigh, A. Bernardi, S. Ghidoni, E. Menegatti","doi":"10.1109/ICME.2015.7177380","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177380","url":null,"abstract":"Microsoft Kinect had a key role in the development of consumer depth sensors being the device that brought depth acquisition to the mass market. Despite the success of this sensor, with the introduction of the second generation, Microsoft has completely changed the technology behind the sensor from structured light to Time-Of-Flight. This paper presents a comparison of the data provided by the first and second generation Kinect in order to explain the achievements that have been obtained with the switch of technology. After an accurate analysis of the accuracy of the two sensors under different conditions, two sample applications, i.e., 3D reconstruction and people tracking, are presented and used to compare the performance of the two sensors.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133731346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 125
Object tracking using structure-aware binary features 使用结构感知二进制特征的对象跟踪
Pub Date : 2015-06-01 DOI: 10.1109/ICME.2015.7177407
Haoyu Ren, Ze-Nian Li
Object tracking is one of the most important components in numerous applications of computer vision. In this paper, the target is represented by a series of binary patterns, where each binary pattern consists of several rectangle pairs in variable size and location. As complementary to traditional binary descriptors, these patterns are extracted in both the intensity domain and the gradient domain. In the tracking process, the RealAdaBoost algorithm is adopted frame by frame to select the meaningful patterns while considering the discriminative ability and the robustness. This is achieved by a penalty term based on the classification margin and structural diversity. As a result, the features good at describing the target and robust to noises will be selected. Experimental results on 10 challenging video sequences demonstrate that the tracking accuracy is significantly improved compared to traditional binary descriptors. It also achieves competitive results with the commonly-used algorithms.
目标跟踪是计算机视觉众多应用中最重要的组成部分之一。在本文中,目标由一系列二进制模式表示,其中每个二进制模式由几个大小和位置可变的矩形对组成。作为传统二元描述符的补充,这些模式可以在强度域和梯度域提取。在跟踪过程中,采用RealAdaBoost算法逐帧选取有意义的模式,同时考虑识别能力和鲁棒性。这是通过基于分类裕度和结构多样性的惩罚条款来实现的。因此,将选择对目标具有良好描述能力和对噪声具有鲁棒性的特征。在10个具有挑战性的视频序列上的实验结果表明,与传统的二值描述符相比,该方法的跟踪精度得到了显著提高。并取得了与常用算法相媲美的结果。
{"title":"Object tracking using structure-aware binary features","authors":"Haoyu Ren, Ze-Nian Li","doi":"10.1109/ICME.2015.7177407","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177407","url":null,"abstract":"Object tracking is one of the most important components in numerous applications of computer vision. In this paper, the target is represented by a series of binary patterns, where each binary pattern consists of several rectangle pairs in variable size and location. As complementary to traditional binary descriptors, these patterns are extracted in both the intensity domain and the gradient domain. In the tracking process, the RealAdaBoost algorithm is adopted frame by frame to select the meaningful patterns while considering the discriminative ability and the robustness. This is achieved by a penalty term based on the classification margin and structural diversity. As a result, the features good at describing the target and robust to noises will be selected. Experimental results on 10 challenging video sequences demonstrate that the tracking accuracy is significantly improved compared to traditional binary descriptors. It also achieves competitive results with the commonly-used algorithms.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"66 14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133444244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A robust real time system for remote heart rate measurement via camera 一个强大的实时系统,远程心率测量通过相机
Pub Date : 2015-06-01 DOI: 10.1109/ICME.2015.7177484
D. N. Tran, Hyukzae Lee, Changick Kim
Heart rate (HR) is an important indicator of human health status. Traditional heart rate measurement methods rely on contact-based sensors or electrodes, which are inconvenient and troublesome for users. Remote sensing of the photoplephysmography (PPG) signal using a video camera provides a promising means to monitor vital signs of people without the need of any physical contact. However, until recently, most of the literature papers approaching this problem have only reported results from off-line recording videos taken under well controlled environments. In this paper, we propose a method to improve HR measurement accuracy under challenging environments involving factors such as subjects movement, complicated facial models (i.e., hair, glass, beards, etc.), subjects' distance to camera, and low illumination condition. We also build a framework for real-time measuring system and construct a stable model for recording and displaying results for long term heart rate monitoring. We tested our system on challenging dataset, and demonstrated that our method not only deals with real-time, on-line measurement tasks, but also outperforms others' works.
心率(HR)是衡量人体健康状况的重要指标。传统的心率测量方法依赖于接触式传感器或电极,这对用户来说既不方便又麻烦。利用摄像机对光电呼气图(PPG)信号进行遥感监测,是一种很有前途的方法,可以在不需要任何身体接触的情况下监测人的生命体征。然而,直到最近,大多数研究这一问题的文献论文都只报道了在良好控制环境下拍摄的离线录制视频的结果。在本文中,我们提出了一种在具有挑战性的环境下提高HR测量精度的方法,这些环境涉及受试者的运动、复杂的面部模型(如头发、玻璃、胡须等)、受试者与相机的距离以及低光照条件。构建了实时测量系统的框架,并构建了稳定的模型,用于记录和显示长期心率监测的结果。我们在具有挑战性的数据集上测试了我们的系统,并证明我们的方法不仅可以处理实时,在线的测量任务,而且优于其他人的工作。
{"title":"A robust real time system for remote heart rate measurement via camera","authors":"D. N. Tran, Hyukzae Lee, Changick Kim","doi":"10.1109/ICME.2015.7177484","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177484","url":null,"abstract":"Heart rate (HR) is an important indicator of human health status. Traditional heart rate measurement methods rely on contact-based sensors or electrodes, which are inconvenient and troublesome for users. Remote sensing of the photoplephysmography (PPG) signal using a video camera provides a promising means to monitor vital signs of people without the need of any physical contact. However, until recently, most of the literature papers approaching this problem have only reported results from off-line recording videos taken under well controlled environments. In this paper, we propose a method to improve HR measurement accuracy under challenging environments involving factors such as subjects movement, complicated facial models (i.e., hair, glass, beards, etc.), subjects' distance to camera, and low illumination condition. We also build a framework for real-time measuring system and construct a stable model for recording and displaying results for long term heart rate monitoring. We tested our system on challenging dataset, and demonstrated that our method not only deals with real-time, on-line measurement tasks, but also outperforms others' works.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"23 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133291896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
EMG based rehabilitation systems - approaches for ALS patients in different stages 肌电图为基础的康复系统-不同阶段肌萎缩侧索硬化症患者的方法
Pub Date : 2015-06-01 DOI: 10.1109/ICME.2015.7177398
Yu-Lin Wang, A. Su, Tseng-Ying Han, Ching-Lun Lin, Ling-Chi Hsu
For the patients suffering from amyotrophic lateral sclerosis (ALS), they are encouraged to exercise their bodies routinely to prevent or delay the paralysis of muscles. This work proposes an electromyogram (EMG) based ALS rehabilitation system via playing computer games. The multi-channel EMG measuring system and the controlled interfaces to computer games were developed. According to the symptoms of disability, the controls from different muscles are designed. For ALS patients in the early stage, the EMG electrodes were placed on the forearm to detect the finger gestures; for ALS patients in the middle stage, the EMG signals of upper extremity were employed to detect the hand gestures and arm moving; for the late ALS stage, the EMG electrodes were placed on chin to detect the facial expression. A commercial video game as well as a self-modified computer game are utilized in our rehabilitation systems. We believe that the patients are more preferable to exercise their bodies in a form of entertainment.
对于患有肌萎缩性侧索硬化症(ALS)的患者,鼓励他们经常锻炼身体,以防止或延缓肌肉瘫痪。本研究提出一种基于肌电图的肌萎缩侧索硬化症(ALS)电脑游戏康复系统。开发了多通道肌电测量系统和电脑游戏控制接口。根据残疾的症状,设计不同肌肉的控制。对于早期ALS患者,将肌电图电极置于前臂,检测手指手势;对中期肌萎缩侧索硬化症患者,采用上肢肌电信号检测手势和手臂运动;晚期肌萎缩侧索硬化症患者在颏部放置肌电图电极,检测面部表情。在我们的康复系统中使用了商业视频游戏和自我修改的电脑游戏。我们认为,患者更愿意以娱乐的形式来锻炼身体。
{"title":"EMG based rehabilitation systems - approaches for ALS patients in different stages","authors":"Yu-Lin Wang, A. Su, Tseng-Ying Han, Ching-Lun Lin, Ling-Chi Hsu","doi":"10.1109/ICME.2015.7177398","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177398","url":null,"abstract":"For the patients suffering from amyotrophic lateral sclerosis (ALS), they are encouraged to exercise their bodies routinely to prevent or delay the paralysis of muscles. This work proposes an electromyogram (EMG) based ALS rehabilitation system via playing computer games. The multi-channel EMG measuring system and the controlled interfaces to computer games were developed. According to the symptoms of disability, the controls from different muscles are designed. For ALS patients in the early stage, the EMG electrodes were placed on the forearm to detect the finger gestures; for ALS patients in the middle stage, the EMG signals of upper extremity were employed to detect the hand gestures and arm moving; for the late ALS stage, the EMG electrodes were placed on chin to detect the facial expression. A commercial video game as well as a self-modified computer game are utilized in our rehabilitation systems. We believe that the patients are more preferable to exercise their bodies in a form of entertainment.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127831787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
GOMES: A group-aware multi-view fusion approach towards real-world image clustering GOMES:面向真实世界图像聚类的群体感知多视图融合方法
Pub Date : 2015-06-01 DOI: 10.1109/ICME.2015.7177392
Zhe Xue, Guorong Li, Shuhui Wang, Chunjie Zhang, W. Zhang, Qingming Huang
Different features describe different views of visual appearance, multi-view based methods can integrate the information contained in each view and improve the image clustering performance. Most of the existing methods assume that the importance of one type of feature is the same to all the data. However, the visual appearance of images are different, so the description abilities of different features vary with different images. To solve this problem, we propose a group-aware multi-view fusion approach. Images are partitioned into groups which consist of several images sharing similar visual appearance. We assign different weights to evaluate the pairwise similarity between different groups. Then the clustering results and the fusion weights are learned by an iterative optimization procedure. Experimental results indicate that our approach achieves promising clustering performance compared with the existing methods.
不同的特征描述了视觉外观的不同视图,基于多视图的方法可以整合每个视图中包含的信息,提高图像聚类性能。现有的大多数方法都假设一种特征对所有数据的重要性是相同的。然而,图像的视觉外观是不同的,因此不同图像对不同特征的描述能力也不同。为了解决这一问题,我们提出了一种群体感知的多视图融合方法。图像被分成几组,每组由几个具有相似视觉外观的图像组成。我们分配不同的权重来评估不同组之间的两两相似性。然后通过迭代优化过程学习聚类结果和融合权值。实验结果表明,与现有的聚类方法相比,我们的方法取得了良好的聚类性能。
{"title":"GOMES: A group-aware multi-view fusion approach towards real-world image clustering","authors":"Zhe Xue, Guorong Li, Shuhui Wang, Chunjie Zhang, W. Zhang, Qingming Huang","doi":"10.1109/ICME.2015.7177392","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177392","url":null,"abstract":"Different features describe different views of visual appearance, multi-view based methods can integrate the information contained in each view and improve the image clustering performance. Most of the existing methods assume that the importance of one type of feature is the same to all the data. However, the visual appearance of images are different, so the description abilities of different features vary with different images. To solve this problem, we propose a group-aware multi-view fusion approach. Images are partitioned into groups which consist of several images sharing similar visual appearance. We assign different weights to evaluate the pairwise similarity between different groups. Then the clustering results and the fusion weights are learned by an iterative optimization procedure. Experimental results indicate that our approach achieves promising clustering performance compared with the existing methods.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"192 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129963614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Compound image compression using lossless and lossy LZMA in HEVC 在HEVC中使用无损和有损LZMA进行复合图像压缩
Pub Date : 2015-06-01 DOI: 10.1109/ICME.2015.7177430
Cuiling Lan, Jizheng Xu, Wenjun Zeng, Feng Wu
We present a compound image compression scheme based on the dictionary-based Lempel-Ziv-Markov chain algorithm (LZMA), under the framework of High Efficiency Video Coding (HEVC). Through matching strings from the sliding window dictionary, LZMA exploits the characteristics of the repeated patterns over the text and graphics regions of compound images, and represents them compactly. To obtain high compression efficiency even for noisy text and graphics contents, we have modified LZMA to support both lossless and lossy compression. We develop and treat it as a new intramode of HEVC. Experimental results show that the proposed scheme achieves significant coding gains for compound image compression. Thanks to the introduction of the lossy LZMA, the compression performance for noisy compound images is improved for more than 5dB in terms of PSNR in comparison with the lossless LZMA scheme.
在高效视频编码(HEVC)框架下,提出了一种基于字典的Lempel-Ziv-Markov链算法(LZMA)的复合图像压缩方案。LZMA通过对滑动窗口字典中的字符串进行匹配,利用复合图像文本和图形区域上重复模式的特征,并将其紧凑地表示出来。为了在有噪声的文本和图形内容中获得较高的压缩效率,我们对LZMA进行了修改,使其同时支持无损和有损压缩。我们开发并将其作为HEVC的一种新的内模。实验结果表明,该方法在复合图像压缩中取得了显著的编码增益。由于引入了有损LZMA,与无损LZMA方案相比,噪声复合图像的压缩性能提高了5dB以上。
{"title":"Compound image compression using lossless and lossy LZMA in HEVC","authors":"Cuiling Lan, Jizheng Xu, Wenjun Zeng, Feng Wu","doi":"10.1109/ICME.2015.7177430","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177430","url":null,"abstract":"We present a compound image compression scheme based on the dictionary-based Lempel-Ziv-Markov chain algorithm (LZMA), under the framework of High Efficiency Video Coding (HEVC). Through matching strings from the sliding window dictionary, LZMA exploits the characteristics of the repeated patterns over the text and graphics regions of compound images, and represents them compactly. To obtain high compression efficiency even for noisy text and graphics contents, we have modified LZMA to support both lossless and lossy compression. We develop and treat it as a new intramode of HEVC. Experimental results show that the proposed scheme achieves significant coding gains for compound image compression. Thanks to the introduction of the lossy LZMA, the compression performance for noisy compound images is improved for more than 5dB in terms of PSNR in comparison with the lossless LZMA scheme.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115695835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
3D video coding using motion information and depth map 使用运动信息和深度图的3D视频编码
Pub Date : 2015-06-01 DOI: 10.1109/ICME.2015.7177431
Fei Cheng, Jimin Xiao, T. Tillo
In this paper, a motion-information-based 3D video coding method is proposed for the texture plus depth 3D video format. The synchronized global motion information of camcorder is sampled to assist the encoder to improve its rate-distortion performance. This approach works by projecting temporal previous frames into the position of the current frame using the depth and motion information. These projected frames are added in the reference buffer as virtual reference frames. As these virtual reference frames are more similar to the current frame than the conventional reference frames, the required residual information is reduced. The experimental results demonstrate that the proposed scheme enhances the coding performance in various motion conditions including rotational and translational motions.
针对纹理加深度的三维视频格式,提出了一种基于运动信息的三维视频编码方法。对摄像机的同步全局运动信息进行采样,以帮助编码器改善其率失真性能。该方法通过使用深度和运动信息将前一帧的时间投影到当前帧的位置。这些投影帧被添加到参考缓冲区中作为虚拟参考帧。由于这些虚拟参考帧比传统参考帧更接近于当前帧,因此减少了所需的残差信息。实验结果表明,该方案在旋转运动和平移运动等多种运动条件下都能提高编码性能。
{"title":"3D video coding using motion information and depth map","authors":"Fei Cheng, Jimin Xiao, T. Tillo","doi":"10.1109/ICME.2015.7177431","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177431","url":null,"abstract":"In this paper, a motion-information-based 3D video coding method is proposed for the texture plus depth 3D video format. The synchronized global motion information of camcorder is sampled to assist the encoder to improve its rate-distortion performance. This approach works by projecting temporal previous frames into the position of the current frame using the depth and motion information. These projected frames are added in the reference buffer as virtual reference frames. As these virtual reference frames are more similar to the current frame than the conventional reference frames, the required residual information is reduced. The experimental results demonstrate that the proposed scheme enhances the coding performance in various motion conditions including rotational and translational motions.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"42 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131273621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2015 IEEE International Conference on Multimedia and Expo (ICME)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1