首页 > 最新文献

2015 IEEE International Conference on Multimedia and Expo (ICME)最新文献

英文 中文
Discontinuous seam cutting for enhanced video stitching 为增强视频拼接不连续的接缝切割
Pub Date : 2015-06-01 DOI: 10.1109/ICME.2015.7177506
Jie Hu, Dong-Qing Zhang, H. H. Yu, Chang Wen Chen
Video stitching requires proper seam cutting technique to decide the boundary of the sub video volume cropped from source videos. In theory, approaches such as 3D graph-cuts that search the entire spatiotemporal volume for a cutting surface should provide the best results. However, given the tremendous data size of the camera array video source, the 3D graph-cuts algorithm is extremely resource-demanding and impractical. In this paper, we propose a sequential seam cutting scheme, which is a dynamic programming algorithm that scans the source videos frame-by-frame, updates the pixels' spatiotemporal constraints, and gradually builds the cutting surface in low space complexity. The proposed scheme features flexible seam finding conditions based on temporal and spatial coherence as well as salience. Experimental results show that by relaxing the seam continuity constraint, the proposed video stitching can better handle abrupt motions or sharp edges in the source, reduce stitching artifacts, and render enhanced visual quality.
视频拼接需要适当的拼接技术来确定从源视频中裁剪出来的子视频体的边界。理论上,诸如搜索整个时空体以寻找切割面的3D图形切割等方法应该能提供最好的结果。然而,考虑到摄像机阵列视频源的巨大数据量,三维图切割算法对资源的要求非常高,而且不切实际。在本文中,我们提出了一种序列拼接切割方案,该方案是一种动态规划算法,逐帧扫描源视频,更新像素的时空约束,在低空间复杂度下逐步构建切割面。该方案具有基于时空相干性和显著性的柔性寻缝条件。实验结果表明,通过放宽缝线连续性约束,所提出的视频拼接能更好地处理图像源中的突然运动或尖锐边缘,减少拼接伪影,提高视觉质量。
{"title":"Discontinuous seam cutting for enhanced video stitching","authors":"Jie Hu, Dong-Qing Zhang, H. H. Yu, Chang Wen Chen","doi":"10.1109/ICME.2015.7177506","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177506","url":null,"abstract":"Video stitching requires proper seam cutting technique to decide the boundary of the sub video volume cropped from source videos. In theory, approaches such as 3D graph-cuts that search the entire spatiotemporal volume for a cutting surface should provide the best results. However, given the tremendous data size of the camera array video source, the 3D graph-cuts algorithm is extremely resource-demanding and impractical. In this paper, we propose a sequential seam cutting scheme, which is a dynamic programming algorithm that scans the source videos frame-by-frame, updates the pixels' spatiotemporal constraints, and gradually builds the cutting surface in low space complexity. The proposed scheme features flexible seam finding conditions based on temporal and spatial coherence as well as salience. Experimental results show that by relaxing the seam continuity constraint, the proposed video stitching can better handle abrupt motions or sharp edges in the source, reduce stitching artifacts, and render enhanced visual quality.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"171 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116298364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Mirror mirror on the wall… An intelligent multisensory mirror for well-being self-assessment 镜子,墙上的镜子,一个智能的多感官镜子,用于幸福自我评估
Pub Date : 2015-06-01 DOI: 10.1109/ICME.2015.7177468
Yasmina Andreu, P. Castellano, S. Colantonio, G. Coppini, R. Favilla, D. Germanese, G. Giannakakis, D. Giorgi, M. Larsson, P. Marraccini, M. Martinelli, B. Matuszewski, Matijia Milanic, M. A. Pascali, M. Pediaditis, Giovanni Raccichini, L. Randeberg, O. Salvetti, T. Strömberg
The face reveals the healthy status of an individual, through a combination of physical signs and facial expressions. The project SEMEOTICONS is translating the semeiotic code of the human face into computational descriptors and measures, automatically extracted from videos, images, and 3D scans of the face. SEMEOTICONS is developing a multisensory platform, in the form of a smart mirror, looking for signs related to cardio-metabolic risk. The goal is to enable users to self-monitor their well-being status over time and improve their life-style via tailored user guidance. Building the multisensory mirror requires addressing significant scientific and technological challenges, from touch-less data acquisition, to real-time processing and integration of multimodal data.
脸通过身体体征和面部表情的结合,揭示了一个人的健康状况。SEMEOTICONS项目将人脸的符号学代码转换为计算描述符和度量,并从人脸的视频、图像和3D扫描中自动提取。SEMEOTICONS正在开发一种多感官平台,以智能镜子的形式,寻找与心脏代谢风险相关的迹象。其目标是使用户能够随着时间的推移自我监控他们的健康状况,并通过量身定制的用户指导改善他们的生活方式。构建多感官镜子需要解决重大的科学和技术挑战,从非接触式数据采集到多模式数据的实时处理和集成。
{"title":"Mirror mirror on the wall… An intelligent multisensory mirror for well-being self-assessment","authors":"Yasmina Andreu, P. Castellano, S. Colantonio, G. Coppini, R. Favilla, D. Germanese, G. Giannakakis, D. Giorgi, M. Larsson, P. Marraccini, M. Martinelli, B. Matuszewski, Matijia Milanic, M. A. Pascali, M. Pediaditis, Giovanni Raccichini, L. Randeberg, O. Salvetti, T. Strömberg","doi":"10.1109/ICME.2015.7177468","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177468","url":null,"abstract":"The face reveals the healthy status of an individual, through a combination of physical signs and facial expressions. The project SEMEOTICONS is translating the semeiotic code of the human face into computational descriptors and measures, automatically extracted from videos, images, and 3D scans of the face. SEMEOTICONS is developing a multisensory platform, in the form of a smart mirror, looking for signs related to cardio-metabolic risk. The goal is to enable users to self-monitor their well-being status over time and improve their life-style via tailored user guidance. Building the multisensory mirror requires addressing significant scientific and technological challenges, from touch-less data acquisition, to real-time processing and integration of multimodal data.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"71 1-2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114008886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Towards active annotation for detection of numerous and scattered objects 面向多目标和分散目标检测的主动标注
Pub Date : 2015-06-01 DOI: 10.1109/ICME.2015.7177524
Hang Su, Hua Yang, Shibao Zheng, Sha Wei, Yu Wang, Shuang Wu
Object detection is an active study area in the field of computer vision and image understanding. In this paper, we propose an active annotation algorithm by addressing the detection of numerous and scattered objects in a view, e.g., hundreds of cells in microscopy images. In particular, object detection is implemented by classifying pixels into specific classes with graph-based semi-supervised learning and grouping neighboring pixels with the same label. Sample or seed selection is conducted based on a novel annotation criterion that minimizes the expected prediction error. The most informative samples are therefore annotated actively, which are subsequently propagated to the unlabeled samples via a pairwise affinity graph. Experimental results conducted on two real world datasets validate that our proposed scheme quickly reaches high quality results and reduces human efforts significantly.
目标检测是计算机视觉和图像理解领域的一个活跃研究领域。在本文中,我们提出了一种主动注释算法,通过解决视图中大量和分散的物体的检测,例如显微镜图像中的数百个细胞。特别是,目标检测是通过基于图的半监督学习将像素分类到特定的类别并将具有相同标签的相邻像素分组来实现的。样本或种子的选择是基于一种新的注释标准,使预期的预测误差最小化。因此,最具信息量的样本被主动注释,随后通过两两亲和图传播到未标记的样本。在两个真实数据集上进行的实验结果验证了我们提出的方案快速获得高质量的结果,并显着减少了人工劳动。
{"title":"Towards active annotation for detection of numerous and scattered objects","authors":"Hang Su, Hua Yang, Shibao Zheng, Sha Wei, Yu Wang, Shuang Wu","doi":"10.1109/ICME.2015.7177524","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177524","url":null,"abstract":"Object detection is an active study area in the field of computer vision and image understanding. In this paper, we propose an active annotation algorithm by addressing the detection of numerous and scattered objects in a view, e.g., hundreds of cells in microscopy images. In particular, object detection is implemented by classifying pixels into specific classes with graph-based semi-supervised learning and grouping neighboring pixels with the same label. Sample or seed selection is conducted based on a novel annotation criterion that minimizes the expected prediction error. The most informative samples are therefore annotated actively, which are subsequently propagated to the unlabeled samples via a pairwise affinity graph. Experimental results conducted on two real world datasets validate that our proposed scheme quickly reaches high quality results and reduces human efforts significantly.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127837318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Locally regularized Anchored Neighborhood Regression for fast Super-Resolution 快速超分辨率的局部正则化锚定邻域回归
Pub Date : 2015-06-01 DOI: 10.1109/ICME.2015.7177470
Junjun Jiang, Jican Fu, T. Lu, R. Hu, Zhongyuan Wang
The goal of learning-based image Super-Resolution (SR) is to generate a plausible and visually pleasing High-Resolution (HR) image from a given Low-Resolution (LR) input. The problem is dramatically under-constrained, which relies on examples or some strong image priors to better reconstruct the missing HR image details. This paper addresses the problem of learning the mapping functions (i.e. projection matrices) between the LR and HR images based on a dictionary of LR and HR examples. One recently proposed method, Anchored Neighborhood Regression (ANR) [1], provides state-of-the-art quality performance and is very fast. In this paper, we propose an improved variant of ANR, namely Locally regularized Anchored Neighborhood Regression (LANR), which utilizes the locality-constrained regression in place of the ridge regression in ANR. LANR assigns different freedom for each neighbor dictionary atom according to its correlation to the input LR patch, thus the learned projection matrices are much more flexible. Experimental results demonstrate that the proposed algorithm performs efficiently and effectively over state-of-the-art methods, e.g., 0.1-0.4 dB in term of PSNR better than ANR.
基于学习的图像超分辨率(SR)的目标是从给定的低分辨率(LR)输入生成可信且视觉上令人愉悦的高分辨率(HR)图像。这个问题非常缺乏约束,它依赖于示例或一些强图像先验来更好地重建缺失的HR图像细节。本文解决了基于LR和HR示例字典的LR和HR图像之间映射函数(即投影矩阵)的学习问题。最近提出的一种方法,锚定邻域回归(ANR)[1],提供了最先进的质量性能,并且非常快。在本文中,我们提出了一种改进的ANR,即局部正则化锚定邻域回归(LANR),它利用位置约束回归代替了ANR中的脊回归。LANR根据每个相邻的字典原子与输入LR patch的相关性为其分配不同的自由度,从而使学习到的投影矩阵更加灵活。实验结果表明,该算法比目前最先进的方法更有效,例如,0.1-0.4 dB的PSNR优于ANR。
{"title":"Locally regularized Anchored Neighborhood Regression for fast Super-Resolution","authors":"Junjun Jiang, Jican Fu, T. Lu, R. Hu, Zhongyuan Wang","doi":"10.1109/ICME.2015.7177470","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177470","url":null,"abstract":"The goal of learning-based image Super-Resolution (SR) is to generate a plausible and visually pleasing High-Resolution (HR) image from a given Low-Resolution (LR) input. The problem is dramatically under-constrained, which relies on examples or some strong image priors to better reconstruct the missing HR image details. This paper addresses the problem of learning the mapping functions (i.e. projection matrices) between the LR and HR images based on a dictionary of LR and HR examples. One recently proposed method, Anchored Neighborhood Regression (ANR) [1], provides state-of-the-art quality performance and is very fast. In this paper, we propose an improved variant of ANR, namely Locally regularized Anchored Neighborhood Regression (LANR), which utilizes the locality-constrained regression in place of the ridge regression in ANR. LANR assigns different freedom for each neighbor dictionary atom according to its correlation to the input LR patch, thus the learned projection matrices are much more flexible. Experimental results demonstrate that the proposed algorithm performs efficiently and effectively over state-of-the-art methods, e.g., 0.1-0.4 dB in term of PSNR better than ANR.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127869125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Two-dimensional digital water art creation on a non-absorbent hydrophilic surface 在不吸水的亲水表面上进行二维数字水艺术创作
Pub Date : 2015-06-01 DOI: 10.1109/ICME.2015.7177466
Pei-Shan Chen, Sai-Keung Wong, Wen-Chieh Lin
In this paper, we develop a physics based approach which enables users to use a brush for manipulating water. The users can drip and drag water on a non-absorbent hydrophilic surface to create water artworks. We consider factors, such as cohesive force and adhesive force, to compute the motion of water. Water drops with different shapes can be formed. We also develop a converter for converting input pictures into water-styled pictures. Our system can be applied in advertisements, movies, games, and education.
在本文中,我们开发了一种基于物理的方法,使用户能够使用刷子来操纵水。用户可以在不吸水的亲水表面上滴水和拖水来创作水艺术品。我们考虑了内聚力和附着力等因素来计算水的运动。可以形成不同形状的水滴。我们还开发了一个转换器,用于将输入图像转换为水样式图像。我们的系统可以应用于广告、电影、游戏和教育。
{"title":"Two-dimensional digital water art creation on a non-absorbent hydrophilic surface","authors":"Pei-Shan Chen, Sai-Keung Wong, Wen-Chieh Lin","doi":"10.1109/ICME.2015.7177466","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177466","url":null,"abstract":"In this paper, we develop a physics based approach which enables users to use a brush for manipulating water. The users can drip and drag water on a non-absorbent hydrophilic surface to create water artworks. We consider factors, such as cohesive force and adhesive force, to compute the motion of water. Water drops with different shapes can be formed. We also develop a converter for converting input pictures into water-styled pictures. Our system can be applied in advertisements, movies, games, and education.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"28 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131409096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Adaptive integration of depth and color for objectness estimation 深度和颜色自适应融合的目标估计
Pub Date : 2015-06-01 DOI: 10.1109/ICME.2015.7177498
Xiangyang Xu, L. Ge, Tongwei Ren, Gangshan Wu
The goal of objectness estimation is to predict a moderate number of proposals of all possible objects in a given image with high efficiency. Most existing works solve this problem solely in conventional 2D color images. In this paper, we demonstrate that the depth information could benefit the estimation as a complementary cue to color information. After detailed analysis of depth characteristics, we present an adaptively integrated description for generic objects, which could take full advantages of both depth and color. With the proposed objectness description, the ambiguous area, especially the highly textured regions in original color maps, can be effectively discriminated. Meanwhile, the object boundary areas could be further emphasized, which leads to a more powerful objectness description. To evaluate the performance of the proposed approach, we conduct the experiments on two challenging datasets. The experimental results show that our proposed objectness description is more powerful and effective than state-of-the-art alternatives.
物体估计的目标是高效地预测给定图像中所有可能物体的中等数量的建议。大多数现有作品仅在传统的2D彩色图像中解决了这个问题。在本文中,我们证明了深度信息可以作为颜色信息的补充线索而有利于估计。在详细分析深度特征的基础上,提出了一种充分利用深度和颜色优势的通用目标自适应综合描述方法。利用本文提出的对象描述方法,可以有效地识别出原始彩色地图中的模糊区域,特别是纹理化程度较高的区域。同时,可以进一步强调物体边界区域,从而实现更强大的物体描述。为了评估该方法的性能,我们在两个具有挑战性的数据集上进行了实验。实验结果表明,我们提出的客观描述比现有的替代方法更强大和有效。
{"title":"Adaptive integration of depth and color for objectness estimation","authors":"Xiangyang Xu, L. Ge, Tongwei Ren, Gangshan Wu","doi":"10.1109/ICME.2015.7177498","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177498","url":null,"abstract":"The goal of objectness estimation is to predict a moderate number of proposals of all possible objects in a given image with high efficiency. Most existing works solve this problem solely in conventional 2D color images. In this paper, we demonstrate that the depth information could benefit the estimation as a complementary cue to color information. After detailed analysis of depth characteristics, we present an adaptively integrated description for generic objects, which could take full advantages of both depth and color. With the proposed objectness description, the ambiguous area, especially the highly textured regions in original color maps, can be effectively discriminated. Meanwhile, the object boundary areas could be further emphasized, which leads to a more powerful objectness description. To evaluate the performance of the proposed approach, we conduct the experiments on two challenging datasets. The experimental results show that our proposed objectness description is more powerful and effective than state-of-the-art alternatives.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114432254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Multimodal hypergraph learning for microblog sentiment prediction 微博情感预测的多模态超图学习
Pub Date : 2015-06-01 DOI: 10.1109/ICME.2015.7177477
Fuhai Chen, Yue Gao, Donglin Cao, R. Ji
Microblog sentiment analysis has attracted extensive research attention in the recent literature. However, most existing works mainly focus on the textual modality, while ignore the contribution of visual information that contributes ever increasing proportion in expressing user emotions. In this paper, we propose to employ a hypergraph structure to formulate textual, visual and emoticon information jointly for sentiment prediction. The constructed hypergraph captures the similarities of tweets on different modalities where each vertex represents a tweet and the hyperedge is formed by the “centroid” vertex and its k-nearest neighbors on each modality. Then, the transductive inference is conducted to learn the relevance score among tweets for sentiment prediction. In this way, both intra- and inter- modality dependencies are taken into consideration in sentiment prediction. Experiments conducted on over 6,000 microblog tweets demonstrate the superiority of our method by 86.77% accuracy and 7% improvement compared to the state-of-the-art methods.
微博情感分析在最近的文献中引起了广泛的研究关注。然而,现有的大多数作品主要关注文本形态,而忽略了视觉信息的贡献,而视觉信息在表达用户情感方面的贡献越来越大。在本文中,我们建议采用超图结构来共同制定文本、视觉和表情信息,以进行情感预测。构建的超图捕获不同模态上tweet的相似性,其中每个顶点代表tweet,超边缘由“质心”顶点及其在每个模态上的k近邻组成。然后,进行转导推理,学习推文之间的相关性评分,用于情绪预测。通过这种方式,在情绪预测中既考虑了模态内依赖又考虑了模态间依赖。在超过6000条微博上进行的实验表明,我们的方法比目前最先进的方法准确率提高了86.77%,提高了7%。
{"title":"Multimodal hypergraph learning for microblog sentiment prediction","authors":"Fuhai Chen, Yue Gao, Donglin Cao, R. Ji","doi":"10.1109/ICME.2015.7177477","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177477","url":null,"abstract":"Microblog sentiment analysis has attracted extensive research attention in the recent literature. However, most existing works mainly focus on the textual modality, while ignore the contribution of visual information that contributes ever increasing proportion in expressing user emotions. In this paper, we propose to employ a hypergraph structure to formulate textual, visual and emoticon information jointly for sentiment prediction. The constructed hypergraph captures the similarities of tweets on different modalities where each vertex represents a tweet and the hyperedge is formed by the “centroid” vertex and its k-nearest neighbors on each modality. Then, the transductive inference is conducted to learn the relevance score among tweets for sentiment prediction. In this way, both intra- and inter- modality dependencies are taken into consideration in sentiment prediction. Experiments conducted on over 6,000 microblog tweets demonstrate the superiority of our method by 86.77% accuracy and 7% improvement compared to the state-of-the-art methods.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"136 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122016234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
Learning compact binary codes via pairwise correlation reconstruction 学习紧凑的二进制代码通过两两相关重建
Pub Date : 2015-06-01 DOI: 10.1109/ICME.2015.7177488
Xiao-Jiao Mao, Yubin Yang, Ning Li
Due to the explosive growth of visual data and the raised urgent needs for more efficient nearest neighbor search methods, hashing methods have been widely studied in recent years. However, parameter optimization of the hash function in most available approaches is tightly coupled with the form of the function itself, which makes the optimization difficult and consequently affects the similarity preserving performance of hashing. To address this issue, we propose a novel pairwise correlation reconstruction framework for learning compact binary codes flexibly. Firstly, each data point is projected into a metric space and represented as a vector encoding the underlying local and global structure of the input space. The similarities of the data are then measured by the pairwise correlations of the learned vectors, which are represented as Euclidean distances. Afterwards, in order to preserve the similarities maximally, the optimal binary codes are learned by reconstructing the pairwise correlations. Experimental results are provided and analyzed on four commonly used benchmark datasets to demonstrate that the proposed method achieves the best nearest neighbor search performance comparing with the state-of-the-art methods.
由于可视化数据的爆炸式增长以及对更高效的最近邻搜索方法的迫切需求,哈希方法近年来得到了广泛的研究。然而,在大多数可用的方法中,哈希函数的参数优化与函数本身的形式紧密耦合,这使得优化变得困难,从而影响了哈希的相似性保持性能。为了解决这个问题,我们提出了一种新的两两相关重构框架,用于灵活地学习紧凑二进制码。首先,将每个数据点投影到度量空间中,并表示为编码输入空间的底层局部和全局结构的向量。然后通过学习到的向量的两两相关性来测量数据的相似性,这被表示为欧几里得距离。然后,为了最大限度地保持相似性,通过重建两两相关来学习最优二进制码。在四种常用的基准数据集上给出了实验结果并进行了分析,结果表明,与现有方法相比,该方法具有最佳的最近邻搜索性能。
{"title":"Learning compact binary codes via pairwise correlation reconstruction","authors":"Xiao-Jiao Mao, Yubin Yang, Ning Li","doi":"10.1109/ICME.2015.7177488","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177488","url":null,"abstract":"Due to the explosive growth of visual data and the raised urgent needs for more efficient nearest neighbor search methods, hashing methods have been widely studied in recent years. However, parameter optimization of the hash function in most available approaches is tightly coupled with the form of the function itself, which makes the optimization difficult and consequently affects the similarity preserving performance of hashing. To address this issue, we propose a novel pairwise correlation reconstruction framework for learning compact binary codes flexibly. Firstly, each data point is projected into a metric space and represented as a vector encoding the underlying local and global structure of the input space. The similarities of the data are then measured by the pairwise correlations of the learned vectors, which are represented as Euclidean distances. Afterwards, in order to preserve the similarities maximally, the optimal binary codes are learned by reconstructing the pairwise correlations. Experimental results are provided and analyzed on four commonly used benchmark datasets to demonstrate that the proposed method achieves the best nearest neighbor search performance comparing with the state-of-the-art methods.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128489495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving image fidelity by luma-assisted chroma subsampling 利用亮度辅助色度子采样提高图像保真度
Pub Date : 2015-06-01 DOI: 10.1109/ICME.2015.7177387
J. Korhonen
Chroma subsampling is commonly used for digital representations of images and video sequences. The basic rationale behind chroma subsampling is that the human visual system is less sensitive to color variations than luma variations. Therefore, chroma data can be coded in lower resolution than luma data, without noticeable loss in perceived image quality. In this paper, we compare different upsampling methods for chroma data and show that by using advanced upsampling schemes the fidelity of the reconstructed image can be significantly improved. We also present an adaptive upsampling method that uses full resolution luma information to assist chroma upsampling. Experimental results show that in the presence of compression noise, the proposed technique steadily outperforms advanced non-assisted upsampling.
色度子采样通常用于图像和视频序列的数字表示。色度子采样背后的基本原理是,人类视觉系统对颜色变化的敏感度低于亮度变化。因此,色度数据可以编码在较低的分辨率比亮度数据,没有明显的损失感知图像质量。在本文中,我们比较了不同的色度数据上采样方法,并表明采用先进的上采样方案可以显著提高重建图像的保真度。我们还提出了一种利用全分辨率亮度信息辅助色度上采样的自适应上采样方法。实验结果表明,在存在压缩噪声的情况下,该方法的性能稳定地优于先进的非辅助上采样。
{"title":"Improving image fidelity by luma-assisted chroma subsampling","authors":"J. Korhonen","doi":"10.1109/ICME.2015.7177387","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177387","url":null,"abstract":"Chroma subsampling is commonly used for digital representations of images and video sequences. The basic rationale behind chroma subsampling is that the human visual system is less sensitive to color variations than luma variations. Therefore, chroma data can be coded in lower resolution than luma data, without noticeable loss in perceived image quality. In this paper, we compare different upsampling methods for chroma data and show that by using advanced upsampling schemes the fidelity of the reconstructed image can be significantly improved. We also present an adaptive upsampling method that uses full resolution luma information to assist chroma upsampling. Experimental results show that in the presence of compression noise, the proposed technique steadily outperforms advanced non-assisted upsampling.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122371953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Joint learning for image-based handbag recommendation 基于图像的手袋推荐联合学习
Pub Date : 2015-06-01 DOI: 10.1109/ICME.2015.7177520
Yan Wang, Sheng Li, A. Kot
Fashion recommendation helps shoppers to find desirable fashion items, which facilitates online interaction and product promotion. In this paper, we propose a method to recommend handbags to each shopper, based on the handbag images the shopper has clicked. This is performed by Joint learning of attribute Projection and One-class SVM classification (JPO) based on the images of the shopper's preferred handbags. More specifically, for the handbag images clicked by each shopper, we project the original image feature space into an attribute space which is more compact. The projection matrix is learned jointly with a one-class SVM to yield a shopper-specific one-class classifier. The results show that the proposed JPO handbag recommendation performs favorably based on initial subject testing.
时尚推荐帮助购物者找到自己想要的时尚单品,促进了在线互动和产品推广。在本文中,我们提出了一种基于购物者点击的手袋图像向每位购物者推荐手袋的方法。这是通过基于购物者喜欢的手袋图像的属性投影和一类支持向量机分类(JPO)的联合学习来实现的。更具体地说,对于每个购物者点击的手袋图像,我们将原始图像特征空间投影到更紧凑的属性空间中。将投影矩阵与单类支持向量机联合学习,生成特定于购物者的单类分类器。结果表明,根据最初的受试者测试,建议的JPO手袋推荐效果良好。
{"title":"Joint learning for image-based handbag recommendation","authors":"Yan Wang, Sheng Li, A. Kot","doi":"10.1109/ICME.2015.7177520","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177520","url":null,"abstract":"Fashion recommendation helps shoppers to find desirable fashion items, which facilitates online interaction and product promotion. In this paper, we propose a method to recommend handbags to each shopper, based on the handbag images the shopper has clicked. This is performed by Joint learning of attribute Projection and One-class SVM classification (JPO) based on the images of the shopper's preferred handbags. More specifically, for the handbag images clicked by each shopper, we project the original image feature space into an attribute space which is more compact. The projection matrix is learned jointly with a one-class SVM to yield a shopper-specific one-class classifier. The results show that the proposed JPO handbag recommendation performs favorably based on initial subject testing.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130707449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
2015 IEEE International Conference on Multimedia and Expo (ICME)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1