首页 > 最新文献

2015 IEEE International Conference on Multimedia and Expo (ICME)最新文献

英文 中文
Improved performance of inverse halftoning algorithms via coupled dictionaries 利用耦合字典改进反半调算法的性能
Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177457
P. Freitas, Mylène C. Q. Farias, Aleteia P. F. Araujo
Inverse halftoning techniques are known to introduce visible distortions (typically, blurring or noise) into the reconstructed image. To reduce the severity of these distortions, we propose a novel training approach for inverse halftoning algorithms. The proposed technique uses a coupled dictionary (CD) to match distorted and original images via a sparse representation. This technique enforces similarities of sparse representations between distorted and non-distorted images. Results show that the proposed technique can improve the performance of different inverse halftone approaches. Images reconstructed with the proposed approach have a higher quality, showing less blur, noise, and chromatic aberrations.
已知反半调技术会在重建图像中引入可见的失真(通常是模糊或噪声)。为了减少这些失真的严重程度,我们提出了一种新的反半调算法训练方法。该技术使用耦合字典(CD)通过稀疏表示来匹配失真图像和原始图像。该技术增强了扭曲和非扭曲图像之间稀疏表示的相似性。结果表明,该方法可以提高不同反半色调方法的性能。用该方法重建的图像具有更高的质量,显示更少的模糊、噪声和色差。
{"title":"Improved performance of inverse halftoning algorithms via coupled dictionaries","authors":"P. Freitas, Mylène C. Q. Farias, Aleteia P. F. Araujo","doi":"10.1109/ICME.2015.7177457","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177457","url":null,"abstract":"Inverse halftoning techniques are known to introduce visible distortions (typically, blurring or noise) into the reconstructed image. To reduce the severity of these distortions, we propose a novel training approach for inverse halftoning algorithms. The proposed technique uses a coupled dictionary (CD) to match distorted and original images via a sparse representation. This technique enforces similarities of sparse representations between distorted and non-distorted images. Results show that the proposed technique can improve the performance of different inverse halftone approaches. Images reconstructed with the proposed approach have a higher quality, showing less blur, noise, and chromatic aberrations.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"191 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131741481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detecting abnormal behaviors in surveillance videos based on fuzzy clustering and multiple Auto-Encoders 基于模糊聚类和多自编码器的监控视频异常行为检测
Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177459
Zhengying Chen, Yonghong Tian, Wei Zeng, Tiejun Huang
In this paper, we present a novel framework to detect abnormal behaviors in surveillance videos by using fuzzy clustering and multiple Auto-Encoders (FMAE). As detecting abnormal behaviors is often treated as an unsupervised task, how to describe normal patterns becomes the key point. Considering there are many types of normal behaviors in the daily life, we use the fuzzy clustering technique to roughly divide the training samples into several clusters so that each cluster stands for a normal pattern. Then we deploy multiple Auto-Encoders to estimate these different types of normal behaviors from weighted samples. When testing on an unknown video, our framework can predict whether it contains abnormal behaviors or not by summarizing the reconstruction cost through each Auto-Encoder. Since there are always lots of redundancies in the surveillance video, Auto-Encoder is a pretty good tool to capture common structures of normal video sequences automatically as well as estimate normal patterns. The experimental results show that our approach achieves good performance on three public video analysis datasets and statistically outperforms the state-of-the-art approaches under some scenes.
本文提出了一种基于模糊聚类和多自编码器(FMAE)的监控视频异常行为检测框架。异常行为检测通常被视为无监督任务,如何描述正常模式成为关键。考虑到日常生活中存在许多类型的正常行为,我们使用模糊聚类技术将训练样本大致划分为几个簇,使每个簇代表一个正常模式。然后,我们部署多个auto - encoder从加权样本中估计这些不同类型的正常行为。在对未知视频进行测试时,我们的框架可以通过汇总每个Auto-Encoder的重构成本来预测该视频是否包含异常行为。由于监控视频中总是存在大量的冗余,自动编码器是一个很好的工具,可以自动捕获正常视频序列的共同结构,并估计正常模式。实验结果表明,该方法在三个公共视频分析数据集上取得了良好的性能,在某些场景下的统计性能优于目前最先进的方法。
{"title":"Detecting abnormal behaviors in surveillance videos based on fuzzy clustering and multiple Auto-Encoders","authors":"Zhengying Chen, Yonghong Tian, Wei Zeng, Tiejun Huang","doi":"10.1109/ICME.2015.7177459","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177459","url":null,"abstract":"In this paper, we present a novel framework to detect abnormal behaviors in surveillance videos by using fuzzy clustering and multiple Auto-Encoders (FMAE). As detecting abnormal behaviors is often treated as an unsupervised task, how to describe normal patterns becomes the key point. Considering there are many types of normal behaviors in the daily life, we use the fuzzy clustering technique to roughly divide the training samples into several clusters so that each cluster stands for a normal pattern. Then we deploy multiple Auto-Encoders to estimate these different types of normal behaviors from weighted samples. When testing on an unknown video, our framework can predict whether it contains abnormal behaviors or not by summarizing the reconstruction cost through each Auto-Encoder. Since there are always lots of redundancies in the surveillance video, Auto-Encoder is a pretty good tool to capture common structures of normal video sequences automatically as well as estimate normal patterns. The experimental results show that our approach achieves good performance on three public video analysis datasets and statistically outperforms the state-of-the-art approaches under some scenes.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126640475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Facial expression preserving privacy protection using image melding 基于图像融合的面部表情隐私保护
Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177394
Yuta Nakashima, Tatsuya Koyama, N. Yokoya, N. Babaguchi
An enormous number of images are currently shared through social networking services such as Facebook. These images usually contain appearance of people and may violate the people's privacy if they are published without permission from each person. To remedy this privacy concern, visual privacy protection, such as blurring, is applied to facial regions of people without permission. However, in addition to image quality degradation, this may spoil the context of the image: If some people are filtered while the others are not, missing facial expression makes comprehension of the image difficult. This paper proposes an image melding-based method that modifies facial regions in a visually unintrusive way with preserving facial expression. Our experimental results demonstrated that the proposed method can retain facial expression while protecting privacy.
目前,大量图片通过Facebook等社交网络服务共享。这些图像通常包含人物的外表,如果未经每个人的许可而发布,可能会侵犯人们的隐私。为了解决这种隐私问题,视觉隐私保护,如模糊,在未经允许的情况下应用于人们的面部区域。然而,除了图像质量下降之外,这可能会破坏图像的背景:如果一些人被过滤而另一些人没有,那么缺少面部表情就会使图像难以理解。本文提出了一种基于图像融合的方法,在保留面部表情的情况下,以视觉上不受干扰的方式修改面部区域。实验结果表明,该方法可以在保护隐私的同时保留面部表情。
{"title":"Facial expression preserving privacy protection using image melding","authors":"Yuta Nakashima, Tatsuya Koyama, N. Yokoya, N. Babaguchi","doi":"10.1109/ICME.2015.7177394","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177394","url":null,"abstract":"An enormous number of images are currently shared through social networking services such as Facebook. These images usually contain appearance of people and may violate the people's privacy if they are published without permission from each person. To remedy this privacy concern, visual privacy protection, such as blurring, is applied to facial regions of people without permission. However, in addition to image quality degradation, this may spoil the context of the image: If some people are filtered while the others are not, missing facial expression makes comprehension of the image difficult. This paper proposes an image melding-based method that modifies facial regions in a visually unintrusive way with preserving facial expression. Our experimental results demonstrated that the proposed method can retain facial expression while protecting privacy.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"190 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117346535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
An adaptive PEE-based reversible data hiding scheme exploiting referential prediction-errors 一种利用参考预测误差的自适应基于pee的可逆数据隐藏方案
Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177512
Fei Peng, Xiaolong Li, Bin Yang
Prediction-error expansion (PEE) is an efficient technique for reversible data hiding (RDH). Instead of expanding the highest histogram bins in conventional PEE, in this paper, to better utilize the image redundancy, we propose a new PEE-based RDH scheme with an advisable expansion strategy utilizing referential prediction-errors. For each pixel, we first calculate its prediction-error and use its neighbor prediction-error as a reference. The correlation of the prediction-error and its reference is exploited to adaptively select bins for expansion embedding. In addition, to further enhance the reversible embedding performance, we apply the pixel selection technique in our scheme such that the pixels located in smooth image areas are priorly embedded. Experimental results show that the proposed scheme outperforms conventional PEE and it is better than some state-of-the-art RDH works as well.
预测误差展开(PEE)是一种有效的可逆数据隐藏技术。为了更好地利用图像冗余,本文提出了一种新的基于PEE的RDH方案,该方案采用了一种合理的利用参考预测误差的扩展策略。对于每个像素,我们首先计算其预测误差,并使用相邻的预测误差作为参考。利用预测误差与其参考的相关性,自适应地选择扩展嵌入的箱子。此外,为了进一步提高可逆嵌入性能,我们在方案中应用了像素选择技术,使位于平滑图像区域的像素被优先嵌入。实验结果表明,该方案不仅优于传统的PEE,而且优于一些最先进的RDH方案。
{"title":"An adaptive PEE-based reversible data hiding scheme exploiting referential prediction-errors","authors":"Fei Peng, Xiaolong Li, Bin Yang","doi":"10.1109/ICME.2015.7177512","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177512","url":null,"abstract":"Prediction-error expansion (PEE) is an efficient technique for reversible data hiding (RDH). Instead of expanding the highest histogram bins in conventional PEE, in this paper, to better utilize the image redundancy, we propose a new PEE-based RDH scheme with an advisable expansion strategy utilizing referential prediction-errors. For each pixel, we first calculate its prediction-error and use its neighbor prediction-error as a reference. The correlation of the prediction-error and its reference is exploited to adaptively select bins for expansion embedding. In addition, to further enhance the reversible embedding performance, we apply the pixel selection technique in our scheme such that the pixels located in smooth image areas are priorly embedded. Experimental results show that the proposed scheme outperforms conventional PEE and it is better than some state-of-the-art RDH works as well.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128452532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Real-time face detection in Full HD images exploiting both embedded CPU and GPU 利用嵌入式CPU和GPU的全高清图像实时人脸检测
Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177522
Chanyoung Oh, Saehanseul Yi, Youngmin Yi
CPU-GPU heterogeneous systems have become a mainstream platform in both server and embedded domains with ever increasing demand for powerful accelerator. In this paper, we present parallelization techniques that exploit both data and task parallelism of LBP based face detection algorithm on an embedded heterogeneous platform. By running tasks in a pipelined parallel way on multicore CPUs and by offloading a data-parallel task to a GPU, we could successfully achieve 29 fps for Full HD inputs on Tegra K1 platform where quad-core Cortex-A15 CPU and CUDA supported 192-core GPU are integrated. This corresponds to 5.54x speedup over a sequential version and 1.69x speedup compared to the GPU-only implementations.
随着对高性能加速器的需求日益增长,CPU-GPU异构系统已成为服务器和嵌入式领域的主流平台。在本文中,我们提出了在嵌入式异构平台上利用基于LBP的人脸检测算法的数据和任务并行性的并行化技术。通过在多核CPU上以流水线并行方式运行任务,并将数据并行任务卸载到GPU上,我们可以在集成了四核Cortex-A15 CPU和CUDA支持的192核GPU的Tegra K1平台上成功实现29 fps的全高清输入。这相当于比顺序版本加速5.54倍,与仅使用gpu的实现相比加速1.69倍。
{"title":"Real-time face detection in Full HD images exploiting both embedded CPU and GPU","authors":"Chanyoung Oh, Saehanseul Yi, Youngmin Yi","doi":"10.1109/ICME.2015.7177522","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177522","url":null,"abstract":"CPU-GPU heterogeneous systems have become a mainstream platform in both server and embedded domains with ever increasing demand for powerful accelerator. In this paper, we present parallelization techniques that exploit both data and task parallelism of LBP based face detection algorithm on an embedded heterogeneous platform. By running tasks in a pipelined parallel way on multicore CPUs and by offloading a data-parallel task to a GPU, we could successfully achieve 29 fps for Full HD inputs on Tegra K1 platform where quad-core Cortex-A15 CPU and CUDA supported 192-core GPU are integrated. This corresponds to 5.54x speedup over a sequential version and 1.69x speedup compared to the GPU-only implementations.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127228583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Machine learning based rate adaptation with elastic feature selection for HTTP-based streaming 基于机器学习的基于http流的速率自适应弹性特征选择
Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177418
Yu-Lin Chien, K. Lin, Ming-Syan Chen
Dynamic Adaptive Streaming over HTTP (DASH) has become an emerging application nowadays. Video rate adaptation is a key to determine the video quality of HTTP-based media streaming. Recent works have proposed several algorithms that allow a DASH client to adapt its video encoding rate to network dynamics. While network conditions are typically affected by many different factors, these algorithms however usually consider only a few representative information, e.g., predicted available bandwidth or fullness of its playback buffer. In addition, the error in bandwidth estimation could significantly degrade their performance. Therefore, this paper presents Machine Learning-based Adaptive Streaming over HTTP (MLASH), an elastic framework that exploits a wide range of useful network-related features to train a rate classification model. The distinct properties of MLASH are that its machine learning-based framework can be incorporated with any existing adaptation algorithm and utilize big data characteristics to improve prediction accuracy. We show via trace-based simulations that machine learning-based adaptation can achieve a better performance than traditional adaptation algorithms in terms of their target quality of experience (QoE) metrics.
基于HTTP的动态自适应流(DASH)已经成为一种新兴的应用。视频速率适配是决定基于http的流媒体视频质量的关键。最近的工作提出了几种算法,允许DASH客户端根据网络动态调整其视频编码速率。虽然网络条件通常受到许多不同因素的影响,但这些算法通常只考虑少数具有代表性的信息,例如,预测可用带宽或其播放缓冲区的满度。此外,带宽估计的误差可能会严重降低它们的性能。因此,本文提出了基于机器学习的自适应HTTP流(MLASH),这是一个利用广泛的有用的网络相关特征来训练速率分类模型的弹性框架。MLASH的独特之处在于其基于机器学习的框架可以与任何现有的自适应算法相结合,并利用大数据特性来提高预测精度。我们通过基于跟踪的模拟表明,基于机器学习的自适应在目标体验质量(QoE)指标方面可以比传统的自适应算法实现更好的性能。
{"title":"Machine learning based rate adaptation with elastic feature selection for HTTP-based streaming","authors":"Yu-Lin Chien, K. Lin, Ming-Syan Chen","doi":"10.1109/ICME.2015.7177418","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177418","url":null,"abstract":"Dynamic Adaptive Streaming over HTTP (DASH) has become an emerging application nowadays. Video rate adaptation is a key to determine the video quality of HTTP-based media streaming. Recent works have proposed several algorithms that allow a DASH client to adapt its video encoding rate to network dynamics. While network conditions are typically affected by many different factors, these algorithms however usually consider only a few representative information, e.g., predicted available bandwidth or fullness of its playback buffer. In addition, the error in bandwidth estimation could significantly degrade their performance. Therefore, this paper presents Machine Learning-based Adaptive Streaming over HTTP (MLASH), an elastic framework that exploits a wide range of useful network-related features to train a rate classification model. The distinct properties of MLASH are that its machine learning-based framework can be incorporated with any existing adaptation algorithm and utilize big data characteristics to improve prediction accuracy. We show via trace-based simulations that machine learning-based adaptation can achieve a better performance than traditional adaptation algorithms in terms of their target quality of experience (QoE) metrics.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131693217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Refining graph matching using inherent structure information 利用固有结构信息优化图匹配
Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177409
Wenzhao Li, Yi-Zhe Song, A. Cavallaro
We present a graph matching refinement framework that improves the performance of a given graph matching algorithm. Our method synergistically uses the inherent structure information embedded globally in the active association graph, and locally on each individual graph. The combination of such information reveals how consistent each candidate match is with its global and local contexts. In doing so, the proposed method removes most false matches and improves precision. The validation on standard benchmark datasets demonstrates the effectiveness of our method.
我们提出了一个图匹配优化框架,它可以提高给定图匹配算法的性能。我们的方法协同利用嵌入在活动关联图中的全局固有结构信息,以及嵌入在每个单独图中的局部固有结构信息。这些信息的组合揭示了每个候选匹配与其全局和局部上下文的一致性。在此过程中,所提出的方法消除了大多数错误匹配并提高了精度。在标准基准数据集上的验证验证了该方法的有效性。
{"title":"Refining graph matching using inherent structure information","authors":"Wenzhao Li, Yi-Zhe Song, A. Cavallaro","doi":"10.1109/ICME.2015.7177409","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177409","url":null,"abstract":"We present a graph matching refinement framework that improves the performance of a given graph matching algorithm. Our method synergistically uses the inherent structure information embedded globally in the active association graph, and locally on each individual graph. The combination of such information reveals how consistent each candidate match is with its global and local contexts. In doing so, the proposed method removes most false matches and improves precision. The validation on standard benchmark datasets demonstrates the effectiveness of our method.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132893828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Coupled dictionary learning and feature mapping for cross-modal retrieval 跨模态检索的耦合字典学习和特征映射
Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177396
Xing Xu, Atsushi Shimada, R. Taniguchi, Li He
In this paper, we investigate the problem of modeling images and associated text for cross-modal retrieval tasks such as text-to-image search and image-to-text search. To make the data from image and text modalities comparable, previous cross-modal retrieval methods directly learn two projection matrices to map the raw features of the two modalities into a common subspace, in which cross-modal data matching can be performed. However, the different feature representations and correlation structures of different modalities inhibit these methods from efficiently modeling the relationships across modalities through a common subspace. To handle the diversities of different modalities, we first leverage the coupled dictionary learning method to generate homogeneous sparse representations for different modalities by associating and jointly updating their dictionaries. We then use a coupled feature mapping scheme to project the derived sparse representations from different modalities into a common subspace in which cross-modal retrieval can be performed. Experiments on a variety of cross-modal retrieval tasks demonstrate that the proposed method outperforms the state-of-the-art approaches.
在本文中,我们研究了跨模式检索任务(如文本到图像搜索和图像到文本搜索)中图像和相关文本的建模问题。为了使图像和文本模态数据具有可比性,以往的跨模态检索方法直接学习两个投影矩阵,将两模态的原始特征映射到一个共同的子空间中,在该子空间中进行跨模态数据匹配。然而,不同模态的特征表示和关联结构不同,使得这些方法无法通过公共子空间有效地建模模态间的关系。为了处理不同模态的多样性,我们首先利用耦合字典学习方法,通过关联和联合更新不同模态的字典来生成同质的稀疏表示。然后,我们使用耦合特征映射方案将从不同模态导出的稀疏表示投影到可以执行跨模态检索的公共子空间中。在各种跨模态检索任务上的实验表明,该方法优于目前最先进的方法。
{"title":"Coupled dictionary learning and feature mapping for cross-modal retrieval","authors":"Xing Xu, Atsushi Shimada, R. Taniguchi, Li He","doi":"10.1109/ICME.2015.7177396","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177396","url":null,"abstract":"In this paper, we investigate the problem of modeling images and associated text for cross-modal retrieval tasks such as text-to-image search and image-to-text search. To make the data from image and text modalities comparable, previous cross-modal retrieval methods directly learn two projection matrices to map the raw features of the two modalities into a common subspace, in which cross-modal data matching can be performed. However, the different feature representations and correlation structures of different modalities inhibit these methods from efficiently modeling the relationships across modalities through a common subspace. To handle the diversities of different modalities, we first leverage the coupled dictionary learning method to generate homogeneous sparse representations for different modalities by associating and jointly updating their dictionaries. We then use a coupled feature mapping scheme to project the derived sparse representations from different modalities into a common subspace in which cross-modal retrieval can be performed. Experiments on a variety of cross-modal retrieval tasks demonstrate that the proposed method outperforms the state-of-the-art approaches.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130951298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
Graph regularized non-negative local coordinate factorization with pairwise constraints for image representation 具有成对约束的图像正则化非负局部坐标分解
Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177386
Yangcheng He, Hongtao Lu, Bao-Liang Lu
Chen et al. proposed a non-negative local coordinate factorization algorithm for feature extraction (NLCF) [1], which incorporated the local coordinate constraint into non-negative matrix factorization (NMF). However, NLCF is actually a unsupervised method without making use of prior information of problems in hand. In this paper, we propose a novel graph regularized non-negative local coordinate factorization with pairwise constraints algorithm (PCGNLCF) for image representation. PCGNLCF incorporates pairwise constraints and graph Laplacian into NLCF. More specifically, we expect that data points having pairwise must-link constraints will have the similar coordinates as much as possible, while data points with pairwise cannot-link constraints will have distinct coordinates as much as possible. Experimental results show the effectiveness of our proposed method in comparison to the state-of-the-art algorithms on several real-world applications.
Chen等人提出了一种非负局部坐标分解算法用于特征提取(NLCF)[1],该算法将局部坐标约束融入到非负矩阵分解(NMF)中。然而,NLCF实际上是一种不利用手头问题先验信息的无监督方法。本文提出了一种新的带有成对约束的图正则化非负局部坐标分解算法(PCGNLCF)。PCGNLCF在NLCF中引入了成对约束和图拉普拉斯。更具体地说,我们期望具有成对必须链接约束的数据点具有尽可能相似的坐标,而具有成对不能链接约束的数据点具有尽可能不同的坐标。实验结果表明,我们提出的方法在几个实际应用中与最先进的算法相比是有效的。
{"title":"Graph regularized non-negative local coordinate factorization with pairwise constraints for image representation","authors":"Yangcheng He, Hongtao Lu, Bao-Liang Lu","doi":"10.1109/ICME.2015.7177386","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177386","url":null,"abstract":"Chen et al. proposed a non-negative local coordinate factorization algorithm for feature extraction (NLCF) [1], which incorporated the local coordinate constraint into non-negative matrix factorization (NMF). However, NLCF is actually a unsupervised method without making use of prior information of problems in hand. In this paper, we propose a novel graph regularized non-negative local coordinate factorization with pairwise constraints algorithm (PCGNLCF) for image representation. PCGNLCF incorporates pairwise constraints and graph Laplacian into NLCF. More specifically, we expect that data points having pairwise must-link constraints will have the similar coordinates as much as possible, while data points with pairwise cannot-link constraints will have distinct coordinates as much as possible. Experimental results show the effectiveness of our proposed method in comparison to the state-of-the-art algorithms on several real-world applications.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122416603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Temporally consistent region-based video exposure correction 时间一致的基于区域的视频曝光校正
Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177382
Xuan Dong, Lu Yuan, Weixin Li, A. Yuille
We analyze the problem of temporally consistent video exposure correction. Existing methods usually either fail to evaluate optimal exposure for every region or cannot get temporally consistent correction results. In addition, the contrast is often lost when the detail is not preserved properly during correction. In this paper, we use the block-based energy minimization to evaluate the temporally consistent exposure, which considers 1) the maximization of the visibility of all contents, 2) keeping the relative difference between neighboring regions, and 3) temporally consistent exposure of corresponding contents in different frames. Then, based on Weber contrast definition, we propose a contrast preserving exposure correction method. Experimental results show that our method enables better temporally consistent exposure evaluation and produces contrast preserving outputs.
我们分析了时间一致视频曝光校正问题。现有的方法要么无法评估每个区域的最佳曝光,要么无法获得时间一致的校正结果。此外,在校正过程中,如果细节没有得到适当的保留,对比度往往会丢失。本文采用基于分块的能量最小化方法来评估时间一致曝光,考虑1)所有内容的可见性最大化,2)保持相邻区域之间的相对差异,3)不同帧中对应内容的时间一致曝光。然后,基于韦伯对比度定义,提出了一种保持对比度的曝光校正方法。实验结果表明,我们的方法能够更好地进行时间一致的曝光评估,并产生保持对比度的输出。
{"title":"Temporally consistent region-based video exposure correction","authors":"Xuan Dong, Lu Yuan, Weixin Li, A. Yuille","doi":"10.1109/ICME.2015.7177382","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177382","url":null,"abstract":"We analyze the problem of temporally consistent video exposure correction. Existing methods usually either fail to evaluate optimal exposure for every region or cannot get temporally consistent correction results. In addition, the contrast is often lost when the detail is not preserved properly during correction. In this paper, we use the block-based energy minimization to evaluate the temporally consistent exposure, which considers 1) the maximization of the visibility of all contents, 2) keeping the relative difference between neighboring regions, and 3) temporally consistent exposure of corresponding contents in different frames. Then, based on Weber contrast definition, we propose a contrast preserving exposure correction method. Experimental results show that our method enables better temporally consistent exposure evaluation and produces contrast preserving outputs.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124845736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
2015 IEEE International Conference on Multimedia and Expo (ICME)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1