首页 > 最新文献

2015 IEEE International Conference on Multimedia and Expo (ICME)最新文献

英文 中文
A case for application-managed cache for browser 一个应用程序管理的浏览器缓存案例
Pub Date : 2015-06-01 DOI: 10.1109/ICME.2015.7177455
Ashok Anand, Mehrdad Reshadi, Bowei Du, Hariharan Kolam, S. Jaiswal, Aditya Akella
Mobile web usage has significantly increased in last few years. There has been a lot of emphasis on providing good web page performance for mobile devices. Client-side caching can play a significant role in providing good web page performance, but unfortunately, traditional browser caches lack in various aspects leading to sub-optimal performance. More specifically, web applications do not have control on caching, e.g., which resources to cache, how to cache, etc., leading to ineffective cache utilization. Recently, HTML5 has introduced number of persistent storage APIs, that can provide required control for web applications. We evaluate these HTML5 storage options on various devices, and find that they can also meet the performance criteria of caching; in fact, some of the HTML5 storage APIs, e.g., localStorage, can provide even better performance than browser cache. Based on these insights, we make a case for application-managed hierarchical client-side cache, called HCache, that leverages these storage options as backends. We propose a novel API that allows web application developers to intelligently control the caching behavior and the usage of these storage options transparently. Our experiments with a prototype show that HCache can improve web page performance by up to 60%.
在过去的几年里,移动网络的使用显著增加。为移动设备提供良好的网页性能一直是人们关注的焦点。客户端缓存可以在提供良好的网页性能方面发挥重要作用,但不幸的是,传统的浏览器缓存在很多方面都缺乏,导致性能不够理想。更具体地说,web应用程序无法控制缓存,例如,缓存哪些资源,如何缓存等,导致低效的缓存利用率。最近,HTML5引入了一些持久性存储api,可以为web应用程序提供所需的控制。我们在不同设备上评估了这些HTML5存储选项,发现它们也能满足缓存的性能标准;事实上,一些HTML5存储api,例如localStorage,可以提供比浏览器缓存更好的性能。基于这些见解,我们为应用程序管理的分层客户端缓存(称为HCache)做了一个案例,它利用这些存储选项作为后端。我们提出了一种新颖的API,允许web应用程序开发人员智能地控制缓存行为和这些存储选项的使用。我们对原型的实验表明,HCache可以将网页性能提高60%。
{"title":"A case for application-managed cache for browser","authors":"Ashok Anand, Mehrdad Reshadi, Bowei Du, Hariharan Kolam, S. Jaiswal, Aditya Akella","doi":"10.1109/ICME.2015.7177455","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177455","url":null,"abstract":"Mobile web usage has significantly increased in last few years. There has been a lot of emphasis on providing good web page performance for mobile devices. Client-side caching can play a significant role in providing good web page performance, but unfortunately, traditional browser caches lack in various aspects leading to sub-optimal performance. More specifically, web applications do not have control on caching, e.g., which resources to cache, how to cache, etc., leading to ineffective cache utilization. Recently, HTML5 has introduced number of persistent storage APIs, that can provide required control for web applications. We evaluate these HTML5 storage options on various devices, and find that they can also meet the performance criteria of caching; in fact, some of the HTML5 storage APIs, e.g., localStorage, can provide even better performance than browser cache. Based on these insights, we make a case for application-managed hierarchical client-side cache, called HCache, that leverages these storage options as backends. We propose a novel API that allows web application developers to intelligently control the caching behavior and the usage of these storage options transparently. Our experiments with a prototype show that HCache can improve web page performance by up to 60%.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114254653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Loss concentration based Controlled Delay: An Active Queue Management algorithm for enhanced Quality of Experience for video telephony 基于损失集中的可控延迟:一种提高视频电话体验质量的主动队列管理算法
Pub Date : 2015-06-01 DOI: 10.1109/ICME.2015.7177444
A. Balasubramanian, Liangping Ma, Gregory Sternberg
This paper presents an Active Queue Management (AQM) algorithm for improving the Quality of Experience (QoE) of video telephony over packet switched networks. The algorithm exploits the characteristics of the video coding structure, and builds on the Controlled Delay (Codel) active queue management algorithm recently proposed by Nichols and Jacobson to address the prevalent `bufferbloat' problem in the current Internet. The proposed algorithm, Loss Concentration based controlled Delay (LC-Codel), maintains low queuing delay which is essential for video telephony, while using loss concentration to improve video QoE. Simulation results show significant gains in QoE with negligible impact on cross traffic.
提出了一种提高分组交换网络视频电话体验质量的主动队列管理(AQM)算法。该算法利用了视频编码结构的特点,并建立在最近由Nichols和Jacobson提出的可控延迟(Codel)主动队列管理算法的基础上,以解决当前互联网中普遍存在的“缓冲区膨胀”问题。所提出的基于丢失集中的控制延迟(LC-Codel)算法在利用丢失集中提高视频QoE的同时,保持了视频电话所必需的低排队延迟。仿真结果表明,QoE的显著提高对交叉流量的影响可以忽略不计。
{"title":"Loss concentration based Controlled Delay: An Active Queue Management algorithm for enhanced Quality of Experience for video telephony","authors":"A. Balasubramanian, Liangping Ma, Gregory Sternberg","doi":"10.1109/ICME.2015.7177444","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177444","url":null,"abstract":"This paper presents an Active Queue Management (AQM) algorithm for improving the Quality of Experience (QoE) of video telephony over packet switched networks. The algorithm exploits the characteristics of the video coding structure, and builds on the Controlled Delay (Codel) active queue management algorithm recently proposed by Nichols and Jacobson to address the prevalent `bufferbloat' problem in the current Internet. The proposed algorithm, Loss Concentration based controlled Delay (LC-Codel), maintains low queuing delay which is essential for video telephony, while using loss concentration to improve video QoE. Simulation results show significant gains in QoE with negligible impact on cross traffic.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115482838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Evaluating music recommendation in a real-world setting: On data splitting and evaluation metrics 在现实环境中评估音乐推荐:关于数据分割和评估指标
Pub Date : 2015-06-01 DOI: 10.1109/ICME.2015.7177456
Szu-Yu Chou, Yi-Hsuan Yang, Yu-Ching Lin
Evaluation is important to assess the performance of a computer system in fulfilling a certain user need. In the context of recommendation, researchers usually evaluate the performance of a recommender system by holding out a random subset of observed ratings and calculating the accuracy of the system in reproducing such ratings. This evaluation strategy, however, does not consider the fact that in a real-world setting we are actually given the observed ratings of the past and have to predict for the future. There might be new songs, which create the cold-start problem, and the users' musical preference might change over time. Moreover, the user satisfaction of a recommender system may be related to factors other than accuracy. In light of these observations, we propose in this paper a novel evaluation framework that uses various time-based data splitting methods and evaluation metrics to assess the performance of recommender systems. Using millions of listening records collected from a commercial music streaming service, we compare the performance of collaborative filtering (CF) and content-based (CB) models with low-level audio features and semantic audio descriptors. Our evaluation shows that the CB model with semantic descriptors obtains a better trade-off among accuracy, novelty, diversity, freshness and popularity, and can nicely deal with the cold-start problems of new songs.
评估对于评估计算机系统在满足特定用户需求方面的性能是很重要的。在推荐的背景下,研究人员通常通过给出观察到的评级的随机子集并计算系统再现这些评级的准确性来评估推荐系统的性能。然而,这种评估策略没有考虑到这样一个事实,即在现实世界中,我们实际上得到了对过去的观察评级,并且必须预测未来。可能会有新歌,这会产生冷启动问题,用户的音乐偏好可能会随着时间的推移而改变。此外,推荐系统的用户满意度可能与准确性以外的因素有关。根据这些观察结果,我们在本文中提出了一个新的评估框架,该框架使用各种基于时间的数据分割方法和评估指标来评估推荐系统的性能。使用从商业音乐流媒体服务收集的数百万个收听记录,我们比较了具有低级音频特征和语义音频描述符的协同过滤(CF)和基于内容的(CB)模型的性能。结果表明,基于语义描述符的CB模型在准确性、新颖性、多样性、新鲜度和流行度之间取得了较好的平衡,能够很好地解决新歌的冷启动问题。
{"title":"Evaluating music recommendation in a real-world setting: On data splitting and evaluation metrics","authors":"Szu-Yu Chou, Yi-Hsuan Yang, Yu-Ching Lin","doi":"10.1109/ICME.2015.7177456","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177456","url":null,"abstract":"Evaluation is important to assess the performance of a computer system in fulfilling a certain user need. In the context of recommendation, researchers usually evaluate the performance of a recommender system by holding out a random subset of observed ratings and calculating the accuracy of the system in reproducing such ratings. This evaluation strategy, however, does not consider the fact that in a real-world setting we are actually given the observed ratings of the past and have to predict for the future. There might be new songs, which create the cold-start problem, and the users' musical preference might change over time. Moreover, the user satisfaction of a recommender system may be related to factors other than accuracy. In light of these observations, we propose in this paper a novel evaluation framework that uses various time-based data splitting methods and evaluation metrics to assess the performance of recommender systems. Using millions of listening records collected from a commercial music streaming service, we compare the performance of collaborative filtering (CF) and content-based (CB) models with low-level audio features and semantic audio descriptors. Our evaluation shows that the CB model with semantic descriptors obtains a better trade-off among accuracy, novelty, diversity, freshness and popularity, and can nicely deal with the cold-start problems of new songs.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115807549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Undersampled face recognition with one-pass dictionary learning 一次字典学习的欠采样人脸识别
Pub Date : 2015-06-01 DOI: 10.1109/ICME.2015.7177451
Chia-Po Wei, Y. Wang
Undersampled face recognition deals with the problem in which, for each subject to be recognized, only one or few images are available in the gallery (training) set. Thus, it is very difficult to handle large intra-class variations for face images. In this paper, we propose a one-pass dictionary learning algorithm to derive an auxiliary dictionary from external data, which consists of image variants of the subjects not of interest (not to be recognized). The proposed algorithm not only allows us to efficiently model intra-class variations such as illumination and expression changes, it also exhibits excellent abilities in recognizing corrupted images due to occlusion. In our experiments, we will show that our method would perform favorably against existing sparse representation or dictionary learning based approaches. Moreover, our computation time is remarkably less than that of recent dictionary learning based face recognition methods. Therefore, the effectiveness and efficiency of our proposed algorithm can be successfully verified.
欠采样人脸识别处理的问题是,对于每个要识别的主题,在图库(训练)集中只有一个或几个图像可用。因此,很难处理人脸图像的大类内变化。在本文中,我们提出了一种单遍字典学习算法,从外部数据中派生辅助字典,该数据由不感兴趣(不被识别)的主题的图像变体组成。该算法不仅可以有效地模拟类内变化,如光照和表情变化,而且在识别由于遮挡而损坏的图像方面也表现出出色的能力。在我们的实验中,我们将证明我们的方法将优于现有的稀疏表示或基于字典学习的方法。此外,我们的计算时间明显少于最近基于字典学习的人脸识别方法。因此,可以成功验证本文算法的有效性和高效性。
{"title":"Undersampled face recognition with one-pass dictionary learning","authors":"Chia-Po Wei, Y. Wang","doi":"10.1109/ICME.2015.7177451","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177451","url":null,"abstract":"Undersampled face recognition deals with the problem in which, for each subject to be recognized, only one or few images are available in the gallery (training) set. Thus, it is very difficult to handle large intra-class variations for face images. In this paper, we propose a one-pass dictionary learning algorithm to derive an auxiliary dictionary from external data, which consists of image variants of the subjects not of interest (not to be recognized). The proposed algorithm not only allows us to efficiently model intra-class variations such as illumination and expression changes, it also exhibits excellent abilities in recognizing corrupted images due to occlusion. In our experiments, we will show that our method would perform favorably against existing sparse representation or dictionary learning based approaches. Moreover, our computation time is remarkably less than that of recent dictionary learning based face recognition methods. Therefore, the effectiveness and efficiency of our proposed algorithm can be successfully verified.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125448176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Image retargeting by combining fast seam carving with neighboring probability (FSc_Neip) and scaling 结合快速缝雕刻与邻近概率(FSc_Neip)和缩放的图像重定位方法
Pub Date : 2015-06-01 DOI: 10.1109/ICME.2015.7177442
Lifang Wu, Lijuan Wang, Shuang Liu, Qingyang Zheng, Y. Jing, Chang Wen Chen, Bo Yan
No single retargeting approach performs well on all images and all target sizes; therefore, hybrid algorithms are often considered as promising alternatives. However, most hybrid schemes are time consuming. In this paper, we propose a fast hybrid framework in which the Fast Content-Aware Image Distance (FCAID) is used to connect fast seam carving with neighboring probability constraints (FSc_Neip) and scaling. FCAID is used to measure the image distance between the resized image given by FSc_Neip and the original image. This fast technique is embedded within the FSc_Neip framework. Our hybrid scheme is locally applied in strip regions. This makes the retargeting scheme globally non-homogeneous. Experimental results demonstrate that our approach comprehensively outperforms other state-of-the-art techniques in terms of image quality and computational complexity.
没有一种单一的重定向方法对所有图像和所有目标尺寸都有效;因此,混合算法通常被认为是有前途的替代方案。然而,大多数混合方案都很耗时。本文提出了一种快速混合框架,利用快速内容感知图像距离(fast Content-Aware Image Distance, FCAID)将快速拼接与相邻概率约束(FSc_Neip)和缩放连接起来。FCAID用于测量由FSc_Neip给出的调整后的图像与原始图像之间的图像距离。这种快速技术嵌入在FSc_Neip框架中。该混合方案可局部应用于带状地区。这使得重定向方案在全局上不均匀。实验结果表明,我们的方法在图像质量和计算复杂度方面全面优于其他最先进的技术。
{"title":"Image retargeting by combining fast seam carving with neighboring probability (FSc_Neip) and scaling","authors":"Lifang Wu, Lijuan Wang, Shuang Liu, Qingyang Zheng, Y. Jing, Chang Wen Chen, Bo Yan","doi":"10.1109/ICME.2015.7177442","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177442","url":null,"abstract":"No single retargeting approach performs well on all images and all target sizes; therefore, hybrid algorithms are often considered as promising alternatives. However, most hybrid schemes are time consuming. In this paper, we propose a fast hybrid framework in which the Fast Content-Aware Image Distance (FCAID) is used to connect fast seam carving with neighboring probability constraints (FSc_Neip) and scaling. FCAID is used to measure the image distance between the resized image given by FSc_Neip and the original image. This fast technique is embedded within the FSc_Neip framework. Our hybrid scheme is locally applied in strip regions. This makes the retargeting scheme globally non-homogeneous. Experimental results demonstrate that our approach comprehensively outperforms other state-of-the-art techniques in terms of image quality and computational complexity.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126993884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Seeing through the appearance: Body shape estimation using multi-view clothing images 透过外表看:使用多视角服装图像进行身材估计
Pub Date : 2015-06-01 DOI: 10.1109/ICME.2015.7177402
Wei-Yi Chang, Y. Wang
We propose a learning-based algorithm for body shape estimation, which only requires 2D clothing images taken in multiple views as the input data. Compared with the use of 3D scanners or depth cameras, although our setting is more user friendly, it also makes the learning and estimation problems more challenging. In addition to utilizing ground truth body images for constructing human body models at each view of interest, our work uniquely associates the anthropometric measurements (e.g., body height or leg length) across different views. For performing body shape estimation using multi-view clothing images, the proposed algorithm solves an optimization task which recovers the body shape with image and measurement reconstruction guarantees. In the experiments, we will show that the use of our proposed method would achieve satisfactory estimation results, and performs favorably against single-view or other baseline approaches for both body shape and measurement estimation.
我们提出了一种基于学习的体型估计算法,该算法只需要在多个视图中拍摄二维服装图像作为输入数据。与使用3D扫描仪或深度相机相比,虽然我们的设置更加用户友好,但它也使学习和估计问题更具挑战性。除了利用地面真实身体图像在每个感兴趣的视图中构建人体模型外,我们的工作还独特地将不同视图中的人体测量值(例如身体高度或腿长)联系起来。针对多视角服装图像的体型估计,该算法解决了在图像和测量重建保证的情况下恢复体型的优化问题。在实验中,我们将证明使用我们提出的方法将获得令人满意的估计结果,并且在体型和测量估计方面优于单视图或其他基线方法。
{"title":"Seeing through the appearance: Body shape estimation using multi-view clothing images","authors":"Wei-Yi Chang, Y. Wang","doi":"10.1109/ICME.2015.7177402","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177402","url":null,"abstract":"We propose a learning-based algorithm for body shape estimation, which only requires 2D clothing images taken in multiple views as the input data. Compared with the use of 3D scanners or depth cameras, although our setting is more user friendly, it also makes the learning and estimation problems more challenging. In addition to utilizing ground truth body images for constructing human body models at each view of interest, our work uniquely associates the anthropometric measurements (e.g., body height or leg length) across different views. For performing body shape estimation using multi-view clothing images, the proposed algorithm solves an optimization task which recovers the body shape with image and measurement reconstruction guarantees. In the experiments, we will show that the use of our proposed method would achieve satisfactory estimation results, and performs favorably against single-view or other baseline approaches for both body shape and measurement estimation.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128052232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Beyond Bag-of-Words: Fast video classification with Fisher Kernel Vector of Locally Aggregated Descriptors 超越词袋:局部聚合描述子的Fisher核向量快速视频分类
Pub Date : 2015-06-01 DOI: 10.1109/ICME.2015.7177489
Ionut Mironica, Ionut Cosmin Duta, B. Ionescu, N. Sebe
In this paper we introduce a new video description framework that replaces traditional Bag-of-Words with a combination of Fisher Kernels (FK) and Vector of Locally Aggregated Descriptors (VLAD). The main contributions are: (i) a fast algorithm to densely extract global frame features, easier and faster to compute than spatio-temporal local features; (ii) replacing the traditional k-means based vocabulary with a Random Forest approach that allows significant speedup; (iii) use of a modified VLAD and FK representation to replace the classic Bag-of-Words and obtaining better performance. We show that our framework is highly general and is not dependent on a particular type of descriptor. It achieves state-of-the-art results in several classification scenarios.
本文提出了一种新的视频描述框架,它将Fisher核(FK)和局部聚合描述子向量(VLAD)相结合,取代了传统的词袋描述框架。主要贡献有:(1)采用快速算法密集提取全局帧特征,比时空局部特征计算更简单、更快;(ii)用随机森林方法取代传统的基于k均值的词汇表,从而实现显著的加速;(iii)使用改进的VLAD和FK表示来取代经典的Bag-of-Words,获得更好的性能。我们展示了我们的框架是高度通用的,并且不依赖于特定类型的描述符。它在几个分类场景中实现了最先进的结果。
{"title":"Beyond Bag-of-Words: Fast video classification with Fisher Kernel Vector of Locally Aggregated Descriptors","authors":"Ionut Mironica, Ionut Cosmin Duta, B. Ionescu, N. Sebe","doi":"10.1109/ICME.2015.7177489","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177489","url":null,"abstract":"In this paper we introduce a new video description framework that replaces traditional Bag-of-Words with a combination of Fisher Kernels (FK) and Vector of Locally Aggregated Descriptors (VLAD). The main contributions are: (i) a fast algorithm to densely extract global frame features, easier and faster to compute than spatio-temporal local features; (ii) replacing the traditional k-means based vocabulary with a Random Forest approach that allows significant speedup; (iii) use of a modified VLAD and FK representation to replace the classic Bag-of-Words and obtaining better performance. We show that our framework is highly general and is not dependent on a particular type of descriptor. It achieves state-of-the-art results in several classification scenarios.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128081739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Keypoint encoding and transmission for improved feature extraction from compressed images 改进压缩图像特征提取的关键点编码和传输
Pub Date : 2015-06-01 DOI: 10.1109/ICME.2015.7177388
Jianshu Chao, E. Steinbach, Lexing Xie
In many mobile visual analysis scenarios, compressed images are transmitted over a communication network for analysis at a server. Often, the processing at the server includes some form of feature extraction and matching. Image compression has been shown to have an adverse effect on feature matching performance. To address this issue, we propose to signal the feature keypoints as side information to the server, and extract only the feature descriptors from the compressed images. To this end, we propose an approach to efficiently encode the locations, scales, and orientations of keypoints extracted from the original image. Furthermore, we propose a new approach for selecting relevant yet fragile keypoints as side information for the image, thus further reducing the data volume. We evaluate the performance of our approach using the Stanford mobile augmented reality dataset. Results show that the feature matching performance is significantly improved for images at low bitrate.
在许多移动视觉分析场景中,压缩图像通过通信网络传输,以便在服务器上进行分析。通常,服务器上的处理包括某种形式的特征提取和匹配。图像压缩已被证明对特征匹配性能有不利影响。为了解决这个问题,我们建议将特征关键点作为侧信息发送给服务器,并从压缩图像中仅提取特征描述符。为此,我们提出了一种对从原始图像中提取的关键点的位置、尺度和方向进行有效编码的方法。此外,我们提出了一种新的方法,选择相关但脆弱的关键点作为图像的侧信息,从而进一步减少数据量。我们使用斯坦福移动增强现实数据集评估我们的方法的性能。结果表明,在低比特率下,特征匹配性能得到了显著提高。
{"title":"Keypoint encoding and transmission for improved feature extraction from compressed images","authors":"Jianshu Chao, E. Steinbach, Lexing Xie","doi":"10.1109/ICME.2015.7177388","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177388","url":null,"abstract":"In many mobile visual analysis scenarios, compressed images are transmitted over a communication network for analysis at a server. Often, the processing at the server includes some form of feature extraction and matching. Image compression has been shown to have an adverse effect on feature matching performance. To address this issue, we propose to signal the feature keypoints as side information to the server, and extract only the feature descriptors from the compressed images. To this end, we propose an approach to efficiently encode the locations, scales, and orientations of keypoints extracted from the original image. Furthermore, we propose a new approach for selecting relevant yet fragile keypoints as side information for the image, thus further reducing the data volume. We evaluate the performance of our approach using the Stanford mobile augmented reality dataset. Results show that the feature matching performance is significantly improved for images at low bitrate.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124508467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A method to compute saliency regions in 3D video based on fusion of feature maps 一种基于特征映射融合的三维视频显著区域计算方法
Pub Date : 2015-06-01 DOI: 10.1109/ICME.2015.7177474
Lino Ferreira, L. Cruz, P. Assunção
Efficient computation of visual saliency regions has been a research problem in the recent past, but in the case of 3D content no definite solutions exist. This paper presents a computational method to determine saliency regions in 3D video, based on fusion of three feature maps containing perceptually relevant information from spatial, temporal and depth dimensions. The proposed method follows a bottom-up approach to predict the 3D regions where observers tend to hold their gaze for longer periods. Fusion of the feature maps is combined with a center-bias weighting function to determine 3D visual saliency map. For validation and performance evaluation, a publicly available database of 3D video sequences and corresponding fixation density maps was used as ground-truth. The experimental results show that the proposed method achieves better performance than other state-of-art models.
视觉显著区域的高效计算一直是近年来研究的一个问题,但在3D内容的情况下,没有明确的解决方案。本文提出了一种基于融合三维视频中包含空间、时间和深度维度感知相关信息的三个特征图的计算方法来确定3D视频中的显著区域。提出的方法采用自下而上的方法来预测观察者倾向于长时间凝视的3D区域。结合特征图的融合和中心偏置加权函数确定三维视觉显著性图。为了验证和性能评估,使用公开可用的3D视频序列数据库和相应的固定密度图作为基础事实。实验结果表明,该方法比现有的模型具有更好的性能。
{"title":"A method to compute saliency regions in 3D video based on fusion of feature maps","authors":"Lino Ferreira, L. Cruz, P. Assunção","doi":"10.1109/ICME.2015.7177474","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177474","url":null,"abstract":"Efficient computation of visual saliency regions has been a research problem in the recent past, but in the case of 3D content no definite solutions exist. This paper presents a computational method to determine saliency regions in 3D video, based on fusion of three feature maps containing perceptually relevant information from spatial, temporal and depth dimensions. The proposed method follows a bottom-up approach to predict the 3D regions where observers tend to hold their gaze for longer periods. Fusion of the feature maps is combined with a center-bias weighting function to determine 3D visual saliency map. For validation and performance evaluation, a publicly available database of 3D video sequences and corresponding fixation density maps was used as ground-truth. The experimental results show that the proposed method achieves better performance than other state-of-art models.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123613192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Active crosstalk reduction system for multiview autostereoscopic displays 用于多视点自动立体显示器的有源串扰抑制系统
Pub Date : 2015-06-01 DOI: 10.1109/ICME.2015.7177519
Philippe Hanhart, C. D. Nolfo, T. Ebrahimi
Multiview autostereoscopic displays are considered as the future of 3DTV. However, these displays suffer from a high level of crosstalk, which negatively impacts quality of experience (QoE). In this paper, we propose a system to improve 3D QoE on multiview autostereoscopic displays. First, the display is characterized in terms of luminance distribution. Then, the luminance profiles are modeled using a limited set of parameters. A Kinect sensor is used to determine the viewer position in front of the display. Finally, the proposed system performs an intelligent on the fly allocation of the output views to minimize the perceived crosstalk. The user preference between 2D and 3D modes and the proposed system is evaluated. Results show that picture quality is significantly improved when compared to the standard 3D mode, for a similar depth perception and visual comfort.
多视点自动立体显示器被认为是3d电视的未来。然而,这些显示器遭受高水平的串扰,这对体验质量(QoE)产生负面影响。本文提出了一种提高多视点自立体显示器三维QoE的系统。首先,显示器在亮度分布方面具有特征。然后,使用一组有限的参数对亮度曲线进行建模。Kinect传感器用于确定观看者在显示器前的位置。最后,该系统对输出视图进行智能动态分配,以最小化可感知的串扰。评估了用户在2D和3D模式以及所提出的系统之间的偏好。结果表明,与标准3D模式相比,在相似的深度感知和视觉舒适度下,图像质量显着提高。
{"title":"Active crosstalk reduction system for multiview autostereoscopic displays","authors":"Philippe Hanhart, C. D. Nolfo, T. Ebrahimi","doi":"10.1109/ICME.2015.7177519","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177519","url":null,"abstract":"Multiview autostereoscopic displays are considered as the future of 3DTV. However, these displays suffer from a high level of crosstalk, which negatively impacts quality of experience (QoE). In this paper, we propose a system to improve 3D QoE on multiview autostereoscopic displays. First, the display is characterized in terms of luminance distribution. Then, the luminance profiles are modeled using a limited set of parameters. A Kinect sensor is used to determine the viewer position in front of the display. Finally, the proposed system performs an intelligent on the fly allocation of the output views to minimize the perceived crosstalk. The user preference between 2D and 3D modes and the proposed system is evaluated. Results show that picture quality is significantly improved when compared to the standard 3D mode, for a similar depth perception and visual comfort.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122109369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
2015 IEEE International Conference on Multimedia and Expo (ICME)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1