首页 > 最新文献

2015 IEEE International Conference on Multimedia and Expo (ICME)最新文献

英文 中文
Effectively compressing Near-Duplicate Videos in a joint way 以联合方式有效压缩近重复视频
Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177385
Hanli Wang, Ming Ma, Tao Tian
With the increasing popularity of social network, more and more people tend to store and transmit information in visual format, such as image and video. However, the cost of this convenience brings about a shock to traditional video servers and expose them under the risk of overloading. Among the huge amount of online videos, there are quite a number of Near-Duplicate Videos (NDVs). Although many works have been proposed to detect NDVs, few researches are investigated to compress these NDVs in a more effective way than independent compression. In this work, we utilize the data redundancy of NDVs and propose a video coding method to jointly compress NDVs. In order to employ the proposed video coding method, a number of pre-processing functions are designed to explore the correlation of visual information among NDVs and to suit the video coding requirements. Experimental results verify that the proposed video coding method is able to effectively compress NDVs and thus save video data storage.
随着社交网络的日益普及,越来越多的人倾向于以图像、视频等视觉形式存储和传播信息。然而,这种便利的代价给传统的视频服务器带来了冲击,使其面临过载的风险。在海量的网络视频中,存在相当多的近重复视频(NDVs)。尽管已经提出了许多工作来检测ndv,但很少有研究以比独立压缩更有效的方式压缩这些ndv。本文利用ndvv的数据冗余性,提出了一种联合压缩ndvv的视频编码方法。为了应用所提出的视频编码方法,设计了一些预处理函数,以探索ndv之间视觉信息的相关性,并满足视频编码要求。实验结果验证了所提出的视频编码方法能够有效地压缩ndv,从而节省视频数据的存储空间。
{"title":"Effectively compressing Near-Duplicate Videos in a joint way","authors":"Hanli Wang, Ming Ma, Tao Tian","doi":"10.1109/ICME.2015.7177385","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177385","url":null,"abstract":"With the increasing popularity of social network, more and more people tend to store and transmit information in visual format, such as image and video. However, the cost of this convenience brings about a shock to traditional video servers and expose them under the risk of overloading. Among the huge amount of online videos, there are quite a number of Near-Duplicate Videos (NDVs). Although many works have been proposed to detect NDVs, few researches are investigated to compress these NDVs in a more effective way than independent compression. In this work, we utilize the data redundancy of NDVs and propose a video coding method to jointly compress NDVs. In order to employ the proposed video coding method, a number of pre-processing functions are designed to explore the correlation of visual information among NDVs and to suit the video coding requirements. Experimental results verify that the proposed video coding method is able to effectively compress NDVs and thus save video data storage.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129053286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A flexible platform for QoE-driven delivery of image-rich web applications 一个灵活的平台,用于qos驱动的图像丰富的web应用程序交付
Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177516
P. Ahammad, R. Gaunker, B. Kennedy, Mehrdad Reshadi, K. Kumar, A. K. Pathan, Hariharan Kolam
The advent of content-rich modern web applications, unreliable network connectivity and device heterogeneity demands flexible web content delivery platforms that can handle the high variability along many dimensions - especially for the mobile web. Images account for more than 60% of the content delivered by present-day webpages and have a strong influence on the perceived webpage latency and end-user experience. We present a flexible web delivery platform with a client-cloud architecture and content-aware optimizations to address the problem of delivering image-rich web applications. Our solution makes use of quantitative measures of image perceptual quality, machine learning algorithms, partial caching and opportunistic client-side choices to efficiently deliver images on the web. Using data from the WWW, we experimentally demonstrate that our approach shows significant improvement on various web performance criteria that are critical for maintaining a desirable end-user quality-of-experience (QoE) for image-rich web applications.
内容丰富的现代web应用程序的出现,不可靠的网络连接和设备的异构性要求灵活的web内容交付平台,可以处理在许多维度上的高度可变性-特别是对于移动web。图片占当前网页内容的60%以上,对网页延迟和最终用户体验有很大的影响。我们提供了一个灵活的web交付平台,它具有客户端云架构和内容感知优化,以解决交付图像丰富的web应用程序的问题。我们的解决方案利用图像感知质量的定量测量、机器学习算法、部分缓存和机会主义的客户端选择来有效地在网络上传递图像。使用来自WWW的数据,我们通过实验证明,我们的方法在各种web性能标准上显示出显著的改进,这些标准对于维护理想的终端用户体验质量(QoE)对于图像丰富的web应用程序至关重要。
{"title":"A flexible platform for QoE-driven delivery of image-rich web applications","authors":"P. Ahammad, R. Gaunker, B. Kennedy, Mehrdad Reshadi, K. Kumar, A. K. Pathan, Hariharan Kolam","doi":"10.1109/ICME.2015.7177516","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177516","url":null,"abstract":"The advent of content-rich modern web applications, unreliable network connectivity and device heterogeneity demands flexible web content delivery platforms that can handle the high variability along many dimensions - especially for the mobile web. Images account for more than 60% of the content delivered by present-day webpages and have a strong influence on the perceived webpage latency and end-user experience. We present a flexible web delivery platform with a client-cloud architecture and content-aware optimizations to address the problem of delivering image-rich web applications. Our solution makes use of quantitative measures of image perceptual quality, machine learning algorithms, partial caching and opportunistic client-side choices to efficiently deliver images on the web. Using data from the WWW, we experimentally demonstrate that our approach shows significant improvement on various web performance criteria that are critical for maintaining a desirable end-user quality-of-experience (QoE) for image-rich web applications.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126722484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
The effect of non-linear structures on the usage of hypervideo for physical training 非线性结构对体育训练中超视频使用的影响
Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177378
Katrin Tonndorf, Christian Handschigl, Julian Windscheid, H. Kosch, M. Granitzer
The growing number of elderly people combined with financial cuts in the health care sector lead to an increased demand for computer supported medical services. New standards like HTML5 allow the creation of hypervideo training applications that run on a variety of end user devices. In this paper, we evaluate an HTML5 player running an e-health hypervideo for the support of pelvic floor exercises. In an experimental test setting we compared the hypervideo to a primarily linear version regarding usability and utilization for self-controlled training. Our results show the hypervideo version leads to slightly more usability problems but facilitated a more active and individual training.
老年人数量的增加,加上医疗保健部门的财政削减,导致对计算机支持的医疗服务的需求增加。像HTML5这样的新标准允许创建在各种终端用户设备上运行的超视频培训应用程序。在本文中,我们评估了一个运行电子健康超级视频的HTML5播放器,以支持盆底运动。在实验测试设置中,我们将超视频与主要线性版本进行了比较,以了解自我控制训练的可用性和利用率。我们的结果表明,超视频版本导致了更多的可用性问题,但促进了更积极和个性化的训练。
{"title":"The effect of non-linear structures on the usage of hypervideo for physical training","authors":"Katrin Tonndorf, Christian Handschigl, Julian Windscheid, H. Kosch, M. Granitzer","doi":"10.1109/ICME.2015.7177378","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177378","url":null,"abstract":"The growing number of elderly people combined with financial cuts in the health care sector lead to an increased demand for computer supported medical services. New standards like HTML5 allow the creation of hypervideo training applications that run on a variety of end user devices. In this paper, we evaluate an HTML5 player running an e-health hypervideo for the support of pelvic floor exercises. In an experimental test setting we compared the hypervideo to a primarily linear version regarding usability and utilization for self-controlled training. Our results show the hypervideo version leads to slightly more usability problems but facilitated a more active and individual training.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121600421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Affect-expressive hand gestures synthesis and animation 情感表达手势合成和动画
Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177478
E. Bozkurt, E. Erzin, Y. Yemez
Speech and hand gestures form a composite communicative signal that boosts the naturalness and affectiveness of the communication. We present a multimodal framework for joint analysis of continuous affect, speech prosody and hand gestures towards automatic synthesis of realistic hand gestures from spontaneous speech using the hidden semi-Markov models (HSMMs). To the best of our knowledge, this is the first attempt for synthesizing hand gestures using continuous dimensional affect space, i.e., activation, valence, and dominance. We model relationships between acoustic features describing speech prosody and hand gestures with and without using the continuous affect information in speaker independent configurations and evaluate the multimodal analysis framework by generating hand gesture animations, also via objective evaluations. Our experimental studies are promising, conveying the role of affect for modeling the dynamics of speech-gesture relationship.
言语和手势构成了一种复合的交际信号,增强了交际的自然性和情感性。我们提出了一个多模态框架,用于联合分析连续情感、语音韵律和手势,从而利用隐藏半马尔可夫模型(HSMMs)从自发语音中自动合成现实手势。据我们所知,这是第一次尝试使用连续维度的影响空间,即激活、效价和支配来合成手势。我们对描述语音韵律和手势的声学特征之间的关系进行了建模,并通过生成手势动画和客观评估来评估多模态分析框架。我们的实验研究是有希望的,传达了情感在语言-手势关系动力学建模中的作用。
{"title":"Affect-expressive hand gestures synthesis and animation","authors":"E. Bozkurt, E. Erzin, Y. Yemez","doi":"10.1109/ICME.2015.7177478","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177478","url":null,"abstract":"Speech and hand gestures form a composite communicative signal that boosts the naturalness and affectiveness of the communication. We present a multimodal framework for joint analysis of continuous affect, speech prosody and hand gestures towards automatic synthesis of realistic hand gestures from spontaneous speech using the hidden semi-Markov models (HSMMs). To the best of our knowledge, this is the first attempt for synthesizing hand gestures using continuous dimensional affect space, i.e., activation, valence, and dominance. We model relationships between acoustic features describing speech prosody and hand gestures with and without using the continuous affect information in speaker independent configurations and evaluate the multimodal analysis framework by generating hand gesture animations, also via objective evaluations. Our experimental studies are promising, conveying the role of affect for modeling the dynamics of speech-gesture relationship.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"43 9","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114095616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Fusion of Time-of-Flight and Phase Shifting for high-resolution and low-latency depth sensing 融合飞行时间和相移的高分辨率和低延迟深度传感
Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177426
Yueyi Zhang, Zhiwei Xiong, Feng Wu
Depth sensors based on Time-of-Flight (ToF) and Phase Shifting (PS) have complementary strengths and weaknesses. ToF can provide real-time depth but limited in resolution and sensitive to noise. PS can generate accurate and robust depth with high resolution but requires a number of patterns that leads to high latency. In this paper, we propose a novel fusion framework to take advantages of both ToF and PS. The basic idea is using the coarse depth from ToF to disambiguate the wrapped depth from PS. Specifically, we address two key technical problems: cross-modal calibration and interference-free synchronization between ToF and PS sensors. Experiments demonstrate that the proposed method generates accurate and robust depth with high resolution and low latency, which is beneficial to tremendous applications.
基于飞行时间(ToF)和基于相移(PS)的深度传感器各有优缺点。ToF可以提供实时深度,但分辨率有限,对噪声敏感。PS可以在高分辨率下生成准确和健壮的深度,但需要许多导致高延迟的模式。本文提出了一种融合ToF和PS的融合框架,其基本思想是利用ToF的粗深度来消除PS的包裹深度歧义,具体解决了ToF和PS传感器之间的跨模态校准和无干扰同步两个关键技术问题。实验结果表明,该方法能产生准确、鲁棒的深度,具有高分辨率和低延迟,具有广泛的应用前景。
{"title":"Fusion of Time-of-Flight and Phase Shifting for high-resolution and low-latency depth sensing","authors":"Yueyi Zhang, Zhiwei Xiong, Feng Wu","doi":"10.1109/ICME.2015.7177426","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177426","url":null,"abstract":"Depth sensors based on Time-of-Flight (ToF) and Phase Shifting (PS) have complementary strengths and weaknesses. ToF can provide real-time depth but limited in resolution and sensitive to noise. PS can generate accurate and robust depth with high resolution but requires a number of patterns that leads to high latency. In this paper, we propose a novel fusion framework to take advantages of both ToF and PS. The basic idea is using the coarse depth from ToF to disambiguate the wrapped depth from PS. Specifically, we address two key technical problems: cross-modal calibration and interference-free synchronization between ToF and PS sensors. Experiments demonstrate that the proposed method generates accurate and robust depth with high resolution and low latency, which is beneficial to tremendous applications.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"123 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133643387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Energy and area efficient hardware implementation of 4K Main-10 HEVC decoder in Ultra-HD Blu-ray player and TV systems 超高清蓝光播放器和电视系统中4K Main-10 HEVC解码器的节能和高效硬件实现
Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177399
Tsu-Ming Liu, Yung-Chang Chang, Chih-Ming Wang, Hue-Min Lin, Chia-Yun Cheng, Chun-Chia Chen, Min-Hao Chiu, Sheng-Jen Wang, P. Chao, Meng-Jye Hu, Fu-Chun Yeh, Shun-Hsiang Chuang, Hsiu-Yi Lin, Ming-Long Wu, Che-Hong Chen, Chia-Lin Ho, Chi-Cheng Ju
A 4K and Main-10 HEVC video decoder LSI is fabricated in a 28nm CMOS process. It adopts a block-concealed processor (BcP) to improve the visual quality and a bandwidth-suppressed processor (BsP) is newly designed to reduce 30% and 45% of external data accesses in playback and gaming scenario, respectively. It features fully core scalable (FCS) architecture which lowers the required working frequency by 65%. A 10-bit compact scheme is proposed to reduce the frame buffer space by 37.5%. Moreover, a multi-standard architecture reduces are by 28%. It achieves 530Mpixels/s throughput which is two times larger than the state-of-the-art HEVC design [2] and consumes 0.2nJ/pixel energy efficiency, enabling real-time 4K video playback in Ultra-HD Blu-ray player and TV systems.
采用28nm CMOS工艺制作了4K和Main-10 HEVC视频解码器LSI。它采用了块隐藏处理器(BcP)来提高视觉质量,并新设计了带宽抑制处理器(BsP)来减少播放和游戏场景中30%和45%的外部数据访问。它具有全核心可扩展(FCS)架构,可将所需的工作频率降低65%。提出了一种10位压缩方案,可以减少37.5%的帧缓冲空间。此外,多标准架构减少了28%。它实现了5.3亿像素/秒的吞吐量,是最先进的HEVC设计[2]的两倍,能耗为0.2nJ/像素,可在超高清蓝光播放器和电视系统中实现实时4K视频播放。
{"title":"Energy and area efficient hardware implementation of 4K Main-10 HEVC decoder in Ultra-HD Blu-ray player and TV systems","authors":"Tsu-Ming Liu, Yung-Chang Chang, Chih-Ming Wang, Hue-Min Lin, Chia-Yun Cheng, Chun-Chia Chen, Min-Hao Chiu, Sheng-Jen Wang, P. Chao, Meng-Jye Hu, Fu-Chun Yeh, Shun-Hsiang Chuang, Hsiu-Yi Lin, Ming-Long Wu, Che-Hong Chen, Chia-Lin Ho, Chi-Cheng Ju","doi":"10.1109/ICME.2015.7177399","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177399","url":null,"abstract":"A 4K and Main-10 HEVC video decoder LSI is fabricated in a 28nm CMOS process. It adopts a block-concealed processor (BcP) to improve the visual quality and a bandwidth-suppressed processor (BsP) is newly designed to reduce 30% and 45% of external data accesses in playback and gaming scenario, respectively. It features fully core scalable (FCS) architecture which lowers the required working frequency by 65%. A 10-bit compact scheme is proposed to reduce the frame buffer space by 37.5%. Moreover, a multi-standard architecture reduces are by 28%. It achieves 530Mpixels/s throughput which is two times larger than the state-of-the-art HEVC design [2] and consumes 0.2nJ/pixel energy efficiency, enabling real-time 4K video playback in Ultra-HD Blu-ray player and TV systems.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133050584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Joint Latent Dirichlet Allocation for non-iid social tags 非id社会标签的联合潜狄利克雷分配
Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177490
Jiangchao Yao, Ya Zhang, Zhe Xu, Jun-wei Sun, Jun Zhou, Xiao Gu
Topic models have been widely used for analyzing text corpora and achieved great success in applications including content organization and information retrieval. However, different from traditional text data, social tags in the web containers are usually of small amounts, unordered, and non-iid, i.e., it is highly dependent on contextual information such as users and objects. Considering the specific characteristics of social tags, we here introduce a new model named Joint Latent Dirichlet Allocation (JLDA) to capture the relationships among users, objects, and tags. The model assumes that the latent topics of users and those of objects jointly influence the generation of tags. The latent distributions is then inferred with Gibbs sampling. Experiments on two social tag data sets have demonstrated that the model achieves a lower predictive error and generates more reasonable topics. We also present an interesting application of this model to object recommendation.
主题模型在文本语料库分析中得到了广泛的应用,在内容组织和信息检索等方面取得了巨大的成功。然而,与传统的文本数据不同,web容器中的社交标签通常是少量的、无序的、非id的,也就是说,它高度依赖于用户和对象等上下文信息。考虑到社交标签的特点,本文引入了一种新的模型JLDA (Joint Latent Dirichlet Allocation)来捕获用户、对象和标签之间的关系。该模型假设用户的潜在主题和对象的潜在主题共同影响标签的生成。然后用吉布斯抽样推断潜在分布。在两个社会标签数据集上的实验表明,该模型实现了较低的预测误差,生成了更合理的主题。我们还提出了该模型在对象推荐中的一个有趣的应用。
{"title":"Joint Latent Dirichlet Allocation for non-iid social tags","authors":"Jiangchao Yao, Ya Zhang, Zhe Xu, Jun-wei Sun, Jun Zhou, Xiao Gu","doi":"10.1109/ICME.2015.7177490","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177490","url":null,"abstract":"Topic models have been widely used for analyzing text corpora and achieved great success in applications including content organization and information retrieval. However, different from traditional text data, social tags in the web containers are usually of small amounts, unordered, and non-iid, i.e., it is highly dependent on contextual information such as users and objects. Considering the specific characteristics of social tags, we here introduce a new model named Joint Latent Dirichlet Allocation (JLDA) to capture the relationships among users, objects, and tags. The model assumes that the latent topics of users and those of objects jointly influence the generation of tags. The latent distributions is then inferred with Gibbs sampling. Experiments on two social tag data sets have demonstrated that the model achieves a lower predictive error and generates more reasonable topics. We also present an interesting application of this model to object recommendation.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"91 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116939995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Towards GPU HEVC intra decoding: Seizing fine-grain parallelism 对GPU HEVC内部解码:抓住细粒度并行
Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177515
D. Souza, A. Ilic, N. Roma, L. Sousa
To satisfy the growing demands on real-time video decoders for high frame resolutions, novel GPU parallel algorithms are proposed herein for fully compliant HEVC de-quantization, inverse transform and intra prediction. The proposed algorithms are designed to fully exploit and leverage the fine grain parallelism within these computationally demanding and highly data dependent modules. Moreover, the proposed approaches allow the efficient utilization of the GPU computational resources, while carefully managing the data accesses in the complex GPU memory hierarchy. The experimental results show that the real-time processing is achieved for all tested sequences and the most demanding QP, while delivering average fps of 118.6, 89.2 and 49.7 for Full HD, 2160p and Ultra HD 4K sequences, respectively.
为了满足实时视频解码器对高帧分辨率的需求,本文提出了完全兼容HEVC去量化、逆变换和帧内预测的新型GPU并行算法。所提出的算法旨在充分利用和利用这些计算要求高且高度依赖数据的模块中的细粒度并行性。此外,所提出的方法允许高效利用GPU计算资源,同时仔细管理复杂GPU内存层次结构中的数据访问。实验结果表明,在全高清、2160p和超高清4K序列中,平均帧率分别为118.6、89.2和49.7,对所有测试序列和最苛刻的QP都实现了实时处理。
{"title":"Towards GPU HEVC intra decoding: Seizing fine-grain parallelism","authors":"D. Souza, A. Ilic, N. Roma, L. Sousa","doi":"10.1109/ICME.2015.7177515","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177515","url":null,"abstract":"To satisfy the growing demands on real-time video decoders for high frame resolutions, novel GPU parallel algorithms are proposed herein for fully compliant HEVC de-quantization, inverse transform and intra prediction. The proposed algorithms are designed to fully exploit and leverage the fine grain parallelism within these computationally demanding and highly data dependent modules. Moreover, the proposed approaches allow the efficient utilization of the GPU computational resources, while carefully managing the data accesses in the complex GPU memory hierarchy. The experimental results show that the real-time processing is achieved for all tested sequences and the most demanding QP, while delivering average fps of 118.6, 89.2 and 49.7 for Full HD, 2160p and Ultra HD 4K sequences, respectively.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129824292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Compression of photo collections using geometrical information 使用几何信息压缩照片集合
Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177379
S. Milani, P. Zanuttigh
This paper proposes a novel scheme for the joint compression of photo collections framing the same object or scene. The proposed approach starts by locating corresponding features in the various images and then exploits a Structure from Motion algorithm to estimate the geometric relationships between the various images and their viewpoints. Then it uses 3D information and warping to predict images one from the other. Furthermore, graph algorithms are used to compute minimum weight topologies and identify the ordering of the input images that maximizes the efficiency of prediction. The obtained data is fed to a modified HEVC coder to perform the compression. Experimental results show that the proposed scheme outperforms competing solutions and can be efficiently employed for the storage of large image collections in the virtual exploration of architectural landmarks or in photo sharing websites.
本文提出了一种针对同一物体或场景的照片集联合压缩的新方案。该方法首先在各种图像中定位相应的特征,然后利用Structure from Motion算法来估计各种图像及其视点之间的几何关系。然后,它使用3D信息和变形来预测不同的图像。此外,图算法用于计算最小权重拓扑和识别输入图像的排序,以最大限度地提高预测效率。得到的数据被馈送到修改后的HEVC编码器执行压缩。实验结果表明,该方案优于其他方案,可以有效地用于建筑地标虚拟探索或照片共享网站中大型图像集合的存储。
{"title":"Compression of photo collections using geometrical information","authors":"S. Milani, P. Zanuttigh","doi":"10.1109/ICME.2015.7177379","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177379","url":null,"abstract":"This paper proposes a novel scheme for the joint compression of photo collections framing the same object or scene. The proposed approach starts by locating corresponding features in the various images and then exploits a Structure from Motion algorithm to estimate the geometric relationships between the various images and their viewpoints. Then it uses 3D information and warping to predict images one from the other. Furthermore, graph algorithms are used to compute minimum weight topologies and identify the ordering of the input images that maximizes the efficiency of prediction. The obtained data is fed to a modified HEVC coder to perform the compression. Experimental results show that the proposed scheme outperforms competing solutions and can be efficiently employed for the storage of large image collections in the virtual exploration of architectural landmarks or in photo sharing websites.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127536551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Creative design of color palettes for product packaging 产品包装调色板创意设计
Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177443
Ying Li, Anshul Sheopuri
This paper describes our latest work on assisting CPG (Consumer Packaged Goods) companies with their product packaging designs by providing color palettes that are visually appealing, novel and consistent with desired marketing messages for a particular brand and product. Specifically, we start by mining a large collections of images of different products and brands to learn about all the colors and color combinations that frequently appear among them. Meanwhile, a color-message graph is constructed to represent messages conveyed by different colors as well as to capture the interrelationship among them. Knowledge from both color psychology and information sources like Thesaurus are extensively exploited in this case. Now, given a particular product and brand to be designed for its packaging, along with the company's desired marketing message, we apply a computational method to generate quintillions of novel color palettes that can be used for the design. This process will leverage existing palettes used by same products of different brands or different products of the same brand, take in optional color preferences from users, identify then utilize the right colors to convey the desired marketing message. Finally, we rank the palettes based on assessment of their visual aesthetics, novelty and the way that different messages of the same palette interact with each other, so as to guide human designers to choose the right ones. Our initial demonstrations of this work to colleagues of subject matter have received very positive feedback. We are now exploring opportunities to collaborate with them to validate this technology in a controlled experimental setting.
本文描述了我们最新的工作,协助CPG(消费品包装)公司与他们的产品包装设计,提供色彩调色板是视觉上吸引人的,新颖的,并符合特定品牌和产品所需的营销信息。具体来说,我们首先挖掘不同产品和品牌的大量图像,以了解其中经常出现的所有颜色和颜色组合。同时,构建颜色信息图来表示不同颜色所传递的信息,并捕捉它们之间的相互关系。在这种情况下,来自色彩心理学和信息来源(如Thesaurus)的知识被广泛利用。现在,给定要为其包装设计的特定产品和品牌,以及公司所需的营销信息,我们应用计算方法来生成可用于设计的千万亿种新颖调色板。这个过程将利用不同品牌的相同产品或同一品牌的不同产品使用的现有调色板,接受用户的可选颜色偏好,识别并利用正确的颜色来传达所需的营销信息。最后,我们根据视觉美学、新颖性和同一调色板的不同信息相互作用的方式对调色板进行排名,从而指导人类设计师选择正确的调色板。我们最初向相关领域的同事展示了这项工作,得到了非常积极的反馈。我们现在正在探索与他们合作的机会,在受控的实验环境中验证这项技术。
{"title":"Creative design of color palettes for product packaging","authors":"Ying Li, Anshul Sheopuri","doi":"10.1109/ICME.2015.7177443","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177443","url":null,"abstract":"This paper describes our latest work on assisting CPG (Consumer Packaged Goods) companies with their product packaging designs by providing color palettes that are visually appealing, novel and consistent with desired marketing messages for a particular brand and product. Specifically, we start by mining a large collections of images of different products and brands to learn about all the colors and color combinations that frequently appear among them. Meanwhile, a color-message graph is constructed to represent messages conveyed by different colors as well as to capture the interrelationship among them. Knowledge from both color psychology and information sources like Thesaurus are extensively exploited in this case. Now, given a particular product and brand to be designed for its packaging, along with the company's desired marketing message, we apply a computational method to generate quintillions of novel color palettes that can be used for the design. This process will leverage existing palettes used by same products of different brands or different products of the same brand, take in optional color preferences from users, identify then utilize the right colors to convey the desired marketing message. Finally, we rank the palettes based on assessment of their visual aesthetics, novelty and the way that different messages of the same palette interact with each other, so as to guide human designers to choose the right ones. Our initial demonstrations of this work to colleagues of subject matter have received very positive feedback. We are now exploring opportunities to collaborate with them to validate this technology in a controlled experimental setting.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127689348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2015 IEEE International Conference on Multimedia and Expo (ICME)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1