首页 > 最新文献

2015 IEEE International Conference on Multimedia and Expo (ICME)最新文献

英文 中文
Effectively compressing Near-Duplicate Videos in a joint way 以联合方式有效压缩近重复视频
Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177385
Hanli Wang, Ming Ma, Tao Tian
With the increasing popularity of social network, more and more people tend to store and transmit information in visual format, such as image and video. However, the cost of this convenience brings about a shock to traditional video servers and expose them under the risk of overloading. Among the huge amount of online videos, there are quite a number of Near-Duplicate Videos (NDVs). Although many works have been proposed to detect NDVs, few researches are investigated to compress these NDVs in a more effective way than independent compression. In this work, we utilize the data redundancy of NDVs and propose a video coding method to jointly compress NDVs. In order to employ the proposed video coding method, a number of pre-processing functions are designed to explore the correlation of visual information among NDVs and to suit the video coding requirements. Experimental results verify that the proposed video coding method is able to effectively compress NDVs and thus save video data storage.
随着社交网络的日益普及,越来越多的人倾向于以图像、视频等视觉形式存储和传播信息。然而,这种便利的代价给传统的视频服务器带来了冲击,使其面临过载的风险。在海量的网络视频中,存在相当多的近重复视频(NDVs)。尽管已经提出了许多工作来检测ndv,但很少有研究以比独立压缩更有效的方式压缩这些ndv。本文利用ndvv的数据冗余性,提出了一种联合压缩ndvv的视频编码方法。为了应用所提出的视频编码方法,设计了一些预处理函数,以探索ndv之间视觉信息的相关性,并满足视频编码要求。实验结果验证了所提出的视频编码方法能够有效地压缩ndv,从而节省视频数据的存储空间。
{"title":"Effectively compressing Near-Duplicate Videos in a joint way","authors":"Hanli Wang, Ming Ma, Tao Tian","doi":"10.1109/ICME.2015.7177385","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177385","url":null,"abstract":"With the increasing popularity of social network, more and more people tend to store and transmit information in visual format, such as image and video. However, the cost of this convenience brings about a shock to traditional video servers and expose them under the risk of overloading. Among the huge amount of online videos, there are quite a number of Near-Duplicate Videos (NDVs). Although many works have been proposed to detect NDVs, few researches are investigated to compress these NDVs in a more effective way than independent compression. In this work, we utilize the data redundancy of NDVs and propose a video coding method to jointly compress NDVs. In order to employ the proposed video coding method, a number of pre-processing functions are designed to explore the correlation of visual information among NDVs and to suit the video coding requirements. Experimental results verify that the proposed video coding method is able to effectively compress NDVs and thus save video data storage.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129053286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A flexible platform for QoE-driven delivery of image-rich web applications 一个灵活的平台,用于qos驱动的图像丰富的web应用程序交付
Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177516
P. Ahammad, R. Gaunker, B. Kennedy, Mehrdad Reshadi, K. Kumar, A. K. Pathan, Hariharan Kolam
The advent of content-rich modern web applications, unreliable network connectivity and device heterogeneity demands flexible web content delivery platforms that can handle the high variability along many dimensions - especially for the mobile web. Images account for more than 60% of the content delivered by present-day webpages and have a strong influence on the perceived webpage latency and end-user experience. We present a flexible web delivery platform with a client-cloud architecture and content-aware optimizations to address the problem of delivering image-rich web applications. Our solution makes use of quantitative measures of image perceptual quality, machine learning algorithms, partial caching and opportunistic client-side choices to efficiently deliver images on the web. Using data from the WWW, we experimentally demonstrate that our approach shows significant improvement on various web performance criteria that are critical for maintaining a desirable end-user quality-of-experience (QoE) for image-rich web applications.
内容丰富的现代web应用程序的出现,不可靠的网络连接和设备的异构性要求灵活的web内容交付平台,可以处理在许多维度上的高度可变性-特别是对于移动web。图片占当前网页内容的60%以上,对网页延迟和最终用户体验有很大的影响。我们提供了一个灵活的web交付平台,它具有客户端云架构和内容感知优化,以解决交付图像丰富的web应用程序的问题。我们的解决方案利用图像感知质量的定量测量、机器学习算法、部分缓存和机会主义的客户端选择来有效地在网络上传递图像。使用来自WWW的数据,我们通过实验证明,我们的方法在各种web性能标准上显示出显著的改进,这些标准对于维护理想的终端用户体验质量(QoE)对于图像丰富的web应用程序至关重要。
{"title":"A flexible platform for QoE-driven delivery of image-rich web applications","authors":"P. Ahammad, R. Gaunker, B. Kennedy, Mehrdad Reshadi, K. Kumar, A. K. Pathan, Hariharan Kolam","doi":"10.1109/ICME.2015.7177516","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177516","url":null,"abstract":"The advent of content-rich modern web applications, unreliable network connectivity and device heterogeneity demands flexible web content delivery platforms that can handle the high variability along many dimensions - especially for the mobile web. Images account for more than 60% of the content delivered by present-day webpages and have a strong influence on the perceived webpage latency and end-user experience. We present a flexible web delivery platform with a client-cloud architecture and content-aware optimizations to address the problem of delivering image-rich web applications. Our solution makes use of quantitative measures of image perceptual quality, machine learning algorithms, partial caching and opportunistic client-side choices to efficiently deliver images on the web. Using data from the WWW, we experimentally demonstrate that our approach shows significant improvement on various web performance criteria that are critical for maintaining a desirable end-user quality-of-experience (QoE) for image-rich web applications.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126722484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
The effect of non-linear structures on the usage of hypervideo for physical training 非线性结构对体育训练中超视频使用的影响
Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177378
Katrin Tonndorf, Christian Handschigl, Julian Windscheid, H. Kosch, M. Granitzer
The growing number of elderly people combined with financial cuts in the health care sector lead to an increased demand for computer supported medical services. New standards like HTML5 allow the creation of hypervideo training applications that run on a variety of end user devices. In this paper, we evaluate an HTML5 player running an e-health hypervideo for the support of pelvic floor exercises. In an experimental test setting we compared the hypervideo to a primarily linear version regarding usability and utilization for self-controlled training. Our results show the hypervideo version leads to slightly more usability problems but facilitated a more active and individual training.
老年人数量的增加,加上医疗保健部门的财政削减,导致对计算机支持的医疗服务的需求增加。像HTML5这样的新标准允许创建在各种终端用户设备上运行的超视频培训应用程序。在本文中,我们评估了一个运行电子健康超级视频的HTML5播放器,以支持盆底运动。在实验测试设置中,我们将超视频与主要线性版本进行了比较,以了解自我控制训练的可用性和利用率。我们的结果表明,超视频版本导致了更多的可用性问题,但促进了更积极和个性化的训练。
{"title":"The effect of non-linear structures on the usage of hypervideo for physical training","authors":"Katrin Tonndorf, Christian Handschigl, Julian Windscheid, H. Kosch, M. Granitzer","doi":"10.1109/ICME.2015.7177378","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177378","url":null,"abstract":"The growing number of elderly people combined with financial cuts in the health care sector lead to an increased demand for computer supported medical services. New standards like HTML5 allow the creation of hypervideo training applications that run on a variety of end user devices. In this paper, we evaluate an HTML5 player running an e-health hypervideo for the support of pelvic floor exercises. In an experimental test setting we compared the hypervideo to a primarily linear version regarding usability and utilization for self-controlled training. Our results show the hypervideo version leads to slightly more usability problems but facilitated a more active and individual training.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121600421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Affect-expressive hand gestures synthesis and animation 情感表达手势合成和动画
Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177478
E. Bozkurt, E. Erzin, Y. Yemez
Speech and hand gestures form a composite communicative signal that boosts the naturalness and affectiveness of the communication. We present a multimodal framework for joint analysis of continuous affect, speech prosody and hand gestures towards automatic synthesis of realistic hand gestures from spontaneous speech using the hidden semi-Markov models (HSMMs). To the best of our knowledge, this is the first attempt for synthesizing hand gestures using continuous dimensional affect space, i.e., activation, valence, and dominance. We model relationships between acoustic features describing speech prosody and hand gestures with and without using the continuous affect information in speaker independent configurations and evaluate the multimodal analysis framework by generating hand gesture animations, also via objective evaluations. Our experimental studies are promising, conveying the role of affect for modeling the dynamics of speech-gesture relationship.
言语和手势构成了一种复合的交际信号,增强了交际的自然性和情感性。我们提出了一个多模态框架,用于联合分析连续情感、语音韵律和手势,从而利用隐藏半马尔可夫模型(HSMMs)从自发语音中自动合成现实手势。据我们所知,这是第一次尝试使用连续维度的影响空间,即激活、效价和支配来合成手势。我们对描述语音韵律和手势的声学特征之间的关系进行了建模,并通过生成手势动画和客观评估来评估多模态分析框架。我们的实验研究是有希望的,传达了情感在语言-手势关系动力学建模中的作用。
{"title":"Affect-expressive hand gestures synthesis and animation","authors":"E. Bozkurt, E. Erzin, Y. Yemez","doi":"10.1109/ICME.2015.7177478","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177478","url":null,"abstract":"Speech and hand gestures form a composite communicative signal that boosts the naturalness and affectiveness of the communication. We present a multimodal framework for joint analysis of continuous affect, speech prosody and hand gestures towards automatic synthesis of realistic hand gestures from spontaneous speech using the hidden semi-Markov models (HSMMs). To the best of our knowledge, this is the first attempt for synthesizing hand gestures using continuous dimensional affect space, i.e., activation, valence, and dominance. We model relationships between acoustic features describing speech prosody and hand gestures with and without using the continuous affect information in speaker independent configurations and evaluate the multimodal analysis framework by generating hand gesture animations, also via objective evaluations. Our experimental studies are promising, conveying the role of affect for modeling the dynamics of speech-gesture relationship.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"43 9","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114095616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
A novel method on optimal bit allocation at LCU level for rate control in HEVC 一种用于HEVC速率控制的LCU级最优位分配新方法
Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177445
Shengxi Li, Mai Xu, Zulin Wang
In this paper, we propose a new method, namely recursive Taylor expansion (RTE) method, for optimally allocating bits to each LCU in the R-λ rate control scheme for HEVC. Specifically, we first set up an optimization formulation on optimal bit allocation. Unfortunately, it is intractable to achieve a closed-form solution for this formulation. We therefore propose a RTE solution to iteratively solve the formulation with a fast convergence speed. Then, an approximate closed-form solution can be obtained. This way, the optimal bit allocation can be achieved at little encoding complexity cost. Finally, the experimental results validate the effectiveness of our method in three aspects: compressed distortion, bit-rate control error, and bit fluctuation.
在本文中,我们提出了一种新的方法,即递归泰勒展开(RTE)方法,用于在HEVC的R-λ速率控制方案中向每个LCU最优分配比特。具体而言,我们首先建立了最优位分配的优化公式。不幸的是,很难实现这个公式的封闭形式的解决方案。因此,我们提出了一种RTE解来迭代求解该公式,具有较快的收敛速度。然后,得到近似闭型解。这样可以在最小的编码复杂度代价下实现最优的位分配。实验结果从压缩失真、码率控制误差和比特波动三个方面验证了该方法的有效性。
{"title":"A novel method on optimal bit allocation at LCU level for rate control in HEVC","authors":"Shengxi Li, Mai Xu, Zulin Wang","doi":"10.1109/ICME.2015.7177445","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177445","url":null,"abstract":"In this paper, we propose a new method, namely recursive Taylor expansion (RTE) method, for optimally allocating bits to each LCU in the R-λ rate control scheme for HEVC. Specifically, we first set up an optimization formulation on optimal bit allocation. Unfortunately, it is intractable to achieve a closed-form solution for this formulation. We therefore propose a RTE solution to iteratively solve the formulation with a fast convergence speed. Then, an approximate closed-form solution can be obtained. This way, the optimal bit allocation can be achieved at little encoding complexity cost. Finally, the experimental results validate the effectiveness of our method in three aspects: compressed distortion, bit-rate control error, and bit fluctuation.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122573562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Towards GPU HEVC intra decoding: Seizing fine-grain parallelism 对GPU HEVC内部解码:抓住细粒度并行
Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177515
D. Souza, A. Ilic, N. Roma, L. Sousa
To satisfy the growing demands on real-time video decoders for high frame resolutions, novel GPU parallel algorithms are proposed herein for fully compliant HEVC de-quantization, inverse transform and intra prediction. The proposed algorithms are designed to fully exploit and leverage the fine grain parallelism within these computationally demanding and highly data dependent modules. Moreover, the proposed approaches allow the efficient utilization of the GPU computational resources, while carefully managing the data accesses in the complex GPU memory hierarchy. The experimental results show that the real-time processing is achieved for all tested sequences and the most demanding QP, while delivering average fps of 118.6, 89.2 and 49.7 for Full HD, 2160p and Ultra HD 4K sequences, respectively.
为了满足实时视频解码器对高帧分辨率的需求,本文提出了完全兼容HEVC去量化、逆变换和帧内预测的新型GPU并行算法。所提出的算法旨在充分利用和利用这些计算要求高且高度依赖数据的模块中的细粒度并行性。此外,所提出的方法允许高效利用GPU计算资源,同时仔细管理复杂GPU内存层次结构中的数据访问。实验结果表明,在全高清、2160p和超高清4K序列中,平均帧率分别为118.6、89.2和49.7,对所有测试序列和最苛刻的QP都实现了实时处理。
{"title":"Towards GPU HEVC intra decoding: Seizing fine-grain parallelism","authors":"D. Souza, A. Ilic, N. Roma, L. Sousa","doi":"10.1109/ICME.2015.7177515","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177515","url":null,"abstract":"To satisfy the growing demands on real-time video decoders for high frame resolutions, novel GPU parallel algorithms are proposed herein for fully compliant HEVC de-quantization, inverse transform and intra prediction. The proposed algorithms are designed to fully exploit and leverage the fine grain parallelism within these computationally demanding and highly data dependent modules. Moreover, the proposed approaches allow the efficient utilization of the GPU computational resources, while carefully managing the data accesses in the complex GPU memory hierarchy. The experimental results show that the real-time processing is achieved for all tested sequences and the most demanding QP, while delivering average fps of 118.6, 89.2 and 49.7 for Full HD, 2160p and Ultra HD 4K sequences, respectively.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129824292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Creative design of color palettes for product packaging 产品包装调色板创意设计
Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177443
Ying Li, Anshul Sheopuri
This paper describes our latest work on assisting CPG (Consumer Packaged Goods) companies with their product packaging designs by providing color palettes that are visually appealing, novel and consistent with desired marketing messages for a particular brand and product. Specifically, we start by mining a large collections of images of different products and brands to learn about all the colors and color combinations that frequently appear among them. Meanwhile, a color-message graph is constructed to represent messages conveyed by different colors as well as to capture the interrelationship among them. Knowledge from both color psychology and information sources like Thesaurus are extensively exploited in this case. Now, given a particular product and brand to be designed for its packaging, along with the company's desired marketing message, we apply a computational method to generate quintillions of novel color palettes that can be used for the design. This process will leverage existing palettes used by same products of different brands or different products of the same brand, take in optional color preferences from users, identify then utilize the right colors to convey the desired marketing message. Finally, we rank the palettes based on assessment of their visual aesthetics, novelty and the way that different messages of the same palette interact with each other, so as to guide human designers to choose the right ones. Our initial demonstrations of this work to colleagues of subject matter have received very positive feedback. We are now exploring opportunities to collaborate with them to validate this technology in a controlled experimental setting.
本文描述了我们最新的工作,协助CPG(消费品包装)公司与他们的产品包装设计,提供色彩调色板是视觉上吸引人的,新颖的,并符合特定品牌和产品所需的营销信息。具体来说,我们首先挖掘不同产品和品牌的大量图像,以了解其中经常出现的所有颜色和颜色组合。同时,构建颜色信息图来表示不同颜色所传递的信息,并捕捉它们之间的相互关系。在这种情况下,来自色彩心理学和信息来源(如Thesaurus)的知识被广泛利用。现在,给定要为其包装设计的特定产品和品牌,以及公司所需的营销信息,我们应用计算方法来生成可用于设计的千万亿种新颖调色板。这个过程将利用不同品牌的相同产品或同一品牌的不同产品使用的现有调色板,接受用户的可选颜色偏好,识别并利用正确的颜色来传达所需的营销信息。最后,我们根据视觉美学、新颖性和同一调色板的不同信息相互作用的方式对调色板进行排名,从而指导人类设计师选择正确的调色板。我们最初向相关领域的同事展示了这项工作,得到了非常积极的反馈。我们现在正在探索与他们合作的机会,在受控的实验环境中验证这项技术。
{"title":"Creative design of color palettes for product packaging","authors":"Ying Li, Anshul Sheopuri","doi":"10.1109/ICME.2015.7177443","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177443","url":null,"abstract":"This paper describes our latest work on assisting CPG (Consumer Packaged Goods) companies with their product packaging designs by providing color palettes that are visually appealing, novel and consistent with desired marketing messages for a particular brand and product. Specifically, we start by mining a large collections of images of different products and brands to learn about all the colors and color combinations that frequently appear among them. Meanwhile, a color-message graph is constructed to represent messages conveyed by different colors as well as to capture the interrelationship among them. Knowledge from both color psychology and information sources like Thesaurus are extensively exploited in this case. Now, given a particular product and brand to be designed for its packaging, along with the company's desired marketing message, we apply a computational method to generate quintillions of novel color palettes that can be used for the design. This process will leverage existing palettes used by same products of different brands or different products of the same brand, take in optional color preferences from users, identify then utilize the right colors to convey the desired marketing message. Finally, we rank the palettes based on assessment of their visual aesthetics, novelty and the way that different messages of the same palette interact with each other, so as to guide human designers to choose the right ones. Our initial demonstrations of this work to colleagues of subject matter have received very positive feedback. We are now exploring opportunities to collaborate with them to validate this technology in a controlled experimental setting.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127689348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Content-based music recommendation using underlying music preference structure 使用底层音乐偏好结构的基于内容的音乐推荐
Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177504
M. Soleymani, Anna Aljanaki, F. Wiering, R. Veltkamp
The cold start problem for new users or items is a great challenge for recommender systems. New items can be positioned within the existing items using a similarity metric to estimate their ratings. However, the calculation of similarity varies by domain and available resources. In this paper, we propose a content-based music recommender system which is based on a set of attributes derived from psychological studies of music preference. These five attributes, namely, Mellow, Unpretentious, Sophisticated, Intense and Contemporary (MUSIC), better describe the underlying factors of music preference compared to music genre. Using 249 songs and hundreds of ratings and attribute scores, we first develop an acoustic content-based attribute detection using auditory modulation features and a regression by sparse representation. We then use the estimated attributes in a cold start recommendation scenario. The proposed content-based recommendation significantly outperforms genre-based and user-based recommendation based on the root-mean-square error. The results demonstrate the effectiveness of these attributes in music preference estimation. Such methods will increase the chance of less popular but interesting songs in the long tail to be listened to.
新用户或新项目的冷启动问题对推荐系统来说是一个巨大的挑战。新项目可以在现有项目中定位,使用相似性度量来估计它们的评级。然而,相似度的计算因领域和可用资源而异。在本文中,我们提出了一个基于内容的音乐推荐系统,该系统基于一组来自音乐偏好心理学研究的属性。与音乐类型相比,“醇厚”、“朴实”、“精致”、“强烈”和“当代”这五个属性更能描述音乐偏好的潜在因素。使用249首歌曲和数百个评级和属性分数,我们首先使用听觉调制特征和稀疏表示回归开发了基于声学内容的属性检测。然后,我们在冷启动推荐场景中使用估计的属性。基于内容的推荐显著优于基于体裁和基于均方根误差的用户推荐。结果证明了这些属性在音乐偏好估计中的有效性。这些方法将增加长尾中不太流行但有趣的歌曲被听的机会。
{"title":"Content-based music recommendation using underlying music preference structure","authors":"M. Soleymani, Anna Aljanaki, F. Wiering, R. Veltkamp","doi":"10.1109/ICME.2015.7177504","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177504","url":null,"abstract":"The cold start problem for new users or items is a great challenge for recommender systems. New items can be positioned within the existing items using a similarity metric to estimate their ratings. However, the calculation of similarity varies by domain and available resources. In this paper, we propose a content-based music recommender system which is based on a set of attributes derived from psychological studies of music preference. These five attributes, namely, Mellow, Unpretentious, Sophisticated, Intense and Contemporary (MUSIC), better describe the underlying factors of music preference compared to music genre. Using 249 songs and hundreds of ratings and attribute scores, we first develop an acoustic content-based attribute detection using auditory modulation features and a regression by sparse representation. We then use the estimated attributes in a cold start recommendation scenario. The proposed content-based recommendation significantly outperforms genre-based and user-based recommendation based on the root-mean-square error. The results demonstrate the effectiveness of these attributes in music preference estimation. Such methods will increase the chance of less popular but interesting songs in the long tail to be listened to.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"297 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120881752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
Video sharpness prediction based on motion blur analysis 基于运动模糊分析的视频清晰度预测
Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177424
Jongyoo Kim, Junghwan Kim, Woojae Kim, Jisoo Lee, Sanghoon Lee
For high bit rate video, it is important to acquire the video contents with high resolution, the quality of which may be degraded due to the motion blur from the movement of an object(s) or the camera. However, conventional sharpness assessments are designed to find focal blur caused either by defocusing or by compression distortion targeted for low bit rates. To overcome this limitation, we present a no-reference framework of a visual sharpness assessment (VSA) for high-resolution video based on the motion and scene classification. In the proposed framework, the accuracy of the sharpness estimation can be improved via pooling weighted by the visual perception from the object and camera movements and by the strong influence from the region with the highest sharpness. Based on the motion blur characteristics, the variance and the contrast over the spectral domain are used to quantify the perceived sharpness. Moreover, for the VSA, we extract the highly influential sharper regions and emphasize them by utilizing the scene adaptive pooling.
对于高比特率视频来说,获得高分辨率的视频内容是很重要的,因为物体或摄像机的运动可能会造成运动模糊,从而降低视频质量。然而,传统的清晰度评估的目的是发现焦点模糊引起的散焦或压缩失真针对低比特率。为了克服这一限制,我们提出了一种基于运动和场景分类的高分辨率视频视觉清晰度评估(VSA)的无参考框架。在提出的框架中,可以通过由物体视觉感知和相机运动以及最高清晰度区域的强烈影响加权的池化来提高清晰度估计的准确性。基于运动模糊特性,利用光谱域上的方差和对比度来量化感知的清晰度。此外,对于VSA,我们利用场景自适应池提取高影响的锐利区域并对其进行强调。
{"title":"Video sharpness prediction based on motion blur analysis","authors":"Jongyoo Kim, Junghwan Kim, Woojae Kim, Jisoo Lee, Sanghoon Lee","doi":"10.1109/ICME.2015.7177424","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177424","url":null,"abstract":"For high bit rate video, it is important to acquire the video contents with high resolution, the quality of which may be degraded due to the motion blur from the movement of an object(s) or the camera. However, conventional sharpness assessments are designed to find focal blur caused either by defocusing or by compression distortion targeted for low bit rates. To overcome this limitation, we present a no-reference framework of a visual sharpness assessment (VSA) for high-resolution video based on the motion and scene classification. In the proposed framework, the accuracy of the sharpness estimation can be improved via pooling weighted by the visual perception from the object and camera movements and by the strong influence from the region with the highest sharpness. Based on the motion blur characteristics, the variance and the contrast over the spectral domain are used to quantify the perceived sharpness. Moreover, for the VSA, we extract the highly influential sharper regions and emphasize them by utilizing the scene adaptive pooling.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127840632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Compression of photo collections using geometrical information 使用几何信息压缩照片集合
Pub Date : 2015-08-06 DOI: 10.1109/ICME.2015.7177379
S. Milani, P. Zanuttigh
This paper proposes a novel scheme for the joint compression of photo collections framing the same object or scene. The proposed approach starts by locating corresponding features in the various images and then exploits a Structure from Motion algorithm to estimate the geometric relationships between the various images and their viewpoints. Then it uses 3D information and warping to predict images one from the other. Furthermore, graph algorithms are used to compute minimum weight topologies and identify the ordering of the input images that maximizes the efficiency of prediction. The obtained data is fed to a modified HEVC coder to perform the compression. Experimental results show that the proposed scheme outperforms competing solutions and can be efficiently employed for the storage of large image collections in the virtual exploration of architectural landmarks or in photo sharing websites.
本文提出了一种针对同一物体或场景的照片集联合压缩的新方案。该方法首先在各种图像中定位相应的特征,然后利用Structure from Motion算法来估计各种图像及其视点之间的几何关系。然后,它使用3D信息和变形来预测不同的图像。此外,图算法用于计算最小权重拓扑和识别输入图像的排序,以最大限度地提高预测效率。得到的数据被馈送到修改后的HEVC编码器执行压缩。实验结果表明,该方案优于其他方案,可以有效地用于建筑地标虚拟探索或照片共享网站中大型图像集合的存储。
{"title":"Compression of photo collections using geometrical information","authors":"S. Milani, P. Zanuttigh","doi":"10.1109/ICME.2015.7177379","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177379","url":null,"abstract":"This paper proposes a novel scheme for the joint compression of photo collections framing the same object or scene. The proposed approach starts by locating corresponding features in the various images and then exploits a Structure from Motion algorithm to estimate the geometric relationships between the various images and their viewpoints. Then it uses 3D information and warping to predict images one from the other. Furthermore, graph algorithms are used to compute minimum weight topologies and identify the ordering of the input images that maximizes the efficiency of prediction. The obtained data is fed to a modified HEVC coder to perform the compression. Experimental results show that the proposed scheme outperforms competing solutions and can be efficiently employed for the storage of large image collections in the virtual exploration of architectural landmarks or in photo sharing websites.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127536551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
期刊
2015 IEEE International Conference on Multimedia and Expo (ICME)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1