Pub Date : 2015-08-06DOI: 10.1109/ICME.2015.7177385
Hanli Wang, Ming Ma, Tao Tian
With the increasing popularity of social network, more and more people tend to store and transmit information in visual format, such as image and video. However, the cost of this convenience brings about a shock to traditional video servers and expose them under the risk of overloading. Among the huge amount of online videos, there are quite a number of Near-Duplicate Videos (NDVs). Although many works have been proposed to detect NDVs, few researches are investigated to compress these NDVs in a more effective way than independent compression. In this work, we utilize the data redundancy of NDVs and propose a video coding method to jointly compress NDVs. In order to employ the proposed video coding method, a number of pre-processing functions are designed to explore the correlation of visual information among NDVs and to suit the video coding requirements. Experimental results verify that the proposed video coding method is able to effectively compress NDVs and thus save video data storage.
{"title":"Effectively compressing Near-Duplicate Videos in a joint way","authors":"Hanli Wang, Ming Ma, Tao Tian","doi":"10.1109/ICME.2015.7177385","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177385","url":null,"abstract":"With the increasing popularity of social network, more and more people tend to store and transmit information in visual format, such as image and video. However, the cost of this convenience brings about a shock to traditional video servers and expose them under the risk of overloading. Among the huge amount of online videos, there are quite a number of Near-Duplicate Videos (NDVs). Although many works have been proposed to detect NDVs, few researches are investigated to compress these NDVs in a more effective way than independent compression. In this work, we utilize the data redundancy of NDVs and propose a video coding method to jointly compress NDVs. In order to employ the proposed video coding method, a number of pre-processing functions are designed to explore the correlation of visual information among NDVs and to suit the video coding requirements. Experimental results verify that the proposed video coding method is able to effectively compress NDVs and thus save video data storage.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129053286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-08-06DOI: 10.1109/ICME.2015.7177516
P. Ahammad, R. Gaunker, B. Kennedy, Mehrdad Reshadi, K. Kumar, A. K. Pathan, Hariharan Kolam
The advent of content-rich modern web applications, unreliable network connectivity and device heterogeneity demands flexible web content delivery platforms that can handle the high variability along many dimensions - especially for the mobile web. Images account for more than 60% of the content delivered by present-day webpages and have a strong influence on the perceived webpage latency and end-user experience. We present a flexible web delivery platform with a client-cloud architecture and content-aware optimizations to address the problem of delivering image-rich web applications. Our solution makes use of quantitative measures of image perceptual quality, machine learning algorithms, partial caching and opportunistic client-side choices to efficiently deliver images on the web. Using data from the WWW, we experimentally demonstrate that our approach shows significant improvement on various web performance criteria that are critical for maintaining a desirable end-user quality-of-experience (QoE) for image-rich web applications.
{"title":"A flexible platform for QoE-driven delivery of image-rich web applications","authors":"P. Ahammad, R. Gaunker, B. Kennedy, Mehrdad Reshadi, K. Kumar, A. K. Pathan, Hariharan Kolam","doi":"10.1109/ICME.2015.7177516","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177516","url":null,"abstract":"The advent of content-rich modern web applications, unreliable network connectivity and device heterogeneity demands flexible web content delivery platforms that can handle the high variability along many dimensions - especially for the mobile web. Images account for more than 60% of the content delivered by present-day webpages and have a strong influence on the perceived webpage latency and end-user experience. We present a flexible web delivery platform with a client-cloud architecture and content-aware optimizations to address the problem of delivering image-rich web applications. Our solution makes use of quantitative measures of image perceptual quality, machine learning algorithms, partial caching and opportunistic client-side choices to efficiently deliver images on the web. Using data from the WWW, we experimentally demonstrate that our approach shows significant improvement on various web performance criteria that are critical for maintaining a desirable end-user quality-of-experience (QoE) for image-rich web applications.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126722484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-08-06DOI: 10.1109/ICME.2015.7177378
Katrin Tonndorf, Christian Handschigl, Julian Windscheid, H. Kosch, M. Granitzer
The growing number of elderly people combined with financial cuts in the health care sector lead to an increased demand for computer supported medical services. New standards like HTML5 allow the creation of hypervideo training applications that run on a variety of end user devices. In this paper, we evaluate an HTML5 player running an e-health hypervideo for the support of pelvic floor exercises. In an experimental test setting we compared the hypervideo to a primarily linear version regarding usability and utilization for self-controlled training. Our results show the hypervideo version leads to slightly more usability problems but facilitated a more active and individual training.
{"title":"The effect of non-linear structures on the usage of hypervideo for physical training","authors":"Katrin Tonndorf, Christian Handschigl, Julian Windscheid, H. Kosch, M. Granitzer","doi":"10.1109/ICME.2015.7177378","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177378","url":null,"abstract":"The growing number of elderly people combined with financial cuts in the health care sector lead to an increased demand for computer supported medical services. New standards like HTML5 allow the creation of hypervideo training applications that run on a variety of end user devices. In this paper, we evaluate an HTML5 player running an e-health hypervideo for the support of pelvic floor exercises. In an experimental test setting we compared the hypervideo to a primarily linear version regarding usability and utilization for self-controlled training. Our results show the hypervideo version leads to slightly more usability problems but facilitated a more active and individual training.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121600421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-08-06DOI: 10.1109/ICME.2015.7177478
E. Bozkurt, E. Erzin, Y. Yemez
Speech and hand gestures form a composite communicative signal that boosts the naturalness and affectiveness of the communication. We present a multimodal framework for joint analysis of continuous affect, speech prosody and hand gestures towards automatic synthesis of realistic hand gestures from spontaneous speech using the hidden semi-Markov models (HSMMs). To the best of our knowledge, this is the first attempt for synthesizing hand gestures using continuous dimensional affect space, i.e., activation, valence, and dominance. We model relationships between acoustic features describing speech prosody and hand gestures with and without using the continuous affect information in speaker independent configurations and evaluate the multimodal analysis framework by generating hand gesture animations, also via objective evaluations. Our experimental studies are promising, conveying the role of affect for modeling the dynamics of speech-gesture relationship.
{"title":"Affect-expressive hand gestures synthesis and animation","authors":"E. Bozkurt, E. Erzin, Y. Yemez","doi":"10.1109/ICME.2015.7177478","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177478","url":null,"abstract":"Speech and hand gestures form a composite communicative signal that boosts the naturalness and affectiveness of the communication. We present a multimodal framework for joint analysis of continuous affect, speech prosody and hand gestures towards automatic synthesis of realistic hand gestures from spontaneous speech using the hidden semi-Markov models (HSMMs). To the best of our knowledge, this is the first attempt for synthesizing hand gestures using continuous dimensional affect space, i.e., activation, valence, and dominance. We model relationships between acoustic features describing speech prosody and hand gestures with and without using the continuous affect information in speaker independent configurations and evaluate the multimodal analysis framework by generating hand gesture animations, also via objective evaluations. Our experimental studies are promising, conveying the role of affect for modeling the dynamics of speech-gesture relationship.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"43 9","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114095616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-08-06DOI: 10.1109/ICME.2015.7177426
Yueyi Zhang, Zhiwei Xiong, Feng Wu
Depth sensors based on Time-of-Flight (ToF) and Phase Shifting (PS) have complementary strengths and weaknesses. ToF can provide real-time depth but limited in resolution and sensitive to noise. PS can generate accurate and robust depth with high resolution but requires a number of patterns that leads to high latency. In this paper, we propose a novel fusion framework to take advantages of both ToF and PS. The basic idea is using the coarse depth from ToF to disambiguate the wrapped depth from PS. Specifically, we address two key technical problems: cross-modal calibration and interference-free synchronization between ToF and PS sensors. Experiments demonstrate that the proposed method generates accurate and robust depth with high resolution and low latency, which is beneficial to tremendous applications.
{"title":"Fusion of Time-of-Flight and Phase Shifting for high-resolution and low-latency depth sensing","authors":"Yueyi Zhang, Zhiwei Xiong, Feng Wu","doi":"10.1109/ICME.2015.7177426","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177426","url":null,"abstract":"Depth sensors based on Time-of-Flight (ToF) and Phase Shifting (PS) have complementary strengths and weaknesses. ToF can provide real-time depth but limited in resolution and sensitive to noise. PS can generate accurate and robust depth with high resolution but requires a number of patterns that leads to high latency. In this paper, we propose a novel fusion framework to take advantages of both ToF and PS. The basic idea is using the coarse depth from ToF to disambiguate the wrapped depth from PS. Specifically, we address two key technical problems: cross-modal calibration and interference-free synchronization between ToF and PS sensors. Experiments demonstrate that the proposed method generates accurate and robust depth with high resolution and low latency, which is beneficial to tremendous applications.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"123 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133643387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A 4K and Main-10 HEVC video decoder LSI is fabricated in a 28nm CMOS process. It adopts a block-concealed processor (BcP) to improve the visual quality and a bandwidth-suppressed processor (BsP) is newly designed to reduce 30% and 45% of external data accesses in playback and gaming scenario, respectively. It features fully core scalable (FCS) architecture which lowers the required working frequency by 65%. A 10-bit compact scheme is proposed to reduce the frame buffer space by 37.5%. Moreover, a multi-standard architecture reduces are by 28%. It achieves 530Mpixels/s throughput which is two times larger than the state-of-the-art HEVC design [2] and consumes 0.2nJ/pixel energy efficiency, enabling real-time 4K video playback in Ultra-HD Blu-ray player and TV systems.
{"title":"Energy and area efficient hardware implementation of 4K Main-10 HEVC decoder in Ultra-HD Blu-ray player and TV systems","authors":"Tsu-Ming Liu, Yung-Chang Chang, Chih-Ming Wang, Hue-Min Lin, Chia-Yun Cheng, Chun-Chia Chen, Min-Hao Chiu, Sheng-Jen Wang, P. Chao, Meng-Jye Hu, Fu-Chun Yeh, Shun-Hsiang Chuang, Hsiu-Yi Lin, Ming-Long Wu, Che-Hong Chen, Chia-Lin Ho, Chi-Cheng Ju","doi":"10.1109/ICME.2015.7177399","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177399","url":null,"abstract":"A 4K and Main-10 HEVC video decoder LSI is fabricated in a 28nm CMOS process. It adopts a block-concealed processor (BcP) to improve the visual quality and a bandwidth-suppressed processor (BsP) is newly designed to reduce 30% and 45% of external data accesses in playback and gaming scenario, respectively. It features fully core scalable (FCS) architecture which lowers the required working frequency by 65%. A 10-bit compact scheme is proposed to reduce the frame buffer space by 37.5%. Moreover, a multi-standard architecture reduces are by 28%. It achieves 530Mpixels/s throughput which is two times larger than the state-of-the-art HEVC design [2] and consumes 0.2nJ/pixel energy efficiency, enabling real-time 4K video playback in Ultra-HD Blu-ray player and TV systems.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133050584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-08-06DOI: 10.1109/ICME.2015.7177490
Jiangchao Yao, Ya Zhang, Zhe Xu, Jun-wei Sun, Jun Zhou, Xiao Gu
Topic models have been widely used for analyzing text corpora and achieved great success in applications including content organization and information retrieval. However, different from traditional text data, social tags in the web containers are usually of small amounts, unordered, and non-iid, i.e., it is highly dependent on contextual information such as users and objects. Considering the specific characteristics of social tags, we here introduce a new model named Joint Latent Dirichlet Allocation (JLDA) to capture the relationships among users, objects, and tags. The model assumes that the latent topics of users and those of objects jointly influence the generation of tags. The latent distributions is then inferred with Gibbs sampling. Experiments on two social tag data sets have demonstrated that the model achieves a lower predictive error and generates more reasonable topics. We also present an interesting application of this model to object recommendation.
{"title":"Joint Latent Dirichlet Allocation for non-iid social tags","authors":"Jiangchao Yao, Ya Zhang, Zhe Xu, Jun-wei Sun, Jun Zhou, Xiao Gu","doi":"10.1109/ICME.2015.7177490","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177490","url":null,"abstract":"Topic models have been widely used for analyzing text corpora and achieved great success in applications including content organization and information retrieval. However, different from traditional text data, social tags in the web containers are usually of small amounts, unordered, and non-iid, i.e., it is highly dependent on contextual information such as users and objects. Considering the specific characteristics of social tags, we here introduce a new model named Joint Latent Dirichlet Allocation (JLDA) to capture the relationships among users, objects, and tags. The model assumes that the latent topics of users and those of objects jointly influence the generation of tags. The latent distributions is then inferred with Gibbs sampling. Experiments on two social tag data sets have demonstrated that the model achieves a lower predictive error and generates more reasonable topics. We also present an interesting application of this model to object recommendation.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"91 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116939995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-08-06DOI: 10.1109/ICME.2015.7177515
D. Souza, A. Ilic, N. Roma, L. Sousa
To satisfy the growing demands on real-time video decoders for high frame resolutions, novel GPU parallel algorithms are proposed herein for fully compliant HEVC de-quantization, inverse transform and intra prediction. The proposed algorithms are designed to fully exploit and leverage the fine grain parallelism within these computationally demanding and highly data dependent modules. Moreover, the proposed approaches allow the efficient utilization of the GPU computational resources, while carefully managing the data accesses in the complex GPU memory hierarchy. The experimental results show that the real-time processing is achieved for all tested sequences and the most demanding QP, while delivering average fps of 118.6, 89.2 and 49.7 for Full HD, 2160p and Ultra HD 4K sequences, respectively.
{"title":"Towards GPU HEVC intra decoding: Seizing fine-grain parallelism","authors":"D. Souza, A. Ilic, N. Roma, L. Sousa","doi":"10.1109/ICME.2015.7177515","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177515","url":null,"abstract":"To satisfy the growing demands on real-time video decoders for high frame resolutions, novel GPU parallel algorithms are proposed herein for fully compliant HEVC de-quantization, inverse transform and intra prediction. The proposed algorithms are designed to fully exploit and leverage the fine grain parallelism within these computationally demanding and highly data dependent modules. Moreover, the proposed approaches allow the efficient utilization of the GPU computational resources, while carefully managing the data accesses in the complex GPU memory hierarchy. The experimental results show that the real-time processing is achieved for all tested sequences and the most demanding QP, while delivering average fps of 118.6, 89.2 and 49.7 for Full HD, 2160p and Ultra HD 4K sequences, respectively.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129824292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-08-06DOI: 10.1109/ICME.2015.7177379
S. Milani, P. Zanuttigh
This paper proposes a novel scheme for the joint compression of photo collections framing the same object or scene. The proposed approach starts by locating corresponding features in the various images and then exploits a Structure from Motion algorithm to estimate the geometric relationships between the various images and their viewpoints. Then it uses 3D information and warping to predict images one from the other. Furthermore, graph algorithms are used to compute minimum weight topologies and identify the ordering of the input images that maximizes the efficiency of prediction. The obtained data is fed to a modified HEVC coder to perform the compression. Experimental results show that the proposed scheme outperforms competing solutions and can be efficiently employed for the storage of large image collections in the virtual exploration of architectural landmarks or in photo sharing websites.
本文提出了一种针对同一物体或场景的照片集联合压缩的新方案。该方法首先在各种图像中定位相应的特征,然后利用Structure from Motion算法来估计各种图像及其视点之间的几何关系。然后,它使用3D信息和变形来预测不同的图像。此外,图算法用于计算最小权重拓扑和识别输入图像的排序,以最大限度地提高预测效率。得到的数据被馈送到修改后的HEVC编码器执行压缩。实验结果表明,该方案优于其他方案,可以有效地用于建筑地标虚拟探索或照片共享网站中大型图像集合的存储。
{"title":"Compression of photo collections using geometrical information","authors":"S. Milani, P. Zanuttigh","doi":"10.1109/ICME.2015.7177379","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177379","url":null,"abstract":"This paper proposes a novel scheme for the joint compression of photo collections framing the same object or scene. The proposed approach starts by locating corresponding features in the various images and then exploits a Structure from Motion algorithm to estimate the geometric relationships between the various images and their viewpoints. Then it uses 3D information and warping to predict images one from the other. Furthermore, graph algorithms are used to compute minimum weight topologies and identify the ordering of the input images that maximizes the efficiency of prediction. The obtained data is fed to a modified HEVC coder to perform the compression. Experimental results show that the proposed scheme outperforms competing solutions and can be efficiently employed for the storage of large image collections in the virtual exploration of architectural landmarks or in photo sharing websites.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127536551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-08-06DOI: 10.1109/ICME.2015.7177443
Ying Li, Anshul Sheopuri
This paper describes our latest work on assisting CPG (Consumer Packaged Goods) companies with their product packaging designs by providing color palettes that are visually appealing, novel and consistent with desired marketing messages for a particular brand and product. Specifically, we start by mining a large collections of images of different products and brands to learn about all the colors and color combinations that frequently appear among them. Meanwhile, a color-message graph is constructed to represent messages conveyed by different colors as well as to capture the interrelationship among them. Knowledge from both color psychology and information sources like Thesaurus are extensively exploited in this case. Now, given a particular product and brand to be designed for its packaging, along with the company's desired marketing message, we apply a computational method to generate quintillions of novel color palettes that can be used for the design. This process will leverage existing palettes used by same products of different brands or different products of the same brand, take in optional color preferences from users, identify then utilize the right colors to convey the desired marketing message. Finally, we rank the palettes based on assessment of their visual aesthetics, novelty and the way that different messages of the same palette interact with each other, so as to guide human designers to choose the right ones. Our initial demonstrations of this work to colleagues of subject matter have received very positive feedback. We are now exploring opportunities to collaborate with them to validate this technology in a controlled experimental setting.
{"title":"Creative design of color palettes for product packaging","authors":"Ying Li, Anshul Sheopuri","doi":"10.1109/ICME.2015.7177443","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177443","url":null,"abstract":"This paper describes our latest work on assisting CPG (Consumer Packaged Goods) companies with their product packaging designs by providing color palettes that are visually appealing, novel and consistent with desired marketing messages for a particular brand and product. Specifically, we start by mining a large collections of images of different products and brands to learn about all the colors and color combinations that frequently appear among them. Meanwhile, a color-message graph is constructed to represent messages conveyed by different colors as well as to capture the interrelationship among them. Knowledge from both color psychology and information sources like Thesaurus are extensively exploited in this case. Now, given a particular product and brand to be designed for its packaging, along with the company's desired marketing message, we apply a computational method to generate quintillions of novel color palettes that can be used for the design. This process will leverage existing palettes used by same products of different brands or different products of the same brand, take in optional color preferences from users, identify then utilize the right colors to convey the desired marketing message. Finally, we rank the palettes based on assessment of their visual aesthetics, novelty and the way that different messages of the same palette interact with each other, so as to guide human designers to choose the right ones. Our initial demonstrations of this work to colleagues of subject matter have received very positive feedback. We are now exploring opportunities to collaborate with them to validate this technology in a controlled experimental setting.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127689348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}