Pub Date : 2015-06-01DOI: 10.1109/ICME.2015.7177506
Jie Hu, Dong-Qing Zhang, H. H. Yu, Chang Wen Chen
Video stitching requires proper seam cutting technique to decide the boundary of the sub video volume cropped from source videos. In theory, approaches such as 3D graph-cuts that search the entire spatiotemporal volume for a cutting surface should provide the best results. However, given the tremendous data size of the camera array video source, the 3D graph-cuts algorithm is extremely resource-demanding and impractical. In this paper, we propose a sequential seam cutting scheme, which is a dynamic programming algorithm that scans the source videos frame-by-frame, updates the pixels' spatiotemporal constraints, and gradually builds the cutting surface in low space complexity. The proposed scheme features flexible seam finding conditions based on temporal and spatial coherence as well as salience. Experimental results show that by relaxing the seam continuity constraint, the proposed video stitching can better handle abrupt motions or sharp edges in the source, reduce stitching artifacts, and render enhanced visual quality.
{"title":"Discontinuous seam cutting for enhanced video stitching","authors":"Jie Hu, Dong-Qing Zhang, H. H. Yu, Chang Wen Chen","doi":"10.1109/ICME.2015.7177506","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177506","url":null,"abstract":"Video stitching requires proper seam cutting technique to decide the boundary of the sub video volume cropped from source videos. In theory, approaches such as 3D graph-cuts that search the entire spatiotemporal volume for a cutting surface should provide the best results. However, given the tremendous data size of the camera array video source, the 3D graph-cuts algorithm is extremely resource-demanding and impractical. In this paper, we propose a sequential seam cutting scheme, which is a dynamic programming algorithm that scans the source videos frame-by-frame, updates the pixels' spatiotemporal constraints, and gradually builds the cutting surface in low space complexity. The proposed scheme features flexible seam finding conditions based on temporal and spatial coherence as well as salience. Experimental results show that by relaxing the seam continuity constraint, the proposed video stitching can better handle abrupt motions or sharp edges in the source, reduce stitching artifacts, and render enhanced visual quality.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"171 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116298364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-06-01DOI: 10.1109/ICME.2015.7177468
Yasmina Andreu, P. Castellano, S. Colantonio, G. Coppini, R. Favilla, D. Germanese, G. Giannakakis, D. Giorgi, M. Larsson, P. Marraccini, M. Martinelli, B. Matuszewski, Matijia Milanic, M. A. Pascali, M. Pediaditis, Giovanni Raccichini, L. Randeberg, O. Salvetti, T. Strömberg
The face reveals the healthy status of an individual, through a combination of physical signs and facial expressions. The project SEMEOTICONS is translating the semeiotic code of the human face into computational descriptors and measures, automatically extracted from videos, images, and 3D scans of the face. SEMEOTICONS is developing a multisensory platform, in the form of a smart mirror, looking for signs related to cardio-metabolic risk. The goal is to enable users to self-monitor their well-being status over time and improve their life-style via tailored user guidance. Building the multisensory mirror requires addressing significant scientific and technological challenges, from touch-less data acquisition, to real-time processing and integration of multimodal data.
{"title":"Mirror mirror on the wall… An intelligent multisensory mirror for well-being self-assessment","authors":"Yasmina Andreu, P. Castellano, S. Colantonio, G. Coppini, R. Favilla, D. Germanese, G. Giannakakis, D. Giorgi, M. Larsson, P. Marraccini, M. Martinelli, B. Matuszewski, Matijia Milanic, M. A. Pascali, M. Pediaditis, Giovanni Raccichini, L. Randeberg, O. Salvetti, T. Strömberg","doi":"10.1109/ICME.2015.7177468","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177468","url":null,"abstract":"The face reveals the healthy status of an individual, through a combination of physical signs and facial expressions. The project SEMEOTICONS is translating the semeiotic code of the human face into computational descriptors and measures, automatically extracted from videos, images, and 3D scans of the face. SEMEOTICONS is developing a multisensory platform, in the form of a smart mirror, looking for signs related to cardio-metabolic risk. The goal is to enable users to self-monitor their well-being status over time and improve their life-style via tailored user guidance. Building the multisensory mirror requires addressing significant scientific and technological challenges, from touch-less data acquisition, to real-time processing and integration of multimodal data.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"71 1-2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114008886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-06-01DOI: 10.1109/ICME.2015.7177524
Hang Su, Hua Yang, Shibao Zheng, Sha Wei, Yu Wang, Shuang Wu
Object detection is an active study area in the field of computer vision and image understanding. In this paper, we propose an active annotation algorithm by addressing the detection of numerous and scattered objects in a view, e.g., hundreds of cells in microscopy images. In particular, object detection is implemented by classifying pixels into specific classes with graph-based semi-supervised learning and grouping neighboring pixels with the same label. Sample or seed selection is conducted based on a novel annotation criterion that minimizes the expected prediction error. The most informative samples are therefore annotated actively, which are subsequently propagated to the unlabeled samples via a pairwise affinity graph. Experimental results conducted on two real world datasets validate that our proposed scheme quickly reaches high quality results and reduces human efforts significantly.
{"title":"Towards active annotation for detection of numerous and scattered objects","authors":"Hang Su, Hua Yang, Shibao Zheng, Sha Wei, Yu Wang, Shuang Wu","doi":"10.1109/ICME.2015.7177524","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177524","url":null,"abstract":"Object detection is an active study area in the field of computer vision and image understanding. In this paper, we propose an active annotation algorithm by addressing the detection of numerous and scattered objects in a view, e.g., hundreds of cells in microscopy images. In particular, object detection is implemented by classifying pixels into specific classes with graph-based semi-supervised learning and grouping neighboring pixels with the same label. Sample or seed selection is conducted based on a novel annotation criterion that minimizes the expected prediction error. The most informative samples are therefore annotated actively, which are subsequently propagated to the unlabeled samples via a pairwise affinity graph. Experimental results conducted on two real world datasets validate that our proposed scheme quickly reaches high quality results and reduces human efforts significantly.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127837318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-06-01DOI: 10.1109/ICME.2015.7177470
Junjun Jiang, Jican Fu, T. Lu, R. Hu, Zhongyuan Wang
The goal of learning-based image Super-Resolution (SR) is to generate a plausible and visually pleasing High-Resolution (HR) image from a given Low-Resolution (LR) input. The problem is dramatically under-constrained, which relies on examples or some strong image priors to better reconstruct the missing HR image details. This paper addresses the problem of learning the mapping functions (i.e. projection matrices) between the LR and HR images based on a dictionary of LR and HR examples. One recently proposed method, Anchored Neighborhood Regression (ANR) [1], provides state-of-the-art quality performance and is very fast. In this paper, we propose an improved variant of ANR, namely Locally regularized Anchored Neighborhood Regression (LANR), which utilizes the locality-constrained regression in place of the ridge regression in ANR. LANR assigns different freedom for each neighbor dictionary atom according to its correlation to the input LR patch, thus the learned projection matrices are much more flexible. Experimental results demonstrate that the proposed algorithm performs efficiently and effectively over state-of-the-art methods, e.g., 0.1-0.4 dB in term of PSNR better than ANR.
{"title":"Locally regularized Anchored Neighborhood Regression for fast Super-Resolution","authors":"Junjun Jiang, Jican Fu, T. Lu, R. Hu, Zhongyuan Wang","doi":"10.1109/ICME.2015.7177470","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177470","url":null,"abstract":"The goal of learning-based image Super-Resolution (SR) is to generate a plausible and visually pleasing High-Resolution (HR) image from a given Low-Resolution (LR) input. The problem is dramatically under-constrained, which relies on examples or some strong image priors to better reconstruct the missing HR image details. This paper addresses the problem of learning the mapping functions (i.e. projection matrices) between the LR and HR images based on a dictionary of LR and HR examples. One recently proposed method, Anchored Neighborhood Regression (ANR) [1], provides state-of-the-art quality performance and is very fast. In this paper, we propose an improved variant of ANR, namely Locally regularized Anchored Neighborhood Regression (LANR), which utilizes the locality-constrained regression in place of the ridge regression in ANR. LANR assigns different freedom for each neighbor dictionary atom according to its correlation to the input LR patch, thus the learned projection matrices are much more flexible. Experimental results demonstrate that the proposed algorithm performs efficiently and effectively over state-of-the-art methods, e.g., 0.1-0.4 dB in term of PSNR better than ANR.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127869125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-06-01DOI: 10.1109/ICME.2015.7177466
Pei-Shan Chen, Sai-Keung Wong, Wen-Chieh Lin
In this paper, we develop a physics based approach which enables users to use a brush for manipulating water. The users can drip and drag water on a non-absorbent hydrophilic surface to create water artworks. We consider factors, such as cohesive force and adhesive force, to compute the motion of water. Water drops with different shapes can be formed. We also develop a converter for converting input pictures into water-styled pictures. Our system can be applied in advertisements, movies, games, and education.
{"title":"Two-dimensional digital water art creation on a non-absorbent hydrophilic surface","authors":"Pei-Shan Chen, Sai-Keung Wong, Wen-Chieh Lin","doi":"10.1109/ICME.2015.7177466","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177466","url":null,"abstract":"In this paper, we develop a physics based approach which enables users to use a brush for manipulating water. The users can drip and drag water on a non-absorbent hydrophilic surface to create water artworks. We consider factors, such as cohesive force and adhesive force, to compute the motion of water. Water drops with different shapes can be formed. We also develop a converter for converting input pictures into water-styled pictures. Our system can be applied in advertisements, movies, games, and education.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"28 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131409096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-06-01DOI: 10.1109/ICME.2015.7177498
Xiangyang Xu, L. Ge, Tongwei Ren, Gangshan Wu
The goal of objectness estimation is to predict a moderate number of proposals of all possible objects in a given image with high efficiency. Most existing works solve this problem solely in conventional 2D color images. In this paper, we demonstrate that the depth information could benefit the estimation as a complementary cue to color information. After detailed analysis of depth characteristics, we present an adaptively integrated description for generic objects, which could take full advantages of both depth and color. With the proposed objectness description, the ambiguous area, especially the highly textured regions in original color maps, can be effectively discriminated. Meanwhile, the object boundary areas could be further emphasized, which leads to a more powerful objectness description. To evaluate the performance of the proposed approach, we conduct the experiments on two challenging datasets. The experimental results show that our proposed objectness description is more powerful and effective than state-of-the-art alternatives.
{"title":"Adaptive integration of depth and color for objectness estimation","authors":"Xiangyang Xu, L. Ge, Tongwei Ren, Gangshan Wu","doi":"10.1109/ICME.2015.7177498","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177498","url":null,"abstract":"The goal of objectness estimation is to predict a moderate number of proposals of all possible objects in a given image with high efficiency. Most existing works solve this problem solely in conventional 2D color images. In this paper, we demonstrate that the depth information could benefit the estimation as a complementary cue to color information. After detailed analysis of depth characteristics, we present an adaptively integrated description for generic objects, which could take full advantages of both depth and color. With the proposed objectness description, the ambiguous area, especially the highly textured regions in original color maps, can be effectively discriminated. Meanwhile, the object boundary areas could be further emphasized, which leads to a more powerful objectness description. To evaluate the performance of the proposed approach, we conduct the experiments on two challenging datasets. The experimental results show that our proposed objectness description is more powerful and effective than state-of-the-art alternatives.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114432254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-06-01DOI: 10.1109/ICME.2015.7177477
Fuhai Chen, Yue Gao, Donglin Cao, R. Ji
Microblog sentiment analysis has attracted extensive research attention in the recent literature. However, most existing works mainly focus on the textual modality, while ignore the contribution of visual information that contributes ever increasing proportion in expressing user emotions. In this paper, we propose to employ a hypergraph structure to formulate textual, visual and emoticon information jointly for sentiment prediction. The constructed hypergraph captures the similarities of tweets on different modalities where each vertex represents a tweet and the hyperedge is formed by the “centroid” vertex and its k-nearest neighbors on each modality. Then, the transductive inference is conducted to learn the relevance score among tweets for sentiment prediction. In this way, both intra- and inter- modality dependencies are taken into consideration in sentiment prediction. Experiments conducted on over 6,000 microblog tweets demonstrate the superiority of our method by 86.77% accuracy and 7% improvement compared to the state-of-the-art methods.
{"title":"Multimodal hypergraph learning for microblog sentiment prediction","authors":"Fuhai Chen, Yue Gao, Donglin Cao, R. Ji","doi":"10.1109/ICME.2015.7177477","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177477","url":null,"abstract":"Microblog sentiment analysis has attracted extensive research attention in the recent literature. However, most existing works mainly focus on the textual modality, while ignore the contribution of visual information that contributes ever increasing proportion in expressing user emotions. In this paper, we propose to employ a hypergraph structure to formulate textual, visual and emoticon information jointly for sentiment prediction. The constructed hypergraph captures the similarities of tweets on different modalities where each vertex represents a tweet and the hyperedge is formed by the “centroid” vertex and its k-nearest neighbors on each modality. Then, the transductive inference is conducted to learn the relevance score among tweets for sentiment prediction. In this way, both intra- and inter- modality dependencies are taken into consideration in sentiment prediction. Experiments conducted on over 6,000 microblog tweets demonstrate the superiority of our method by 86.77% accuracy and 7% improvement compared to the state-of-the-art methods.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"136 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122016234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-06-01DOI: 10.1109/ICME.2015.7177488
Xiao-Jiao Mao, Yubin Yang, Ning Li
Due to the explosive growth of visual data and the raised urgent needs for more efficient nearest neighbor search methods, hashing methods have been widely studied in recent years. However, parameter optimization of the hash function in most available approaches is tightly coupled with the form of the function itself, which makes the optimization difficult and consequently affects the similarity preserving performance of hashing. To address this issue, we propose a novel pairwise correlation reconstruction framework for learning compact binary codes flexibly. Firstly, each data point is projected into a metric space and represented as a vector encoding the underlying local and global structure of the input space. The similarities of the data are then measured by the pairwise correlations of the learned vectors, which are represented as Euclidean distances. Afterwards, in order to preserve the similarities maximally, the optimal binary codes are learned by reconstructing the pairwise correlations. Experimental results are provided and analyzed on four commonly used benchmark datasets to demonstrate that the proposed method achieves the best nearest neighbor search performance comparing with the state-of-the-art methods.
{"title":"Learning compact binary codes via pairwise correlation reconstruction","authors":"Xiao-Jiao Mao, Yubin Yang, Ning Li","doi":"10.1109/ICME.2015.7177488","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177488","url":null,"abstract":"Due to the explosive growth of visual data and the raised urgent needs for more efficient nearest neighbor search methods, hashing methods have been widely studied in recent years. However, parameter optimization of the hash function in most available approaches is tightly coupled with the form of the function itself, which makes the optimization difficult and consequently affects the similarity preserving performance of hashing. To address this issue, we propose a novel pairwise correlation reconstruction framework for learning compact binary codes flexibly. Firstly, each data point is projected into a metric space and represented as a vector encoding the underlying local and global structure of the input space. The similarities of the data are then measured by the pairwise correlations of the learned vectors, which are represented as Euclidean distances. Afterwards, in order to preserve the similarities maximally, the optimal binary codes are learned by reconstructing the pairwise correlations. Experimental results are provided and analyzed on four commonly used benchmark datasets to demonstrate that the proposed method achieves the best nearest neighbor search performance comparing with the state-of-the-art methods.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128489495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-06-01DOI: 10.1109/ICME.2015.7177387
J. Korhonen
Chroma subsampling is commonly used for digital representations of images and video sequences. The basic rationale behind chroma subsampling is that the human visual system is less sensitive to color variations than luma variations. Therefore, chroma data can be coded in lower resolution than luma data, without noticeable loss in perceived image quality. In this paper, we compare different upsampling methods for chroma data and show that by using advanced upsampling schemes the fidelity of the reconstructed image can be significantly improved. We also present an adaptive upsampling method that uses full resolution luma information to assist chroma upsampling. Experimental results show that in the presence of compression noise, the proposed technique steadily outperforms advanced non-assisted upsampling.
{"title":"Improving image fidelity by luma-assisted chroma subsampling","authors":"J. Korhonen","doi":"10.1109/ICME.2015.7177387","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177387","url":null,"abstract":"Chroma subsampling is commonly used for digital representations of images and video sequences. The basic rationale behind chroma subsampling is that the human visual system is less sensitive to color variations than luma variations. Therefore, chroma data can be coded in lower resolution than luma data, without noticeable loss in perceived image quality. In this paper, we compare different upsampling methods for chroma data and show that by using advanced upsampling schemes the fidelity of the reconstructed image can be significantly improved. We also present an adaptive upsampling method that uses full resolution luma information to assist chroma upsampling. Experimental results show that in the presence of compression noise, the proposed technique steadily outperforms advanced non-assisted upsampling.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122371953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-06-01DOI: 10.1109/ICME.2015.7177520
Yan Wang, Sheng Li, A. Kot
Fashion recommendation helps shoppers to find desirable fashion items, which facilitates online interaction and product promotion. In this paper, we propose a method to recommend handbags to each shopper, based on the handbag images the shopper has clicked. This is performed by Joint learning of attribute Projection and One-class SVM classification (JPO) based on the images of the shopper's preferred handbags. More specifically, for the handbag images clicked by each shopper, we project the original image feature space into an attribute space which is more compact. The projection matrix is learned jointly with a one-class SVM to yield a shopper-specific one-class classifier. The results show that the proposed JPO handbag recommendation performs favorably based on initial subject testing.
{"title":"Joint learning for image-based handbag recommendation","authors":"Yan Wang, Sheng Li, A. Kot","doi":"10.1109/ICME.2015.7177520","DOIUrl":"https://doi.org/10.1109/ICME.2015.7177520","url":null,"abstract":"Fashion recommendation helps shoppers to find desirable fashion items, which facilitates online interaction and product promotion. In this paper, we propose a method to recommend handbags to each shopper, based on the handbag images the shopper has clicked. This is performed by Joint learning of attribute Projection and One-class SVM classification (JPO) based on the images of the shopper's preferred handbags. More specifically, for the handbag images clicked by each shopper, we project the original image feature space into an attribute space which is more compact. The projection matrix is learned jointly with a one-class SVM to yield a shopper-specific one-class classifier. The results show that the proposed JPO handbag recommendation performs favorably based on initial subject testing.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130707449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}