{"title":"移动环境下显著性预测的跨模态学习","authors":"Dakai Ren, X. Wen, Xiao-Yang Liu, Shuai Huang, Jiazhong Chen","doi":"10.1145/3444685.3446304","DOIUrl":null,"url":null,"abstract":"The existing researches reveal that a significant impact is introduced by viewing conditions for visual perception when viewing media on mobile screens. This brings two issues in the area of visual saliency that we need to address: how the saliency models perform in mobile conditions, and how to consider the mobile conditions when designing a saliency model. To investigate the performance of saliency models in mobile environment, eye fixations in four typical mobile conditions are collected as the mobile ground truth in this work. To consider the mobile conditions when designing a saliency model, we combine viewing factors and visual stimuli as two modalities, and a cross-modal based deep learning architecture is proposed for visual attention prediction. Experimental results demonstrate the model with the consideration of mobile viewing factors often outperforms the models without such consideration.","PeriodicalId":119278,"journal":{"name":"Proceedings of the 2nd ACM International Conference on Multimedia in Asia","volume":"31 2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Cross-modal learning for saliency prediction in mobile environment\",\"authors\":\"Dakai Ren, X. Wen, Xiao-Yang Liu, Shuai Huang, Jiazhong Chen\",\"doi\":\"10.1145/3444685.3446304\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The existing researches reveal that a significant impact is introduced by viewing conditions for visual perception when viewing media on mobile screens. This brings two issues in the area of visual saliency that we need to address: how the saliency models perform in mobile conditions, and how to consider the mobile conditions when designing a saliency model. To investigate the performance of saliency models in mobile environment, eye fixations in four typical mobile conditions are collected as the mobile ground truth in this work. To consider the mobile conditions when designing a saliency model, we combine viewing factors and visual stimuli as two modalities, and a cross-modal based deep learning architecture is proposed for visual attention prediction. Experimental results demonstrate the model with the consideration of mobile viewing factors often outperforms the models without such consideration.\",\"PeriodicalId\":119278,\"journal\":{\"name\":\"Proceedings of the 2nd ACM International Conference on Multimedia in Asia\",\"volume\":\"31 2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-03-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2nd ACM International Conference on Multimedia in Asia\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3444685.3446304\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2nd ACM International Conference on Multimedia in Asia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3444685.3446304","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Cross-modal learning for saliency prediction in mobile environment
The existing researches reveal that a significant impact is introduced by viewing conditions for visual perception when viewing media on mobile screens. This brings two issues in the area of visual saliency that we need to address: how the saliency models perform in mobile conditions, and how to consider the mobile conditions when designing a saliency model. To investigate the performance of saliency models in mobile environment, eye fixations in four typical mobile conditions are collected as the mobile ground truth in this work. To consider the mobile conditions when designing a saliency model, we combine viewing factors and visual stimuli as two modalities, and a cross-modal based deep learning architecture is proposed for visual attention prediction. Experimental results demonstrate the model with the consideration of mobile viewing factors often outperforms the models without such consideration.