{"title":"预训练CNN模型在视觉显著性预测中的性能评价","authors":"Bashir Ghariba, M. Shehata, Peter F. McGuire","doi":"10.1109/CCECE47787.2020.9255692","DOIUrl":null,"url":null,"abstract":"Human Visual System (HVS) has the ability to focus on specific parts of the scene, rather than the whole scene. This phenomenon is one of the most active research topics in the computer vision and neuroscience fields. Recently, deep learning models have been used for visual saliency prediction. In this paper, we investigate the performance of five state-of-the-art deep neural networks (VGG-16, ResNet-50, Xception, InceptionResNet-v2, and MobileNet-v2) for the task of visual saliency prediction. In this paper, we train five deep learning models over the SALICON dataset and then use the trained models to predict visual saliency maps using four standard datasets, namely: TORONTO, MIT300, MIT1003, and DUT-OMRON. The results indicate that the ResNet-50 model outperforms the other four and provides a visual saliency map that is very close to human performance.","PeriodicalId":296506,"journal":{"name":"2020 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE)","volume":"76 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Performance Evaluation of Pre-Trained CNN Models for Visual Saliency Prediction\",\"authors\":\"Bashir Ghariba, M. Shehata, Peter F. McGuire\",\"doi\":\"10.1109/CCECE47787.2020.9255692\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Human Visual System (HVS) has the ability to focus on specific parts of the scene, rather than the whole scene. This phenomenon is one of the most active research topics in the computer vision and neuroscience fields. Recently, deep learning models have been used for visual saliency prediction. In this paper, we investigate the performance of five state-of-the-art deep neural networks (VGG-16, ResNet-50, Xception, InceptionResNet-v2, and MobileNet-v2) for the task of visual saliency prediction. In this paper, we train five deep learning models over the SALICON dataset and then use the trained models to predict visual saliency maps using four standard datasets, namely: TORONTO, MIT300, MIT1003, and DUT-OMRON. The results indicate that the ResNet-50 model outperforms the other four and provides a visual saliency map that is very close to human performance.\",\"PeriodicalId\":296506,\"journal\":{\"name\":\"2020 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE)\",\"volume\":\"76 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-08-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CCECE47787.2020.9255692\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCECE47787.2020.9255692","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Performance Evaluation of Pre-Trained CNN Models for Visual Saliency Prediction
Human Visual System (HVS) has the ability to focus on specific parts of the scene, rather than the whole scene. This phenomenon is one of the most active research topics in the computer vision and neuroscience fields. Recently, deep learning models have been used for visual saliency prediction. In this paper, we investigate the performance of five state-of-the-art deep neural networks (VGG-16, ResNet-50, Xception, InceptionResNet-v2, and MobileNet-v2) for the task of visual saliency prediction. In this paper, we train five deep learning models over the SALICON dataset and then use the trained models to predict visual saliency maps using four standard datasets, namely: TORONTO, MIT300, MIT1003, and DUT-OMRON. The results indicate that the ResNet-50 model outperforms the other four and provides a visual saliency map that is very close to human performance.