高斯RAM:基于随机视网膜启发的一瞥和强化学习的轻量级图像分类

2020 20th International Conference on Control, Automation and Systems (ICCAS) Pub Date : 2020-10-13 DOI:10.23919/ICCAS50221.2020.9268201

D. Shim, H. Kim

{"title":"高斯RAM:基于随机视网膜启发的一瞥和强化学习的轻量级图像分类","authors":"D. Shim, H. Kim","doi":"10.23919/ICCAS50221.2020.9268201","DOIUrl":null,"url":null,"abstract":"Previous studies on image classification have mainly focused on the performance of the networks, not on real-time operation or model compression. We propose a Gaussian Deep Recurrent visual Attention Model (GDRAM) - a reinforcement learning based lightweight deep neural network for large scale image classification that outperforms the conventional CNN (Convolutional Neural Network) which uses the entire image as input. Highly inspired by the biological visual recognition process, our model mimics the stochastic location of the retina with Gaussian distribution. We evaluate the model on Large cluttered MNIST, Large CIFAR-10 and Large CIFAR-100 datasets which are resized to 128 in both width and height. The implementation of Gaussian RAM in PyTorch and its pretrained model are available at : https://github.com/dsshim0125/gaussian-ram","PeriodicalId":6732,"journal":{"name":"2020 20th International Conference on Control, Automation and Systems (ICCAS)","volume":"85 1","pages":"155-160"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Gaussian RAM: Lightweight Image Classification via Stochastic Retina-Inspired Glimpse and Reinforcement Learning\",\"authors\":\"D. Shim, H. Kim\",\"doi\":\"10.23919/ICCAS50221.2020.9268201\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Previous studies on image classification have mainly focused on the performance of the networks, not on real-time operation or model compression. We propose a Gaussian Deep Recurrent visual Attention Model (GDRAM) - a reinforcement learning based lightweight deep neural network for large scale image classification that outperforms the conventional CNN (Convolutional Neural Network) which uses the entire image as input. Highly inspired by the biological visual recognition process, our model mimics the stochastic location of the retina with Gaussian distribution. We evaluate the model on Large cluttered MNIST, Large CIFAR-10 and Large CIFAR-100 datasets which are resized to 128 in both width and height. The implementation of Gaussian RAM in PyTorch and its pretrained model are available at : https://github.com/dsshim0125/gaussian-ram\",\"PeriodicalId\":6732,\"journal\":{\"name\":\"2020 20th International Conference on Control, Automation and Systems (ICCAS)\",\"volume\":\"85 1\",\"pages\":\"155-160\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 20th International Conference on Control, Automation and Systems (ICCAS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23919/ICCAS50221.2020.9268201\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 20th International Conference on Control, Automation and Systems (ICCAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/ICCAS50221.2020.9268201","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

以往的图像分类研究主要关注网络的性能，而不是实时操作或模型压缩。我们提出了一种高斯深度循环视觉注意模型(GDRAM)——一种基于强化学习的轻量级深度神经网络，用于大规模图像分类，优于使用整个图像作为输入的传统CNN(卷积神经网络)。受生物视觉识别过程的启发，我们的模型模拟了视网膜的随机位置与高斯分布。我们在大型杂乱的MNIST、大型CIFAR-10和大型CIFAR-100数据集上对模型进行了评估，这些数据集的宽度和高度都被调整为128。PyTorch中高斯内存的实现及其预训练模型可在:https://github.com/dsshim0125/gaussian-ram获得

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Gaussian RAM: Lightweight Image Classification via Stochastic Retina-Inspired Glimpse and Reinforcement Learning

Previous studies on image classification have mainly focused on the performance of the networks, not on real-time operation or model compression. We propose a Gaussian Deep Recurrent visual Attention Model (GDRAM) - a reinforcement learning based lightweight deep neural network for large scale image classification that outperforms the conventional CNN (Convolutional Neural Network) which uses the entire image as input. Highly inspired by the biological visual recognition process, our model mimics the stochastic location of the retina with Gaussian distribution. We evaluate the model on Large cluttered MNIST, Large CIFAR-10 and Large CIFAR-100 datasets which are resized to 128 in both width and height. The implementation of Gaussian RAM in PyTorch and its pretrained model are available at : https://github.com/dsshim0125/gaussian-ram

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 20th International Conference on Control, Automation and Systems (ICCAS)

自引率

0.00%

发文量