{"title":"Hybrid CNNs: A Rotation Equivariant Framework for High Resolution Spherical Images","authors":"Wei Yu, Daren Zha, Nan Mu, Tianshu Fu","doi":"10.1145/3316551.3316573","DOIUrl":null,"url":null,"abstract":"With the prevalence of virtual reality, augmented reality and autonomous robots, the high resolution spherical images they produced make the standard convolutional neural networks (CNNs), which have been proven powerful on perspective images, non-trivial. The classic solution to utilize CNNs on spherical images is to project the spherical images onto plane and learning the planar images using conventional CNNs. But the distortion generated by the projection of spherical images to planar images invalidates the projection based models. Besides, these models are not robust to rotations which are the basic transformation of spherical images. Another type of solution based on spherical harmonics recently proposed by Cohen et al. [1] is rotation equivariant, but can't handle high resolution spherical images with its expensive computational cost. To process high resolution spherical images, we proposed the Hybrid CNNs. Our framework is both computationally efficient and rotation equivariant with two kinds of convolution operations defined in this paper. We compared our method with several baseline models in two classification tasks. The experimental results demonstrate the computational efficiency and rotation equivariance of the Hybrid CNNs.","PeriodicalId":300199,"journal":{"name":"Proceedings of the 2019 3rd International Conference on Digital Signal Processing","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2019 3rd International Conference on Digital Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3316551.3316573","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
With the prevalence of virtual reality, augmented reality and autonomous robots, the high resolution spherical images they produced make the standard convolutional neural networks (CNNs), which have been proven powerful on perspective images, non-trivial. The classic solution to utilize CNNs on spherical images is to project the spherical images onto plane and learning the planar images using conventional CNNs. But the distortion generated by the projection of spherical images to planar images invalidates the projection based models. Besides, these models are not robust to rotations which are the basic transformation of spherical images. Another type of solution based on spherical harmonics recently proposed by Cohen et al. [1] is rotation equivariant, but can't handle high resolution spherical images with its expensive computational cost. To process high resolution spherical images, we proposed the Hybrid CNNs. Our framework is both computationally efficient and rotation equivariant with two kinds of convolution operations defined in this paper. We compared our method with several baseline models in two classification tasks. The experimental results demonstrate the computational efficiency and rotation equivariance of the Hybrid CNNs.