360° videos have been widely used with the development of virtual reality technology and triggered a demand to determine the most visually attractive objects in them, aka 360° video saliency prediction (VSP). While generative models, i.e., variational autoencoders or autoregressive models have proved their effectiveness in handling spatio-temporal data, utilizing them in 360° VSP is still challenging due to the problem of severe distortion and feature alignment inconsistency. In this study, we propose a novel spatio-temporal consistency generative network for 360° VSP. A dual-stream encoder-decoder architecture is adopted to process the forward and backward frame sequences of 360° videos simultaneously. Moreover, a deep autoregressive module termed as axial-attention based spherical ConvLSTM is designed in the encoder to memorize features with global-range spatial and temporal dependencies. Finally, motivated by the bias phenomenon in human viewing behavior, a temporal-convolutional Gaussian prior module is introduced to further improve the accuracy of the saliency prediction. Extensive experiments are conducted to evaluate our model and the state-of-the-art competitors, demonstrating that our model has achieved the best performance on the databases of PVS-HM and VR-Eyetracking.
{"title":"Predicting 360° Video Saliency: A ConvLSTM Encoder-Decoder Network With Spatio-Temporal Consistency","authors":"Zhaolin Wan;Han Qin;Ruiqin Xiong;Zhiyang Li;Xiaopeng Fan;Debin Zhao","doi":"10.1109/JETCAS.2024.3377096","DOIUrl":"10.1109/JETCAS.2024.3377096","url":null,"abstract":"360° videos have been widely used with the development of virtual reality technology and triggered a demand to determine the most visually attractive objects in them, aka 360° video saliency prediction (VSP). While generative models, i.e., variational autoencoders or autoregressive models have proved their effectiveness in handling spatio-temporal data, utilizing them in 360° VSP is still challenging due to the problem of severe distortion and feature alignment inconsistency. In this study, we propose a novel spatio-temporal consistency generative network for 360° VSP. A dual-stream encoder-decoder architecture is adopted to process the forward and backward frame sequences of 360° videos simultaneously. Moreover, a deep autoregressive module termed as axial-attention based spherical ConvLSTM is designed in the encoder to memorize features with global-range spatial and temporal dependencies. Finally, motivated by the bias phenomenon in human viewing behavior, a temporal-convolutional Gaussian prior module is introduced to further improve the accuracy of the saliency prediction. Extensive experiments are conducted to evaluate our model and the state-of-the-art competitors, demonstrating that our model has achieved the best performance on the databases of PVS-HM and VR-Eyetracking.","PeriodicalId":48827,"journal":{"name":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems","volume":"14 2","pages":"311-322"},"PeriodicalIF":3.7,"publicationDate":"2024-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140171922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-13DOI: 10.1109/JETCAS.2024.3364895
{"title":"IEEE Circuits and Systems Society","authors":"","doi":"10.1109/JETCAS.2024.3364895","DOIUrl":"https://doi.org/10.1109/JETCAS.2024.3364895","url":null,"abstract":"","PeriodicalId":48827,"journal":{"name":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems","volume":"14 1","pages":"C3-C3"},"PeriodicalIF":4.6,"publicationDate":"2024-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10472166","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140123317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-13DOI: 10.1109/JETCAS.2023.3335798
Wen-Hsiao Peng
The IEEE Journal On Emerging and Selected Topics in Circuits and Systems (JETCAS) is a periodical sponsored by the IEEE Circuits and Systems Society (CASS). Since its advent about a decade ago, JETCAS has published quarterly special issues on emerging and selected topics that cover the entire field of interest of the CASS. Particular emphasis has been put on emerging areas that are expected to grow over time in scientific and professional importance. For example, the special issues published in the last two years touched upon industry x.0 applications, unconventional computing techniques, memristive circuits and systems, quantum computation, processing-in-memory machine learning, and highly renewable penetrated power systems. Some of these special issues have become valuable references in many forefront technology developments within and beyond CASS. Thanks to the strong leadership by Prof. Ho Ching (Herbert) Iu, the outgoing Editor-in-Chief, and the remarkable work of his editorial board, JETCAS is now one of the leading journals in the CASS, with an impact factor of 4.6-5.8 from 2022 to 2023. Its LinkedIn profile page ( https://bit.ly/3FLIBFs