{"title":"使用 CU-VGG 深度学习架构进行高速编码单元深度识别","authors":"Hari Pattimi, B. K. N. Srinivasarao","doi":"10.1007/s13369-024-08928-4","DOIUrl":null,"url":null,"abstract":"<div><p>The quadtree partition process involves major complexity in high-efficiency video coding (HEVC/H.265). It divides the coding tree units (CTUs) recursively into coding units (CUs). Determining the coding unit partition depth based on rate-distortion optimisation is computationally difficult in HEVC. This article proposes a system based on a deep learning architecture for determining the coding unit partition depth with less time in HEVC intra-prediction. The proposed system minimises computing complexity by removing the rate-distortion optimisation. The proposed system comprises two main blocks: the pre-processing block and the deep learning block. During the pre-processing phase, the spatial resolution of the input data is drastically reduced, enabling the neural network model to quickly adapt to the input sample and extract more meaningful feature data. This paper proposes two distinct deep learning architectures, CU-VGG16 and CU-VGG19. Pre-processed coding units (16 <span>\\(\\times \\)</span> 16) are the input for the deep learning architecture, and the corresponding coding units’ depths (0, 1, 2, 3) are the output. To compare the accuracy of coding unit depth prediction in the two proposed models, we have created a database with varying resolutions. The performance of the proposed models was observed by replacing the CU partition block of traditional HEVC with the proposed systems and comparing the bit rate and encoding time with traditional HEVC. The results demonstrated that the proposed architecture with CU-VGG16 and CU-VGG19 designs speeds up coding unit partitioning by 87.15% and 87.70%, respectively, as compared to standard HEVC.</p></div>","PeriodicalId":54354,"journal":{"name":"Arabian Journal for Science and Engineering","volume":"49 12","pages":"16287 - 16298"},"PeriodicalIF":2.6000,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"High-Speed Coding Unit Depth Identifications Using CU-VGG Deep Learning Architectures\",\"authors\":\"Hari Pattimi, B. K. N. Srinivasarao\",\"doi\":\"10.1007/s13369-024-08928-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The quadtree partition process involves major complexity in high-efficiency video coding (HEVC/H.265). It divides the coding tree units (CTUs) recursively into coding units (CUs). Determining the coding unit partition depth based on rate-distortion optimisation is computationally difficult in HEVC. This article proposes a system based on a deep learning architecture for determining the coding unit partition depth with less time in HEVC intra-prediction. The proposed system minimises computing complexity by removing the rate-distortion optimisation. The proposed system comprises two main blocks: the pre-processing block and the deep learning block. During the pre-processing phase, the spatial resolution of the input data is drastically reduced, enabling the neural network model to quickly adapt to the input sample and extract more meaningful feature data. This paper proposes two distinct deep learning architectures, CU-VGG16 and CU-VGG19. Pre-processed coding units (16 <span>\\\\(\\\\times \\\\)</span> 16) are the input for the deep learning architecture, and the corresponding coding units’ depths (0, 1, 2, 3) are the output. To compare the accuracy of coding unit depth prediction in the two proposed models, we have created a database with varying resolutions. The performance of the proposed models was observed by replacing the CU partition block of traditional HEVC with the proposed systems and comparing the bit rate and encoding time with traditional HEVC. The results demonstrated that the proposed architecture with CU-VGG16 and CU-VGG19 designs speeds up coding unit partitioning by 87.15% and 87.70%, respectively, as compared to standard HEVC.</p></div>\",\"PeriodicalId\":54354,\"journal\":{\"name\":\"Arabian Journal for Science and Engineering\",\"volume\":\"49 12\",\"pages\":\"16287 - 16298\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2024-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Arabian Journal for Science and Engineering\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s13369-024-08928-4\",\"RegionNum\":4,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Arabian Journal for Science and Engineering","FirstCategoryId":"103","ListUrlMain":"https://link.springer.com/article/10.1007/s13369-024-08928-4","RegionNum":4,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
High-Speed Coding Unit Depth Identifications Using CU-VGG Deep Learning Architectures
The quadtree partition process involves major complexity in high-efficiency video coding (HEVC/H.265). It divides the coding tree units (CTUs) recursively into coding units (CUs). Determining the coding unit partition depth based on rate-distortion optimisation is computationally difficult in HEVC. This article proposes a system based on a deep learning architecture for determining the coding unit partition depth with less time in HEVC intra-prediction. The proposed system minimises computing complexity by removing the rate-distortion optimisation. The proposed system comprises two main blocks: the pre-processing block and the deep learning block. During the pre-processing phase, the spatial resolution of the input data is drastically reduced, enabling the neural network model to quickly adapt to the input sample and extract more meaningful feature data. This paper proposes two distinct deep learning architectures, CU-VGG16 and CU-VGG19. Pre-processed coding units (16 \(\times \) 16) are the input for the deep learning architecture, and the corresponding coding units’ depths (0, 1, 2, 3) are the output. To compare the accuracy of coding unit depth prediction in the two proposed models, we have created a database with varying resolutions. The performance of the proposed models was observed by replacing the CU partition block of traditional HEVC with the proposed systems and comparing the bit rate and encoding time with traditional HEVC. The results demonstrated that the proposed architecture with CU-VGG16 and CU-VGG19 designs speeds up coding unit partitioning by 87.15% and 87.70%, respectively, as compared to standard HEVC.
期刊介绍:
King Fahd University of Petroleum & Minerals (KFUPM) partnered with Springer to publish the Arabian Journal for Science and Engineering (AJSE).
AJSE, which has been published by KFUPM since 1975, is a recognized national, regional and international journal that provides a great opportunity for the dissemination of research advances from the Kingdom of Saudi Arabia, MENA and the world.