Javier Usón, J. Cabrera, Daniel Corregidor, Narciso García
{"title":"基于深度学习的自由视点视频系统的前景分割分析","authors":"Javier Usón, J. Cabrera, Daniel Corregidor, Narciso García","doi":"10.1109/ICCE-Berlin56473.2022.9937087","DOIUrl":null,"url":null,"abstract":"Volumetric video acquisition systems enable realistic virtual experiences such as Free-Viewpoint Video (FVV). Stereo matching is a well known way of obtaining this volumetric information as depth images, calculating the disparity be-tween two stereo color images. On these applications, the background of the scene captured is static and does not change, so foreground information is much more valuable. We propose adding foreground segmentation to help learning based algorithms, such as deep learning models, improve results previously obtained. We utilized the framework De-tectron2 to model foreground segmentation by detecting people. Additionally, we built a large stereo dataset focused on FVV systems. Finally, we modified a successful deep learning model from the state-of-the-art, CREStereo, to add foreground segmentation and performed supervised training on it to estimate disparity, obtaining promising results.","PeriodicalId":138931,"journal":{"name":"2022 IEEE 12th International Conference on Consumer Electronics (ICCE-Berlin)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Analysing Foreground Segmentation in Deep Learning Based Depth Estimation on Free-Viewpoint Video Systems\",\"authors\":\"Javier Usón, J. Cabrera, Daniel Corregidor, Narciso García\",\"doi\":\"10.1109/ICCE-Berlin56473.2022.9937087\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Volumetric video acquisition systems enable realistic virtual experiences such as Free-Viewpoint Video (FVV). Stereo matching is a well known way of obtaining this volumetric information as depth images, calculating the disparity be-tween two stereo color images. On these applications, the background of the scene captured is static and does not change, so foreground information is much more valuable. We propose adding foreground segmentation to help learning based algorithms, such as deep learning models, improve results previously obtained. We utilized the framework De-tectron2 to model foreground segmentation by detecting people. Additionally, we built a large stereo dataset focused on FVV systems. Finally, we modified a successful deep learning model from the state-of-the-art, CREStereo, to add foreground segmentation and performed supervised training on it to estimate disparity, obtaining promising results.\",\"PeriodicalId\":138931,\"journal\":{\"name\":\"2022 IEEE 12th International Conference on Consumer Electronics (ICCE-Berlin)\",\"volume\":\"60 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 12th International Conference on Consumer Electronics (ICCE-Berlin)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCE-Berlin56473.2022.9937087\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 12th International Conference on Consumer Electronics (ICCE-Berlin)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCE-Berlin56473.2022.9937087","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Analysing Foreground Segmentation in Deep Learning Based Depth Estimation on Free-Viewpoint Video Systems
Volumetric video acquisition systems enable realistic virtual experiences such as Free-Viewpoint Video (FVV). Stereo matching is a well known way of obtaining this volumetric information as depth images, calculating the disparity be-tween two stereo color images. On these applications, the background of the scene captured is static and does not change, so foreground information is much more valuable. We propose adding foreground segmentation to help learning based algorithms, such as deep learning models, improve results previously obtained. We utilized the framework De-tectron2 to model foreground segmentation by detecting people. Additionally, we built a large stereo dataset focused on FVV systems. Finally, we modified a successful deep learning model from the state-of-the-art, CREStereo, to add foreground segmentation and performed supervised training on it to estimate disparity, obtaining promising results.