Jiaqi Li , Shuhuan Wen , Di Lu , Linxiang Li , Hong Zhang
{"title":"Voxel and deep learning based depth complementation for transparent objects","authors":"Jiaqi Li , Shuhuan Wen , Di Lu , Linxiang Li , Hong Zhang","doi":"10.1016/j.patrec.2025.04.003","DOIUrl":null,"url":null,"abstract":"<div><div>For the problem of missing depth values of transparent objects in depth-channel captured by RGB-D camera, a voxel-based deep learning depth-completion algorithm for transparent objects is proposed. We mapped the image to the 3D voxel space, calculated the effective point cloud according to the input depth map, and obtained the occupied voxels by the boundary test method. Combined with the camera ray direction, the occupied voxels are filtered for the voxels that intersect the camera ray. Using the image features contained in the RGB image and the valid points in the intersecting voxels calculated from the point cloud image, the multi-layer perception is applied to predict the missing channel of the object, and under the constraint of surface normal consistency, the depth value is optimized. The proposed algorithm achieves improvements of 12.55%, 0.6%, and 1.63% over ClearGrasp in the metrics <span><math><msub><mrow><mi>δ</mi></mrow><mrow><mn>1</mn><mo>.</mo><mn>05</mn></mrow></msub></math></span>, <span><math><msub><mrow><mi>δ</mi></mrow><mrow><mn>1</mn><mo>.</mo><mn>10</mn></mrow></msub></math></span>, and <span><math><msub><mrow><mi>δ</mi></mrow><mrow><mn>1</mn><mo>.</mo><mn>25</mn></mrow></msub></math></span>, respectively.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"193 ","pages":"Pages 14-20"},"PeriodicalIF":3.3000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition Letters","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167865525001369","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/4/15 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
For the problem of missing depth values of transparent objects in depth-channel captured by RGB-D camera, a voxel-based deep learning depth-completion algorithm for transparent objects is proposed. We mapped the image to the 3D voxel space, calculated the effective point cloud according to the input depth map, and obtained the occupied voxels by the boundary test method. Combined with the camera ray direction, the occupied voxels are filtered for the voxels that intersect the camera ray. Using the image features contained in the RGB image and the valid points in the intersecting voxels calculated from the point cloud image, the multi-layer perception is applied to predict the missing channel of the object, and under the constraint of surface normal consistency, the depth value is optimized. The proposed algorithm achieves improvements of 12.55%, 0.6%, and 1.63% over ClearGrasp in the metrics , , and , respectively.
期刊介绍:
Pattern Recognition Letters aims at rapid publication of concise articles of a broad interest in pattern recognition.
Subject areas include all the current fields of interest represented by the Technical Committees of the International Association of Pattern Recognition, and other developing themes involving learning and recognition.