Rongfeng Lu, Hangyu Chen, Zunjie Zhu, Yuhang Qin, Ming Lu, Le Zhang, Chenggang Yan, Anke Xue
{"title":"ThermalGaussian:热三维高斯拼接技术","authors":"Rongfeng Lu, Hangyu Chen, Zunjie Zhu, Yuhang Qin, Ming Lu, Le Zhang, Chenggang Yan, Anke Xue","doi":"arxiv-2409.07200","DOIUrl":null,"url":null,"abstract":"Thermography is especially valuable for the military and other users of\nsurveillance cameras. Some recent methods based on Neural Radiance Fields\n(NeRF) are proposed to reconstruct the thermal scenes in 3D from a set of\nthermal and RGB images. However, unlike NeRF, 3D Gaussian splatting (3DGS)\nprevails due to its rapid training and real-time rendering. In this work, we\npropose ThermalGaussian, the first thermal 3DGS approach capable of rendering\nhigh-quality images in RGB and thermal modalities. We first calibrate the RGB\ncamera and the thermal camera to ensure that both modalities are accurately\naligned. Subsequently, we use the registered images to learn the multimodal 3D\nGaussians. To prevent the overfitting of any single modality, we introduce\nseveral multimodal regularization constraints. We also develop smoothing\nconstraints tailored to the physical characteristics of the thermal modality.\nBesides, we contribute a real-world dataset named RGBT-Scenes, captured by a\nhand-hold thermal-infrared camera, facilitating future research on thermal\nscene reconstruction. We conduct comprehensive experiments to show that\nThermalGaussian achieves photorealistic rendering of thermal images and\nimproves the rendering quality of RGB images. With the proposed multimodal\nregularization constraints, we also reduced the model's storage cost by 90\\%.\nThe code and dataset will be released.","PeriodicalId":501130,"journal":{"name":"arXiv - CS - Computer Vision and Pattern Recognition","volume":"62 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ThermalGaussian: Thermal 3D Gaussian Splatting\",\"authors\":\"Rongfeng Lu, Hangyu Chen, Zunjie Zhu, Yuhang Qin, Ming Lu, Le Zhang, Chenggang Yan, Anke Xue\",\"doi\":\"arxiv-2409.07200\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Thermography is especially valuable for the military and other users of\\nsurveillance cameras. Some recent methods based on Neural Radiance Fields\\n(NeRF) are proposed to reconstruct the thermal scenes in 3D from a set of\\nthermal and RGB images. However, unlike NeRF, 3D Gaussian splatting (3DGS)\\nprevails due to its rapid training and real-time rendering. In this work, we\\npropose ThermalGaussian, the first thermal 3DGS approach capable of rendering\\nhigh-quality images in RGB and thermal modalities. We first calibrate the RGB\\ncamera and the thermal camera to ensure that both modalities are accurately\\naligned. Subsequently, we use the registered images to learn the multimodal 3D\\nGaussians. To prevent the overfitting of any single modality, we introduce\\nseveral multimodal regularization constraints. We also develop smoothing\\nconstraints tailored to the physical characteristics of the thermal modality.\\nBesides, we contribute a real-world dataset named RGBT-Scenes, captured by a\\nhand-hold thermal-infrared camera, facilitating future research on thermal\\nscene reconstruction. We conduct comprehensive experiments to show that\\nThermalGaussian achieves photorealistic rendering of thermal images and\\nimproves the rendering quality of RGB images. With the proposed multimodal\\nregularization constraints, we also reduced the model's storage cost by 90\\\\%.\\nThe code and dataset will be released.\",\"PeriodicalId\":501130,\"journal\":{\"name\":\"arXiv - CS - Computer Vision and Pattern Recognition\",\"volume\":\"62 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Computer Vision and Pattern Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.07200\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computer Vision and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07200","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Thermography is especially valuable for the military and other users of
surveillance cameras. Some recent methods based on Neural Radiance Fields
(NeRF) are proposed to reconstruct the thermal scenes in 3D from a set of
thermal and RGB images. However, unlike NeRF, 3D Gaussian splatting (3DGS)
prevails due to its rapid training and real-time rendering. In this work, we
propose ThermalGaussian, the first thermal 3DGS approach capable of rendering
high-quality images in RGB and thermal modalities. We first calibrate the RGB
camera and the thermal camera to ensure that both modalities are accurately
aligned. Subsequently, we use the registered images to learn the multimodal 3D
Gaussians. To prevent the overfitting of any single modality, we introduce
several multimodal regularization constraints. We also develop smoothing
constraints tailored to the physical characteristics of the thermal modality.
Besides, we contribute a real-world dataset named RGBT-Scenes, captured by a
hand-hold thermal-infrared camera, facilitating future research on thermal
scene reconstruction. We conduct comprehensive experiments to show that
ThermalGaussian achieves photorealistic rendering of thermal images and
improves the rendering quality of RGB images. With the proposed multimodal
regularization constraints, we also reduced the model's storage cost by 90\%.
The code and dataset will be released.