SALENet: Structure-Aware Lighting Estimations From a Single Image for Indoor Environments

IF 13.7 IEEE transactions on image processing : a publication of the IEEE Signal Processing Society Pub Date : 2024-12-11 DOI:10.1109/TIP.2024.3512381

Junhong Zhao;Bing Xue;Mengjie Zhang

{"title":"SALENet: Structure-Aware Lighting Estimations From a Single Image for Indoor Environments","authors":"Junhong Zhao;Bing Xue;Mengjie Zhang","doi":"10.1109/TIP.2024.3512381","DOIUrl":null,"url":null,"abstract":"High Dynamic Range (HDR) lighting plays a pivotal role in modern augmented and mixed-reality (AR/MR) applications, facilitating immersive experiences through realistic object insertion and dynamic relighting. However, the acquisition of precise HDR environment maps remains cost-prohibitive and impractical when using standard devices. To bridge this gap, this paper introduces SALENet, a novel deep network for estimating global lighting conditions from a single image, to effectively mitigate the need for resource-intensive acquisition methods. In contrast to earlier studies, we focus on exploring the inherent structural relationships within the lighting distribution. We design a hierarchical transformer-based neural network architecture with a proposed cross-attention mechanism between different resolution lighting source representations, optimizing the spatial distribution of lighting sources simultaneously for enhanced consistency. To further improve accuracy, a structure-based contrastive learning method is proposed to select positive-negative pairs based on lighting distribution similarity. By harnessing the synergy of hierarchical transformers and structure-based contrastive learning, our framework yields a significant enhancement in lighting prediction accuracy, enabling high-fidelity augmented and mixed reality to achieve cost-effectively immersive and realistic lighting effects.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"33 ","pages":"6806-6820"},"PeriodicalIF":13.7000,"publicationDate":"2024-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10794602/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

High Dynamic Range (HDR) lighting plays a pivotal role in modern augmented and mixed-reality (AR/MR) applications, facilitating immersive experiences through realistic object insertion and dynamic relighting. However, the acquisition of precise HDR environment maps remains cost-prohibitive and impractical when using standard devices. To bridge this gap, this paper introduces SALENet, a novel deep network for estimating global lighting conditions from a single image, to effectively mitigate the need for resource-intensive acquisition methods. In contrast to earlier studies, we focus on exploring the inherent structural relationships within the lighting distribution. We design a hierarchical transformer-based neural network architecture with a proposed cross-attention mechanism between different resolution lighting source representations, optimizing the spatial distribution of lighting sources simultaneously for enhanced consistency. To further improve accuracy, a structure-based contrastive learning method is proposed to select positive-negative pairs based on lighting distribution similarity. By harnessing the synergy of hierarchical transformers and structure-based contrastive learning, our framework yields a significant enhancement in lighting prediction accuracy, enabling high-fidelity augmented and mixed reality to achieve cost-effectively immersive and realistic lighting effects.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于室内环境单幅图像的结构感知照明估计

高动态范围（HDR）照明在现代增强现实和混合现实（AR/MR）应用中发挥着关键作用，通过逼真的物体插入和动态重照明促进沉浸式体验。然而，当使用标准设备时，获取精确的HDR环境地图仍然成本高昂且不切实际。为了弥补这一差距，本文引入了SALENet，这是一种新的深度网络，用于从单个图像估计全球照明条件，以有效减轻对资源密集型获取方法的需求。与之前的研究相反，我们的重点是探索照明分布内部的内在结构关系。我们设计了一个基于分层变压器的神经网络架构，并提出了不同分辨率光源表示之间的交叉注意机制，同时优化光源的空间分布以增强一致性。为了进一步提高准确率，提出了一种基于结构的对比学习方法，基于光照分布相似度选择正负对。通过利用分层变压器和基于结构的对比学习的协同作用，我们的框架显著提高了照明预测的准确性，使高保真增强和混合现实能够实现经济有效的沉浸式和逼真的照明效果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society

自引率

0.00%

发文量