Mobile-PBR: A 28-nm Energy-Efficient Rendering Processor for Photorealistic Augmented Reality With Inverse Rendering and Background Clustering

IF 5.6 1区 工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Journal of Solid-state Circuits Pub Date : 2024-11-05 DOI:10.1109/JSSC.2024.3484212
Shiyu Guo;Yuhao Ju;Xi Chen;Sachin S. Sapatnekar;Jie Gu
{"title":"Mobile-PBR: A 28-nm Energy-Efficient Rendering Processor for Photorealistic Augmented Reality With Inverse Rendering and Background Clustering","authors":"Shiyu Guo;Yuhao Ju;Xi Chen;Sachin S. Sapatnekar;Jie Gu","doi":"10.1109/JSSC.2024.3484212","DOIUrl":null,"url":null,"abstract":"This work presents a low-power physical-based ray-tracing (PBRT) rendering processor for photorealistic augmented reality (AR) rendering applications on mobile devices, referred to as mobile physical-based renderer (Mobile-PBR). By introducing inverse rendering (IR) and background clustering, Mobile-PBR enables complicated photorealistic lighting effects such as reflection, refraction, and shadow with minimum resources on mobile edge devices. The key features of this work include: 1) an ASIC rendering processor that embeds an end-to-end ray-tracing (RT) solution with IR for AR on mobile devices; 2) a reconfigurable mixed-precision processing element (PE) design supporting diverse computing tasks for both IR and RT modes; 3) background clustered field of view (FOV)-focused 3-D construction reducing conventional background scene complexity from O(nlogn) to O(1); 4) scalable partitioning scheme for complex 3-D objects with an average of <inline-formula> <tex-math>$13{\\times }$ </tex-math></inline-formula> speed up on test scenes; and 5) use of global RT scheduler (GRTS) and global memory access controller (GMAC) to overcome the challenges of irregular memory access pattern and varied PE runtime with overall <inline-formula> <tex-math>$684{\\times }$ </tex-math></inline-formula> speed up compared with the baseline design. A 28-nm test chip was fabricated demonstrating 500- and 1418-frames/s/W power efficiency in IR and RT modes, respectively, achieving <inline-formula> <tex-math>$28.8{\\times }$ </tex-math></inline-formula> and <inline-formula> <tex-math>$3.95{\\times }$ </tex-math></inline-formula> higher RT rendering efficiency compared with existing ASIC solutions, and having an average performance of 25.8 frames/s on various testing scenes, enabling real-time physical-based RT rendering on mobile edge devices.","PeriodicalId":13129,"journal":{"name":"IEEE Journal of Solid-state Circuits","volume":"60 1","pages":"125-135"},"PeriodicalIF":5.6000,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal of Solid-state Circuits","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10744559/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

This work presents a low-power physical-based ray-tracing (PBRT) rendering processor for photorealistic augmented reality (AR) rendering applications on mobile devices, referred to as mobile physical-based renderer (Mobile-PBR). By introducing inverse rendering (IR) and background clustering, Mobile-PBR enables complicated photorealistic lighting effects such as reflection, refraction, and shadow with minimum resources on mobile edge devices. The key features of this work include: 1) an ASIC rendering processor that embeds an end-to-end ray-tracing (RT) solution with IR for AR on mobile devices; 2) a reconfigurable mixed-precision processing element (PE) design supporting diverse computing tasks for both IR and RT modes; 3) background clustered field of view (FOV)-focused 3-D construction reducing conventional background scene complexity from O(nlogn) to O(1); 4) scalable partitioning scheme for complex 3-D objects with an average of $13{\times }$ speed up on test scenes; and 5) use of global RT scheduler (GRTS) and global memory access controller (GMAC) to overcome the challenges of irregular memory access pattern and varied PE runtime with overall $684{\times }$ speed up compared with the baseline design. A 28-nm test chip was fabricated demonstrating 500- and 1418-frames/s/W power efficiency in IR and RT modes, respectively, achieving $28.8{\times }$ and $3.95{\times }$ higher RT rendering efficiency compared with existing ASIC solutions, and having an average performance of 25.8 frames/s on various testing scenes, enabling real-time physical-based RT rendering on mobile edge devices.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Mobile-PBR:采用反渲染和背景聚类技术的 28 纳米高能效逼真增强现实渲染处理器
这项工作提出了一种低功耗的基于物理的光线追踪(PBRT)渲染处理器,用于移动设备上的逼真增强现实(AR)渲染应用,称为基于移动物理的渲染器(mobile - pbr)。通过引入反向渲染(IR)和背景聚类,mobile - pbr可以在移动边缘设备上以最小的资源实现复杂的逼真照明效果,如反射、折射和阴影。这项工作的主要特点包括:1)一个ASIC渲染处理器,它在移动设备上嵌入了端到端的光线追踪(RT)解决方案,用于AR;2)可重构混合精度处理单元(PE)设计,支持IR和RT模式的多种计算任务;3)以背景聚类视场(FOV)为焦点的三维构建,将传统背景场景复杂度从0 (nlogn)降低到O(1);4)复杂3d对象的可扩展分区方案,在测试场景下平均速度为$13{\times}$;5)使用全局RT调度程序(GRTS)和全局内存访问控制器(GMAC)克服了内存访问模式不规则和PE运行时变化的挑战,与基线设计相比,总体速度提高了684{\times}$。制作了一个28纳米测试芯片,在IR和RT模式下的功率效率分别为500和1418帧/s/W,与现有ASIC解决方案相比,RT渲染效率提高了28.8和3.95美元,在各种测试场景下的平均性能为25.8帧/s,实现了移动边缘设备上基于物理的实时RT渲染。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE Journal of Solid-state Circuits
IEEE Journal of Solid-state Circuits 工程技术-工程:电子与电气
CiteScore
11.00
自引率
20.40%
发文量
351
审稿时长
3-6 weeks
期刊介绍: The IEEE Journal of Solid-State Circuits publishes papers each month in the broad area of solid-state circuits with particular emphasis on transistor-level design of integrated circuits. It also provides coverage of topics such as circuits modeling, technology, systems design, layout, and testing that relate directly to IC design. Integrated circuits and VLSI are of principal interest; material related to discrete circuit design is seldom published. Experimental verification is strongly encouraged.
期刊最新文献
A 16× Interleaved 32-GS/s 8b Hybrid ADC With Self-Tracking Inter-Stage Gain Achieving 44.3-dB SFDR at 20.9-GHz Input A 6.78-MHz Single-Stage Regulating Rectifier With Dual Outputs Simultaneously Charged in a Half Cycle Achieving 92.2% Efficiency and 131 mW Output Power A 95.3% Efficiency APT/AET/SPT Multimode Multiband CMOS/GaN Envelope Tracking for 6G-Oriented Systems A 0.4-V 988-nW Tiny Footprint Time-Domain Audio Feature Extraction ASIC for Keyword Spotting Using Injection-Locked Oscillators Self-Enabled Write Assist Cells for High-Density SRAM in Resistance-Dominated Technology Node
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1