面向全孔径融合的光场语义分割

Fei Teng;Jiaming Zhang;Kunyu Peng;Yaonan Wang;Rainer Stiefelhagen;Kailun Yang
{"title":"面向全孔径融合的光场语义分割","authors":"Fei Teng;Jiaming Zhang;Kunyu Peng;Yaonan Wang;Rainer Stiefelhagen;Kailun Yang","doi":"10.1109/TAI.2024.3457931","DOIUrl":null,"url":null,"abstract":"Light field cameras are capable of capturing intricate angular and spatial details. This allows for acquiring complex light patterns and details from multiple angles, significantly enhancing the precision of image semantic segmentation. However, two significant issues arise: 1) The extensive angular information of light field cameras contains a large amount of redundant data, which is overwhelming for the limited hardware resources of intelligent agents. 2) A relative displacement difference exists in the data collected by different microlenses. To address these issues, we propose an \n<italic>omni-aperture fusion model (OAFuser)</i>\n that leverages dense context from the central view and extracts the angular information from subaperture images to generate semantically consistent results. To simultaneously streamline the redundant information from the light field cameras and avoid feature loss during network propagation, we present a simple yet very effective \n<italic>subaperture fusion module (SAFM)</i>\n. This module efficiently embeds subaperture images in angular features, allowing the network to process each subaperture image with a minimal computational demand of only (\n<inline-formula><tex-math>${\\sim}1\\rm GFlops$</tex-math></inline-formula>\n). Furthermore, to address the mismatched spatial information across viewpoints, we present a \n<italic>center angular rectification module (CARM)</i>\n to realize feature resorting and prevent feature occlusion caused by misalignment. The proposed OAFuser achieves state-of-the-art performance on four UrbanLF datasets in terms of \n<italic>all evaluation metrics</i>\n and sets a new record of \n<inline-formula><tex-math>$84.93\\%$</tex-math></inline-formula>\n in mIoU on the UrbanLF-Real Extended dataset, with a gain of \n<inline-formula><tex-math>${+}3.69\\%$</tex-math></inline-formula>\n. The source code for OAFuser is available at \n<uri>https://github.com/FeiBryantkit/OAFuser</uri>\n.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 12","pages":"6225-6239"},"PeriodicalIF":0.0000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"OAFuser: Toward Omni-Aperture Fusion for Light Field Semantic Segmentation\",\"authors\":\"Fei Teng;Jiaming Zhang;Kunyu Peng;Yaonan Wang;Rainer Stiefelhagen;Kailun Yang\",\"doi\":\"10.1109/TAI.2024.3457931\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Light field cameras are capable of capturing intricate angular and spatial details. This allows for acquiring complex light patterns and details from multiple angles, significantly enhancing the precision of image semantic segmentation. However, two significant issues arise: 1) The extensive angular information of light field cameras contains a large amount of redundant data, which is overwhelming for the limited hardware resources of intelligent agents. 2) A relative displacement difference exists in the data collected by different microlenses. To address these issues, we propose an \\n<italic>omni-aperture fusion model (OAFuser)</i>\\n that leverages dense context from the central view and extracts the angular information from subaperture images to generate semantically consistent results. To simultaneously streamline the redundant information from the light field cameras and avoid feature loss during network propagation, we present a simple yet very effective \\n<italic>subaperture fusion module (SAFM)</i>\\n. This module efficiently embeds subaperture images in angular features, allowing the network to process each subaperture image with a minimal computational demand of only (\\n<inline-formula><tex-math>${\\\\sim}1\\\\rm GFlops$</tex-math></inline-formula>\\n). Furthermore, to address the mismatched spatial information across viewpoints, we present a \\n<italic>center angular rectification module (CARM)</i>\\n to realize feature resorting and prevent feature occlusion caused by misalignment. The proposed OAFuser achieves state-of-the-art performance on four UrbanLF datasets in terms of \\n<italic>all evaluation metrics</i>\\n and sets a new record of \\n<inline-formula><tex-math>$84.93\\\\%$</tex-math></inline-formula>\\n in mIoU on the UrbanLF-Real Extended dataset, with a gain of \\n<inline-formula><tex-math>${+}3.69\\\\%$</tex-math></inline-formula>\\n. The source code for OAFuser is available at \\n<uri>https://github.com/FeiBryantkit/OAFuser</uri>\\n.\",\"PeriodicalId\":73305,\"journal\":{\"name\":\"IEEE transactions on artificial intelligence\",\"volume\":\"5 12\",\"pages\":\"6225-6239\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on artificial intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10677512/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10677512/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

光场相机能够捕捉到复杂的角度和空间细节。这允许从多个角度获取复杂的光模式和细节,大大提高了图像语义分割的精度。但是存在两个重要的问题:1)光场相机广泛的角度信息包含了大量的冗余数据,这对于智能体有限的硬件资源来说是压倒性的。2)不同微透镜采集的数据存在相对位移差异。为了解决这些问题,我们提出了一种全孔径融合模型(OAFuser),该模型利用中心视图的密集上下文并从子孔径图像中提取角度信息以生成语义一致的结果。为了同时精简来自光场相机的冗余信息并避免网络传播过程中的特征丢失,我们提出了一种简单但非常有效的子孔径融合模块(SAFM)。该模块有效地将子孔径图像嵌入到角度特征中,使网络能够以最小的计算需求(${\sim}1\rm GFlops$)处理每个子孔径图像。此外,为了解决视点间空间信息不匹配的问题,我们提出了圆心角校正模块(CARM),实现特征求助,防止因不对准导致的特征遮挡。根据所有评估指标,提出的OAFuser在四个UrbanLF数据集上实现了最先进的性能,并在UrbanLF- real扩展数据集上创造了84.93\%$的mIoU新记录,增益为${+}3.69\%$。OAFuser的源代码可从https://github.com/FeiBryantkit/OAFuser获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
OAFuser: Toward Omni-Aperture Fusion for Light Field Semantic Segmentation
Light field cameras are capable of capturing intricate angular and spatial details. This allows for acquiring complex light patterns and details from multiple angles, significantly enhancing the precision of image semantic segmentation. However, two significant issues arise: 1) The extensive angular information of light field cameras contains a large amount of redundant data, which is overwhelming for the limited hardware resources of intelligent agents. 2) A relative displacement difference exists in the data collected by different microlenses. To address these issues, we propose an omni-aperture fusion model (OAFuser) that leverages dense context from the central view and extracts the angular information from subaperture images to generate semantically consistent results. To simultaneously streamline the redundant information from the light field cameras and avoid feature loss during network propagation, we present a simple yet very effective subaperture fusion module (SAFM) . This module efficiently embeds subaperture images in angular features, allowing the network to process each subaperture image with a minimal computational demand of only ( ${\sim}1\rm GFlops$ ). Furthermore, to address the mismatched spatial information across viewpoints, we present a center angular rectification module (CARM) to realize feature resorting and prevent feature occlusion caused by misalignment. The proposed OAFuser achieves state-of-the-art performance on four UrbanLF datasets in terms of all evaluation metrics and sets a new record of $84.93\%$ in mIoU on the UrbanLF-Real Extended dataset, with a gain of ${+}3.69\%$ . The source code for OAFuser is available at https://github.com/FeiBryantkit/OAFuser .
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
7.70
自引率
0.00%
发文量
0
期刊最新文献
Front Cover Table of Contents IEEE Transactions on Artificial Intelligence Publication Information Table of Contents Front Cover
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1