Continuous-time Object Segmentation using High Temporal Resolution Event Camera.

Lin Zhu, Xianzhang Chen, Lizhi Wang, Xiao Wang, Yonghong Tian, Hua Huang
{"title":"Continuous-time Object Segmentation using High Temporal Resolution Event Camera.","authors":"Lin Zhu, Xianzhang Chen, Lizhi Wang, Xiao Wang, Yonghong Tian, Hua Huang","doi":"10.1109/TPAMI.2024.3477591","DOIUrl":null,"url":null,"abstract":"<p><p>Event cameras are novel bio-inspired sensors, where individual pixels operate independently and asynchronously, generating intensity changes as events. Leveraging the microsecond resolution (no motion blur) and high dynamic range (compatible with extreme light conditions) of events, there is considerable promise in directly segmenting objects from sparse and asynchronous event streams in various applications. However, different from the rich cues in video object segmentation, it is challenging to segment complete objects from the sparse event stream. In this paper, we present the first framework for continuous-time object segmentation from event stream. Given the object mask at the initial time, our task aims to segment the complete object at any subsequent time in event streams. Specifically, our framework consists of a Recurrent Temporal Embedding Extraction (RTEE) module based on a novel ResLSTM, a Cross-time Spatiotemporal Feature Modeling (CSFM) module which is a transformer architecture with long-term and short-term matching modules, and a segmentation head. The historical events and masks (reference sets) are recurrently fed into our framework along with current-time events. The temporal embedding is updated as new events are input, enabling our framework to continuously process the event stream. To train and test our model, we construct both real-world and simulated event-based object segmentation datasets, each comprising event streams, APS images, and object annotations. Extensive experiments on our datasets demonstrate the effectiveness of the proposed recurrent architecture. Our code and dataset are available at https://sites.google.com/view/ecos-net/.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TPAMI.2024.3477591","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Event cameras are novel bio-inspired sensors, where individual pixels operate independently and asynchronously, generating intensity changes as events. Leveraging the microsecond resolution (no motion blur) and high dynamic range (compatible with extreme light conditions) of events, there is considerable promise in directly segmenting objects from sparse and asynchronous event streams in various applications. However, different from the rich cues in video object segmentation, it is challenging to segment complete objects from the sparse event stream. In this paper, we present the first framework for continuous-time object segmentation from event stream. Given the object mask at the initial time, our task aims to segment the complete object at any subsequent time in event streams. Specifically, our framework consists of a Recurrent Temporal Embedding Extraction (RTEE) module based on a novel ResLSTM, a Cross-time Spatiotemporal Feature Modeling (CSFM) module which is a transformer architecture with long-term and short-term matching modules, and a segmentation head. The historical events and masks (reference sets) are recurrently fed into our framework along with current-time events. The temporal embedding is updated as new events are input, enabling our framework to continuously process the event stream. To train and test our model, we construct both real-world and simulated event-based object segmentation datasets, each comprising event streams, APS images, and object annotations. Extensive experiments on our datasets demonstrate the effectiveness of the proposed recurrent architecture. Our code and dataset are available at https://sites.google.com/view/ecos-net/.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用高时间分辨率事件摄像机进行连续时间物体分割。
事件相机是一种新颖的生物启发传感器,单个像素独立异步工作,产生强度变化作为事件。利用事件的微秒级分辨率(无运动模糊)和高动态范围(与极端光线条件兼容),在各种应用中直接从稀疏和异步事件流中分割对象大有可为。然而,与视频对象分割中的丰富线索不同,从稀疏事件流中分割完整的对象具有挑战性。在本文中,我们首次提出了从事件流中进行连续时间对象分割的框架。鉴于初始时间的对象掩码,我们的任务旨在分割事件流中任意后续时间的完整对象。具体来说,我们的框架由一个基于新型 ResLSTM 的递归时空嵌入提取(RTEE)模块、一个跨时时空特征建模(CSFM)模块(这是一个包含长期和短期匹配模块的转换器架构)和一个分割头组成。历史事件和掩码(参考集)与当前时间事件一起循环输入到我们的框架中。随着新事件的输入,时间嵌入也会随之更新,从而使我们的框架能够持续处理事件流。为了训练和测试我们的模型,我们构建了真实世界和模拟基于事件的物体分割数据集,每个数据集都包含事件流、APS 图像和物体注释。在我们的数据集上进行的大量实验证明了所提出的循环架构的有效性。我们的代码和数据集可在 https://sites.google.com/view/ecos-net/ 上获取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Changen2: Multi-Temporal Remote Sensing Generative Change Foundation Model. Continuous-time Object Segmentation using High Temporal Resolution Event Camera. Dual-grained Lightweight Strategy. Fast Window-Based Event Denoising with Spatiotemporal Correlation Enhancement. Robust Multimodal Learning with Missing Modalities via Parameter-Efficient Adaptation.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1