Online Video Object Detection Using Association LSTM

2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI:10.1109/ICCV.2017.257

Yongyi Lu, Cewu Lu, Chi-Keung Tang

引用次数: 101

Abstract

Video object detection is a fundamental tool for many applications. Since direct application of image-based object detection cannot leverage the rich temporal information inherent in video data, we advocate to the detection of long-range video object pattern. While the Long Short-Term Memory (LSTM) has been the de facto choice for such detection, currently LSTM cannot fundamentally model object association between consecutive frames. In this paper, we propose the association LSTM to address this fundamental association problem. Association LSTM not only regresses and classifiy directly on object locations and categories but also associates features to represent each output object. By minimizing the matching error between these features, we learn how to associate objects in two consecutive frames. Additionally, our method works in an online manner, which is important for most video tasks. Compared to the traditional video object detection methods, our approach outperforms them on standard video datasets.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于关联LSTM的在线视频目标检测

视频目标检测是许多应用的基本工具。由于直接应用基于图像的目标检测不能充分利用视频数据中所固有的丰富的时间信息，我们主张对远程视频目标模式进行检测。虽然长短期记忆(LSTM)已经成为这种检测的实际选择，但目前LSTM还不能从根本上对连续帧之间的对象关联进行建模。在本文中，我们提出了关联LSTM来解决这个基本的关联问题。关联LSTM不仅直接对对象的位置和类别进行回归和分类，而且还将特征关联起来以表示每个输出对象。通过最小化这些特征之间的匹配误差，我们学习如何将两个连续帧中的对象关联起来。此外，我们的方法以在线方式工作，这对大多数视频任务都很重要。与传统的视频目标检测方法相比，我们的方法在标准视频数据集上优于传统的视频目标检测方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2017 IEEE International Conference on Computer Vision (ICCV)

自引率

0.00%

发文量

期刊最新文献

Visual Odometry for Pixel Processor Arrays Rolling Shutter Correction in Manhattan World Sketching with Style: Visual Search with Sketches and Aesthetic Context Active Learning for Human Pose Estimation Attribute-Enhanced Face Recognition with Neural Tensor Fusion Networks