首页 > 最新文献

2019 IEEE Winter Applications of Computer Vision Workshops (WACVW)最新文献

英文 中文
Joint Event Detection and Description in Continuous Video Streams 连续视频流中的联合事件检测与描述
Pub Date : 2018-02-28 DOI: 10.1109/WACV.2019.00048
Huijuan Xu, Boyang Albert Li, Vasili Ramanishka, L. Sigal, Kate Saenko
Dense video captioning involves first localizing events in a video and then generating captions for the identified events. We present the Joint Event Detection and Description Network (JEDDi-Net) for solving this task in an end-to-end fashion, which encodes the input video stream with three-dimensional convolutional layers, proposes variable- length temporal events based on pooled features, and then uses a two-level hierarchical LSTM module with context modeling to transcribe the event proposals into captions. We show the effectiveness of our proposed JEDDi-Net on the large-scale ActivityNet Captions dataset.
密集视频字幕包括首先将视频中的事件本地化,然后为已识别的事件生成字幕。我们提出了联合事件检测和描述网络(JEDDi-Net)以端到端方式解决该任务,该网络使用三维卷积层对输入视频流进行编码,提出基于池化特征的变长时间事件,然后使用具有上下文建模的两级分层LSTM模块将事件建议转录成字幕。我们在大规模ActivityNet Captions数据集上展示了我们提出的JEDDi-Net的有效性。
{"title":"Joint Event Detection and Description in Continuous Video Streams","authors":"Huijuan Xu, Boyang Albert Li, Vasili Ramanishka, L. Sigal, Kate Saenko","doi":"10.1109/WACV.2019.00048","DOIUrl":"https://doi.org/10.1109/WACV.2019.00048","url":null,"abstract":"Dense video captioning involves first localizing events in a video and then generating captions for the identified events. We present the Joint Event Detection and Description Network (JEDDi-Net) for solving this task in an end-to-end fashion, which encodes the input video stream with three-dimensional convolutional layers, proposes variable- length temporal events based on pooled features, and then uses a two-level hierarchical LSTM module with context modeling to transcribe the event proposals into captions. We show the effectiveness of our proposed JEDDi-Net on the large-scale ActivityNet Captions dataset.","PeriodicalId":254512,"journal":{"name":"2019 IEEE Winter Applications of Computer Vision Workshops (WACVW)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127532506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 48
A Scalable System Architecture for Activity Detection with Simple Heuristics 基于简单启发式的可扩展活动检测系统架构
Pub Date : 1900-01-01 DOI: 10.1109/WACVW.2019.00012
Rico Thomanek, Christian Roschke, Benny Platte, R. Manthey, Tony Rolletschke, Manuel Heinzig, M. Vodel, Frank Zimmer, Maximilian Eibl
The analysis of video footage regarding the identification of persons at defined locations or the detection of complex activities is still a challenging process. Nowadays, various (semi-)automated systems can be used to overcome different parts of these challenges. Object detection and their classification reach even higher detection rates when making use of the latest cutting-edge convolutional neural network frameworks. Integrated into a scalable infrastructure as a service data base system, we employ the combination of such networks by using the Detectron framework within Docker containers with case-specific engineered tracking and motion pattern heuristics in order to detect several activities with comparatively low and distributed computing efforts and reasonable results.
分析关于在确定地点识别人员或探测复杂活动的录像片段仍然是一个具有挑战性的过程。如今,各种(半)自动化系统可以用来克服这些挑战的不同部分。当使用最新最前沿的卷积神经网络框架时,目标检测及其分类达到更高的检测率。集成到可扩展的基础设施即服务数据库系统中,我们通过在Docker容器中使用Detectron框架以及特定案例的工程跟踪和运动模式启发式来结合这些网络,以便以相对较低的分布式计算工作量和合理的结果检测多个活动。
{"title":"A Scalable System Architecture for Activity Detection with Simple Heuristics","authors":"Rico Thomanek, Christian Roschke, Benny Platte, R. Manthey, Tony Rolletschke, Manuel Heinzig, M. Vodel, Frank Zimmer, Maximilian Eibl","doi":"10.1109/WACVW.2019.00012","DOIUrl":"https://doi.org/10.1109/WACVW.2019.00012","url":null,"abstract":"The analysis of video footage regarding the identification of persons at defined locations or the detection of complex activities is still a challenging process. Nowadays, various (semi-)automated systems can be used to overcome different parts of these challenges. Object detection and their classification reach even higher detection rates when making use of the latest cutting-edge convolutional neural network frameworks. Integrated into a scalable infrastructure as a service data base system, we employ the combination of such networks by using the Detectron framework within Docker containers with case-specific engineered tracking and motion pattern heuristics in order to detect several activities with comparatively low and distributed computing efforts and reasonable results.","PeriodicalId":254512,"journal":{"name":"2019 IEEE Winter Applications of Computer Vision Workshops (WACVW)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124824462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Fine-grained Action Detection in Untrimmed Surveillance Videos 未修剪监控视频中的细粒度动作检测
Pub Date : 1900-01-01 DOI: 10.1109/WACVW.2019.00014
Sathyanarayanan N. Aakur, Daniel Sawyer, Sudeep Sarkar
Spatiotemporal localization of activities in untrimmed surveillance videos is a hard task, especially given the occurrence of simultaneous activities across different temporal and spatial scales. We tackle this problem using a cascaded region proposal and detection (CRPAD) framework implementing frame-level simultaneous action detection, followed by tracking. We propose the use of a frame-level spatial detection model based on advances in object detection and a temporal linking algorithm that models the temporal dynamics of the detected activities. We show results on the VIRAT dataset through the recent Activities in Extended Video (ActEV) challenge that is part of the TrecVID competition[1, 2].
对未经修剪的监控视频中的活动进行时空定位是一项艰巨的任务,特别是考虑到在不同时空尺度上同时发生的活动。我们使用级联区域提议和检测(CRPAD)框架来解决这个问题,该框架实现帧级同步动作检测,然后进行跟踪。我们建议使用基于目标检测进展的帧级空间检测模型和时间链接算法,该算法模拟被检测活动的时间动态。我们通过最近的扩展视频活动(ActEV)挑战展示了VIRAT数据集上的结果,该挑战是trevid竞赛的一部分[1,2]。
{"title":"Fine-grained Action Detection in Untrimmed Surveillance Videos","authors":"Sathyanarayanan N. Aakur, Daniel Sawyer, Sudeep Sarkar","doi":"10.1109/WACVW.2019.00014","DOIUrl":"https://doi.org/10.1109/WACVW.2019.00014","url":null,"abstract":"Spatiotemporal localization of activities in untrimmed surveillance videos is a hard task, especially given the occurrence of simultaneous activities across different temporal and spatial scales. We tackle this problem using a cascaded region proposal and detection (CRPAD) framework implementing frame-level simultaneous action detection, followed by tracking. We propose the use of a frame-level spatial detection model based on advances in object detection and a temporal linking algorithm that models the temporal dynamics of the detected activities. We show results on the VIRAT dataset through the recent Activities in Extended Video (ActEV) challenge that is part of the TrecVID competition[1, 2].","PeriodicalId":254512,"journal":{"name":"2019 IEEE Winter Applications of Computer Vision Workshops (WACVW)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133511538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Minding the Gaps in a Video Action Analysis Pipeline 注意视频动作分析管道中的漏洞
Pub Date : 1900-01-01 DOI: 10.1109/WACVW.2019.00015
Jia Chen, Jiang Liu, Junwei Liang, Ting-yao Hu, Wei Ke, Wayner Barrios, Dong Huang, Alexander Hauptmann
We present an event detection system, which shares many similarities with standard object detection pipelines. It is composed of four modules: feature extraction, event proposal generation, event classification and event localization. We developed and assessed each module separately by evaluating several candidate options given oracle input using intermediate evaluation metric. This particular process results in a mismatch gap between training and testing when we integrate the module into the complete system pipeline. This results from the fact that each module is trained on clean oracle input, but during testing the module can only receive system generated input, which can be significantly different from the oracle data. Furthermore, we discovered that all the gaps between the different modules can contribute to a decrease in accuracy and they represent the major bottleneck for a system developed in this way. Fortunately, we were able to develop a set of relatively simple fixes in our final system to address and mitigate some of the gaps.
我们提出了一个事件检测系统,它与标准的目标检测管道有许多相似之处。它由四个模块组成:特征提取、事件建议生成、事件分类和事件定位。我们通过使用中间评估指标评估oracle输入的几个候选选项,分别开发和评估每个模块。当我们将模块集成到完整的系统管道中时,这个特殊的过程会导致训练和测试之间的不匹配差距。这是因为每个模块都是在干净的oracle输入上进行训练的,但是在测试期间,模块只能接收系统生成的输入,这可能与oracle数据有很大的不同。此外,我们发现不同模块之间的所有间隙都可能导致准确性的降低,并且它们代表了以这种方式开发系统的主要瓶颈。幸运的是,我们能够在我们的最终系统中开发一组相对简单的修复,以解决和减轻一些差距。
{"title":"Minding the Gaps in a Video Action Analysis Pipeline","authors":"Jia Chen, Jiang Liu, Junwei Liang, Ting-yao Hu, Wei Ke, Wayner Barrios, Dong Huang, Alexander Hauptmann","doi":"10.1109/WACVW.2019.00015","DOIUrl":"https://doi.org/10.1109/WACVW.2019.00015","url":null,"abstract":"We present an event detection system, which shares many similarities with standard object detection pipelines. It is composed of four modules: feature extraction, event proposal generation, event classification and event localization. We developed and assessed each module separately by evaluating several candidate options given oracle input using intermediate evaluation metric. This particular process results in a mismatch gap between training and testing when we integrate the module into the complete system pipeline. This results from the fact that each module is trained on clean oracle input, but during testing the module can only receive system generated input, which can be significantly different from the oracle data. Furthermore, we discovered that all the gaps between the different modules can contribute to a decrease in accuracy and they represent the major bottleneck for a system developed in this way. Fortunately, we were able to develop a set of relatively simple fixes in our final system to address and mitigate some of the gaps.","PeriodicalId":254512,"journal":{"name":"2019 IEEE Winter Applications of Computer Vision Workshops (WACVW)","volume":"193 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124297774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Deep Representation Learning for Metadata Verification 元数据验证的深度表示学习
Pub Date : 1900-01-01 DOI: 10.1109/WACVW.2019.00019
Bor-Chun Chen, L. Davis
Verifying the authenticity of a given image is an emerging topic in media forensics research. Many current works focus on content manipulation detection, which aims to detect possible alteration in the image content. However, tampering might not only occur in the image content itself, but also in the metadata associated with the image, such as timestamp, geo-tag, and captions. We address metadata verification, aiming to verify the authenticity of the metadata associated with the image, using a deep representation learning approach. We propose a deep neural network called Attentive Bilinear Convolutional Neural Networks (AB-CNN) that learns appropriate representation for metadata verification. AB-CNN address several common challenges in verifying a specific type of metadata – event (i.e. time and places), including lack of training data, finegrained differences between distinct events, and diverse visual content within the same event. Experimental results on three different datasets show that the proposed model can provide a substantial improvement over the baseline method.
验证给定图像的真实性是媒体取证研究中的一个新兴课题。目前的许多工作都集中在内容操作检测上,目的是检测图像内容可能发生的变化。但是,篡改不仅可能发生在图像内容本身,还可能发生在与图像相关的元数据中,例如时间戳、地理标记和标题。我们解决元数据验证,旨在验证与图像相关的元数据的真实性,使用深度表示学习方法。我们提出了一种深度神经网络,称为细心双线性卷积神经网络(AB-CNN),它可以学习适当的元数据验证表示。AB-CNN在验证特定类型的元数据事件(即时间和地点)时解决了几个常见的挑战,包括缺乏训练数据,不同事件之间的细粒度差异以及同一事件内的不同视觉内容。在三个不同数据集上的实验结果表明,该模型比基线方法有很大的改进。
{"title":"Deep Representation Learning for Metadata Verification","authors":"Bor-Chun Chen, L. Davis","doi":"10.1109/WACVW.2019.00019","DOIUrl":"https://doi.org/10.1109/WACVW.2019.00019","url":null,"abstract":"Verifying the authenticity of a given image is an emerging topic in media forensics research. Many current works focus on content manipulation detection, which aims to detect possible alteration in the image content. However, tampering might not only occur in the image content itself, but also in the metadata associated with the image, such as timestamp, geo-tag, and captions. We address metadata verification, aiming to verify the authenticity of the metadata associated with the image, using a deep representation learning approach. We propose a deep neural network called Attentive Bilinear Convolutional Neural Networks (AB-CNN) that learns appropriate representation for metadata verification. AB-CNN address several common challenges in verifying a specific type of metadata – event (i.e. time and places), including lack of training data, finegrained differences between distinct events, and diverse visual content within the same event. Experimental results on three different datasets show that the proposed model can provide a substantial improvement over the baseline method.","PeriodicalId":254512,"journal":{"name":"2019 IEEE Winter Applications of Computer Vision Workshops (WACVW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125873343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
期刊
2019 IEEE Winter Applications of Computer Vision Workshops (WACVW)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1