Contrastive Learning for Unsupervised Video Highlight Detection

Taivanbat Badamdorj, Mrigank Rochan, Yang Wang, Li-na Cheng
{"title":"Contrastive Learning for Unsupervised Video Highlight Detection","authors":"Taivanbat Badamdorj, Mrigank Rochan, Yang Wang, Li-na Cheng","doi":"10.1109/CVPR52688.2022.01365","DOIUrl":null,"url":null,"abstract":"Video highlight detection can greatly simplify video browsing, potentially paving the way for a wide range of ap-plications. Existing efforts are mostly fully-supervised, requiring humans to manually identify and label the interesting moments (called highlights) in a video. Recent weakly supervised methods forgo the use of highlight annotations, but typically require extensive efforts in collecting external data such as web-crawled videos for model learning. This observation has inspired us to consider unsupervised highlight detection where neither frame-level nor video-level annotations are available in training. We propose a simple contrastive learning framework for unsupervised highlight detection. Our framework encodes a video into a vector representation by learning to pick video clips that help to distinguish it from other videos via a contrastive objective using dropout noise. This inherently allows our framework to identify video clips corresponding to highlight of the video. Extensive empirical evaluations on three highlight detection benchmarks demonstrate the superior performance of our approach.","PeriodicalId":355552,"journal":{"name":"2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR52688.2022.01365","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

Abstract

Video highlight detection can greatly simplify video browsing, potentially paving the way for a wide range of ap-plications. Existing efforts are mostly fully-supervised, requiring humans to manually identify and label the interesting moments (called highlights) in a video. Recent weakly supervised methods forgo the use of highlight annotations, but typically require extensive efforts in collecting external data such as web-crawled videos for model learning. This observation has inspired us to consider unsupervised highlight detection where neither frame-level nor video-level annotations are available in training. We propose a simple contrastive learning framework for unsupervised highlight detection. Our framework encodes a video into a vector representation by learning to pick video clips that help to distinguish it from other videos via a contrastive objective using dropout noise. This inherently allows our framework to identify video clips corresponding to highlight of the video. Extensive empirical evaluations on three highlight detection benchmarks demonstrate the superior performance of our approach.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
无监督视频高光检测的对比学习
视频高亮检测可以大大简化视频浏览,潜在地为广泛的应用程序铺平道路。现有的努力大多是完全监督的,需要人类手动识别和标记视频中的有趣时刻(称为亮点)。最近的弱监督方法放弃了高亮注释的使用,但通常需要大量的努力来收集外部数据,例如用于模型学习的网络抓取视频。这一观察启发我们考虑无监督高亮检测,在训练中既没有帧级注释,也没有视频级注释。我们提出了一个简单的对比学习框架,用于无监督突出检测。我们的框架通过学习挑选视频片段,将视频编码为矢量表示,这些视频片段有助于通过使用dropout噪声的对比目标将其与其他视频区分开来。这本质上允许我们的框架识别与视频高亮相对应的视频剪辑。对三个亮点检测基准的广泛实证评估表明,我们的方法具有优越的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Synthetic Aperture Imaging with Events and Frames PhotoScene: Photorealistic Material and Lighting Transfer for Indoor Scenes A Unified Model for Line Projections in Catadioptric Cameras with Rotationally Symmetric Mirrors Distinguishing Unseen from Seen for Generalized Zero-shot Learning Virtual Correspondence: Humans as a Cue for Extreme-View Geometry
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1