Desynchronization resilient video fingerprinting via randomized, low-rank tensor approximations

Mu Li, V. Monga
{"title":"Desynchronization resilient video fingerprinting via randomized, low-rank tensor approximations","authors":"Mu Li, V. Monga","doi":"10.1109/MMSP.2011.6093778","DOIUrl":null,"url":null,"abstract":"The problem of summarizing videos by short fingerprints or hashes has garnered significant attention recently. While traditional applications of video hashing lie in database search and content authentication, the emergence of websites such as YouTube and DailyMotion poses a challenging problem of anti-piracy video search. That is, hashes or fingerprints of an original video (provided to YouTube by the content owner) must be matched against those uploaded to YouTube by users to identify instances of “illegal” or undesirable uploads. Because the uploaded videos invariably differ from the original in their digital representation (owing to incidental or malicious distortions), robust video hashes are desired. In this paper, we model videos as order-3 tensors and use multilinear subspace projections, such as a reduced rank parallel factor analysis (PARAFAC) to construct video hashes. We observe that unlike most standard descriptors of video content, tensor based subspace projections can offer excellent robustness while effectively capturing the spatio-temporal essence of the video for discriminability. We further randomize the construction of the hash by dividing the video into randomly selected overlapping sub-cubes to prevent against intentional guessing and forgery. The most significant gains are seen for the difficult attacks of spatial (e.g. geometric) as well as temporal (random frame dropping) desynchronization. Experimental validation is provided in the form of ROC curves and we further perform detection-theoretic analysis which closely mimics empirically observed probability of error.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MMSP.2011.6093778","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14

Abstract

The problem of summarizing videos by short fingerprints or hashes has garnered significant attention recently. While traditional applications of video hashing lie in database search and content authentication, the emergence of websites such as YouTube and DailyMotion poses a challenging problem of anti-piracy video search. That is, hashes or fingerprints of an original video (provided to YouTube by the content owner) must be matched against those uploaded to YouTube by users to identify instances of “illegal” or undesirable uploads. Because the uploaded videos invariably differ from the original in their digital representation (owing to incidental or malicious distortions), robust video hashes are desired. In this paper, we model videos as order-3 tensors and use multilinear subspace projections, such as a reduced rank parallel factor analysis (PARAFAC) to construct video hashes. We observe that unlike most standard descriptors of video content, tensor based subspace projections can offer excellent robustness while effectively capturing the spatio-temporal essence of the video for discriminability. We further randomize the construction of the hash by dividing the video into randomly selected overlapping sub-cubes to prevent against intentional guessing and forgery. The most significant gains are seen for the difficult attacks of spatial (e.g. geometric) as well as temporal (random frame dropping) desynchronization. Experimental validation is provided in the form of ROC curves and we further perform detection-theoretic analysis which closely mimics empirically observed probability of error.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
去同步弹性视频指纹通过随机,低秩张量近似
最近,通过短指纹或哈希来总结视频的问题引起了人们的极大关注。传统的视频哈希算法应用于数据库搜索和内容认证,而YouTube和DailyMotion等网站的出现,给反盗版视频搜索带来了挑战。也就是说,原始视频(由内容所有者提供给YouTube)的哈希值或指纹必须与用户上传到YouTube的视频相匹配,以识别“非法”或不受欢迎的上传实例。由于上传的视频总是在数字表示上与原始视频不同(由于偶然或恶意扭曲),因此需要鲁棒的视频哈希。在本文中,我们将视频建模为3阶张量,并使用多线性子空间投影,如降阶并行因子分析(PARAFAC)来构建视频哈希。我们观察到,与大多数视频内容的标准描述符不同,基于张量的子空间投影可以提供出色的鲁棒性,同时有效地捕捉视频的时空本质以实现可判别性。我们通过将视频分成随机选择的重叠子立方体来进一步随机化哈希的构造,以防止故意猜测和伪造。最显著的增益见于空间(例如几何)和时间(随机帧丢失)去同步的困难攻击。实验验证以ROC曲线的形式提供,我们进一步进行检测理论分析,接近模拟经验观察到的误差概率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Separation of speech sources using an Acoustic Vector Sensor Strategies for orca call retrieval to support collaborative annotation of a large archive Recognizing actions using salient features Region of interest determination using human computation Image super-resolution via feature-based affine transform
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1