基于卷积神经网络的视频摘要方法研究

Ke-xin Zheng, Xiang Chen
{"title":"基于卷积神经网络的视频摘要方法研究","authors":"Ke-xin Zheng, Xiang Chen","doi":"10.1117/12.2639224","DOIUrl":null,"url":null,"abstract":"Short videos on the Internet are growing exponentially, and the number of videos uploaded every day is huge; people also involve a lot of video data in real life. People can retrieve and view all kinds of videos, but it also brings a lot of problems. On the one hand, the accumulation of a large number of videos makes people unable to find the videos they want quickly, and the repeated scenes in the videos will also waste people's time and energy; on the other hand, a large amount of video data also brings enormous pressure to storage. Aiming at the problems of inaccurate selection of key frames and how to select video frame features in existing video summarization models, this paper proposes a multi-feature-based video summarization generation model (DME-VSNet), which extracts multiple features of video frames. Including importance score, image memory strength and image entropy. Aiming at the problem of inaccurate video shot segmentation, this model proposes a video shot segmentation algorithm based on TransNet network, which divides the original video into several short shots through shot boundaries; the model inputs the above three features into the proposed The video frame score is obtained in the MLP architecture, and the key frame is selected by the score to generate a video summary. The effectiveness of the video shot segmentation method based on TransNet network and the overall model based on convolutional neural network is verified by comparative experiments. The experimental results show that the evaluation results of the video summaries generated by the three features are better.","PeriodicalId":336892,"journal":{"name":"Neural Networks, Information and Communication Engineering","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Research on video summarization method based on convolutional neural network\",\"authors\":\"Ke-xin Zheng, Xiang Chen\",\"doi\":\"10.1117/12.2639224\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Short videos on the Internet are growing exponentially, and the number of videos uploaded every day is huge; people also involve a lot of video data in real life. People can retrieve and view all kinds of videos, but it also brings a lot of problems. On the one hand, the accumulation of a large number of videos makes people unable to find the videos they want quickly, and the repeated scenes in the videos will also waste people's time and energy; on the other hand, a large amount of video data also brings enormous pressure to storage. Aiming at the problems of inaccurate selection of key frames and how to select video frame features in existing video summarization models, this paper proposes a multi-feature-based video summarization generation model (DME-VSNet), which extracts multiple features of video frames. Including importance score, image memory strength and image entropy. Aiming at the problem of inaccurate video shot segmentation, this model proposes a video shot segmentation algorithm based on TransNet network, which divides the original video into several short shots through shot boundaries; the model inputs the above three features into the proposed The video frame score is obtained in the MLP architecture, and the key frame is selected by the score to generate a video summary. The effectiveness of the video shot segmentation method based on TransNet network and the overall model based on convolutional neural network is verified by comparative experiments. The experimental results show that the evaluation results of the video summaries generated by the three features are better.\",\"PeriodicalId\":336892,\"journal\":{\"name\":\"Neural Networks, Information and Communication Engineering\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Networks, Information and Communication Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1117/12.2639224\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks, Information and Communication Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.2639224","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

互联网上的短视频呈指数级增长,每天上传的视频数量巨大;人们在现实生活中也会涉及到大量的视频数据。人们可以检索和观看各种各样的视频,但这也带来了很多问题。一方面,大量视频的积累使得人们无法快速找到自己想要的视频,视频中反复出现的场景也会浪费人们的时间和精力;另一方面,海量的视频数据也给存储带来了巨大的压力。针对现有视频摘要模型中关键帧选择不准确以及如何选择视频帧特征的问题,本文提出了一种基于多特征的视频摘要生成模型(DME-VSNet),该模型提取视频帧的多个特征。包括重要性评分、图像记忆强度和图像熵。针对视频镜头分割不准确的问题,该模型提出了一种基于TransNet网络的视频镜头分割算法,该算法通过镜头边界将原始视频分割为多个短镜头;该模型将上述三个特征输入到所提出的视频帧分数中,在MLP架构中得到视频帧分数,并根据分数选择关键帧生成视频摘要。通过对比实验验证了基于TransNet网络的视频镜头分割方法和基于卷积神经网络的整体模型的有效性。实验结果表明,三种特征生成的视频摘要评价结果较好。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Research on video summarization method based on convolutional neural network
Short videos on the Internet are growing exponentially, and the number of videos uploaded every day is huge; people also involve a lot of video data in real life. People can retrieve and view all kinds of videos, but it also brings a lot of problems. On the one hand, the accumulation of a large number of videos makes people unable to find the videos they want quickly, and the repeated scenes in the videos will also waste people's time and energy; on the other hand, a large amount of video data also brings enormous pressure to storage. Aiming at the problems of inaccurate selection of key frames and how to select video frame features in existing video summarization models, this paper proposes a multi-feature-based video summarization generation model (DME-VSNet), which extracts multiple features of video frames. Including importance score, image memory strength and image entropy. Aiming at the problem of inaccurate video shot segmentation, this model proposes a video shot segmentation algorithm based on TransNet network, which divides the original video into several short shots through shot boundaries; the model inputs the above three features into the proposed The video frame score is obtained in the MLP architecture, and the key frame is selected by the score to generate a video summary. The effectiveness of the video shot segmentation method based on TransNet network and the overall model based on convolutional neural network is verified by comparative experiments. The experimental results show that the evaluation results of the video summaries generated by the three features are better.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Improve vulnerability prediction performance using self-attention mechanism and convolutional neural network Design of digital pulse-position modulation system based on minimum distance method Design of an externally adjustable oscillator circuit Research on non-intrusive video capture technology based on FPD-linkⅢ The communication process of digital binary pulse-position modulation with additive white Gaussian noise
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1