Sunil S Harakannanavar , Shaik Roshan Sameer , Vikash Kumar , Sunil Kumar Behera , Adithya V Amberkar , Veena I. Puranikmath
{"title":"Robust video summarization algorithm using supervised machine learning","authors":"Sunil S Harakannanavar , Shaik Roshan Sameer , Vikash Kumar , Sunil Kumar Behera , Adithya V Amberkar , Veena I. Puranikmath","doi":"10.1016/j.gltp.2022.04.009","DOIUrl":null,"url":null,"abstract":"<div><p>The proposed approach uses ResNet-18 for feature extraction and with the help of temporal interest proposals generated for the video sequences, generates a video summary. The ResNet-18 is a convolutional neural network with eighteen layers. The existing methods don't address the problem of the summary being temporally consistent. The proposed work aims to create a temporally consistent summary. The classification and regression module are implemented to get fixed length inputs of the combined features. After this, the non-maximum suppression algorithm is applied to reduce the redundancy and remove the video segments having poor quality and low confidence-scores. Video summaries are generated using the kernel temporal segmentation (KTS) algorithm which converts a given video segment into video shots. The two standard datasets TVSum and SumMe are used to evaluate the proposed model. It is seen that the F-score obtained on TVSum and SumMe dataset is 56.13 and 45.06 respectively.</p></div>","PeriodicalId":100588,"journal":{"name":"Global Transitions Proceedings","volume":"3 1","pages":"Pages 131-135"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666285X22000450/pdfft?md5=eed96dd5bed239cbd125280be4cf8fa1&pid=1-s2.0-S2666285X22000450-main.pdf","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Global Transitions Proceedings","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666285X22000450","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
The proposed approach uses ResNet-18 for feature extraction and with the help of temporal interest proposals generated for the video sequences, generates a video summary. The ResNet-18 is a convolutional neural network with eighteen layers. The existing methods don't address the problem of the summary being temporally consistent. The proposed work aims to create a temporally consistent summary. The classification and regression module are implemented to get fixed length inputs of the combined features. After this, the non-maximum suppression algorithm is applied to reduce the redundancy and remove the video segments having poor quality and low confidence-scores. Video summaries are generated using the kernel temporal segmentation (KTS) algorithm which converts a given video segment into video shots. The two standard datasets TVSum and SumMe are used to evaluate the proposed model. It is seen that the F-score obtained on TVSum and SumMe dataset is 56.13 and 45.06 respectively.