{"title":"HAVS: Human action-based video summarization, Taxonomy, Challenges, and Future Perspectives","authors":"Ambreen Sabha, A. Selwal","doi":"10.1109/ICSES52305.2021.9633804","DOIUrl":null,"url":null,"abstract":"In computer vision, video summarization is a critical research problem as it is related to a more condensed and engaging portrayal of the video's original content. Deep learning models have lately been employed for various approaches to human action recognition. In this paper, we examine the most up-to-date methodologies for summarizing human behaviors in videos, as well as numerous deep learning and hybrid algorithms. We provide an in-depth analysis of the many forms of human activities, including gesture-based, interaction-based, human action-based, and group activity-based activities. Our study goes over the most recent benchmark datasets for recognizing human motion in video sequences. It also discusses the strengths and limitations of the existing methods, open research issues, and future directions for human action-based video summarization (HAVS). This work clearly reveals that majority of HAVS approaches rely upon key-frames selection using Convolution neural network (CNN), which direct research community to explore sequence learning such as Long short-term neural network (LSTM). Furthermore, inadequate datasets for learning HAVS models are an additional challenge. An improvement in existing deep learning models for HAVS may be oriented towards the notion of transfer learning, which results in lower training overhead and higher accuracy.","PeriodicalId":6777,"journal":{"name":"2021 International Conference on Innovative Computing, Intelligent Communication and Smart Electrical Systems (ICSES)","volume":"108 1","pages":"1-9"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Innovative Computing, Intelligent Communication and Smart Electrical Systems (ICSES)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSES52305.2021.9633804","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
In computer vision, video summarization is a critical research problem as it is related to a more condensed and engaging portrayal of the video's original content. Deep learning models have lately been employed for various approaches to human action recognition. In this paper, we examine the most up-to-date methodologies for summarizing human behaviors in videos, as well as numerous deep learning and hybrid algorithms. We provide an in-depth analysis of the many forms of human activities, including gesture-based, interaction-based, human action-based, and group activity-based activities. Our study goes over the most recent benchmark datasets for recognizing human motion in video sequences. It also discusses the strengths and limitations of the existing methods, open research issues, and future directions for human action-based video summarization (HAVS). This work clearly reveals that majority of HAVS approaches rely upon key-frames selection using Convolution neural network (CNN), which direct research community to explore sequence learning such as Long short-term neural network (LSTM). Furthermore, inadequate datasets for learning HAVS models are an additional challenge. An improvement in existing deep learning models for HAVS may be oriented towards the notion of transfer learning, which results in lower training overhead and higher accuracy.