HAVS: Human action-based video summarization, Taxonomy, Challenges, and Future Perspectives

2021 International Conference on Innovative Computing, Intelligent Communication and Smart Electrical Systems (ICSES) Pub Date : 2021-09-24 DOI:10.1109/ICSES52305.2021.9633804

Ambreen Sabha, A. Selwal

{"title":"HAVS: Human action-based video summarization, Taxonomy, Challenges, and Future Perspectives","authors":"Ambreen Sabha, A. Selwal","doi":"10.1109/ICSES52305.2021.9633804","DOIUrl":null,"url":null,"abstract":"In computer vision, video summarization is a critical research problem as it is related to a more condensed and engaging portrayal of the video's original content. Deep learning models have lately been employed for various approaches to human action recognition. In this paper, we examine the most up-to-date methodologies for summarizing human behaviors in videos, as well as numerous deep learning and hybrid algorithms. We provide an in-depth analysis of the many forms of human activities, including gesture-based, interaction-based, human action-based, and group activity-based activities. Our study goes over the most recent benchmark datasets for recognizing human motion in video sequences. It also discusses the strengths and limitations of the existing methods, open research issues, and future directions for human action-based video summarization (HAVS). This work clearly reveals that majority of HAVS approaches rely upon key-frames selection using Convolution neural network (CNN), which direct research community to explore sequence learning such as Long short-term neural network (LSTM). Furthermore, inadequate datasets for learning HAVS models are an additional challenge. An improvement in existing deep learning models for HAVS may be oriented towards the notion of transfer learning, which results in lower training overhead and higher accuracy.","PeriodicalId":6777,"journal":{"name":"2021 International Conference on Innovative Computing, Intelligent Communication and Smart Electrical Systems (ICSES)","volume":"108 1","pages":"1-9"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Innovative Computing, Intelligent Communication and Smart Electrical Systems (ICSES)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSES52305.2021.9633804","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

Abstract

In computer vision, video summarization is a critical research problem as it is related to a more condensed and engaging portrayal of the video's original content. Deep learning models have lately been employed for various approaches to human action recognition. In this paper, we examine the most up-to-date methodologies for summarizing human behaviors in videos, as well as numerous deep learning and hybrid algorithms. We provide an in-depth analysis of the many forms of human activities, including gesture-based, interaction-based, human action-based, and group activity-based activities. Our study goes over the most recent benchmark datasets for recognizing human motion in video sequences. It also discusses the strengths and limitations of the existing methods, open research issues, and future directions for human action-based video summarization (HAVS). This work clearly reveals that majority of HAVS approaches rely upon key-frames selection using Convolution neural network (CNN), which direct research community to explore sequence learning such as Long short-term neural network (LSTM). Furthermore, inadequate datasets for learning HAVS models are an additional challenge. An improvement in existing deep learning models for HAVS may be oriented towards the notion of transfer learning, which results in lower training overhead and higher accuracy.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

HAVS:基于人类行为的视频总结、分类、挑战和未来展望

在计算机视觉中，视频摘要是一个关键的研究问题，因为它关系到对视频原始内容的更浓缩和更吸引人的描绘。深度学习模型最近被用于各种人类行为识别方法。在本文中，我们研究了用于总结视频中人类行为的最新方法，以及许多深度学习和混合算法。我们对多种形式的人类活动进行了深入分析，包括基于手势的、基于互动的、基于人类行为的和基于群体活动的活动。我们的研究通过最新的基准数据集来识别视频序列中的人体运动。讨论了基于人类行为的视频摘要(HAVS)现有方法的优势和局限性、开放的研究问题以及未来的发展方向。这项工作清楚地表明，大多数HAVS方法依赖于使用卷积神经网络(CNN)的关键帧选择，这指导了研究界探索序列学习，如长短期神经网络(LSTM)。此外，用于学习HAVS模型的数据集不足是另一个挑战。现有的HAVS深度学习模型的改进可能是面向迁移学习的概念，这将导致更低的训练开销和更高的准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2021 International Conference on Innovative Computing, Intelligent Communication and Smart Electrical Systems (ICSES)

自引率

0.00%

发文量