对话场景自动视频捕获和编辑系统的视频质量分析

2005 IEEE International Conference on Multimedia and Expo Pub Date : 2005-07-06 DOI:10.1109/ICME.2005.1521514

Takashi Nishizaki, R. Ogata, Yuichi Kameda, Yoshinari Ohta, Yuichi Nakamura

{"title":"对话场景自动视频捕获和编辑系统的视频质量分析","authors":"Takashi Nishizaki, R. Ogata, Yuichi Kameda, Yoshinari Ohta, Yuichi Nakamura","doi":"10.1109/ICME.2005.1521514","DOIUrl":null,"url":null,"abstract":"This paper introduces video quality analysis for automated video capture and editing. Previously, we proposed an automated video capture and editing system for conversation scenes. In the capture phase, our system not only produces concurrent video streams with multiple pan-tilt-zoom cameras but also recognizes \"conversation states\" i.e., who is speaking, when someone is nodding, etc. As it is necessary to know the conversation states for the automated editing phase, it is important to clarify how the recognition rate of the conversation attributes affects our editing system with regard to the quality of the resultant videos. In the present study, we analyzed the relationship between the recognition rate of conversation states and the quality of resultant videos through subjective evaluation experiments. The quality scores of the resultant videos were almost the same as the best case in which recognition was done manually, and the recognition rate of our capture system was therefore sufficient.","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Video quality analysis for an automated video capturing and editing system for conversation scenes\",\"authors\":\"Takashi Nishizaki, R. Ogata, Yuichi Kameda, Yoshinari Ohta, Yuichi Nakamura\",\"doi\":\"10.1109/ICME.2005.1521514\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper introduces video quality analysis for automated video capture and editing. Previously, we proposed an automated video capture and editing system for conversation scenes. In the capture phase, our system not only produces concurrent video streams with multiple pan-tilt-zoom cameras but also recognizes \\\"conversation states\\\" i.e., who is speaking, when someone is nodding, etc. As it is necessary to know the conversation states for the automated editing phase, it is important to clarify how the recognition rate of the conversation attributes affects our editing system with regard to the quality of the resultant videos. In the present study, we analyzed the relationship between the recognition rate of conversation states and the quality of resultant videos through subjective evaluation experiments. The quality scores of the resultant videos were almost the same as the best case in which recognition was done manually, and the recognition rate of our capture system was therefore sufficient.\",\"PeriodicalId\":244360,\"journal\":{\"name\":\"2005 IEEE International Conference on Multimedia and Expo\",\"volume\":\"45 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2005-07-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2005 IEEE International Conference on Multimedia and Expo\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICME.2005.1521514\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2005 IEEE International Conference on Multimedia and Expo","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICME.2005.1521514","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

本文介绍了用于自动视频采集和编辑的视频质量分析。之前，我们提出了一个会话场景的自动视频捕获和编辑系统。在捕捉阶段，我们的系统不仅可以用多个泛倾斜变焦摄像机产生并发视频流，还可以识别“对话状态”，即谁在说话，什么时候有人在点头等。由于了解自动编辑阶段的对话状态是必要的，因此澄清对话属性的识别率如何影响我们的编辑系统对最终视频质量的影响是很重要的。在本研究中，我们通过主观评价实验分析了会话状态识别率与生成视频质量之间的关系。所得视频的质量分数几乎与手动识别的最佳情况相同，因此我们的捕获系统的识别率是足够的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Video quality analysis for an automated video capturing and editing system for conversation scenes

This paper introduces video quality analysis for automated video capture and editing. Previously, we proposed an automated video capture and editing system for conversation scenes. In the capture phase, our system not only produces concurrent video streams with multiple pan-tilt-zoom cameras but also recognizes "conversation states" i.e., who is speaking, when someone is nodding, etc. As it is necessary to know the conversation states for the automated editing phase, it is important to clarify how the recognition rate of the conversation attributes affects our editing system with regard to the quality of the resultant videos. In the present study, we analyzed the relationship between the recognition rate of conversation states and the quality of resultant videos through subjective evaluation experiments. The quality scores of the resultant videos were almost the same as the best case in which recognition was done manually, and the recognition rate of our capture system was therefore sufficient.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2005 IEEE International Conference on Multimedia and Expo

自引率

0.00%

发文量

期刊最新文献

Lossless image compression with tree coding of magnitude levels Maximizing the profit for cache replacement in a transcoding proxy Pre-Attentional Filtering in Compressed Video Annotation and detection of blended emotions in real human-human dialogs recorded in a call center Fast inter frame encoding based on modes pre-decision in H.264