Takashi Nishizaki, R. Ogata, Yuichi Kameda, Yoshinari Ohta, Yuichi Nakamura
{"title":"对话场景自动视频捕获和编辑系统的视频质量分析","authors":"Takashi Nishizaki, R. Ogata, Yuichi Kameda, Yoshinari Ohta, Yuichi Nakamura","doi":"10.1109/ICME.2005.1521514","DOIUrl":null,"url":null,"abstract":"This paper introduces video quality analysis for automated video capture and editing. Previously, we proposed an automated video capture and editing system for conversation scenes. In the capture phase, our system not only produces concurrent video streams with multiple pan-tilt-zoom cameras but also recognizes \"conversation states\" i.e., who is speaking, when someone is nodding, etc. As it is necessary to know the conversation states for the automated editing phase, it is important to clarify how the recognition rate of the conversation attributes affects our editing system with regard to the quality of the resultant videos. In the present study, we analyzed the relationship between the recognition rate of conversation states and the quality of resultant videos through subjective evaluation experiments. The quality scores of the resultant videos were almost the same as the best case in which recognition was done manually, and the recognition rate of our capture system was therefore sufficient.","PeriodicalId":244360,"journal":{"name":"2005 IEEE International Conference on Multimedia and Expo","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Video quality analysis for an automated video capturing and editing system for conversation scenes\",\"authors\":\"Takashi Nishizaki, R. Ogata, Yuichi Kameda, Yoshinari Ohta, Yuichi Nakamura\",\"doi\":\"10.1109/ICME.2005.1521514\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper introduces video quality analysis for automated video capture and editing. Previously, we proposed an automated video capture and editing system for conversation scenes. In the capture phase, our system not only produces concurrent video streams with multiple pan-tilt-zoom cameras but also recognizes \\\"conversation states\\\" i.e., who is speaking, when someone is nodding, etc. As it is necessary to know the conversation states for the automated editing phase, it is important to clarify how the recognition rate of the conversation attributes affects our editing system with regard to the quality of the resultant videos. In the present study, we analyzed the relationship between the recognition rate of conversation states and the quality of resultant videos through subjective evaluation experiments. The quality scores of the resultant videos were almost the same as the best case in which recognition was done manually, and the recognition rate of our capture system was therefore sufficient.\",\"PeriodicalId\":244360,\"journal\":{\"name\":\"2005 IEEE International Conference on Multimedia and Expo\",\"volume\":\"45 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2005-07-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2005 IEEE International Conference on Multimedia and Expo\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICME.2005.1521514\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2005 IEEE International Conference on Multimedia and Expo","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICME.2005.1521514","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Video quality analysis for an automated video capturing and editing system for conversation scenes
This paper introduces video quality analysis for automated video capture and editing. Previously, we proposed an automated video capture and editing system for conversation scenes. In the capture phase, our system not only produces concurrent video streams with multiple pan-tilt-zoom cameras but also recognizes "conversation states" i.e., who is speaking, when someone is nodding, etc. As it is necessary to know the conversation states for the automated editing phase, it is important to clarify how the recognition rate of the conversation attributes affects our editing system with regard to the quality of the resultant videos. In the present study, we analyzed the relationship between the recognition rate of conversation states and the quality of resultant videos through subjective evaluation experiments. The quality scores of the resultant videos were almost the same as the best case in which recognition was done manually, and the recognition rate of our capture system was therefore sufficient.