{"title":"注意视频动作分析管道中的漏洞","authors":"Jia Chen, Jiang Liu, Junwei Liang, Ting-yao Hu, Wei Ke, Wayner Barrios, Dong Huang, Alexander Hauptmann","doi":"10.1109/WACVW.2019.00015","DOIUrl":null,"url":null,"abstract":"We present an event detection system, which shares many similarities with standard object detection pipelines. It is composed of four modules: feature extraction, event proposal generation, event classification and event localization. We developed and assessed each module separately by evaluating several candidate options given oracle input using intermediate evaluation metric. This particular process results in a mismatch gap between training and testing when we integrate the module into the complete system pipeline. This results from the fact that each module is trained on clean oracle input, but during testing the module can only receive system generated input, which can be significantly different from the oracle data. Furthermore, we discovered that all the gaps between the different modules can contribute to a decrease in accuracy and they represent the major bottleneck for a system developed in this way. Fortunately, we were able to develop a set of relatively simple fixes in our final system to address and mitigate some of the gaps.","PeriodicalId":254512,"journal":{"name":"2019 IEEE Winter Applications of Computer Vision Workshops (WACVW)","volume":"193 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"Minding the Gaps in a Video Action Analysis Pipeline\",\"authors\":\"Jia Chen, Jiang Liu, Junwei Liang, Ting-yao Hu, Wei Ke, Wayner Barrios, Dong Huang, Alexander Hauptmann\",\"doi\":\"10.1109/WACVW.2019.00015\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present an event detection system, which shares many similarities with standard object detection pipelines. It is composed of four modules: feature extraction, event proposal generation, event classification and event localization. We developed and assessed each module separately by evaluating several candidate options given oracle input using intermediate evaluation metric. This particular process results in a mismatch gap between training and testing when we integrate the module into the complete system pipeline. This results from the fact that each module is trained on clean oracle input, but during testing the module can only receive system generated input, which can be significantly different from the oracle data. Furthermore, we discovered that all the gaps between the different modules can contribute to a decrease in accuracy and they represent the major bottleneck for a system developed in this way. Fortunately, we were able to develop a set of relatively simple fixes in our final system to address and mitigate some of the gaps.\",\"PeriodicalId\":254512,\"journal\":{\"name\":\"2019 IEEE Winter Applications of Computer Vision Workshops (WACVW)\",\"volume\":\"193 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE Winter Applications of Computer Vision Workshops (WACVW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WACVW.2019.00015\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE Winter Applications of Computer Vision Workshops (WACVW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WACVW.2019.00015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Minding the Gaps in a Video Action Analysis Pipeline
We present an event detection system, which shares many similarities with standard object detection pipelines. It is composed of four modules: feature extraction, event proposal generation, event classification and event localization. We developed and assessed each module separately by evaluating several candidate options given oracle input using intermediate evaluation metric. This particular process results in a mismatch gap between training and testing when we integrate the module into the complete system pipeline. This results from the fact that each module is trained on clean oracle input, but during testing the module can only receive system generated input, which can be significantly different from the oracle data. Furthermore, we discovered that all the gaps between the different modules can contribute to a decrease in accuracy and they represent the major bottleneck for a system developed in this way. Fortunately, we were able to develop a set of relatively simple fixes in our final system to address and mitigate some of the gaps.