Masaki Takahashi, M. Naemura, Mahito Fujii, J. Little
{"title":"Recognition of Action in Broadcast Basketball Videos on the Basis of Global and Local Pairwise Representation","authors":"Masaki Takahashi, M. Naemura, Mahito Fujii, J. Little","doi":"10.1109/ISM.2013.32","DOIUrl":null,"url":null,"abstract":"A new feature-representation method for recognizing actions in broadcast videos, which focuses on the relationship between human actions and camera motions, is proposed. With this method, key point trajectories are extracted as motion features in spatio-temporal sub-regions called \"spatio-temporal multiscale bags\" (STMBs). Global representations and local representations from one sub-region in the STMBs are then combined to create a \"glocal pair wise representation\" (GPR). The GPR considers the co-occurrence of camera motions and human actions. Finally, two-stage SVM classifiers are trained with STMB-based GPRs, and specified human actions in video sequences are identified. It was experimentally confirmed that the proposed method can robustly detect specific human actions in broadcast basketball videos.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"37 1","pages":"147-154"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISM.2013.32","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
A new feature-representation method for recognizing actions in broadcast videos, which focuses on the relationship between human actions and camera motions, is proposed. With this method, key point trajectories are extracted as motion features in spatio-temporal sub-regions called "spatio-temporal multiscale bags" (STMBs). Global representations and local representations from one sub-region in the STMBs are then combined to create a "glocal pair wise representation" (GPR). The GPR considers the co-occurrence of camera motions and human actions. Finally, two-stage SVM classifiers are trained with STMB-based GPRs, and specified human actions in video sequences are identified. It was experimentally confirmed that the proposed method can robustly detect specific human actions in broadcast basketball videos.