{"title":"Role of Spatio-Temporal Feature Position in Recognition of Human Vehicle Interaction","authors":"Qurat ul ain Ali, M. Yousaf","doi":"10.1109/TENCON.2018.8650232","DOIUrl":null,"url":null,"abstract":"This paper presents a solution for incorporating the structural information along with local features to enhance the recognition accuracy of human-vehicle interaction activities. Proposed system aims to exploit Bag of Words for extracting structural information both spatial and temporal relationship between features from video data to help achieve better recognition accuracy for complex interaction scenes. Traditional Bag of Words (BOW) approach is inefficient in representing structural information, feature positions and their temporal relationships which makes it difficult for the classifier to recognise interaction and complex scenes. The classifier uses BOW along with spatial and temporal positions of features. Random Forest and kNN are used as classifiers to compare classification results and to find a trade-off between recognition accuracy and computational complexity. We have used state of the art dataset VIRAT (Video and Image Retrieval and Analysis Tool) for validation of our scheme. Random Forest and modified BOW (RF+mBOW) gives better recognition accuracy at the cost of higher computational time whereas kNN and modified BOW (kNN+mBOW) takes less time for computations while giving remarkable recognition results. We observed that Random Forest and modified BOW (RF+mBOW) outperforms all state of art methodologies.","PeriodicalId":132900,"journal":{"name":"TENCON 2018 - 2018 IEEE Region 10 Conference","volume":"81 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"TENCON 2018 - 2018 IEEE Region 10 Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TENCON.2018.8650232","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
This paper presents a solution for incorporating the structural information along with local features to enhance the recognition accuracy of human-vehicle interaction activities. Proposed system aims to exploit Bag of Words for extracting structural information both spatial and temporal relationship between features from video data to help achieve better recognition accuracy for complex interaction scenes. Traditional Bag of Words (BOW) approach is inefficient in representing structural information, feature positions and their temporal relationships which makes it difficult for the classifier to recognise interaction and complex scenes. The classifier uses BOW along with spatial and temporal positions of features. Random Forest and kNN are used as classifiers to compare classification results and to find a trade-off between recognition accuracy and computational complexity. We have used state of the art dataset VIRAT (Video and Image Retrieval and Analysis Tool) for validation of our scheme. Random Forest and modified BOW (RF+mBOW) gives better recognition accuracy at the cost of higher computational time whereas kNN and modified BOW (kNN+mBOW) takes less time for computations while giving remarkable recognition results. We observed that Random Forest and modified BOW (RF+mBOW) outperforms all state of art methodologies.