Sujitha Martin, Eshed Ohn-Bar, Ashish Tawari, M. Trivedi
{"title":"了解在自然驾驶视频中头部和手部的活动和协调","authors":"Sujitha Martin, Eshed Ohn-Bar, Ashish Tawari, M. Trivedi","doi":"10.1109/IVS.2014.6856610","DOIUrl":null,"url":null,"abstract":"In this work, we propose a vision-based analysis framework for recognizing in-vehicle activities such as interactions with the steering wheel, the instrument cluster and the gear. The framework leverages two views for activity analysis, a camera looking at the driver's hand and another looking at the driver's head. The techniques proposed can be used by researchers in order to extract `mid-level' information from video, which is information that represents some semantic understanding of the scene but may still require an expert in order to distinguish difficult cases or leverage the cues to perform drive analysis. Unlike such information, `low-level' video is large in quantity and can't be used unless processed entirely by an expert. This work can apply to minimizing manual labor so that researchers may better benefit from the accessibility of the data and provide them with the ability to perform larger-scaled studies.","PeriodicalId":254500,"journal":{"name":"2014 IEEE Intelligent Vehicles Symposium Proceedings","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"40","resultStr":"{\"title\":\"Understanding head and hand activities and coordination in naturalistic driving videos\",\"authors\":\"Sujitha Martin, Eshed Ohn-Bar, Ashish Tawari, M. Trivedi\",\"doi\":\"10.1109/IVS.2014.6856610\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this work, we propose a vision-based analysis framework for recognizing in-vehicle activities such as interactions with the steering wheel, the instrument cluster and the gear. The framework leverages two views for activity analysis, a camera looking at the driver's hand and another looking at the driver's head. The techniques proposed can be used by researchers in order to extract `mid-level' information from video, which is information that represents some semantic understanding of the scene but may still require an expert in order to distinguish difficult cases or leverage the cues to perform drive analysis. Unlike such information, `low-level' video is large in quantity and can't be used unless processed entirely by an expert. This work can apply to minimizing manual labor so that researchers may better benefit from the accessibility of the data and provide them with the ability to perform larger-scaled studies.\",\"PeriodicalId\":254500,\"journal\":{\"name\":\"2014 IEEE Intelligent Vehicles Symposium Proceedings\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-06-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"40\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 IEEE Intelligent Vehicles Symposium Proceedings\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IVS.2014.6856610\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE Intelligent Vehicles Symposium Proceedings","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IVS.2014.6856610","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Understanding head and hand activities and coordination in naturalistic driving videos
In this work, we propose a vision-based analysis framework for recognizing in-vehicle activities such as interactions with the steering wheel, the instrument cluster and the gear. The framework leverages two views for activity analysis, a camera looking at the driver's hand and another looking at the driver's head. The techniques proposed can be used by researchers in order to extract `mid-level' information from video, which is information that represents some semantic understanding of the scene but may still require an expert in order to distinguish difficult cases or leverage the cues to perform drive analysis. Unlike such information, `low-level' video is large in quantity and can't be used unless processed entirely by an expert. This work can apply to minimizing manual labor so that researchers may better benefit from the accessibility of the data and provide them with the ability to perform larger-scaled studies.