{"title":"加权霍夫投票的多视图汽车检测","authors":"T. Xiang, Zuomei Lai, Wensheng Qiao, Tao Li","doi":"10.23919/ICIF.2017.8009658","DOIUrl":null,"url":null,"abstract":"Hough voting based methods for object detection work by means of allowing local image patches to vote for the center of the object according to the trained visual words. They are effective for object with small local varieties, but incapable of solving multi-view detection problem. The traditional way is training visual words for each subcategory that has similar view. However, limited training data prevents this from being effective. In this paper, we propose an extension to the Hough voting which allows for sharing visual words among multiple subcategories and accumulating votes with discriminative combination weights for different subcategories. The shared visual words are learned using dense image patches. Having such visual words, we can collect descriptors of samples in all subcategories and negative set to train the discriminative combination weights. The final score of a hypothesis is the maximum one in all discretized views. By fusing the geometry structure, image appearance and view information of the object, multi-view object detection problem is solved effectively. In this paper, we mainly focus on multi-view car detection, but not limited to. The proposed method is evaluated on 2 well-known datasets: MIT StreetScene Cars dataset and PASCAL VOC2007 car dataset. The experimental results demonstrate that our method achieves state-of-the-art or competitive performance.","PeriodicalId":148407,"journal":{"name":"2017 20th International Conference on Information Fusion (Fusion)","volume":"143 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Weighted Hough voting for multi-view car detection\",\"authors\":\"T. Xiang, Zuomei Lai, Wensheng Qiao, Tao Li\",\"doi\":\"10.23919/ICIF.2017.8009658\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Hough voting based methods for object detection work by means of allowing local image patches to vote for the center of the object according to the trained visual words. They are effective for object with small local varieties, but incapable of solving multi-view detection problem. The traditional way is training visual words for each subcategory that has similar view. However, limited training data prevents this from being effective. In this paper, we propose an extension to the Hough voting which allows for sharing visual words among multiple subcategories and accumulating votes with discriminative combination weights for different subcategories. The shared visual words are learned using dense image patches. Having such visual words, we can collect descriptors of samples in all subcategories and negative set to train the discriminative combination weights. The final score of a hypothesis is the maximum one in all discretized views. By fusing the geometry structure, image appearance and view information of the object, multi-view object detection problem is solved effectively. In this paper, we mainly focus on multi-view car detection, but not limited to. The proposed method is evaluated on 2 well-known datasets: MIT StreetScene Cars dataset and PASCAL VOC2007 car dataset. The experimental results demonstrate that our method achieves state-of-the-art or competitive performance.\",\"PeriodicalId\":148407,\"journal\":{\"name\":\"2017 20th International Conference on Information Fusion (Fusion)\",\"volume\":\"143 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-07-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 20th International Conference on Information Fusion (Fusion)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23919/ICIF.2017.8009658\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 20th International Conference on Information Fusion (Fusion)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/ICIF.2017.8009658","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Weighted Hough voting for multi-view car detection
Hough voting based methods for object detection work by means of allowing local image patches to vote for the center of the object according to the trained visual words. They are effective for object with small local varieties, but incapable of solving multi-view detection problem. The traditional way is training visual words for each subcategory that has similar view. However, limited training data prevents this from being effective. In this paper, we propose an extension to the Hough voting which allows for sharing visual words among multiple subcategories and accumulating votes with discriminative combination weights for different subcategories. The shared visual words are learned using dense image patches. Having such visual words, we can collect descriptors of samples in all subcategories and negative set to train the discriminative combination weights. The final score of a hypothesis is the maximum one in all discretized views. By fusing the geometry structure, image appearance and view information of the object, multi-view object detection problem is solved effectively. In this paper, we mainly focus on multi-view car detection, but not limited to. The proposed method is evaluated on 2 well-known datasets: MIT StreetScene Cars dataset and PASCAL VOC2007 car dataset. The experimental results demonstrate that our method achieves state-of-the-art or competitive performance.