Rafael Martin Nieto, Jesus Molina Merchan, Álvaro García-Martín, J. Sanchez
{"title":"人员检测器训练综合模型的生成与评价","authors":"Rafael Martin Nieto, Jesus Molina Merchan, Álvaro García-Martín, J. Sanchez","doi":"10.1109/CCST.2017.8167818","DOIUrl":null,"url":null,"abstract":"There is a large demand in the area of video-surveillance, especially in people detection, which has caused a large increase in the number of researches and resources in this field. As training images and annotations are not always available, it is important to consider the cost involved in creating the detector models. For example, for elderly people detection, the detector must have into account different positions such as standing, sitting, in a wheelchair, etc. Therefore, this work has the main objective of reducing the amount of resources needed to generate the detection model, saving the cost of having to record new sequences and generate the associated annotations for a detector training. To achieve this, three synthetic image datasets have been created in order to train three different models, evaluating which model is optimal and finally analyzing its feasibility by comparing it with a people detector for wheelchair users trained with real images. Other people detection scenarios in which this technique could be applied are, for example, people riding horses or motorbikes, or people carrying supermarket carts. The synthetic datasets have been generated by combining images of standing people with wheelchair images, combining image patches, and segmenting sections of people (trunk, legs, etc.) to add them to the wheelchair image. As expected, the obtained results have a reduction of efficiency (between 21 and 25%) in exchange for the enormous saving in human annotation and resources to record real images.","PeriodicalId":371622,"journal":{"name":"2017 International Carnahan Conference on Security Technology (ICCST)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Generation and evaluation of synthetic models for training people detectors\",\"authors\":\"Rafael Martin Nieto, Jesus Molina Merchan, Álvaro García-Martín, J. Sanchez\",\"doi\":\"10.1109/CCST.2017.8167818\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"There is a large demand in the area of video-surveillance, especially in people detection, which has caused a large increase in the number of researches and resources in this field. As training images and annotations are not always available, it is important to consider the cost involved in creating the detector models. For example, for elderly people detection, the detector must have into account different positions such as standing, sitting, in a wheelchair, etc. Therefore, this work has the main objective of reducing the amount of resources needed to generate the detection model, saving the cost of having to record new sequences and generate the associated annotations for a detector training. To achieve this, three synthetic image datasets have been created in order to train three different models, evaluating which model is optimal and finally analyzing its feasibility by comparing it with a people detector for wheelchair users trained with real images. Other people detection scenarios in which this technique could be applied are, for example, people riding horses or motorbikes, or people carrying supermarket carts. The synthetic datasets have been generated by combining images of standing people with wheelchair images, combining image patches, and segmenting sections of people (trunk, legs, etc.) to add them to the wheelchair image. As expected, the obtained results have a reduction of efficiency (between 21 and 25%) in exchange for the enormous saving in human annotation and resources to record real images.\",\"PeriodicalId\":371622,\"journal\":{\"name\":\"2017 International Carnahan Conference on Security Technology (ICCST)\",\"volume\":\"33 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 International Carnahan Conference on Security Technology (ICCST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CCST.2017.8167818\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Carnahan Conference on Security Technology (ICCST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCST.2017.8167818","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Generation and evaluation of synthetic models for training people detectors
There is a large demand in the area of video-surveillance, especially in people detection, which has caused a large increase in the number of researches and resources in this field. As training images and annotations are not always available, it is important to consider the cost involved in creating the detector models. For example, for elderly people detection, the detector must have into account different positions such as standing, sitting, in a wheelchair, etc. Therefore, this work has the main objective of reducing the amount of resources needed to generate the detection model, saving the cost of having to record new sequences and generate the associated annotations for a detector training. To achieve this, three synthetic image datasets have been created in order to train three different models, evaluating which model is optimal and finally analyzing its feasibility by comparing it with a people detector for wheelchair users trained with real images. Other people detection scenarios in which this technique could be applied are, for example, people riding horses or motorbikes, or people carrying supermarket carts. The synthetic datasets have been generated by combining images of standing people with wheelchair images, combining image patches, and segmenting sections of people (trunk, legs, etc.) to add them to the wheelchair image. As expected, the obtained results have a reduction of efficiency (between 21 and 25%) in exchange for the enormous saving in human annotation and resources to record real images.