人员检测器训练综合模型的生成与评价

2017 International Carnahan Conference on Security Technology (ICCST) Pub Date : 2017-10-01 DOI:10.1109/CCST.2017.8167818

Rafael Martin Nieto, Jesus Molina Merchan, Álvaro García-Martín, J. Sanchez

{"title":"人员检测器训练综合模型的生成与评价","authors":"Rafael Martin Nieto, Jesus Molina Merchan, Álvaro García-Martín, J. Sanchez","doi":"10.1109/CCST.2017.8167818","DOIUrl":null,"url":null,"abstract":"There is a large demand in the area of video-surveillance, especially in people detection, which has caused a large increase in the number of researches and resources in this field. As training images and annotations are not always available, it is important to consider the cost involved in creating the detector models. For example, for elderly people detection, the detector must have into account different positions such as standing, sitting, in a wheelchair, etc. Therefore, this work has the main objective of reducing the amount of resources needed to generate the detection model, saving the cost of having to record new sequences and generate the associated annotations for a detector training. To achieve this, three synthetic image datasets have been created in order to train three different models, evaluating which model is optimal and finally analyzing its feasibility by comparing it with a people detector for wheelchair users trained with real images. Other people detection scenarios in which this technique could be applied are, for example, people riding horses or motorbikes, or people carrying supermarket carts. The synthetic datasets have been generated by combining images of standing people with wheelchair images, combining image patches, and segmenting sections of people (trunk, legs, etc.) to add them to the wheelchair image. As expected, the obtained results have a reduction of efficiency (between 21 and 25%) in exchange for the enormous saving in human annotation and resources to record real images.","PeriodicalId":371622,"journal":{"name":"2017 International Carnahan Conference on Security Technology (ICCST)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Generation and evaluation of synthetic models for training people detectors\",\"authors\":\"Rafael Martin Nieto, Jesus Molina Merchan, Álvaro García-Martín, J. Sanchez\",\"doi\":\"10.1109/CCST.2017.8167818\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"There is a large demand in the area of video-surveillance, especially in people detection, which has caused a large increase in the number of researches and resources in this field. As training images and annotations are not always available, it is important to consider the cost involved in creating the detector models. For example, for elderly people detection, the detector must have into account different positions such as standing, sitting, in a wheelchair, etc. Therefore, this work has the main objective of reducing the amount of resources needed to generate the detection model, saving the cost of having to record new sequences and generate the associated annotations for a detector training. To achieve this, three synthetic image datasets have been created in order to train three different models, evaluating which model is optimal and finally analyzing its feasibility by comparing it with a people detector for wheelchair users trained with real images. Other people detection scenarios in which this technique could be applied are, for example, people riding horses or motorbikes, or people carrying supermarket carts. The synthetic datasets have been generated by combining images of standing people with wheelchair images, combining image patches, and segmenting sections of people (trunk, legs, etc.) to add them to the wheelchair image. As expected, the obtained results have a reduction of efficiency (between 21 and 25%) in exchange for the enormous saving in human annotation and resources to record real images.\",\"PeriodicalId\":371622,\"journal\":{\"name\":\"2017 International Carnahan Conference on Security Technology (ICCST)\",\"volume\":\"33 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 International Carnahan Conference on Security Technology (ICCST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CCST.2017.8167818\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Carnahan Conference on Security Technology (ICCST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCST.2017.8167818","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在视频监控领域，特别是在人员检测方面有着巨大的需求，这导致了该领域的研究和资源的大量增加。由于训练图像和注释并不总是可用的，因此考虑创建检测器模型所涉及的成本是很重要的。例如，对于老年人的检测，探测器必须考虑到不同的位置，如站着、坐着、坐在轮椅上等。因此，这项工作的主要目标是减少生成检测模型所需的资源量，节省必须记录新序列并为检测器训练生成相关注释的成本。为了实现这一目标，我们创建了三个合成图像数据集来训练三个不同的模型，评估哪一个模型是最优的，最后通过将其与使用真实图像训练的轮椅使用者的人检测器进行比较，分析其可行性。其他可以应用该技术的人员检测场景，例如，骑马或骑摩托车的人，或推着超市购物车的人。合成数据集是通过将站立的人图像与轮椅图像结合，结合图像patch，对人的部分(躯干、腿部等)进行分割添加到轮椅图像中生成的。正如预期的那样，获得的结果降低了效率(在21%到25%之间)，以换取大量节省人工注释和记录真实图像的资源。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Generation and evaluation of synthetic models for training people detectors

There is a large demand in the area of video-surveillance, especially in people detection, which has caused a large increase in the number of researches and resources in this field. As training images and annotations are not always available, it is important to consider the cost involved in creating the detector models. For example, for elderly people detection, the detector must have into account different positions such as standing, sitting, in a wheelchair, etc. Therefore, this work has the main objective of reducing the amount of resources needed to generate the detection model, saving the cost of having to record new sequences and generate the associated annotations for a detector training. To achieve this, three synthetic image datasets have been created in order to train three different models, evaluating which model is optimal and finally analyzing its feasibility by comparing it with a people detector for wheelchair users trained with real images. Other people detection scenarios in which this technique could be applied are, for example, people riding horses or motorbikes, or people carrying supermarket carts. The synthetic datasets have been generated by combining images of standing people with wheelchair images, combining image patches, and segmenting sections of people (trunk, legs, etc.) to add them to the wheelchair image. As expected, the obtained results have a reduction of efficiency (between 21 and 25%) in exchange for the enormous saving in human annotation and resources to record real images.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助