{"title":"训练卷积神经网络来检测火车车厢中的废物","authors":"Nathan Western, X. Kong, Mustafa Erden","doi":"10.1109/ICSIPA52582.2021.9576771","DOIUrl":null,"url":null,"abstract":"This research constitutes a systematic investigation of the effect of image view on Convolutional Neural Networks (CNNs) when trained to detect waste in train carriages. Additionally, this research identifies neural network architecture and training conditions for use in an automated train cleaning robot. Specifically, we investigate the relationship between the size of the CNN training dataset, whether these images are taken from a view sympathetic to the CNN application, and the effectiveness of the trained networks. Three datasets were constructed specifically for this research; a large dataset of 58,300 studio images of waste in a variety of conditions, a smaller dataset of 4,515 images taken of actual waste items on trains, and a dataset of 7,290 images of actual waste on trains used to test the CNNs. The images taken on trains were captured from the perspective of a hypothetical cleaning robot that would use these networks. Additionally, we provide a comparison of MobileNetV2, ShuffleNet, and SqueezeNet CNNs based on their suitability for implementation in an automated train cleaning system, and the optimum conditions to do so. Training with a smaller dataset of images taken from a “robot-eye view” resulted in an average increase in classification accuracy of 10.5%, with the largest increase being 26%, when compared to training with a larger dataset of images of waste items in various poses. ShuffleNet was identified as the optimally performing CNN for waste detection, achieving an accuracy of 88.61% when trained with a small dataset of images sympathetic to the end use. MobileNetV2 was found to perform optimally with a larger dataset of training images, even if these are less specific to the application of the network.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Training Convolutional Neural Networks to Detect Waste in Train Carriages\",\"authors\":\"Nathan Western, X. Kong, Mustafa Erden\",\"doi\":\"10.1109/ICSIPA52582.2021.9576771\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This research constitutes a systematic investigation of the effect of image view on Convolutional Neural Networks (CNNs) when trained to detect waste in train carriages. Additionally, this research identifies neural network architecture and training conditions for use in an automated train cleaning robot. Specifically, we investigate the relationship between the size of the CNN training dataset, whether these images are taken from a view sympathetic to the CNN application, and the effectiveness of the trained networks. Three datasets were constructed specifically for this research; a large dataset of 58,300 studio images of waste in a variety of conditions, a smaller dataset of 4,515 images taken of actual waste items on trains, and a dataset of 7,290 images of actual waste on trains used to test the CNNs. The images taken on trains were captured from the perspective of a hypothetical cleaning robot that would use these networks. Additionally, we provide a comparison of MobileNetV2, ShuffleNet, and SqueezeNet CNNs based on their suitability for implementation in an automated train cleaning system, and the optimum conditions to do so. Training with a smaller dataset of images taken from a “robot-eye view” resulted in an average increase in classification accuracy of 10.5%, with the largest increase being 26%, when compared to training with a larger dataset of images of waste items in various poses. ShuffleNet was identified as the optimally performing CNN for waste detection, achieving an accuracy of 88.61% when trained with a small dataset of images sympathetic to the end use. MobileNetV2 was found to perform optimally with a larger dataset of training images, even if these are less specific to the application of the network.\",\"PeriodicalId\":326688,\"journal\":{\"name\":\"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)\",\"volume\":\"37 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-09-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSIPA52582.2021.9576771\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSIPA52582.2021.9576771","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Training Convolutional Neural Networks to Detect Waste in Train Carriages
This research constitutes a systematic investigation of the effect of image view on Convolutional Neural Networks (CNNs) when trained to detect waste in train carriages. Additionally, this research identifies neural network architecture and training conditions for use in an automated train cleaning robot. Specifically, we investigate the relationship between the size of the CNN training dataset, whether these images are taken from a view sympathetic to the CNN application, and the effectiveness of the trained networks. Three datasets were constructed specifically for this research; a large dataset of 58,300 studio images of waste in a variety of conditions, a smaller dataset of 4,515 images taken of actual waste items on trains, and a dataset of 7,290 images of actual waste on trains used to test the CNNs. The images taken on trains were captured from the perspective of a hypothetical cleaning robot that would use these networks. Additionally, we provide a comparison of MobileNetV2, ShuffleNet, and SqueezeNet CNNs based on their suitability for implementation in an automated train cleaning system, and the optimum conditions to do so. Training with a smaller dataset of images taken from a “robot-eye view” resulted in an average increase in classification accuracy of 10.5%, with the largest increase being 26%, when compared to training with a larger dataset of images of waste items in various poses. ShuffleNet was identified as the optimally performing CNN for waste detection, achieving an accuracy of 88.61% when trained with a small dataset of images sympathetic to the end use. MobileNetV2 was found to perform optimally with a larger dataset of training images, even if these are less specific to the application of the network.