{"title":"基于RPA和L-System的高效深度学习模型训练合成数据生成器","authors":"E. S., O. E. Ramos, Sixto Prado G.","doi":"10.1109/ECICE52819.2021.9645719","DOIUrl":null,"url":null,"abstract":"Deep learning (DL) models applied to computer vision have made great progress for image-based plant phenotyping in recent years, mostly for quality control process automation in the agroindustry. On the one hand, these models are able to detect objects in complex and noisy images as fast as human observations, but on the other hand, they are trained with a large amount of labeled data for parameter tuning. This turns the training process into an expensive, repetitive, and time-consuming labor. In this work, a synthetic data generator based on robotic process automation (RPA) and Lindenmayer systems (L-Systems) named RPASD is designed and implemented to train a DL model that detects artichoke seedlings in images captured by a robot. First, the growth artichoke seedling is modeled in L+C language using the LStudio software. Second, the RPASD is developed in Python to produce labeled images of grouped synthetic artichoke seedlings that alongside manually labeled images of real artichoke seedlings, taken by a robot, form the PlantiNet database. Third, a YOLOv3 model is trained with the previously built databases forming three datasets: 1) real and synthetics seedlings, 2) only synthetic seedlings, and 3) only real seedlings. The results show a 55% of Mean Intersection over the Union (mIoU) when training only with the second dataset and testing with the third one, which allows us to conclude that our proposed method could adequately boost DL model training reducing costs and time.","PeriodicalId":176225,"journal":{"name":"2021 IEEE 3rd Eurasia Conference on IOT, Communication and Engineering (ECICE)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"RPA and L-System Based Synthetic Data Generator for Cost-efficient Deep Learning Model Training\",\"authors\":\"E. S., O. E. Ramos, Sixto Prado G.\",\"doi\":\"10.1109/ECICE52819.2021.9645719\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep learning (DL) models applied to computer vision have made great progress for image-based plant phenotyping in recent years, mostly for quality control process automation in the agroindustry. On the one hand, these models are able to detect objects in complex and noisy images as fast as human observations, but on the other hand, they are trained with a large amount of labeled data for parameter tuning. This turns the training process into an expensive, repetitive, and time-consuming labor. In this work, a synthetic data generator based on robotic process automation (RPA) and Lindenmayer systems (L-Systems) named RPASD is designed and implemented to train a DL model that detects artichoke seedlings in images captured by a robot. First, the growth artichoke seedling is modeled in L+C language using the LStudio software. Second, the RPASD is developed in Python to produce labeled images of grouped synthetic artichoke seedlings that alongside manually labeled images of real artichoke seedlings, taken by a robot, form the PlantiNet database. Third, a YOLOv3 model is trained with the previously built databases forming three datasets: 1) real and synthetics seedlings, 2) only synthetic seedlings, and 3) only real seedlings. The results show a 55% of Mean Intersection over the Union (mIoU) when training only with the second dataset and testing with the third one, which allows us to conclude that our proposed method could adequately boost DL model training reducing costs and time.\",\"PeriodicalId\":176225,\"journal\":{\"name\":\"2021 IEEE 3rd Eurasia Conference on IOT, Communication and Engineering (ECICE)\",\"volume\":\"23 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE 3rd Eurasia Conference on IOT, Communication and Engineering (ECICE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ECICE52819.2021.9645719\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 3rd Eurasia Conference on IOT, Communication and Engineering (ECICE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ECICE52819.2021.9645719","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
RPA and L-System Based Synthetic Data Generator for Cost-efficient Deep Learning Model Training
Deep learning (DL) models applied to computer vision have made great progress for image-based plant phenotyping in recent years, mostly for quality control process automation in the agroindustry. On the one hand, these models are able to detect objects in complex and noisy images as fast as human observations, but on the other hand, they are trained with a large amount of labeled data for parameter tuning. This turns the training process into an expensive, repetitive, and time-consuming labor. In this work, a synthetic data generator based on robotic process automation (RPA) and Lindenmayer systems (L-Systems) named RPASD is designed and implemented to train a DL model that detects artichoke seedlings in images captured by a robot. First, the growth artichoke seedling is modeled in L+C language using the LStudio software. Second, the RPASD is developed in Python to produce labeled images of grouped synthetic artichoke seedlings that alongside manually labeled images of real artichoke seedlings, taken by a robot, form the PlantiNet database. Third, a YOLOv3 model is trained with the previously built databases forming three datasets: 1) real and synthetics seedlings, 2) only synthetic seedlings, and 3) only real seedlings. The results show a 55% of Mean Intersection over the Union (mIoU) when training only with the second dataset and testing with the third one, which allows us to conclude that our proposed method could adequately boost DL model training reducing costs and time.