RPA and L-System Based Synthetic Data Generator for Cost-efficient Deep Learning Model Training

2021 IEEE 3rd Eurasia Conference on IOT, Communication and Engineering (ECICE) Pub Date : 2021-10-29 DOI:10.1109/ECICE52819.2021.9645719

E. S., O. E. Ramos, Sixto Prado G.

{"title":"RPA and L-System Based Synthetic Data Generator for Cost-efficient Deep Learning Model Training","authors":"E. S., O. E. Ramos, Sixto Prado G.","doi":"10.1109/ECICE52819.2021.9645719","DOIUrl":null,"url":null,"abstract":"Deep learning (DL) models applied to computer vision have made great progress for image-based plant phenotyping in recent years, mostly for quality control process automation in the agroindustry. On the one hand, these models are able to detect objects in complex and noisy images as fast as human observations, but on the other hand, they are trained with a large amount of labeled data for parameter tuning. This turns the training process into an expensive, repetitive, and time-consuming labor. In this work, a synthetic data generator based on robotic process automation (RPA) and Lindenmayer systems (L-Systems) named RPASD is designed and implemented to train a DL model that detects artichoke seedlings in images captured by a robot. First, the growth artichoke seedling is modeled in L+C language using the LStudio software. Second, the RPASD is developed in Python to produce labeled images of grouped synthetic artichoke seedlings that alongside manually labeled images of real artichoke seedlings, taken by a robot, form the PlantiNet database. Third, a YOLOv3 model is trained with the previously built databases forming three datasets: 1) real and synthetics seedlings, 2) only synthetic seedlings, and 3) only real seedlings. The results show a 55% of Mean Intersection over the Union (mIoU) when training only with the second dataset and testing with the third one, which allows us to conclude that our proposed method could adequately boost DL model training reducing costs and time.","PeriodicalId":176225,"journal":{"name":"2021 IEEE 3rd Eurasia Conference on IOT, Communication and Engineering (ECICE)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 3rd Eurasia Conference on IOT, Communication and Engineering (ECICE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ECICE52819.2021.9645719","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Deep learning (DL) models applied to computer vision have made great progress for image-based plant phenotyping in recent years, mostly for quality control process automation in the agroindustry. On the one hand, these models are able to detect objects in complex and noisy images as fast as human observations, but on the other hand, they are trained with a large amount of labeled data for parameter tuning. This turns the training process into an expensive, repetitive, and time-consuming labor. In this work, a synthetic data generator based on robotic process automation (RPA) and Lindenmayer systems (L-Systems) named RPASD is designed and implemented to train a DL model that detects artichoke seedlings in images captured by a robot. First, the growth artichoke seedling is modeled in L+C language using the LStudio software. Second, the RPASD is developed in Python to produce labeled images of grouped synthetic artichoke seedlings that alongside manually labeled images of real artichoke seedlings, taken by a robot, form the PlantiNet database. Third, a YOLOv3 model is trained with the previously built databases forming three datasets: 1) real and synthetics seedlings, 2) only synthetic seedlings, and 3) only real seedlings. The results show a 55% of Mean Intersection over the Union (mIoU) when training only with the second dataset and testing with the third one, which allows us to conclude that our proposed method could adequately boost DL model training reducing costs and time.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于RPA和L-System的高效深度学习模型训练合成数据生成器

近年来，深度学习(DL)模型应用于计算机视觉，在基于图像的植物表型分析方面取得了很大进展，主要用于农业工业的质量控制过程自动化。一方面，这些模型能够像人类观察一样快速地检测复杂和有噪声的图像中的物体，但另一方面，它们需要使用大量标记数据进行训练以进行参数调优。这就把培训过程变成了一项昂贵、重复和耗时的工作。在这项工作中，设计并实现了一个基于机器人过程自动化(RPA)和林登迈尔系统(L-Systems)的合成数据生成器RPASD，用于训练一个深度学习模型，该模型可以检测机器人捕获的图像中的洋蓟幼苗。首先，利用LStudio软件，用L+C语言对洋蓟幼苗生长过程进行建模。其次，RPASD是用Python开发的，用于生成分组合成洋蓟幼苗的标记图像，这些图像与机器人拍摄的人工标记的真正洋蓟幼苗图像一起形成PlantiNet数据库。第三，使用之前构建的数据库训练YOLOv3模型，形成三个数据集:1)真实和合成幼苗，2)仅合成幼苗，3)仅真实幼苗。结果显示，当仅使用第二个数据集进行训练并使用第三个数据集进行测试时，平均交集超过联合(mIoU)的55%，这使我们能够得出结论，我们提出的方法可以充分提高深度学习模型的训练，减少成本和时间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2021 IEEE 3rd Eurasia Conference on IOT, Communication and Engineering (ECICE)

自引率

0.00%

发文量