{"title":"基于深度学习的精准农业产量估计","authors":"Youssef Osman, Reed Dennis, Khalid Elgazzar","doi":"10.1109/WF-IoT51360.2021.9595143","DOIUrl":null,"url":null,"abstract":"We perform fruit counting on video footage by following a two-stage pipeline that consists of detecting the fruits, then tracking them frame-by-frame. Detection is done through the use of You Only Look Once model (YOLO). Bounding boxes are extracted from detection and Non Max Suppression (NMS) is performed to get final detections. The boxes are then input into the tracking pipeline. For tracking, we apply a custom-developed DeepSORT algorithm to work with fruits. Using the box coordinates, every detected object is cropped out of the original image, and a separate feature extraction using a convolutional neural network (CNN) called ResNet is performed on that image crop to get the feature map. New detections are associated with old detections by comparing their features as a distance metric, where two objects with minimal distance are associated together. Input objects with no association are treated as new objects to be tracked. By keeping track of the fruits throughout the video frames, we ensure that we’re counting them appropriately when they are first detected. We demonstrate the approach on videos from an apple orchard to test the performance of the proposed pipeline in natural light. Experimental results show high accuracy of fruit counting on real-time video feeds. The new approach can be efficiently applied on any type of fruit and vegetables with no changes in the algorithms.","PeriodicalId":184138,"journal":{"name":"2021 IEEE 7th World Forum on Internet of Things (WF-IoT)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Yield Estimation using Deep Learning for Precision Agriculture\",\"authors\":\"Youssef Osman, Reed Dennis, Khalid Elgazzar\",\"doi\":\"10.1109/WF-IoT51360.2021.9595143\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We perform fruit counting on video footage by following a two-stage pipeline that consists of detecting the fruits, then tracking them frame-by-frame. Detection is done through the use of You Only Look Once model (YOLO). Bounding boxes are extracted from detection and Non Max Suppression (NMS) is performed to get final detections. The boxes are then input into the tracking pipeline. For tracking, we apply a custom-developed DeepSORT algorithm to work with fruits. Using the box coordinates, every detected object is cropped out of the original image, and a separate feature extraction using a convolutional neural network (CNN) called ResNet is performed on that image crop to get the feature map. New detections are associated with old detections by comparing their features as a distance metric, where two objects with minimal distance are associated together. Input objects with no association are treated as new objects to be tracked. By keeping track of the fruits throughout the video frames, we ensure that we’re counting them appropriately when they are first detected. We demonstrate the approach on videos from an apple orchard to test the performance of the proposed pipeline in natural light. Experimental results show high accuracy of fruit counting on real-time video feeds. The new approach can be efficiently applied on any type of fruit and vegetables with no changes in the algorithms.\",\"PeriodicalId\":184138,\"journal\":{\"name\":\"2021 IEEE 7th World Forum on Internet of Things (WF-IoT)\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-06-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE 7th World Forum on Internet of Things (WF-IoT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WF-IoT51360.2021.9595143\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 7th World Forum on Internet of Things (WF-IoT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WF-IoT51360.2021.9595143","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Yield Estimation using Deep Learning for Precision Agriculture
We perform fruit counting on video footage by following a two-stage pipeline that consists of detecting the fruits, then tracking them frame-by-frame. Detection is done through the use of You Only Look Once model (YOLO). Bounding boxes are extracted from detection and Non Max Suppression (NMS) is performed to get final detections. The boxes are then input into the tracking pipeline. For tracking, we apply a custom-developed DeepSORT algorithm to work with fruits. Using the box coordinates, every detected object is cropped out of the original image, and a separate feature extraction using a convolutional neural network (CNN) called ResNet is performed on that image crop to get the feature map. New detections are associated with old detections by comparing their features as a distance metric, where two objects with minimal distance are associated together. Input objects with no association are treated as new objects to be tracked. By keeping track of the fruits throughout the video frames, we ensure that we’re counting them appropriately when they are first detected. We demonstrate the approach on videos from an apple orchard to test the performance of the proposed pipeline in natural light. Experimental results show high accuracy of fruit counting on real-time video feeds. The new approach can be efficiently applied on any type of fruit and vegetables with no changes in the algorithms.