Pub Date : 2024-07-01DOI: 10.1109/TAFE.2024.3416221
Divyansh Thakur;Vikram Kumar
In this work, we developed and enhanced an artificial intelligence (AI)-centered hardware framework. This framework integrates the Nvidia Jetson Nano processing unit with a Depth AI camera. Our primary goal was to create an improved version of the YOLOv7 algorithm to quantify apple fruits using edge computing resources. We curated a dataset of 9,000 images of apple fruits to support this effort. Within the enhanced YOLOv7 architecture, we introduced a novel dual attention mechanism called the Global-SE Unified Attention Mechanism (GSEAM). This mechanism was designed to improve the accuracy of object detection by combining spatial and channel-oriented attention mechanisms, significantly enhancing the model.s contextual understanding and object recognition in various settings. The incorporation of GSEAM, along with the Gaussian Error Linear Unit activation function, was a deliberate effort to boost the YOLOv7 architecture.s ability to capture intricate contextual details and hierarchical features. Our system.s performance was rigorously evaluated across six key performance metrics and compared with other pretrained models. We achieved a precision of 99.54%, recall of 98.94%, F1-score of 99.71%, and average precision of 99.13%. This system has proven to be a valuable tool for real-time apple fruit counting, with practical applications for farmers.
{"title":"FruitVision: Dual-Attention Embedded AI System for Precise Apple Counting Using Edge Computing","authors":"Divyansh Thakur;Vikram Kumar","doi":"10.1109/TAFE.2024.3416221","DOIUrl":"https://doi.org/10.1109/TAFE.2024.3416221","url":null,"abstract":"In this work, we developed and enhanced an artificial intelligence (AI)-centered hardware framework. This framework integrates the Nvidia Jetson Nano processing unit with a Depth AI camera. Our primary goal was to create an improved version of the YOLOv7 algorithm to quantify apple fruits using edge computing resources. We curated a dataset of 9,000 images of apple fruits to support this effort. Within the enhanced YOLOv7 architecture, we introduced a novel dual attention mechanism called the Global-SE Unified Attention Mechanism (GSEAM). This mechanism was designed to improve the accuracy of object detection by combining spatial and channel-oriented attention mechanisms, significantly enhancing the model.s contextual understanding and object recognition in various settings. The incorporation of GSEAM, along with the Gaussian Error Linear Unit activation function, was a deliberate effort to boost the YOLOv7 architecture.s ability to capture intricate contextual details and hierarchical features. Our system.s performance was rigorously evaluated across six key performance metrics and compared with other pretrained models. We achieved a precision of 99.54%, recall of 98.94%, F1-score of 99.71%, and average precision of 99.13%. This system has proven to be a valuable tool for real-time apple fruit counting, with practical applications for farmers.","PeriodicalId":100637,"journal":{"name":"IEEE Transactions on AgriFood Electronics","volume":"2 2","pages":"445-459"},"PeriodicalIF":0.0,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142408682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-28DOI: 10.1109/TAFE.2024.3414953
Stephany Valarezo-Plaza;Julio Torres-Tello;Keshav D. Singh;Steve J. Shirtliffe;S. Deivalakshmi;Seok-Bum Ko
The escalating global demand for food, coupled with challenges in sustaining crop production, deteriorating ocean health, and depleting natural resources, underscores the critical role of agricultural technology. This article addresses the imperative of developing an optimal deep-learning model for predicting canola crop yield using hyperspectral images captured by drone flights. Our primary objective is to identify the most efficient model in terms of performance and size, considering the storage limitations on edge devices like Raspberry Pi 4 (RPi4). We start with the baseline 1D _