Application of Principal Component Analysis to advancing digital phenotyping of plant disease in the context of limited memory for training data storage

Journal of Scientific Agriculture Pub Date : 2023-05-12 DOI:10.25081/jsa.2023.v7.8327

Enow Albert, N. Bille, N. Leonard

{"title":"Application of Principal Component Analysis to advancing digital phenotyping of plant disease in the context of limited memory for training data storage","authors":"Enow Albert, N. Bille, N. Leonard","doi":"10.25081/jsa.2023.v7.8327","DOIUrl":null,"url":null,"abstract":"Despite its widespread employment as a highly efficient dimensionality reduction technique, limited research has been carried out on the advantage of Principal Component Analysis (PCA)–based compression/reconstruction of image data to machine learning-based image classification performance and storage space optimization. To address this limitation, we designed a study in which we compared the performances of two Convolutional Neural Network-Random Forest Algorithm (CNN-RF) guava leaf image classification models developed using training data from a number of original guava leaf images contained in a predefined amount of storage space (on the one hand), and a number of PCA compressed/reconstructed guava leaf images contained in the same amount of storage space (on the other hand), on the basis of four criteria – Accuracy, F1-Score, Phi Coefficient and the Fowlkes–Mallows index. Our approach achieved a 1:100 image compression ratio (99.00% image compression) which was comparatively much better than previous results achieved using other algorithms like arithmetic coding (1:1.50), wavelet transform (90.00% image compression), and a combination of three transform-based techniques – Discrete Fourier (DFT), Discrete Wavelet (DWT) and Discrete Cosine (DCT) (1:22.50). From a subjective visual quality perspective, the PCA compressed/reconstructed guava leaf images presented almost no loss of image detail. Finally, the CNN-RF model developed using PCA compressed/reconstructed guava leaf images outperformed the CNN-RF model developed using original guava leaf images by 0.10% accuracy increase, 0.10 F1-Score increase, 0.18 Phi Coefficient increase and 0.09 Fowlkes–Mallows increase.","PeriodicalId":130104,"journal":{"name":"Journal of Scientific Agriculture","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Scientific Agriculture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.25081/jsa.2023.v7.8327","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Despite its widespread employment as a highly efficient dimensionality reduction technique, limited research has been carried out on the advantage of Principal Component Analysis (PCA)–based compression/reconstruction of image data to machine learning-based image classification performance and storage space optimization. To address this limitation, we designed a study in which we compared the performances of two Convolutional Neural Network-Random Forest Algorithm (CNN-RF) guava leaf image classification models developed using training data from a number of original guava leaf images contained in a predefined amount of storage space (on the one hand), and a number of PCA compressed/reconstructed guava leaf images contained in the same amount of storage space (on the other hand), on the basis of four criteria – Accuracy, F1-Score, Phi Coefficient and the Fowlkes–Mallows index. Our approach achieved a 1:100 image compression ratio (99.00% image compression) which was comparatively much better than previous results achieved using other algorithms like arithmetic coding (1:1.50), wavelet transform (90.00% image compression), and a combination of three transform-based techniques – Discrete Fourier (DFT), Discrete Wavelet (DWT) and Discrete Cosine (DCT) (1:22.50). From a subjective visual quality perspective, the PCA compressed/reconstructed guava leaf images presented almost no loss of image detail. Finally, the CNN-RF model developed using PCA compressed/reconstructed guava leaf images outperformed the CNN-RF model developed using original guava leaf images by 0.10% accuracy increase, 0.10 F1-Score increase, 0.18 Phi Coefficient increase and 0.09 Fowlkes–Mallows increase.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

应用主成分分析在有限记忆环境下推进植物病害数字表型分析

尽管它作为一种高效的降维技术被广泛使用，但基于主成分分析(PCA)的图像数据压缩/重建与基于机器学习的图像分类性能和存储空间优化的优势研究却很少。为了解决这一限制，我们设计了一项研究，在该研究中，我们比较了两种卷积神经网络-随机森林算法(CNN-RF)番石榴叶图像分类模型的性能，这些模型使用的是预定义存储空间中包含的大量原始番石榴叶图像的训练数据(一方面)，以及包含相同存储空间中包含的大量PCA压缩/重构番石榴叶图像(另一方面)。基于四个标准-准确性，F1-Score, Phi系数和Fowlkes-Mallows指数。我们的方法实现了1:100的图像压缩比(99.00%的图像压缩)，这比以前使用其他算法(如算术编码(1:1.50)、小波变换(90.00%的图像压缩)和三种基于变换的技术的组合——离散傅立叶(DFT)、离散小波(DWT)和离散余弦(DCT)(1:22.50)所获得的结果要好得多。从主观视觉质量的角度来看，PCA压缩/重构的番石榴叶图像几乎没有图像细节的损失。最后，使用PCA压缩/重构的石榴叶图像开发的CNN-RF模型比使用原始石榴叶图像开发的CNN-RF模型精度提高0.10%，F1-Score提高0.10,Phi系数提高0.18,Fowlkes-Mallows提高0.09。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Journal of Scientific Agriculture

自引率

0.00%

发文量