{"title":"Leaf-based species classification of hybrid cherry tomato plants by using hyperspectral imaging","authors":"Songhao Li, Huilin Wu, Jing Zhao, Yu Liu, Yun Li, Houcheng Liu, Yiting Zhang, Yubin Lan, Xinglong Zhang, Yutao Liu, Yongbing Long","doi":"10.1177/09670335221148593","DOIUrl":null,"url":null,"abstract":"Approaches based on near infrared hyperspectral imaging (NIR-HSI) technology combined with machine learning have been developed to classify the leaves of hybrid cherry tomatoes and then identify the species of hybrid cherry tomato plants. The near infrared (NIR) hyperspectral images of 400 cherry tomato leaves (100 per species) were collected in the wavelength range of 900–1700 nm. Machine learning algorithms such as linear discriminant analysis (LDA), random forest (RF), and support vector machine (SVM) were employed to construct leaf classification models with the hyperspectral data preprocessed by Savitzky-Golay (SG) smoothing filter, first derivative (first Der) and standard normal variate (SNV). Principle of Component Analysis (PCA) was also used to reduce the data dimension and extract spectral features. It is revealed that the LDA model reaches the highest classification accuracy among the three machine learning algorithms and SNV can lead to higher improvement in model accuracy than other preprocessing methods of SG smoothing and first Der. Analysis based on PCA spectral feature extraction demonstrates that differences occur in internal material content in the leaves of cherry tomato plants with different species, which renders the models being able to distinguish between the species. Another important work was performed to reveal the different effects of the mesophyll and vein regions (VR) on the accuracy of the leaf classification model. It is demonstrated that the classification accuracy is improved by a value of 0.033 or 0.042 when mesophyll substitutes vein or whole leaf as regions of interest (ROI) to extract reflectance spectra for modeling. As a result, the accuracy of the training and test set respectively reached a high value of 0.998 and 0.973 for the LDA classification model combined with the SNV preprocessing method. The results propose that the use of mesophyll region (MR) as ROI can improve the performance of the leaf classification model, which provides a new strategy for efficient and non-destructive classification of different hybrid cherry tomato plants.","PeriodicalId":16551,"journal":{"name":"Journal of Near Infrared Spectroscopy","volume":"31 1","pages":"41 - 51"},"PeriodicalIF":1.6000,"publicationDate":"2023-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Near Infrared Spectroscopy","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1177/09670335221148593","RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CHEMISTRY, APPLIED","Score":null,"Total":0}
引用次数: 0
Abstract
Approaches based on near infrared hyperspectral imaging (NIR-HSI) technology combined with machine learning have been developed to classify the leaves of hybrid cherry tomatoes and then identify the species of hybrid cherry tomato plants. The near infrared (NIR) hyperspectral images of 400 cherry tomato leaves (100 per species) were collected in the wavelength range of 900–1700 nm. Machine learning algorithms such as linear discriminant analysis (LDA), random forest (RF), and support vector machine (SVM) were employed to construct leaf classification models with the hyperspectral data preprocessed by Savitzky-Golay (SG) smoothing filter, first derivative (first Der) and standard normal variate (SNV). Principle of Component Analysis (PCA) was also used to reduce the data dimension and extract spectral features. It is revealed that the LDA model reaches the highest classification accuracy among the three machine learning algorithms and SNV can lead to higher improvement in model accuracy than other preprocessing methods of SG smoothing and first Der. Analysis based on PCA spectral feature extraction demonstrates that differences occur in internal material content in the leaves of cherry tomato plants with different species, which renders the models being able to distinguish between the species. Another important work was performed to reveal the different effects of the mesophyll and vein regions (VR) on the accuracy of the leaf classification model. It is demonstrated that the classification accuracy is improved by a value of 0.033 or 0.042 when mesophyll substitutes vein or whole leaf as regions of interest (ROI) to extract reflectance spectra for modeling. As a result, the accuracy of the training and test set respectively reached a high value of 0.998 and 0.973 for the LDA classification model combined with the SNV preprocessing method. The results propose that the use of mesophyll region (MR) as ROI can improve the performance of the leaf classification model, which provides a new strategy for efficient and non-destructive classification of different hybrid cherry tomato plants.
期刊介绍:
JNIRS — Journal of Near Infrared Spectroscopy is a peer reviewed journal, publishing original research papers, short communications, review articles and letters concerned with near infrared spectroscopy and technology, its application, new instrumentation and the use of chemometric and data handling techniques within NIR.