Peng Huang, Pan Yang, Lijia Xu, Yuchao Wang, Jinfu Yuan, Zhiliang Kang
{"title":"Moisture content detection of Tibetan tea based on hyperspectral technology, machine vision and machine learning","authors":"Peng Huang, Pan Yang, Lijia Xu, Yuchao Wang, Jinfu Yuan, Zhiliang Kang","doi":"10.1007/s11694-024-03032-5","DOIUrl":null,"url":null,"abstract":"<div><p>The moisture content of tea leaves plays a dominant role in the processing and storage of tea leaves, and directly affects the color, flavor and value of tea leaves. This study aims to use hyperspectral imaging technology combined with machine learning methods to achieve nondestructive detection of tea moisture content. The hyperspectral images of tea samples in the wavelength range of 387 ~ 1035 nm were collected, the region of interest (ROI) was intercepted by ENVI software and the spectral information was extracted by python programming software, and the texture information of the samples was extracted by using gray scale co-generation matrix (GLCM) to build a model based on spectral, texture and spectral-texture fusion for the detection of moisture content of Tibetan tea. The original Tibetan tea spectral data (RAW) and the fused spectral-texture features were preprocessed using six preprocessing algorithms, including standard normal variational transform (SNVT), multiple scattering correction (MSC), first-order derivative (FD), second-order derivative (SD), Savitzky-Golay (SG) filtering and Z-Score Standardization (ZSS). After extracting the Tibetan tea spectral, texture, and spectral-texture fusion features using GB, AdaBoost, RF, XGBoost, LightGBM, and CatBoost algorithms, respectively, the top 30 features were ranked according to their importance and were used as inputs to the RFR, CatBoostR, LightGBMR, and XGBoostR models. The XGBoost + CatBoostR model has the best performance with <span>\\(R_{c}^{2}\\)</span>, <span>\\(R_{p}^{2}\\)</span>, and RMSEC and RMSEP of 0.9814, 0.9788, and 0.2064, 0.2506, respectively. And according to the results of modeling, the features extracted by GB algorithm are filtered as inputs, and finally the Stacking model with XGBoostR and CatBoostR as base learners and CatBoostR as meta-learner is built. The prediction results of this model are more satisfactory, and its <span>\\(R_{c}^{2}\\)</span>, <span>\\(R_{p}^{2}\\)</span>, RMSEC, and RMSEP are 0.9947, 0.9817, and 0.1101, 0.2326, respectively.</p></div>","PeriodicalId":631,"journal":{"name":"Journal of Food Measurement and Characterization","volume":"19 2","pages":"1167 - 1185"},"PeriodicalIF":2.9000,"publicationDate":"2024-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Food Measurement and Characterization","FirstCategoryId":"97","ListUrlMain":"https://link.springer.com/article/10.1007/s11694-024-03032-5","RegionNum":3,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"FOOD SCIENCE & TECHNOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
The moisture content of tea leaves plays a dominant role in the processing and storage of tea leaves, and directly affects the color, flavor and value of tea leaves. This study aims to use hyperspectral imaging technology combined with machine learning methods to achieve nondestructive detection of tea moisture content. The hyperspectral images of tea samples in the wavelength range of 387 ~ 1035 nm were collected, the region of interest (ROI) was intercepted by ENVI software and the spectral information was extracted by python programming software, and the texture information of the samples was extracted by using gray scale co-generation matrix (GLCM) to build a model based on spectral, texture and spectral-texture fusion for the detection of moisture content of Tibetan tea. The original Tibetan tea spectral data (RAW) and the fused spectral-texture features were preprocessed using six preprocessing algorithms, including standard normal variational transform (SNVT), multiple scattering correction (MSC), first-order derivative (FD), second-order derivative (SD), Savitzky-Golay (SG) filtering and Z-Score Standardization (ZSS). After extracting the Tibetan tea spectral, texture, and spectral-texture fusion features using GB, AdaBoost, RF, XGBoost, LightGBM, and CatBoost algorithms, respectively, the top 30 features were ranked according to their importance and were used as inputs to the RFR, CatBoostR, LightGBMR, and XGBoostR models. The XGBoost + CatBoostR model has the best performance with \(R_{c}^{2}\), \(R_{p}^{2}\), and RMSEC and RMSEP of 0.9814, 0.9788, and 0.2064, 0.2506, respectively. And according to the results of modeling, the features extracted by GB algorithm are filtered as inputs, and finally the Stacking model with XGBoostR and CatBoostR as base learners and CatBoostR as meta-learner is built. The prediction results of this model are more satisfactory, and its \(R_{c}^{2}\), \(R_{p}^{2}\), RMSEC, and RMSEP are 0.9947, 0.9817, and 0.1101, 0.2326, respectively.
期刊介绍:
This interdisciplinary journal publishes new measurement results, characteristic properties, differentiating patterns, measurement methods and procedures for such purposes as food process innovation, product development, quality control, and safety assurance.
The journal encompasses all topics related to food property measurement and characterization, including all types of measured properties of food and food materials, features and patterns, measurement principles and techniques, development and evaluation of technologies, novel uses and applications, and industrial implementation of systems and procedures.