Geographical Origin Identification of Red Chili Powder Using NIR Spectroscopy Combined with SIMCA and Machine Learning Algorithms

IF 2.6 3区 农林科学 Q2 FOOD SCIENCE & TECHNOLOGY Food Analytical Methods Pub Date : 2024-04-26 DOI:10.1007/s12161-024-02625-6
Deepoo Meena, Somsubhra Chakraborty, Jayeeta Mitra
{"title":"Geographical Origin Identification of Red Chili Powder Using NIR Spectroscopy Combined with SIMCA and Machine Learning Algorithms","authors":"Deepoo Meena,&nbsp;Somsubhra Chakraborty,&nbsp;Jayeeta Mitra","doi":"10.1007/s12161-024-02625-6","DOIUrl":null,"url":null,"abstract":"<div><p>Knowing the geographical origins of chili papers produced in specific areas is crucial because the geographical origins of various varieties of chili powder have a significant impact on their quality and price. In this research, for the first time, NIR (near-infrared) spectroscopy was used for the identification and classification of the geographical origin of chili powder of 6 different varieties, combining the method of PCA (principal component analysis) to extract relevant spectral features from the spectral data and segregate visible cluster trends, SIMCA (soft independent modeling of class analogy) statistically based classification model, and the four machine learning (ML) classifiers, including K-Nearest Neighbors (KNN), Decision Tree (DT), Random Forest (RF), and Support Vector Machine (SVM), were applied for supervised classification. It was found that the SVM classifier, with a <i>C</i> value of 4013.0 and γ of 0.04125, delivered the highest cross-validation accuracy of 98.41% and prediction accuracy of 97.22%. The optimization process, guided by a detailed 3D contour plot, led to a model that not only generalized well but also offered remarkable precision, as confirmed by confusion matrices. The classification accuracy of the SIMCA model was 94.04% for the calibration set and 84.74% for the prediction set. The nonlinear SVM technique of classification outperformed the linear SIMCA model and other ML models. In general, the results indicated that chili powder from various geographic origins could be discriminated by the use of NIR spectroscopy combined with the SVM model quickly, nondestructively, and reliably.</p></div>","PeriodicalId":561,"journal":{"name":"Food Analytical Methods","volume":"17 7","pages":"1005 - 1023"},"PeriodicalIF":2.6000,"publicationDate":"2024-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Food Analytical Methods","FirstCategoryId":"97","ListUrlMain":"https://link.springer.com/article/10.1007/s12161-024-02625-6","RegionNum":3,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"FOOD SCIENCE & TECHNOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Knowing the geographical origins of chili papers produced in specific areas is crucial because the geographical origins of various varieties of chili powder have a significant impact on their quality and price. In this research, for the first time, NIR (near-infrared) spectroscopy was used for the identification and classification of the geographical origin of chili powder of 6 different varieties, combining the method of PCA (principal component analysis) to extract relevant spectral features from the spectral data and segregate visible cluster trends, SIMCA (soft independent modeling of class analogy) statistically based classification model, and the four machine learning (ML) classifiers, including K-Nearest Neighbors (KNN), Decision Tree (DT), Random Forest (RF), and Support Vector Machine (SVM), were applied for supervised classification. It was found that the SVM classifier, with a C value of 4013.0 and γ of 0.04125, delivered the highest cross-validation accuracy of 98.41% and prediction accuracy of 97.22%. The optimization process, guided by a detailed 3D contour plot, led to a model that not only generalized well but also offered remarkable precision, as confirmed by confusion matrices. The classification accuracy of the SIMCA model was 94.04% for the calibration set and 84.74% for the prediction set. The nonlinear SVM technique of classification outperformed the linear SIMCA model and other ML models. In general, the results indicated that chili powder from various geographic origins could be discriminated by the use of NIR spectroscopy combined with the SVM model quickly, nondestructively, and reliably.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用近红外光谱结合 SIMCA 和机器学习算法识别红辣椒粉的地理产地
了解特定地区生产的辣椒纸的地理产地至关重要,因为各种辣椒粉的地理产地对其质量和价格有重大影响。本研究首次利用近红外(NIR)光谱对 6 个不同品种辣椒粉的地理产地进行识别和分类,结合 PCA(主成分分析)方法从光谱数据中提取相关光谱特征,并分离出可见的聚类趋势、SIMCA(类类比软独立建模)统计分类模型,以及四种机器学习(ML)分类器,包括 K-Nearest Neighbors (KNN)、Decision Tree (DT)、Random Forest (RF) 和 Support Vector Machine (SVM),用于监督分类。结果发现,C 值为 4013.0、γ 为 0.04125 的 SVM 分类器的交叉验证准确率最高,达到 98.41%,预测准确率为 97.22%。在详细的三维等高线图的指导下,优化过程不仅使模型具有良好的泛化能力,而且还提供了显著的精度,混淆矩阵也证实了这一点。SIMCA 模型的校准集分类准确率为 94.04%,预测集分类准确率为 84.74%。非线性 SVM 分类技术优于线性 SIMCA 模型和其他 ML 模型。总之,研究结果表明,利用近红外光谱和 SVM 模型可以快速、无损、可靠地鉴别不同产地的辣椒粉。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Food Analytical Methods
Food Analytical Methods 农林科学-食品科技
CiteScore
6.00
自引率
3.40%
发文量
244
审稿时长
3.1 months
期刊介绍: Food Analytical Methods publishes original articles, review articles, and notes on novel and/or state-of-the-art analytical methods or issues to be solved, as well as significant improvements or interesting applications to existing methods. These include analytical technology and methodology for food microbial contaminants, food chemistry and toxicology, food quality, food authenticity and food traceability. The journal covers fundamental and specific aspects of the development, optimization, and practical implementation in routine laboratories, and validation of food analytical methods for the monitoring of food safety and quality.
期刊最新文献
A Straightforward Method for Disaccharide Characterization from Transverse Relaxometry Using Low-Field Time-Domain Nuclear Magnetic Resonance Phytochemical Factor Analysis of Some Extra Virgin Olive Oils (Olivae oleum) and the Effects of Storage Under Different Conditions—Simulating General Consumer Behavior Sensory Evaluation and Volatile Organic Compounds in Dried Mango Produced from Different Varieties Determination of Adulteration in Evening Primrose Oil Capsules by 1H NMR Analysis and Chemometric Techniques Advanced Quantification of Trans Fatty Acids in Biscuits Using Fourier Transform Infrared Spectroscopy with Attenuated Total Reflectance (FT-IR-ATR)
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1