ATR-FTIR Spectroscopy Preprocessing Technique Selection for Identification of Geographical Origins of Gastrodia elata Blume

IF 2.3 4区 化学 Q1 SOCIAL WORK Journal of Chemometrics Pub Date : 2024-07-03 DOI:10.1002/cem.3579
Hong Liu, Honggao Liu, Jieqing Li, Yuanzhong Wang
{"title":"ATR-FTIR Spectroscopy Preprocessing Technique Selection for Identification of Geographical Origins of Gastrodia elata Blume","authors":"Hong Liu,&nbsp;Honggao Liu,&nbsp;Jieqing Li,&nbsp;Yuanzhong Wang","doi":"10.1002/cem.3579","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p><i>Gastrodia elata</i> Blume from different regions varies in growth conditions, soil types, and climate, which directly affects the content and quality of its medicinal components. Accurately identifying the origin can effectively ensure the medicinal value of <i>G. elata</i> Bl., prevent the circulation of counterfeit products, and thus protect the interests and health of consumers. Attenuated total reflectance Fourier transform infrared (ATR-FTIR) spectroscopy is a rapid and effective method for verifying the authenticity of traditional Chinese medicines. However, the presence of scattering effects in the spectra poses challenges in establishing reliable discrimination models. Therefore, employing appropriate scattering correction techniques is crucial for improving the quality of spectral data and the accuracy of discrimination models. This study uses two ensemble preprocessing approaches; the first type is series fusion of scatter correction technologies (SCSF), and another method is sequential preprocessing through orthogonalization (SPORT). Four discriminant models were established using a single scattering correction technique and two ensemble preprocessing approaches. The results show that the data-driven version of the soft independent modeling of class analogy (DD-SIMCA) model built based on multiplicative scatter correction (MSC) preprocessing has a sensitivity of 0.98 and a specificity of 0.91, able to effectively distinguish whether a sample of <i>G. elata</i> Bl. originates from Zhaotong. In addition, three discriminant models including support vector machine (SVM), partial least squares discriminant analysis (PLS-DA), and three gradient boosting machine (GBM) algorithms built using the ensemble preprocessing approach have good classification and generalization capabilities. Among them, the SCSF-PLS-DA model has the best performance with 99.68% and 98.08% accuracy for the training and test sets, respectively, and F1 of 0.97; the SPORT-SVM model achieved the second-best classification ability. The results show that the ensemble preprocessing approach used can improve the success rate of <i>G. elata</i> Bl. geographical origin classification.</p>\n </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 10","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemometrics","FirstCategoryId":"92","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cem.3579","RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"SOCIAL WORK","Score":null,"Total":0}
引用次数: 0

Abstract

Gastrodia elata Blume from different regions varies in growth conditions, soil types, and climate, which directly affects the content and quality of its medicinal components. Accurately identifying the origin can effectively ensure the medicinal value of G. elata Bl., prevent the circulation of counterfeit products, and thus protect the interests and health of consumers. Attenuated total reflectance Fourier transform infrared (ATR-FTIR) spectroscopy is a rapid and effective method for verifying the authenticity of traditional Chinese medicines. However, the presence of scattering effects in the spectra poses challenges in establishing reliable discrimination models. Therefore, employing appropriate scattering correction techniques is crucial for improving the quality of spectral data and the accuracy of discrimination models. This study uses two ensemble preprocessing approaches; the first type is series fusion of scatter correction technologies (SCSF), and another method is sequential preprocessing through orthogonalization (SPORT). Four discriminant models were established using a single scattering correction technique and two ensemble preprocessing approaches. The results show that the data-driven version of the soft independent modeling of class analogy (DD-SIMCA) model built based on multiplicative scatter correction (MSC) preprocessing has a sensitivity of 0.98 and a specificity of 0.91, able to effectively distinguish whether a sample of G. elata Bl. originates from Zhaotong. In addition, three discriminant models including support vector machine (SVM), partial least squares discriminant analysis (PLS-DA), and three gradient boosting machine (GBM) algorithms built using the ensemble preprocessing approach have good classification and generalization capabilities. Among them, the SCSF-PLS-DA model has the best performance with 99.68% and 98.08% accuracy for the training and test sets, respectively, and F1 of 0.97; the SPORT-SVM model achieved the second-best classification ability. The results show that the ensemble preprocessing approach used can improve the success rate of G. elata Bl. geographical origin classification.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
选择 ATR-FTIR 光谱预处理技术以确定天麻的地理来源
不同地区的天麻因生长条件、土壤类型、气候等不同,直接影响其药用成分的含量和质量。准确鉴定产地能有效确保白花蛇舌草的药用价值,防止假冒伪劣产品的流通,从而保护消费者的利益和健康。衰减全反射傅立叶变换红外光谱法(ATR-FTIR)是验证中药真伪的一种快速有效的方法。然而,光谱中散射效应的存在给建立可靠的鉴别模型带来了挑战。因此,采用适当的散射校正技术对于提高光谱数据的质量和鉴别模型的准确性至关重要。本研究采用了两种集合预处理方法:第一种是散射校正技术系列融合(SCSF),另一种方法是通过正交化进行序列预处理(SPORT)。利用单一散射校正技术和两种集合预处理方法建立了四个判别模型。结果表明,基于乘法散射校正(MSC)预处理方法建立的数据驱动版类类比软独立建模(DD-SIMCA)模型的灵敏度为 0.98,特异度为 0.91,能够有效区分昭通白花蛇舌草样本是否产自昭通。此外,利用集合预处理方法建立的支持向量机(SVM)、偏最小二乘判别分析(PLS-DA)等三种判别模型和三种梯度提升机(GBM)算法也具有良好的分类和泛化能力。其中,SCSF-PLS-DA 模型性能最好,训练集和测试集的准确率分别为 99.68% 和 98.08%,F1 为 0.97;SPORT-SVM 模型的分类能力次之。结果表明,所使用的集合预处理方法可以提高 G. elata Bl. 地理起源分类的成功率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Chemometrics
Journal of Chemometrics 化学-分析化学
CiteScore
5.20
自引率
8.30%
发文量
78
审稿时长
2 months
期刊介绍: The Journal of Chemometrics is devoted to the rapid publication of original scientific papers, reviews and short communications on fundamental and applied aspects of chemometrics. It also provides a forum for the exchange of information on meetings and other news relevant to the growing community of scientists who are interested in chemometrics and its applications. Short, critical review papers are a particularly important feature of the journal, in view of the multidisciplinary readership at which it is aimed.
期刊最新文献
Issue Information Cover Image Past, Present and Future of Research in Analytical Figures of Merit Analytical Figures of Merit in Univariate, Multivariate, and Multiway Calibration: What Have We Learned? What Do We Still Need to Learn? Paul Geladi (1951–2024) Chemometrician, spectroscopist and pioneer
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1