Rapid Raman spectroscopy analysis assisted with machine learning: a case study on Radix Bupleuri.

IF 3.3 2区 农林科学 Q1 AGRICULTURE, MULTIDISCIPLINARY Journal of the Science of Food and Agriculture Pub Date : 2024-11-08 DOI:10.1002/jsfa.14012
Fangjie Guo, Xudong Yang, Zhengyong Zhang, Shuren Liu, Yinsheng Zhang, Haiyan Wang
{"title":"Rapid Raman spectroscopy analysis assisted with machine learning: a case study on Radix Bupleuri.","authors":"Fangjie Guo, Xudong Yang, Zhengyong Zhang, Shuren Liu, Yinsheng Zhang, Haiyan Wang","doi":"10.1002/jsfa.14012","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Radix Bupleuri has been widely used for its plentiful pharmacological effects. But it is hard to evaluate their safety and efficacy because the concentrations of components are tightly affected by the surrounding environment. Thus, Radix Bupleuri samples from different regions and varieties were collected. Based on the experimental and computational Raman spectrum, machine learning is emphasized for certain obscured characteristics; for example, linear discriminant analysis (LDA), support vector machine (SVM), eXtreme gradient boosting (XGBoost) and light gradient boosting machine (LightGBM).</p><p><strong>Results: </strong>After dimension reduction by LDA, models of SVM, XGBoost and LightGBM were trained for classification and regression prediction of Bupleurum production regions. Support vector classifiers achieved the best accuracy of 98% and an F1 score above 0.96 on the test set. Support vector regression has a good fitting performance with an R<sup>2</sup> score above 0.90 and a relatively low mean square error. However, complex models were prone to overfitting, resulting in poor generalization ability.</p><p><strong>Conclusion: </strong>Among these machine learning models, the typical LDA-SVM models, consistent with the high-performance liquid chromatography results, demonstrate great performance and stability. We envision that this rapid classification and regression technique can be extended to predictions for other herbs. © 2024 Society of Chemical Industry.</p>","PeriodicalId":17725,"journal":{"name":"Journal of the Science of Food and Agriculture","volume":" ","pages":""},"PeriodicalIF":3.3000,"publicationDate":"2024-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Science of Food and Agriculture","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.1002/jsfa.14012","RegionNum":2,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Radix Bupleuri has been widely used for its plentiful pharmacological effects. But it is hard to evaluate their safety and efficacy because the concentrations of components are tightly affected by the surrounding environment. Thus, Radix Bupleuri samples from different regions and varieties were collected. Based on the experimental and computational Raman spectrum, machine learning is emphasized for certain obscured characteristics; for example, linear discriminant analysis (LDA), support vector machine (SVM), eXtreme gradient boosting (XGBoost) and light gradient boosting machine (LightGBM).

Results: After dimension reduction by LDA, models of SVM, XGBoost and LightGBM were trained for classification and regression prediction of Bupleurum production regions. Support vector classifiers achieved the best accuracy of 98% and an F1 score above 0.96 on the test set. Support vector regression has a good fitting performance with an R2 score above 0.90 and a relatively low mean square error. However, complex models were prone to overfitting, resulting in poor generalization ability.

Conclusion: Among these machine learning models, the typical LDA-SVM models, consistent with the high-performance liquid chromatography results, demonstrate great performance and stability. We envision that this rapid classification and regression technique can be extended to predictions for other herbs. © 2024 Society of Chemical Industry.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
机器学习辅助快速拉曼光谱分析:关于柴胡的案例研究。
背景:柴胡因其丰富的药理作用而被广泛使用。但由于其成分浓度受周围环境的影响很大,因此很难对其安全性和有效性进行评估。因此,我们采集了不同地区、不同品种的柴胡样本。在实验和计算拉曼光谱的基础上,针对某些模糊特征强调机器学习,例如线性判别分析(LDA)、支持向量机(SVM)、极端梯度提升(XGBoost)和光梯度提升机(LightGBM):通过 LDA 降维后,对 SVM、XGBoost 和 LightGBM 模型进行了训练,以对柴胡产区进行分类和回归预测。支持向量分类器在测试集上取得了 98% 的最佳准确率和高于 0.96 的 F1 分数。支持向量回归具有良好的拟合性能,R2 得分超过 0.90,均方误差相对较低。然而,复杂模型容易出现过度拟合,导致泛化能力差:结论:在这些机器学习模型中,典型的 LDA-SVM 模型与高效液相色谱结果一致,表现出很好的性能和稳定性。我们设想这种快速分类和回归技术可以扩展到其他药材的预测中。© 2024 化学工业协会。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
8.10
自引率
4.90%
发文量
634
审稿时长
3.1 months
期刊介绍: The Journal of the Science of Food and Agriculture publishes peer-reviewed original research, reviews, mini-reviews, perspectives and spotlights in these areas, with particular emphasis on interdisciplinary studies at the agriculture/ food interface. Published for SCI by John Wiley & Sons Ltd. SCI (Society of Chemical Industry) is a unique international forum where science meets business on independent, impartial ground. Anyone can join and current Members include consumers, business people, environmentalists, industrialists, farmers, and researchers. The Society offers a chance to share information between sectors as diverse as food and agriculture, pharmaceuticals, biotechnology, materials, chemicals, environmental science and safety. As well as organising educational events, SCI awards a number of prestigious honours and scholarships each year, publishes peer-reviewed journals, and provides Members with news from their sectors in the respected magazine, Chemistry & Industry . Originally established in London in 1881 and in New York in 1894, SCI is a registered charity with Members in over 70 countries.
期刊最新文献
Fluorescent nanoparticles from roast duck induce cell damage and physiological dysfunction in Caenorhabditis elegans. Effects of catechins, resveratrol, silymarin components and some of their conjugates on xanthine oxidase-catalyzed xanthine and 6-mercaptopurine oxidation. The effect of enzymatic deamidation on the solubility and emulsifying properties of walnut protein isolate. The use of heat-treated whey protein isolate as a natural emulsifier in fat-filled whey powder with a pre-emulsification process. A novel polysaccharide from Macadamia peel: Extraction, purification, structural characterization and antioxidant activity.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1