预测甘薯感官特征的近红外光谱学机器学习方法。

Judith Ssali Nantongo, Edwin Serunkuma, Gabriela Burgos, Mariam Nakitto, Fabrice Davrieux, Reuben Ssali
{"title":"预测甘薯感官特征的近红外光谱学机器学习方法。","authors":"Judith Ssali Nantongo, Edwin Serunkuma, Gabriela Burgos, Mariam Nakitto, Fabrice Davrieux, Reuben Ssali","doi":"10.1016/j.saa.2024.124406","DOIUrl":null,"url":null,"abstract":"<p><p>It has been established that near infrared (NIR) spectroscopy has the potential of estimating sensory traits given the direct spectral responses that these properties have in the NIR region. In sweetpotato, sensory and texture traits are key for improving acceptability of the crop for food security and nutrition. Studies have statistically modelled the levels of NIR spectroscopy sensory characteristics using partial least squares (PLS) regression methods. To improve prediction accuracy, there are many advanced techniques, which could enhance modelling of fresh (wet and un-processed) samples or nonlinear dependence relationships. Performance of different quantitative prediction models for sensory traits developed using different machine learning methods were compared. Overall, results show that linear methods; linear support vector machine (L-SVM), principal component regression (PCR) and PLS exhibited higher mean R<sup>2</sup> values than other statistical methods. For all the 27 sensory traits, calibration models using L-SVM and PCR has slightly higher overall R<sup>2</sup> (x¯ = 0.33) compared to PLS (x¯ = 0.32) and radial-based SVM (NL-SVM; x¯= 0.30). The levels of orange color intensity were the best predicted by all the calibration models (R<sup>2</sup> = 0.87 - 0.89). The elastic net linear regression (ENR) and tree-based methods; extreme gradient boost (XGBoost) and random forest (RF) performed worse than would be expected but could possibly be improved with increased sample size. Lower average R<sup>2</sup> values were observed for calibration models of ENR (x¯ = 0.26), XGBoost (x¯ = 0.26) and RF (x¯ = 0.22). The overall RMSE in calibration models was lower in PCR models (X = 0.82) compared to L-SVM (x¯ = 0.86) and PLS (x¯ = 0.90). ENR, XGBoost and RF also had higher RMSE (x¯ = 0.90 - 0.92). Effective wavelengths selection using the interval partial least-squares regression (iPLS), improved the performance of the models but did not perform as good as the PLS. SNV pre-treatment was useful in improving model performance.</p>","PeriodicalId":94213,"journal":{"name":"Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy","volume":"318 ","pages":"124406"},"PeriodicalIF":0.0000,"publicationDate":"2024-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Machine learning methods in near infrared spectroscopy for predicting sensory traits in sweetpotatoes.\",\"authors\":\"Judith Ssali Nantongo, Edwin Serunkuma, Gabriela Burgos, Mariam Nakitto, Fabrice Davrieux, Reuben Ssali\",\"doi\":\"10.1016/j.saa.2024.124406\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>It has been established that near infrared (NIR) spectroscopy has the potential of estimating sensory traits given the direct spectral responses that these properties have in the NIR region. In sweetpotato, sensory and texture traits are key for improving acceptability of the crop for food security and nutrition. Studies have statistically modelled the levels of NIR spectroscopy sensory characteristics using partial least squares (PLS) regression methods. To improve prediction accuracy, there are many advanced techniques, which could enhance modelling of fresh (wet and un-processed) samples or nonlinear dependence relationships. Performance of different quantitative prediction models for sensory traits developed using different machine learning methods were compared. Overall, results show that linear methods; linear support vector machine (L-SVM), principal component regression (PCR) and PLS exhibited higher mean R<sup>2</sup> values than other statistical methods. For all the 27 sensory traits, calibration models using L-SVM and PCR has slightly higher overall R<sup>2</sup> (x¯ = 0.33) compared to PLS (x¯ = 0.32) and radial-based SVM (NL-SVM; x¯= 0.30). The levels of orange color intensity were the best predicted by all the calibration models (R<sup>2</sup> = 0.87 - 0.89). The elastic net linear regression (ENR) and tree-based methods; extreme gradient boost (XGBoost) and random forest (RF) performed worse than would be expected but could possibly be improved with increased sample size. Lower average R<sup>2</sup> values were observed for calibration models of ENR (x¯ = 0.26), XGBoost (x¯ = 0.26) and RF (x¯ = 0.22). The overall RMSE in calibration models was lower in PCR models (X = 0.82) compared to L-SVM (x¯ = 0.86) and PLS (x¯ = 0.90). ENR, XGBoost and RF also had higher RMSE (x¯ = 0.90 - 0.92). Effective wavelengths selection using the interval partial least-squares regression (iPLS), improved the performance of the models but did not perform as good as the PLS. SNV pre-treatment was useful in improving model performance.</p>\",\"PeriodicalId\":94213,\"journal\":{\"name\":\"Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy\",\"volume\":\"318 \",\"pages\":\"124406\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-10-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1016/j.saa.2024.124406\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/5/4 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.saa.2024.124406","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/5/4 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

近红外光谱具有估测感官特性的潜力,因为这些特性在近红外区域具有直接的光谱响应。在甘薯中,感官和质地特征是提高作物可接受性以保证粮食安全和营养的关键。研究利用偏最小二乘法(PLS)回归方法对近红外光谱感官特征水平进行了统计建模。为了提高预测的准确性,有许多先进的技术可以加强对新鲜(湿的和未加工的)样品或非线性依赖关系的建模。我们比较了使用不同机器学习方法开发的不同感官性状定量预测模型的性能。总体而言,结果显示线性方法、线性支持向量机(L-SVM)、主成分回归(PCR)和 PLS 的平均 R2 值高于其他统计方法。对于所有 27 个感官性状,与 PLS(x¯ = 0.32)和基于径向的 SVM(NL-SVM;x¯ = 0.30)相比,使用 L-SVM 和 PCR 的校准模型的总体 R2(x¯ = 0.33)略高。在所有校准模型中,橙色强度水平的预测效果最好(R2 = 0.87 - 0.89)。弹性净线性回归(ENR)和基于树的方法;极梯度提升(XGBoost)和随机森林(RF)的表现比预期的要差,但随着样本量的增加可能会有所改善。ENR (x¯ = 0.26)、XGBoost (x¯ = 0.26) 和 RF (x¯ = 0.22) 的校准模型的平均 R2 值较低。与 L-SVM (x¯ = 0.86) 和 PLS (x¯ = 0.90) 相比,PCR 模型校准模型的总体 RMSE 较低 (X = 0.82)。ENR、XGBoost 和 RF 的 RMSE 也较高(x¯ = 0.90 - 0.92)。使用区间偏最小二乘回归(iPLS)进行有效波长选择提高了模型的性能,但不如 PLS 性能好。SNV 预处理有助于提高模型性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Machine learning methods in near infrared spectroscopy for predicting sensory traits in sweetpotatoes.

It has been established that near infrared (NIR) spectroscopy has the potential of estimating sensory traits given the direct spectral responses that these properties have in the NIR region. In sweetpotato, sensory and texture traits are key for improving acceptability of the crop for food security and nutrition. Studies have statistically modelled the levels of NIR spectroscopy sensory characteristics using partial least squares (PLS) regression methods. To improve prediction accuracy, there are many advanced techniques, which could enhance modelling of fresh (wet and un-processed) samples or nonlinear dependence relationships. Performance of different quantitative prediction models for sensory traits developed using different machine learning methods were compared. Overall, results show that linear methods; linear support vector machine (L-SVM), principal component regression (PCR) and PLS exhibited higher mean R2 values than other statistical methods. For all the 27 sensory traits, calibration models using L-SVM and PCR has slightly higher overall R2 (x¯ = 0.33) compared to PLS (x¯ = 0.32) and radial-based SVM (NL-SVM; x¯= 0.30). The levels of orange color intensity were the best predicted by all the calibration models (R2 = 0.87 - 0.89). The elastic net linear regression (ENR) and tree-based methods; extreme gradient boost (XGBoost) and random forest (RF) performed worse than would be expected but could possibly be improved with increased sample size. Lower average R2 values were observed for calibration models of ENR (x¯ = 0.26), XGBoost (x¯ = 0.26) and RF (x¯ = 0.22). The overall RMSE in calibration models was lower in PCR models (X = 0.82) compared to L-SVM (x¯ = 0.86) and PLS (x¯ = 0.90). ENR, XGBoost and RF also had higher RMSE (x¯ = 0.90 - 0.92). Effective wavelengths selection using the interval partial least-squares regression (iPLS), improved the performance of the models but did not perform as good as the PLS. SNV pre-treatment was useful in improving model performance.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Age estimation of Phormia regina pupae based on ATR-FTIR and chemometrics. Exploring the charge transfer enhancement mechanism in selective SERS detection with Mo1-xWxS2@Ag2S nanosheets. Improving monitoring of dissolved organic matter from the wastewater treatment plant to the receiving environment: A new high-frequency in situ fluorescence sensor capable of analyzing 29 pairs of Ex/Em wavelengths. Theoretical study of excited state dynamics of a ratiometric fluorescent probe for detection of SO2 derivatives. A Dicyanoisophorone-based Fluorescent Turn-on Probe for Rapid Detecting Thiophenol in Aqueous Medium and Living Cell Imaging.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1