A comparative analysis of deep learning and chemometric approaches for spectral data modeling

IF 6 2区 化学 Q1 CHEMISTRY, ANALYTICAL Analytica Chimica Acta Pub Date : 2025-02-10 DOI:10.1016/j.aca.2025.343766
Rúben Gariso, João P.L. Coutinho, Tiago J. Rato, Marco S. Reis
{"title":"A comparative analysis of deep learning and chemometric approaches for spectral data modeling","authors":"Rúben Gariso,&nbsp;João P.L. Coutinho,&nbsp;Tiago J. Rato,&nbsp;Marco S. Reis","doi":"10.1016/j.aca.2025.343766","DOIUrl":null,"url":null,"abstract":"<div><h3>Background:</h3><div>This study presents a comprehensive comparison of five different modeling approaches for spectroscopic data analysis. The first approach uses PLS combined with classical chemometric pre-processing (9 models). The second and third approaches use interval PLS (iPLS) with either classical pre-processing or wavelet transforms (28 models). The fourth uses LASSO with wavelet transforms (5 models). Finally, the fifth approach combines CNN with spectral pre-processing (9 models).</div></div><div><h3>Results:</h3><div>In this paper we consider two low dimensional case studies: a regression problem for a beer dataset (40 training samples) and a classification problem for a waste lubricant oil dataset (273 training samples). The results show that, after exhaustive pre-processing selection, iPLS variants show better performance for the first case study and remain competitive in the second case study. Wavelet transforms proved to be a viable alternative to classical pre-processing, improving performance for both linear and CNN models while maintaining interpretability. For the second case study, with more data, CNNs present good performance when applied on raw spectra and could potentially be used to avoid exhaustive pre-processing selection. However, it was found that CNNs can benefit from some types of pre-processing, leading to improved performance for the first case study and overall better performance for the second case study.</div><div><strong>Significance and Novelty:</strong></div><div>This study provides a critical and exhaustive comparison of combinations of pre-processing methods and models for spectroscopic data analysis. It was found that no single combination of pre-processing and model that can be identified as optimal beforehand in low data settings.</div></div>","PeriodicalId":240,"journal":{"name":"Analytica Chimica Acta","volume":"1347 ","pages":"Article 343766"},"PeriodicalIF":6.0000,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Analytica Chimica Acta","FirstCategoryId":"92","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0003267025001606","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, ANALYTICAL","Score":null,"Total":0}
引用次数: 0

Abstract

Background:

This study presents a comprehensive comparison of five different modeling approaches for spectroscopic data analysis. The first approach uses PLS combined with classical chemometric pre-processing (9 models). The second and third approaches use interval PLS (iPLS) with either classical pre-processing or wavelet transforms (28 models). The fourth uses LASSO with wavelet transforms (5 models). Finally, the fifth approach combines CNN with spectral pre-processing (9 models).

Results:

In this paper we consider two low dimensional case studies: a regression problem for a beer dataset (40 training samples) and a classification problem for a waste lubricant oil dataset (273 training samples). The results show that, after exhaustive pre-processing selection, iPLS variants show better performance for the first case study and remain competitive in the second case study. Wavelet transforms proved to be a viable alternative to classical pre-processing, improving performance for both linear and CNN models while maintaining interpretability. For the second case study, with more data, CNNs present good performance when applied on raw spectra and could potentially be used to avoid exhaustive pre-processing selection. However, it was found that CNNs can benefit from some types of pre-processing, leading to improved performance for the first case study and overall better performance for the second case study.
Significance and Novelty:
This study provides a critical and exhaustive comparison of combinations of pre-processing methods and models for spectroscopic data analysis. It was found that no single combination of pre-processing and model that can be identified as optimal beforehand in low data settings.

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
光谱数据建模中深度学习和化学计量学方法的比较分析
背景:本研究对光谱数据分析的五种不同建模方法进行了全面比较。第一种方法使用PLS结合经典的化学计量预处理(9个模型)。第二和第三种方法使用区间PLS (iPLS)与经典预处理或小波变换(28个模型)。第四种方法使用LASSO和小波变换(5个模型)。最后,第五种方法将CNN与光谱预处理(9个模型)相结合。结果:在本文中,我们考虑了两个低维案例研究:啤酒数据集(40个训练样本)的回归问题和废润滑油数据集(273个训练样本)的分类问题。结果表明,经过详尽的预处理选择,iPLS变体在第一个案例研究中表现出更好的性能,并在第二个案例研究中保持竞争力。小波变换被证明是经典预处理的可行替代方案,在保持可解释性的同时提高了线性和CNN模型的性能。对于第二个案例研究,当有更多的数据时,cnn在原始光谱上表现出良好的性能,并且可能被用来避免详尽的预处理选择。然而,研究发现cnn可以从某些类型的预处理中受益,从而提高了第一个案例研究的性能,并使第二个案例研究的总体性能更好。意义和新颖性:本研究对光谱数据分析的预处理方法和模型组合进行了批判性和详尽的比较。研究发现,在低数据设置下,没有单一的预处理和模型组合可以预先确定为最优。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Analytica Chimica Acta
Analytica Chimica Acta 化学-分析化学
CiteScore
10.40
自引率
6.50%
发文量
1081
审稿时长
38 days
期刊介绍: Analytica Chimica Acta has an open access mirror journal Analytica Chimica Acta: X, sharing the same aims and scope, editorial team, submission system and rigorous peer review. Analytica Chimica Acta provides a forum for the rapid publication of original research, and critical, comprehensive reviews dealing with all aspects of fundamental and applied modern analytical chemistry. The journal welcomes the submission of research papers which report studies concerning the development of new and significant analytical methodologies. In determining the suitability of submitted articles for publication, particular scrutiny will be placed on the degree of novelty and impact of the research and the extent to which it adds to the existing body of knowledge in analytical chemistry.
期刊最新文献
A CRISPR/dCas9 mediated electrochemical impedimetric biosensor for sensitive mtDNA detection PfAgo-mediated a rapid and visual detection assay for porcine circovirus type 3 based on recombinase-aided amplification Label-free and isothermal multiplex detection of miRNAs using self-priming hairpin probes and light-up aptamer transcription Polyphenylamine nanosheets-based dual-mode fluorescence probe for ultrasensitive detection of perfluoroalkyls in water environment Precise, fast, and automated gel quantification powered by YOLO11 instance segmentation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1