Combinatorial Order Pre-processing Search (COPS): A new pre-processing strategy for large-scale interpretable data analysis in process analytical technologies

IF 3.9 2区 工程技术 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Computers & Chemical Engineering Pub Date : 2024-10-11 DOI:10.1016/j.compchemeng.2024.108892
Wilson Cardoso , Jussara V. Roque , Jeroen J. Jansen , Sin Yong Teng , Reinaldo F. Teófilo
{"title":"Combinatorial Order Pre-processing Search (COPS): A new pre-processing strategy for large-scale interpretable data analysis in process analytical technologies","authors":"Wilson Cardoso ,&nbsp;Jussara V. Roque ,&nbsp;Jeroen J. Jansen ,&nbsp;Sin Yong Teng ,&nbsp;Reinaldo F. Teófilo","doi":"10.1016/j.compchemeng.2024.108892","DOIUrl":null,"url":null,"abstract":"<div><div>Combinatorial Order Pre-processing Search (COPS), a novel approach for optimizing data pre-processing is proposed in this work. Unlike simultaneous hyperparameter optimization, COPS employs <em>a priori</em> optimization to reduce computational time while refining the search space for preprocessing sequences and combinations. It allows for setting a maximum number of pre-processing methods, while efficiently searching through combinations of methods with chemically relevant knowledge. In this work, 67 calibration datasets across various analytical techniques, including fluorescence spectroscopy, gas chromatography (GC), near-infrared spectroscopy (NIR), mid-infrared spectroscopy (MIR), visible-near-infrared spectroscopy (Vis-NIR), Raman spectroscopy, nuclear magnetic resonance (NMR) spectroscopy, and voltammetry were evaluated. COPS yielded significant improvements over existing methodologies based on design of experiment and compounded pre-processing approaches. The COPS outperformed the other methods, resulting in an average root mean square error of prediction (RMSE<sub>P</sub>) reduction of 31.7%, while also reduced the complexity (number of latent variables) of the model which allows for easier interpretation. This underscores the importance of combinatorial order set theory for the search of pre-processing method combinations (without fixing the sequence of pre-processing methods) to enhance model performance and interpretation. The novel COPS approach can be employed in process analytical technology (such as inline, online or at-line chemical sensing analytics) to enhance predictive accuracy and operational efficiency, fundamentally transforming the quality and reliability of chemical process monitoring and control.</div></div>","PeriodicalId":286,"journal":{"name":"Computers & Chemical Engineering","volume":"192 ","pages":"Article 108892"},"PeriodicalIF":3.9000,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Chemical Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0098135424003107","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

Combinatorial Order Pre-processing Search (COPS), a novel approach for optimizing data pre-processing is proposed in this work. Unlike simultaneous hyperparameter optimization, COPS employs a priori optimization to reduce computational time while refining the search space for preprocessing sequences and combinations. It allows for setting a maximum number of pre-processing methods, while efficiently searching through combinations of methods with chemically relevant knowledge. In this work, 67 calibration datasets across various analytical techniques, including fluorescence spectroscopy, gas chromatography (GC), near-infrared spectroscopy (NIR), mid-infrared spectroscopy (MIR), visible-near-infrared spectroscopy (Vis-NIR), Raman spectroscopy, nuclear magnetic resonance (NMR) spectroscopy, and voltammetry were evaluated. COPS yielded significant improvements over existing methodologies based on design of experiment and compounded pre-processing approaches. The COPS outperformed the other methods, resulting in an average root mean square error of prediction (RMSEP) reduction of 31.7%, while also reduced the complexity (number of latent variables) of the model which allows for easier interpretation. This underscores the importance of combinatorial order set theory for the search of pre-processing method combinations (without fixing the sequence of pre-processing methods) to enhance model performance and interpretation. The novel COPS approach can be employed in process analytical technology (such as inline, online or at-line chemical sensing analytics) to enhance predictive accuracy and operational efficiency, fundamentally transforming the quality and reliability of chemical process monitoring and control.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
组合阶次预处理搜索(COPS):过程分析技术中用于大规模可解释数据分析的新预处理策略
本研究提出了一种优化数据预处理的新方法--组合阶次预处理搜索(COPS)。与同时进行的超参数优化不同,COPS 采用先验优化来减少计算时间,同时完善预处理序列和组合的搜索空间。它允许设置预处理方法的最大数量,同时有效搜索具有化学相关知识的方法组合。在这项工作中,对 67 个校准数据集进行了评估,这些数据集涉及各种分析技术,包括荧光光谱、气相色谱(GC)、近红外光谱(NIR)、中红外光谱(MIR)、可见-近红外光谱(Vis-NIR)、拉曼光谱、核磁共振(NMR)光谱和伏安法。与基于实验设计和复合预处理方法的现有方法相比,COPS 取得了重大改进。COPS 的性能优于其他方法,使预测的平均均方根误差 (RMSEP) 降低了 31.7%,同时还降低了模型的复杂性(潜在变量的数量),从而使解释更加容易。这凸显了组合秩集理论在寻找预处理方法组合(不固定预处理方法顺序)以提高模型性能和解释能力方面的重要性。新颖的 COPS 方法可用于过程分析技术(如在线、在线或在线化学传感分析),以提高预测准确性和操作效率,从根本上改变化学过程监测和控制的质量和可靠性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Computers & Chemical Engineering
Computers & Chemical Engineering 工程技术-工程:化工
CiteScore
8.70
自引率
14.00%
发文量
374
审稿时长
70 days
期刊介绍: Computers & Chemical Engineering is primarily a journal of record for new developments in the application of computing and systems technology to chemical engineering problems.
期刊最新文献
The bullwhip effect, market competition and standard deviation ratio in two parallel supply chains CADET-Julia: Efficient and versatile, open-source simulator for batch chromatography in Julia Computer aided formulation design based on molecular dynamics simulation: Detergents with fragrance Model-based real-time optimization in continuous pharmaceutical manufacturing Risk-averse supply chain management via robust reinforcement learning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1