首页 > 最新文献

Chemometrics and Intelligent Laboratory Systems最新文献

英文 中文
Distributed learning of deep residual principal component analysis for large-scale industrial process monitoring 大规模工业过程监测中深度残差主成分分析的分布式学习
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-01-05 DOI: 10.1016/j.chemolab.2026.105629
Ouguan Xu , Zeyu Yang , Zhiqiang Ge
Distributed principal component analysis (PCA) has been widely used for monitoring large-scale industrial processes in the past years, with lots of improved forms and extension counterparts. This paper introduces a deep residual form of PCA into the distributed modeling framework, in order to improve the monitoring performance for large-scale industrial processes. While deep residual PCA model is developed for feature engineering in each separated block of the process, those augmented features extracted in different blocks are combined together in the second level for construction of an additional deep residual PCA model. By further augmenting the extracted features from different layers of the deep residual model, the final process monitoring scheme can be formulated for large-scale industrial processes. Based on two industrial case studies, the monitoring performance has been improved more than 20 % by the proposed distributed deep learning model, while at the same time the computation burden of the new method has been kept in a low level.
近年来,分布式主成分分析(PCA)被广泛用于大规模工业过程的监测,并有许多改进的形式和扩展的对应形式。为了提高大规模工业过程的监测性能,本文在分布式建模框架中引入了一种深度残差形式的主成分分析。当深度残差PCA模型被开发用于过程中每个分离块的特征工程时,这些在不同块中提取的增强特征在第二级被组合在一起以构建额外的深度残差PCA模型。通过对深度残差模型各层提取的特征进行进一步增强,可以制定针对大规模工业过程的最终过程监控方案。通过两个工业案例研究,所提出的分布式深度学习模型的监测性能提高了20%以上,同时使新方法的计算负担保持在较低的水平。
{"title":"Distributed learning of deep residual principal component analysis for large-scale industrial process monitoring","authors":"Ouguan Xu ,&nbsp;Zeyu Yang ,&nbsp;Zhiqiang Ge","doi":"10.1016/j.chemolab.2026.105629","DOIUrl":"10.1016/j.chemolab.2026.105629","url":null,"abstract":"<div><div>Distributed principal component analysis (PCA) has been widely used for monitoring large-scale industrial processes in the past years, with lots of improved forms and extension counterparts. This paper introduces a deep residual form of PCA into the distributed modeling framework, in order to improve the monitoring performance for large-scale industrial processes. While deep residual PCA model is developed for feature engineering in each separated block of the process, those augmented features extracted in different blocks are combined together in the second level for construction of an additional deep residual PCA model. By further augmenting the extracted features from different layers of the deep residual model, the final process monitoring scheme can be formulated for large-scale industrial processes. Based on two industrial case studies, the monitoring performance has been improved more than 20 % by the proposed distributed deep learning model, while at the same time the computation burden of the new method has been kept in a low level.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"269 ","pages":"Article 105629"},"PeriodicalIF":3.8,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145920786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ensemble robust SIMPLS with block-penalized smoothing for scalar-on-function regression 基于块惩罚平滑的函数上标量回归集成鲁棒SIMPLS
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-01-05 DOI: 10.1016/j.chemolab.2025.105626
Aylin Alin
We propose a robust penalized smooth partial least squares approach that (i) smooths high-dimensional discretized functional predictors via blockwise B-spline bases, (ii) applies robust SIMPLS to obtain latent scores, (iii) fits a penalized regression in the latent space whose penalty is exactly a block-diagonal roughness penalty on the coefficient function(s), and (iv) aggregates models through bootstrap ensembles (classical/sufficient resampling; mean/median aggregation). The method supports multiple functional predictors through a block-diagonal construction and yields interpretable smooth coefficient functions. Our method demonstrates competitive or superior performance under collinearity and contamination.
我们提出了一种鲁棒惩罚光滑偏最小二乘方法,该方法(i)通过块b样条基平滑高维离散函数预测器,(ii)应用鲁棒SIMPLS获得潜在分数,(iii)在潜在空间中拟合惩罚回归,其惩罚正是对系数函数(s)的块对角线粗糙度惩罚,(iv)通过自举集成(经典/充分重采样;均值/中值聚集)聚合模型。该方法通过块对角结构支持多个函数预测器,并产生可解释的平滑系数函数。我们的方法在共线性和污染情况下具有竞争力或优越的性能。
{"title":"Ensemble robust SIMPLS with block-penalized smoothing for scalar-on-function regression","authors":"Aylin Alin","doi":"10.1016/j.chemolab.2025.105626","DOIUrl":"10.1016/j.chemolab.2025.105626","url":null,"abstract":"<div><div>We propose a robust penalized smooth partial least squares approach that (i) smooths high-dimensional discretized functional predictors via blockwise B-spline bases, (ii) applies robust SIMPLS to obtain latent scores, (iii) fits a penalized regression in the latent space whose penalty is exactly a block-diagonal roughness penalty on the coefficient function(s), and (iv) aggregates models through bootstrap ensembles (classical/sufficient resampling; mean/median aggregation). The method supports multiple functional predictors through a block-diagonal construction and yields interpretable smooth coefficient functions. Our method demonstrates competitive or superior performance under collinearity and contamination.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"269 ","pages":"Article 105626"},"PeriodicalIF":3.8,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145920785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An efficient automated deep spatio-temporal feature learning framework for industrial soft sensing 工业软测量中一种高效的自动化深度时空特征学习框架
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-01-02 DOI: 10.1016/j.chemolab.2025.105623
Xiaogang Deng , Ziheng Wang , Lumeng Huang , Ping Wang
Deep learning neural networks have been widely adopted for developing quality prediction models in industrial processes. Despite their strong capability of nonlinear intrinsic features, the existing models have some notable drawbacks, such as insufficient capturing of local spatio-temporal features, high computational complexity of model training, difficult determination of deep model structure, and lack of model interpretability. To address these issues, this paper presents an efficient automated deep spatio-temporal feature learning framework for dynamic industrial process soft sensing, named Deep Convolutional Partial Least Squares (DeCPLS). The proposed approach introduces the convolutional Partial Least squares (CPLS) model as a basic feature extraction unit and stacks multiple CPLS layers to construct an efficient deep dynamic feature learning model. A layerwise training mechanism is presented to facilitate the automated determination of model structures and hyperparameters, thereby reducing the computational complexity. Furthermore, a model prediction error explanation mechanism is introduced to analyze prediction outcomes effectively. Compared to classical deep neural networks, the proposed method demonstrates the advantage of efficiently capturing local spatio-temporal features while maintaining acceptable computational complexity. Finally, the superiority of the proposed method is validated through a simulated industrial case study and a real-world industrial application.
深度学习神经网络已被广泛应用于工业过程质量预测模型的开发。现有模型具有较强的非线性固有特征提取能力,但存在对局部时空特征捕获不足、模型训练计算复杂度高、模型深层结构难以确定、模型可解释性不足等问题。为了解决这些问题,本文提出了一种用于动态工业过程软测量的高效自动化深度时空特征学习框架,称为深度卷积偏最小二乘(DeCPLS)。该方法将卷积偏最小二乘(CPLS)模型作为基本特征提取单元,并将多个CPLS层叠加,构建高效的深度动态特征学习模型。提出了一种分层训练机制,便于模型结构和超参数的自动确定,从而降低了计算复杂度。并引入模型预测误差解释机制,对预测结果进行有效分析。与经典深度神经网络相比,该方法在保持可接受的计算复杂度的同时,能够有效地捕获局部时空特征。最后,通过模拟工业案例研究和实际工业应用验证了所提方法的优越性。
{"title":"An efficient automated deep spatio-temporal feature learning framework for industrial soft sensing","authors":"Xiaogang Deng ,&nbsp;Ziheng Wang ,&nbsp;Lumeng Huang ,&nbsp;Ping Wang","doi":"10.1016/j.chemolab.2025.105623","DOIUrl":"10.1016/j.chemolab.2025.105623","url":null,"abstract":"<div><div>Deep learning neural networks have been widely adopted for developing quality prediction models in industrial processes. Despite their strong capability of nonlinear intrinsic features, the existing models have some notable drawbacks, such as insufficient capturing of local spatio-temporal features, high computational complexity of model training, difficult determination of deep model structure, and lack of model interpretability. To address these issues, this paper presents an efficient automated deep spatio-temporal feature learning framework for dynamic industrial process soft sensing, named Deep Convolutional Partial Least Squares (DeCPLS). The proposed approach introduces the convolutional Partial Least squares (CPLS) model as a basic feature extraction unit and stacks multiple CPLS layers to construct an efficient deep dynamic feature learning model. A layerwise training mechanism is presented to facilitate the automated determination of model structures and hyperparameters, thereby reducing the computational complexity. Furthermore, a model prediction error explanation mechanism is introduced to analyze prediction outcomes effectively. Compared to classical deep neural networks, the proposed method demonstrates the advantage of efficiently capturing local spatio-temporal features while maintaining acceptable computational complexity. Finally, the superiority of the proposed method is validated through a simulated industrial case study and a real-world industrial application.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"269 ","pages":"Article 105623"},"PeriodicalIF":3.8,"publicationDate":"2026-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145920783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel statistical framework to address FTIR spectral challenges: Hybrid MARS–PCA/KPCA models for pollutants analysis in honey samples 解决FTIR光谱挑战的新统计框架:用于蜂蜜样品污染物分析的混合MARS-PCA /KPCA模型
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-01-02 DOI: 10.1016/j.chemolab.2025.105625
Sughra Sarwar, Tahir Mehmood, Mudassir Iqbal
Fourier Transform Infrared (FTIR) spectroscopy enables the rapid and non-destructive examination of complex materials; however, challenges associated with its high dimensional data include dimensionality, noise, and outliers. Conventional regression methods, such as Multivariate Adaptive Regression Splines (MARS), often struggle with these problems, which frequently involve small sample sizes and numerous variables, particularly in chemometric studies. This study proposed a hybrid framework that combines MARS with Principal Component Analysis (PCA) and Kernel PCA (KPCA) and linear regression to overcome these drawbacks. FTIR was used to investigate 30 honey samples from various geographical areas in Pakistan. Before modeling, outlier detection was conducted using the Mahalanobis distance, computed in both PCA and KPCA-transformed spaces, with Minimum Covariance Determinant (MCD) estimators applied to identify and remove statistical outliers. This preprocessing step ensured that anomalous samples did not influence the model’s accuracy. The model consistently identified chemically relevant wave numbers, especially in the high-energy C–H, O–H, and N–H stretching zones (e.g., 2966, 3010, and 3584 cm1). These wavenumbers correspond to important functional groups involved in the absorption and bioaccumulation of pollutants in honey. To assess model performance, we used a 70:30 train-test split. The MARS-PCA-LR model, with RMSE = 1.2905, MAE = 1.2725 and MSE = 1.2887, outperformed the normal MARS baseline (RMSE = 4.5860, MAE = 3.4267 and MSE = 21.0319) and the MARS-KPCA-LR model (RMSE = 1.5017, MAE = 1.3300 and MSE = 1.5013) in terms of prediction accuracy. These results suggest that the proposed MARS-PCA-LR and MARS-KPCA-LR models offer improved interpretability and robustness, making them strong and reliable techniques for analyzing high-dimensional spectral data.
傅里叶变换红外光谱(FTIR)能够快速无损地检测复杂材料;然而,与高维数据相关的挑战包括维度、噪声和异常值。传统的回归方法,如多元自适应回归样条(MARS),经常遇到这些问题,这些问题往往涉及小样本量和众多变量,特别是在化学计量学研究中。为了克服这些缺点,本研究提出了一个将MARS与主成分分析(PCA)、核主成分分析(KPCA)和线性回归相结合的混合框架。利用FTIR对来自巴基斯坦不同地理区域的30份蜂蜜样本进行了调查。在建模之前,使用在PCA和kpca变换空间中计算的马氏距离进行异常点检测,并使用最小协方差行列式(MCD)估计器识别和去除统计异常点。这一预处理步骤确保了异常样本不会影响模型的准确性。该模型一致地确定了化学上相关的波数,特别是在高能的C-H、O-H和N-H拉伸区(例如2966、3010和3584 cm−1)。这些波数对应于蜂蜜中污染物吸收和生物积累的重要官能团。为了评估模型的性能,我们使用了70:30的训练测试分割。MARS- pca - lr模型(RMSE = 1.2905, MAE = 1.2725, MSE = 1.2887)的预测精度优于MARS- kpca - lr模型(RMSE = 4.5860, MAE = 3.4267, MSE = 21.0319)和MARS- kpca - lr模型(RMSE = 1.5017, MAE = 1.3300, MSE = 1.5013)。这些结果表明,所提出的MARS-PCA-LR和MARS-KPCA-LR模型具有更好的可解释性和鲁棒性,使其成为分析高维光谱数据的强大可靠技术。
{"title":"A novel statistical framework to address FTIR spectral challenges: Hybrid MARS–PCA/KPCA models for pollutants analysis in honey samples","authors":"Sughra Sarwar,&nbsp;Tahir Mehmood,&nbsp;Mudassir Iqbal","doi":"10.1016/j.chemolab.2025.105625","DOIUrl":"10.1016/j.chemolab.2025.105625","url":null,"abstract":"<div><div>Fourier Transform Infrared (FTIR) spectroscopy enables the rapid and non-destructive examination of complex materials; however, challenges associated with its high dimensional data include dimensionality, noise, and outliers. Conventional regression methods, such as Multivariate Adaptive Regression Splines (MARS), often struggle with these problems, which frequently involve small sample sizes and numerous variables, particularly in chemometric studies. This study proposed a hybrid framework that combines MARS with Principal Component Analysis (PCA) and Kernel PCA (KPCA) and linear regression to overcome these drawbacks. FTIR was used to investigate 30 honey samples from various geographical areas in Pakistan. Before modeling, outlier detection was conducted using the Mahalanobis distance, computed in both PCA and KPCA-transformed spaces, with Minimum Covariance Determinant (MCD) estimators applied to identify and remove statistical outliers. This preprocessing step ensured that anomalous samples did not influence the model’s accuracy. The model consistently identified chemically relevant wave numbers, especially in the high-energy C–H, O–H, and N–H stretching zones (e.g., 2966, 3010, and 3584 cm<span><math><msup><mrow></mrow><mrow><mo>−</mo><mn>1</mn></mrow></msup></math></span>). These wavenumbers correspond to important functional groups involved in the absorption and bioaccumulation of pollutants in honey. To assess model performance, we used a 70:30 train-test split. The MARS-PCA-LR model, with RMSE <span><math><mo>=</mo></math></span> 1.2905, MAE <span><math><mo>=</mo></math></span> 1.2725 and MSE <span><math><mo>=</mo></math></span> 1.2887, outperformed the normal MARS baseline (RMSE <span><math><mo>=</mo></math></span> 4.5860, MAE <span><math><mo>=</mo></math></span> 3.4267 and MSE <span><math><mo>=</mo></math></span> 21.0319) and the MARS-KPCA-LR model (RMSE <span><math><mo>=</mo></math></span> 1.5017, MAE <span><math><mo>=</mo></math></span> 1.3300 and MSE <span><math><mo>=</mo></math></span> 1.5013) in terms of prediction accuracy. These results suggest that the proposed MARS-PCA-LR and MARS-KPCA-LR models offer improved interpretability and robustness, making them strong and reliable techniques for analyzing high-dimensional spectral data.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"269 ","pages":"Article 105625"},"PeriodicalIF":3.8,"publicationDate":"2026-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145880509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep learning-driven components analysis of Raman spectral mixtures: An integrated masked autoencoder with convolutional neural network approach 深度学习驱动的拉曼光谱混合成分分析:基于卷积神经网络的集成掩模自编码器
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-12-30 DOI: 10.1016/j.chemolab.2025.105627
Zichuan Bu , Jihong Liu , Jiageng Zhang , Chi Liu , Yihua Liu , Kaili Ren , Xuewen Yan , Wei Gao , Jun Dong
Raman spectroscopy is a pivotal tool in analytical and physical chemistry, yet its application in complex systems is hindered by spectral superposition and analysis challenges. The development of deep learning technology has provided new ideas for the component analysis of complex mixtures. This study proposes a mixture component identification method named MCI, which is based on the masked autoencoder and convolutional neural network. The aim is to effectively solve the problems of qualitative recognition and quantitative analysis in the Raman spectra of mixtures. The MCI method adopts a multi-stage framework: First, the Voigt function is used to accurately extract the characteristic peaks of the mixture. Second, the MAE model is employed to reconstruct the corresponding pure-substance spectra. Then, the CNN model is combined to conduct qualitative and quantitative analyses on the reconstructed spectra. Finally, the spectrum of the remaining components is obtained by subtracting the reconstructed spectrum from the mixture spectrum. By iterating the above process, the step-by-step unmixing of complex mixtures is achieved. In the generated mixed sample test data, the MCI outperforms the other three comparative models in terms of complete recognition accuracy in qualitative analysis and the evaluation indicators of each substance, while maintaining a lower average concentration error in quantitative analysis. Moreover, for complex mixtures containing interfering substances, the MCI shows strong anti-interference ability and maintains a high Identification accuracy. In the actual measurement of mixed sample Raman spectral identification detection, The MCI model achieved an average accuracy and F1_Score of 97 % in all test samples, further verifying its reliability and practicality in detecting the main components of real and complex mixtures. In summary, this study provides a new technical method for Raman spectral analysis of complex mixtures, which holds certain theoretical significance and practical value.
拉曼光谱是分析化学和物理化学中的关键工具,但其在复杂系统中的应用受到光谱叠加和分析挑战的阻碍。深度学习技术的发展为复杂混合物的成分分析提供了新的思路。本文提出了一种基于掩模自编码器和卷积神经网络的混合分量识别方法MCI。目的是有效地解决混合物拉曼光谱的定性识别和定量分析问题。MCI方法采用多阶段框架:首先,利用Voigt函数准确提取混合物的特征峰;其次,利用MAE模型重构相应的纯物质谱;然后结合CNN模型对重构光谱进行定性和定量分析。最后,在混合谱中减去重构谱,得到剩余分量的谱。通过重复上述过程,可以实现复杂混合物的逐步分解。在生成的混合样品测试数据中,MCI在定性分析和各物质评价指标上的完全识别准确率优于其他三种比较模型,同时在定量分析中保持较低的平均浓度误差。此外,对于含有干扰物质的复杂混合物,MCI显示出较强的抗干扰能力,并保持较高的识别精度。在混合样品拉曼光谱识别检测的实际测量中,MCI模型在所有测试样品中的平均精度和F1_Score均达到97%,进一步验证了其在检测真实和复杂混合物主要成分方面的可靠性和实用性。综上所述,本研究为复杂混合物的拉曼光谱分析提供了一种新的技术方法,具有一定的理论意义和实用价值。
{"title":"Deep learning-driven components analysis of Raman spectral mixtures: An integrated masked autoencoder with convolutional neural network approach","authors":"Zichuan Bu ,&nbsp;Jihong Liu ,&nbsp;Jiageng Zhang ,&nbsp;Chi Liu ,&nbsp;Yihua Liu ,&nbsp;Kaili Ren ,&nbsp;Xuewen Yan ,&nbsp;Wei Gao ,&nbsp;Jun Dong","doi":"10.1016/j.chemolab.2025.105627","DOIUrl":"10.1016/j.chemolab.2025.105627","url":null,"abstract":"<div><div>Raman spectroscopy is a pivotal tool in analytical and physical chemistry, yet its application in complex systems is hindered by spectral superposition and analysis challenges. The development of deep learning technology has provided new ideas for the component analysis of complex mixtures. This study proposes a mixture component identification method named MCI, which is based on the masked autoencoder and convolutional neural network. The aim is to effectively solve the problems of qualitative recognition and quantitative analysis in the Raman spectra of mixtures. The MCI method adopts a multi-stage framework: First, the Voigt function is used to accurately extract the characteristic peaks of the mixture. Second, the MAE model is employed to reconstruct the corresponding pure-substance spectra. Then, the CNN model is combined to conduct qualitative and quantitative analyses on the reconstructed spectra. Finally, the spectrum of the remaining components is obtained by subtracting the reconstructed spectrum from the mixture spectrum. By iterating the above process, the step-by-step unmixing of complex mixtures is achieved. In the generated mixed sample test data, the MCI outperforms the other three comparative models in terms of complete recognition accuracy in qualitative analysis and the evaluation indicators of each substance, while maintaining a lower average concentration error in quantitative analysis. Moreover, for complex mixtures containing interfering substances, the MCI shows strong anti-interference ability and maintains a high Identification accuracy. In the actual measurement of mixed sample Raman spectral identification detection, The MCI model achieved an average accuracy and <em>F1_Score</em> of 97 % in all test samples, further verifying its reliability and practicality in detecting the main components of real and complex mixtures. In summary, this study provides a new technical method for Raman spectral analysis of complex mixtures, which holds certain theoretical significance and practical value.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"269 ","pages":"Article 105627"},"PeriodicalIF":3.8,"publicationDate":"2025-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145880468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Smoothed Power-Weakness Ratio (sPWR): a new informative system for multi-criteria decision making 平滑强弱比(sPWR):一种新的多准则决策信息系统
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-12-29 DOI: 10.1016/j.chemolab.2025.105624
Viviana Consonni, Davide Ballabio, Enmanuel Cruz Muñoz, Veronica Termopoli, Roberto Todeschini
Nowadays, the large number of measurable variables has considerably increased the complexity of data. In the framework of the decision-making process, this leads to the need of adequate tools to set priorities and rank the available options. Ordering is one of the possible ways to analyse multivariate data, which provides an overview of the relationships among the elements of a system. The Multi-Criteria Decision Making (MCDM) encompasses a broad set of methods designed to set priority-based lists of alternatives based on multiple criteria, which support decision problems. Among the most widely adopted techniques, TOPSIS, dominance-based approaches, the Analytic Hierarchy Process (AHP), and Copeland scores represent some of the classical methodologies in both theoretical research and applied decision analysis.
Among the dominance-based approaches, an effective MCDM method is the Power-Weakness Ratio (PWR), which generates a tournament table (i.e., the pairwise comparison matrix) from a data matrix with a varying number of samples (i.e., alternatives to be compared) and variables (i.e., the criteria for pairwise comparisons), weighted according to their relative importance in determining the final ranking. In this study, a variant of the classical Power-Weakness Ratio is presented, significantly modifying the way the tournament table is obtained. The method, called smoothed Power-Weakness Ratio (sPWR), takes into account the dominance degree of the alternatives in each pairwise comparison exploiting the differences between the criterion values. The rationale behind the method is described by the aid of an illustrative example on a simple benchmark dataset with known reference ranking of the samples. The main advantage of the new method over PWR is that its tournament table is much more informative and sensitive to the original data values than the classical pairwise comparison matrix. A multivariate comparison with other classical MCDM methods, performed on several diverse datasets, demonstrated that the results obtained by sPWR were quite similar to those obtained by Copeland Score and TOPSIS with range scaling. However, sPWR showed a higher tendency toward generating full rankings with an enhanced ability to remove ties in the pairwise comparisons.
如今,大量的可测量变量大大增加了数据的复杂性。在决策过程的框架内,这导致需要适当的工具来确定优先事项和对现有选择进行排序。排序是分析多变量数据的一种可能方法,它提供了系统元素之间关系的概述。多标准决策(Multi-Criteria Decision Making, MCDM)包含了一组广泛的方法,这些方法旨在基于多个标准设置基于优先级的备选方案列表,这些列表支持决策问题。在最广泛采用的技术中,TOPSIS、基于优势的方法、层次分析法(AHP)和Copeland分数代表了理论研究和应用决策分析中的一些经典方法。在基于优势的方法中,一种有效的MCDM方法是强弱比(Power-Weakness Ratio, PWR),它从具有不同数量的样本(即待比较的备选方案)和变量(即两两比较的标准)的数据矩阵中生成比例表(即两两比较矩阵),并根据它们在决定最终排名中的相对重要性进行加权。在这项研究中,提出了经典的强弱比的一个变体,显著地改变了争霸赛表的获得方式。该方法被称为平滑强弱比(sPWR),它利用准则值之间的差异,在每次两两比较中考虑备选方案的优势程度。该方法背后的基本原理是通过一个简单的基准数据集的说明性示例来描述的,该数据集具有已知的样本参考排名。与传统的两两比较矩阵相比,新方法的主要优点是其比赛表的信息量更大,对原始数据值更敏感。在多个不同的数据集上与其他经典MCDM方法进行了多变量比较,结果表明sPWR方法与Copeland Score和TOPSIS方法的结果非常相似。然而,sPWR在两两比较中显示出更高的生成完整排名的倾向,并增强了消除联系的能力。
{"title":"Smoothed Power-Weakness Ratio (sPWR): a new informative system for multi-criteria decision making","authors":"Viviana Consonni,&nbsp;Davide Ballabio,&nbsp;Enmanuel Cruz Muñoz,&nbsp;Veronica Termopoli,&nbsp;Roberto Todeschini","doi":"10.1016/j.chemolab.2025.105624","DOIUrl":"10.1016/j.chemolab.2025.105624","url":null,"abstract":"<div><div>Nowadays, the large number of measurable variables has considerably increased the complexity of data. In the framework of the decision-making process, this leads to the need of adequate tools to set priorities and rank the available options. Ordering is one of the possible ways to analyse multivariate data, which provides an overview of the relationships among the elements of a system. The Multi-Criteria Decision Making (MCDM) encompasses a broad set of methods designed to set priority-based lists of alternatives based on multiple criteria, which support decision problems. Among the most widely adopted techniques, TOPSIS, dominance-based approaches, the Analytic Hierarchy Process (AHP), and Copeland scores represent some of the classical methodologies in both theoretical research and applied decision analysis.</div><div>Among the dominance-based approaches, an effective MCDM method is the Power-Weakness Ratio (PWR), which generates a tournament table (i.e., the pairwise comparison matrix) from a data matrix with a varying number of samples (i.e., alternatives to be compared) and variables (i.e., the criteria for pairwise comparisons), weighted according to their relative importance in determining the final ranking. In this study, a variant of the classical Power-Weakness Ratio is presented, significantly modifying the way the tournament table is obtained. The method, called smoothed Power-Weakness Ratio (sPWR), takes into account the dominance degree of the alternatives in each pairwise comparison exploiting the differences between the criterion values. The rationale behind the method is described by the aid of an illustrative example on a simple benchmark dataset with known reference ranking of the samples. The main advantage of the new method over PWR is that its tournament table is much more informative and sensitive to the original data values than the classical pairwise comparison matrix. A multivariate comparison with other classical MCDM methods, performed on several diverse datasets, demonstrated that the results obtained by sPWR were quite similar to those obtained by Copeland Score and TOPSIS with range scaling. However, sPWR showed a higher tendency toward generating full rankings with an enhanced ability to remove ties in the pairwise comparisons.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"270 ","pages":"Article 105624"},"PeriodicalIF":3.8,"publicationDate":"2025-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146075322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparison of parametric and permutation t spectral representations for determining individual metabolite abundances from factorial design spectra 从析因设计光谱中确定个体代谢物丰度的参数和排列光谱表示的比较
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-12-22 DOI: 10.1016/j.chemolab.2025.105622
Leonardo J. Duarte , Gustavo G. Marcheafave , Elis D. Pauli , Ieda S. Scarminio , Roy E. Bruns
Two level factorial design spectra can be transformed into t spectral representations to analyze changes in metabolic abundances owing to environmental impacts. This transformation involves performing statistically paired t-test for each spectroscopic variable. These tests are sensitive to deviations from normality of the spectral data as well as heterogeneous variances of data at different factorial design levels. Although existing spectral information for metabolites can help guide interpretive efforts, permutation calculations can be performed to obtain statistical significance and t values of metabolic peaks that are expected to be less sensitive to these assumptions. The results of these calculations are reported here and compared with results from parametric statistical values for 13,501 NMR spectral variables for two level factorial design data of ethanol, dichloromethane and ethanol-dichloromethane (1:1) mixture extracts of yerba mate leaf samples. All t-representation peaks found to be statistically significant by parametric calculations are confirmed by the permutation calculations. Permutation results do not indicate any new significant peaks that were not predicted by the parametric results. As such, permutation calculations are recommended to validate results obtained from parametric determinations of statistical significance.
两水平因子设计光谱可以转化为t谱表示来分析由于环境影响而引起的代谢丰度变化。这种转换涉及对每个光谱变量进行统计配对t检验。这些测试对光谱数据的正态性偏差以及不同析因设计水平下数据的异质性方差敏感。虽然现有的代谢物光谱信息可以帮助指导解释工作,但可以进行排列计算以获得代谢峰的统计显著性和t值,这些值对这些假设的敏感性较低。本文报道了这些计算结果,并与马茶叶样品中乙醇、二氯甲烷和乙醇与二氯甲烷(1:1)混合物提取物的两水平析因设计数据中13501个核磁共振谱变量的参数统计值结果进行了比较。所有通过参数计算发现具有统计显著性的t表示峰都通过排列计算得到确认。排列结果不表明任何新的显著峰,不是由参数结果预测。因此,建议使用排列计算来验证从统计显著性参数确定中获得的结果。
{"title":"Comparison of parametric and permutation t spectral representations for determining individual metabolite abundances from factorial design spectra","authors":"Leonardo J. Duarte ,&nbsp;Gustavo G. Marcheafave ,&nbsp;Elis D. Pauli ,&nbsp;Ieda S. Scarminio ,&nbsp;Roy E. Bruns","doi":"10.1016/j.chemolab.2025.105622","DOIUrl":"10.1016/j.chemolab.2025.105622","url":null,"abstract":"<div><div>Two level factorial design spectra can be transformed into t spectral representations to analyze changes in metabolic abundances owing to environmental impacts. This transformation involves performing statistically paired <em>t</em>-test for each spectroscopic variable. These tests are sensitive to deviations from normality of the spectral data as well as heterogeneous variances of data at different factorial design levels. Although existing spectral information for metabolites can help guide interpretive efforts, permutation calculations can be performed to obtain statistical significance and t values of metabolic peaks that are expected to be less sensitive to these assumptions. The results of these calculations are reported here and compared with results from parametric statistical values for 13,501 NMR spectral variables for two level factorial design data of ethanol, dichloromethane and ethanol-dichloromethane (1:1) mixture extracts of yerba mate leaf samples. All t-representation peaks found to be statistically significant by parametric calculations are confirmed by the permutation calculations. Permutation results do not indicate any new significant peaks that were not predicted by the parametric results. As such, permutation calculations are recommended to validate results obtained from parametric determinations of statistical significance.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"269 ","pages":"Article 105622"},"PeriodicalIF":3.8,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145836727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A multi-source data integration for soybean differentiation through multiblock data analysis using a novel adaptation of ComDim 基于ComDim的多块数据分析的大豆多源数据集成
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-12-18 DOI: 10.1016/j.chemolab.2025.105621
Rodrigo Canarin de Oliveira , Hector Hernan Hernandez Zarta , Wargner Alonso Moreno Losada , Sebastián Javier Caruso , Hágata Cremasco , Evandro Bona , Douglas N. Rutledge , Diego Galvan
Soybean is a major global commodity. Given its importance, ensuring traceability becomes essential. Genetic, climatic, and soil-related factors influence its chemical composition. Integrating multi-source data using a multiblock analysis represents a powerful approach to differentiating soybeans and monitoring their traceability. This study employed an extension of the ComDim method (also known as Common Components and Specific Weights Analysis, CCSWA) to simultaneously differentiate 20 Brazilian soybean varieties, conventional and transgenic, based on cultivation region and cultivation type. The extension replaced the PCA (Principal Components Analysis) used in classical ComDim by CCA (Common Components Analysis). Forty samples cultivated in Londrina and Ponta Grossa (Paraná, Brazil) were analyzed for their fatty acid, amino acid, isoflavone, and mineral profiles using GC-FID, IEC, HPLC-DAD, and ICP-OES. The CCA-based ComDim results revealed that Common Component 2 (CC2) was primarily responsible for distinguishing the geographical regions of Londrina and Ponta Grossa. The global loadings of CC2 indicated that zinc (Zn), manganese (Mn), oleic acid, arginine, and malonyl genistin were the most influential variables in this component. In contrast, CC3 was associated with differentiating conventional and transgenic cultivars. The global loadings highlighted linoleic acid, oleic acid, α-linolenic acid, malonyl glycitin, malonyl genistin, Fe, Zn, and Mn as the most relevant contributors. The combined CC2 and CC3 plots indicated tendencies toward differentiation of soybean samples by cultivation region and cultivation type. This result highlights the potential of CCA-based ComDim as an effective tool for soybean traceability.
大豆是一种主要的全球商品。鉴于其重要性,确保可追溯性变得至关重要。遗传、气候和与土壤有关的因素影响其化学成分。使用多块分析集成多源数据是区分大豆和监测其可追溯性的有力方法。本研究采用ComDim方法(也称为Common Components and Specific Weights Analysis, CCSWA)的扩展方法,根据种植区域和种植类型同时区分了20个巴西大豆品种,包括常规大豆和转基因大豆。该扩展用CCA(公共成分分析)取代了经典ComDim中使用的PCA(主成分分析)。采用GC-FID、IEC、HPLC-DAD和ICP-OES分析了巴西Londrina和Ponta Grossa (paran)种植的40个样品的脂肪酸、氨基酸、异黄酮和矿物质谱。基于CC2的ComDim结果表明,共同成分2 (Common Component 2, CC2)是区分Londrina和Ponta Grossa地理区域的主要原因。CC2的全球负荷表明,锌(Zn)、锰(Mn)、油酸、精氨酸和丙二醇基genistin是影响该组分的主要变量。相比之下,CC3与常规和转基因品种的分化有关。亚油酸、油酸、α-亚麻酸、丙二醇甘油酯、丙二醇龙胆素、铁、锌和锰是最相关的贡献者。CC2和CC3联合样地显示了大豆样品按栽培区域和栽培类型分化的趋势。这一结果突出了基于ccm的ComDim作为大豆可追溯性的有效工具的潜力。
{"title":"A multi-source data integration for soybean differentiation through multiblock data analysis using a novel adaptation of ComDim","authors":"Rodrigo Canarin de Oliveira ,&nbsp;Hector Hernan Hernandez Zarta ,&nbsp;Wargner Alonso Moreno Losada ,&nbsp;Sebastián Javier Caruso ,&nbsp;Hágata Cremasco ,&nbsp;Evandro Bona ,&nbsp;Douglas N. Rutledge ,&nbsp;Diego Galvan","doi":"10.1016/j.chemolab.2025.105621","DOIUrl":"10.1016/j.chemolab.2025.105621","url":null,"abstract":"<div><div>Soybean is a major global commodity. Given its importance, ensuring traceability becomes essential. Genetic, climatic, and soil-related factors influence its chemical composition. Integrating multi-source data using a multiblock analysis represents a powerful approach to differentiating soybeans and monitoring their traceability. This study employed an extension of the ComDim method (also known as Common Components and Specific Weights Analysis, CCSWA) to simultaneously differentiate 20 Brazilian soybean varieties, conventional and transgenic, based on cultivation region and cultivation type. The extension replaced the PCA (Principal Components Analysis) used in classical ComDim by CCA (Common Components Analysis). Forty samples cultivated in Londrina and Ponta Grossa (Paraná, Brazil) were analyzed for their fatty acid, amino acid, isoflavone, and mineral profiles using GC-FID, IEC, HPLC-DAD, and ICP-OES. The CCA-based ComDim results revealed that Common Component 2 (CC2) was primarily responsible for distinguishing the geographical regions of Londrina and Ponta Grossa. The global loadings of CC2 indicated that zinc (Zn), manganese (Mn), oleic acid, arginine, and malonyl genistin were the most influential variables in this component. In contrast, CC3 was associated with differentiating conventional and transgenic cultivars. The global loadings highlighted linoleic acid, oleic acid, α-linolenic acid, malonyl glycitin, malonyl genistin, Fe, Zn, and Mn as the most relevant contributors. The combined CC2 and CC3 plots indicated tendencies toward differentiation of soybean samples by cultivation region and cultivation type. This result highlights the potential of CCA-based ComDim as an effective tool for soybean traceability.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"269 ","pages":"Article 105621"},"PeriodicalIF":3.8,"publicationDate":"2025-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145837308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detecting and classifying palladium nanoparticles in microscopic images using neutrosophic deep learning 利用嗜中性深度学习在显微图像中检测和分类钯纳米颗粒
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-12-18 DOI: 10.1016/j.chemolab.2025.105619
Mohamed El-dosuky , Aboul Ella Hassanien , Heba Alshater , Rania Ahmed , Sameh H. Basha , Heba AboulElla , Ashraf Darwish , Sara Abdelghafar
This paper introduces a neutrosophic deep learning model for automated detection and classification of palladium nanoparticles in scanning electron microscopy (SEM) images, distinguishing between ordered and disordered structures for accurate nanoparticle characterization. The model follows a five-phase pipeline for enhanced accuracy and efficiency. It begins with data augmentation, applying transformations like rotation and flipping to improve dataset diversity. The second phase uses neutrosophic image segmentation to manage uncertainty and noise in SEM images, allowing for the precise isolation of nanoparticle regions. In the third phase, the VGG-19 deep neural network extracts high-level features, initially identifying 25,088 features. In the fourth phase, a hybrid approach combining Gini importance and Genetic Optimized Rough Sets (GORS) reduces the number of features to 2454. The refined feature set is then classified using a Random Forest classifier, which effectively distinguishes between ordered and disordered palladium nanoparticles. To validate its performance, the proposed model was evaluated on a dataset of 1000 SEM images of carbon-based materials with deposited palladium nanoparticles, which was then expanded to 1500 images to address class imbalance and minimize overfitting. The experimental results highlight the model's strong potential as a high-performance classification tool for nanoparticle analysis in SEM images, achieving an overall accuracy of 99.67 %. To evaluate the impact of the introduced phases on the proposed model's performance, four ablation experiments were conducted, demonstrating the significance of each phase. Dropping data augmentation and feature reduction reduced accuracy approximately to 97.5 %, while dropping the feature extraction phase reduced it further to 94.17 %, highlighting the critical impact of these processes on performance and robustness.
本文介绍了一种中性深度学习模型,用于扫描电子显微镜(SEM)图像中钯纳米颗粒的自动检测和分类,区分有序和无序结构,以准确表征纳米颗粒。该模型遵循五相管道,以提高准确性和效率。它从数据增强开始,应用旋转和翻转等转换来改善数据集的多样性。第二阶段使用嗜中性的图像分割来管理扫描电镜图像中的不确定性和噪声,允许纳米颗粒区域的精确隔离。在第三阶段,VGG-19深度神经网络提取高级特征,初步识别25,088个特征。在第四阶段,结合基尼重要度和遗传优化粗糙集(GORS)的混合方法将特征数量减少到2454个。然后使用随机森林分类器对改进的特征集进行分类,该分类器可以有效地区分有序和无序的钯纳米颗粒。为了验证其性能,该模型在含有沉积钯纳米颗粒的碳基材料的1000张SEM图像数据集上进行了评估,然后扩展到1500张图像以解决类别不平衡并最小化过拟合。实验结果突出了该模型作为SEM图像中纳米颗粒分析的高性能分类工具的强大潜力,总体准确率达到99.67%。为了评估引入相对模型性能的影响,进行了四次烧蚀实验,证明了每个相的重要性。删除数据增强和特征减少将准确率降低到约97.5%,而删除特征提取阶段将其进一步降低到94.17%,突出了这些过程对性能和鲁棒性的关键影响。
{"title":"Detecting and classifying palladium nanoparticles in microscopic images using neutrosophic deep learning","authors":"Mohamed El-dosuky ,&nbsp;Aboul Ella Hassanien ,&nbsp;Heba Alshater ,&nbsp;Rania Ahmed ,&nbsp;Sameh H. Basha ,&nbsp;Heba AboulElla ,&nbsp;Ashraf Darwish ,&nbsp;Sara Abdelghafar","doi":"10.1016/j.chemolab.2025.105619","DOIUrl":"10.1016/j.chemolab.2025.105619","url":null,"abstract":"<div><div>This paper introduces a neutrosophic deep learning model for automated detection and classification of palladium nanoparticles in scanning electron microscopy (SEM) images, distinguishing between ordered and disordered structures for accurate nanoparticle characterization. The model follows a five-phase pipeline for enhanced accuracy and efficiency. It begins with data augmentation, applying transformations like rotation and flipping to improve dataset diversity. The second phase uses neutrosophic image segmentation to manage uncertainty and noise in SEM images, allowing for the precise isolation of nanoparticle regions. In the third phase, the VGG-19 deep neural network extracts high-level features, initially identifying 25,088 features. In the fourth phase, a hybrid approach combining Gini importance and Genetic Optimized Rough Sets (GORS) reduces the number of features to 2454. The refined feature set is then classified using a Random Forest classifier, which effectively distinguishes between ordered and disordered palladium nanoparticles. To validate its performance, the proposed model was evaluated on a dataset of 1000 SEM images of carbon-based materials with deposited palladium nanoparticles, which was then expanded to 1500 images to address class imbalance and minimize overfitting. The experimental results highlight the model's strong potential as a high-performance classification tool for nanoparticle analysis in SEM images, achieving an overall accuracy of 99.67 %. To evaluate the impact of the introduced phases on the proposed model's performance, four ablation experiments were conducted, demonstrating the significance of each phase. Dropping data augmentation and feature reduction reduced accuracy approximately to 97.5 %, while dropping the feature extraction phase reduced it further to 94.17 %, highlighting the critical impact of these processes on performance and robustness.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"269 ","pages":"Article 105619"},"PeriodicalIF":3.8,"publicationDate":"2025-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145787028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A hybrid SVM-CPSO-KELM model for the simultaneous detection of methane, ethane, and ethylene via photoacoustic spectroscopy 一种混合SVM-CPSO-KELM模型用于光声光谱同时检测甲烷、乙烷和乙烯
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-12-17 DOI: 10.1016/j.chemolab.2025.105620
Meixuan Zhao, Pengcheng Gu, Yuwang Han
Photoacoustic spectroscopy (PAS) is a powerful technique for detecting trace gas mixtures, with applications spanning industrial safety, environmental monitoring, and energy systems. However, when it is applied to three crucial indicator gases methane (CH4), ethane (C2H6), and ethylene (C2H4), strong spectral overlaps introduce cross-interference that complicates accurate concentration retrieval. To address limitations in conventional chemometric and machine learning approaches—such as poor generalization across concentration ranges and vulnerability to interference—this study proposes a hybrid model integrating Support Vector Machine (SVM) classification with Chaotic Particle Swarm Optimization (CPSO)-enhanced Kernel Extreme Learning Machine (KELM). The workflow includes wavelet-based denoising, feature selection via Competitive Adaptive Reweighted Sampling (CARS), dynamic thresholding by SVM to partition samples into high- and low-concentration regimes, and the eventual regression analysis using KELM. The proposed approach significantly improves detection accuracy across a wide concentration range (0.5–500 ppm). Experimental results show that the SVM-CPSO-KELM model achieves an average prediction error of 5.44 %, with maximum error below 14.37 %.
光声光谱(PAS)是一种检测微量气体混合物的强大技术,其应用范围涵盖工业安全、环境监测和能源系统。然而,当它应用于三种关键的指示气体甲烷(CH4)、乙烷(C2H6)和乙烯(C2H4)时,强烈的光谱重叠会引入交叉干扰,使准确的浓度检索变得复杂。为了解决传统化学计量学和机器学习方法的局限性,例如跨浓度范围的较差泛化和易受干扰,本研究提出了一种将支持向量机(SVM)分类与混沌粒子群优化(CPSO)增强的核极限学习机(KELM)相结合的混合模型。工作流程包括基于小波的去噪,通过竞争自适应重加权采样(CARS)进行特征选择,通过支持向量机进行动态阈值分割,将样本划分为高浓度和低浓度区域,最后使用KELM进行回归分析。所提出的方法显着提高了宽浓度范围(0.5 - 500ppm)的检测精度。实验结果表明,SVM-CPSO-KELM模型的平均预测误差为5.44%,最大误差在14.37%以下。
{"title":"A hybrid SVM-CPSO-KELM model for the simultaneous detection of methane, ethane, and ethylene via photoacoustic spectroscopy","authors":"Meixuan Zhao,&nbsp;Pengcheng Gu,&nbsp;Yuwang Han","doi":"10.1016/j.chemolab.2025.105620","DOIUrl":"10.1016/j.chemolab.2025.105620","url":null,"abstract":"<div><div>Photoacoustic spectroscopy (PAS) is a powerful technique for detecting trace gas mixtures, with applications spanning industrial safety, environmental monitoring, and energy systems. However, when it is applied to three crucial indicator gases methane (CH<sub>4</sub>), ethane (C<sub>2</sub>H<sub>6</sub>), and ethylene (C<sub>2</sub>H<sub>4</sub>), strong spectral overlaps introduce cross-interference that complicates accurate concentration retrieval. To address limitations in conventional chemometric and machine learning approaches—such as poor generalization across concentration ranges and vulnerability to interference—this study proposes a hybrid model integrating Support Vector Machine (SVM) classification with Chaotic Particle Swarm Optimization (CPSO)-enhanced Kernel Extreme Learning Machine (KELM). The workflow includes wavelet-based denoising, feature selection via Competitive Adaptive Reweighted Sampling (CARS), dynamic thresholding by SVM to partition samples into high- and low-concentration regimes, and the eventual regression analysis using KELM. The proposed approach significantly improves detection accuracy across a wide concentration range (0.5–500 ppm). Experimental results show that the SVM-CPSO-KELM model achieves an average prediction error of 5.44 %, with maximum error below 14.37 %.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"269 ","pages":"Article 105620"},"PeriodicalIF":3.8,"publicationDate":"2025-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145787027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Chemometrics and Intelligent Laboratory Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1