首页 > 最新文献

Journal of Chemometrics最新文献

英文 中文
Stacked Target-Related Autoencoder-Extreme Learning Machine: A Novel Soft Measurement Modeling Approach for Near-Infrared Spectroscopy 堆叠目标相关自编码器-极限学习机:一种新的近红外光谱软测量建模方法
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-12-22 DOI: 10.1002/cem.70095
Shun Li, Fangkun Zhang, Shuobo Chen, Baoming Shan, Qilei Xu

This paper proposes a novel quantitative modeling and prediction approach for near-infrared (NIR) spectroscopy, combining a stacked target-related autoencoder with an extreme learning machine (STAE-ELM). The STAE performs hierarchical pre-training using multiple improved target-related autoencoders (TAEs) to extract deep spectral features highly correlated with target values. Crucially, the top-level structure of the STAE is replaced by the ELM, which serves as the final prediction model. This integration streamlines training by reducing parameters and steps while simultaneously enhancing performance through optimized initialization of the ELM's weights and biases. Compared to conventional feature selection methods and stacked autoencoders, the STAE-ELM extracts more comprehensive and target-relevant deep features from spectral data, mitigating overfitting risks. The method's efficacy was validated on five open NIR datasets, benchmarking against three approaches: feature selection modeling, SAE-based feature extraction modeling, and backpropagation-based deep network modeling. Results demonstrate that calibration models built with STAE-ELM achieved average reductions in RMSEP of 18.48%, 5.74%, and 12.14%, respectively compared to these benchmarks. Furthermore, modeling efficiency was significantly improved over the backpropagation-based deep network approach.

本文提出了一种新的近红外光谱定量建模和预测方法,该方法将堆叠目标相关自编码器与极限学习机(STAE-ELM)相结合。STAE使用多个改进的目标相关自编码器(TAEs)进行分层预训练,以提取与目标值高度相关的深度光谱特征。至关重要的是,STAE的顶层结构被ELM取代,ELM作为最终的预测模型。这种集成通过减少参数和步骤来简化训练,同时通过优化初始化ELM的权重和偏差来提高性能。与传统的特征选择方法和堆叠式自编码器相比,STAE-ELM从光谱数据中提取更全面和与目标相关的深层特征,降低了过拟合风险。在五个开放的近红外数据集上验证了该方法的有效性,并对三种方法进行了基准测试:特征选择建模、基于sae的特征提取建模和基于反向传播的深度网络建模。结果表明,与这些基准相比,使用STAE-ELM构建的校准模型的RMSEP平均降低了18.48%,5.74%和12.14%。此外,与基于反向传播的深度网络方法相比,建模效率显著提高。
{"title":"Stacked Target-Related Autoencoder-Extreme Learning Machine: A Novel Soft Measurement Modeling Approach for Near-Infrared Spectroscopy","authors":"Shun Li,&nbsp;Fangkun Zhang,&nbsp;Shuobo Chen,&nbsp;Baoming Shan,&nbsp;Qilei Xu","doi":"10.1002/cem.70095","DOIUrl":"https://doi.org/10.1002/cem.70095","url":null,"abstract":"<div>\u0000 \u0000 <p>This paper proposes a novel quantitative modeling and prediction approach for near-infrared (NIR) spectroscopy, combining a stacked target-related autoencoder with an extreme learning machine (STAE-ELM). The STAE performs hierarchical pre-training using multiple improved target-related autoencoders (TAEs) to extract deep spectral features highly correlated with target values. Crucially, the top-level structure of the STAE is replaced by the ELM, which serves as the final prediction model. This integration streamlines training by reducing parameters and steps while simultaneously enhancing performance through optimized initialization of the ELM's weights and biases. Compared to conventional feature selection methods and stacked autoencoders, the STAE-ELM extracts more comprehensive and target-relevant deep features from spectral data, mitigating overfitting risks. The method's efficacy was validated on five open NIR datasets, benchmarking against three approaches: feature selection modeling, SAE-based feature extraction modeling, and backpropagation-based deep network modeling. Results demonstrate that calibration models built with STAE-ELM achieved average reductions in RMSEP of 18.48%, 5.74%, and 12.14%, respectively compared to these benchmarks. Furthermore, modeling efficiency was significantly improved over the backpropagation-based deep network approach.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"40 1","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145891329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Application of an ECAM-ConvNeXt Model With Multichannel Spectrogram Based on Vis–NIR for Soil Property Prediction 基于多通道光谱图的ECAM-ConvNeXt模型在近红外土壤性质预测中的应用
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-12-10 DOI: 10.1002/cem.70092
Qinghao Shuai, Zhengguang Chen, Shuo Liu, Quan Wang

Vis–NIR spectroscopy is increasingly widely used for soil property analysis due to its rapid, cost-effective, and nondestructive advantages. In particular, deep learning models perform very well when working with large sample data. In this study, we propose a deep learning model based on three-channel ECAM-ConvNeXt. Firstly, the method applies three window functions, Bartlett, Gaussian, and Blackman, in the short-time Fourier transform to convert a one-dimensional spectral sequence signal into three different two-dimensional spectrograms. Next, we perform multichannel feature fusion and use the resulting triple-channel spectrograms as model inputs. This method fully preserves the temporal information and spectral characteristics of the spectral sequence, thereby improving the performance of the model. Secondly, this study introduces the Efficient Channel Attention Module in the ConvNeXt model. This module combines the advantages of the Convolutional Block Attention Module and Efficient Channel Attention Network, further enhancing the expressive ability of the network by highlighting useful information and suppressing irrelevant information. Finally, we also validate the effectiveness of multichannel inputs by deep learning models (AlexNet18, ResNet50, MobileNet-V3, EfficientNet, VIT) and compare them with existing techniques reported in the literature. The results indicate that the root-mean-square error (RMSE) of the TriCH-ECAM-ConvNeXt model in predicting soil nitrogen content (N (g/kg)), organic carbon content (OC (g/kg)), cation exchange capacity (CEC (cmol(+)/kg)), pH, clay content (%), and sand content (%) was reduced to 0.9847, 19.7347, 6.3380, 0.3812, 5.1537, and 12.9706, respectively, and the coefficient of determination (R2) increased to 0.9307, 0.9544, 0.7999, 0.9206, 0.8493, and 0.7526, respectively.

可见-近红外光谱由于其快速、经济、无损等优点,在土壤性质分析中得到越来越广泛的应用。特别是,深度学习模型在处理大样本数据时表现非常好。在本研究中,我们提出了一种基于三通道ECAM-ConvNeXt的深度学习模型。该方法首先在短时傅里叶变换中应用Bartlett、Gaussian和Blackman三个窗函数,将一维谱序列信号转换为三个不同的二维谱图。接下来,我们执行多通道特征融合,并使用得到的三通道频谱图作为模型输入。该方法充分保留了光谱序列的时间信息和光谱特征,从而提高了模型的性能。其次,在ConvNeXt模型中引入了高效通道注意模块。该模块结合了卷积块注意模块和高效通道注意网络的优点,通过突出有用信息,抑制无关信息,进一步增强网络的表达能力。最后,我们还通过深度学习模型(AlexNet18、ResNet50、MobileNet-V3、EfficientNet、VIT)验证了多通道输入的有效性,并将它们与文献中报道的现有技术进行了比较。结果表明:TriCH-ECAM-ConvNeXt模型预测土壤氮含量(N (g/kg))、有机碳含量(OC (g/kg))、阳离子交换容量(CEC (cmol(+)/kg)、pH、粘土含量(%)、砂土含量(%)的均方根误差(RMSE)分别降低至0.9847、19.7347、6.3380、0.3812、5.1537、12.9706,决定系数(R2)分别提高至0.9307、0.9544、0.7999、0.9206、0.8493、0.7526。
{"title":"Application of an ECAM-ConvNeXt Model With Multichannel Spectrogram Based on Vis–NIR for Soil Property Prediction","authors":"Qinghao Shuai,&nbsp;Zhengguang Chen,&nbsp;Shuo Liu,&nbsp;Quan Wang","doi":"10.1002/cem.70092","DOIUrl":"https://doi.org/10.1002/cem.70092","url":null,"abstract":"<div>\u0000 \u0000 <p>Vis–NIR spectroscopy is increasingly widely used for soil property analysis due to its rapid, cost-effective, and nondestructive advantages. In particular, deep learning models perform very well when working with large sample data. In this study, we propose a deep learning model based on three-channel ECAM-ConvNeXt. Firstly, the method applies three window functions, Bartlett, Gaussian, and Blackman, in the short-time Fourier transform to convert a one-dimensional spectral sequence signal into three different two-dimensional spectrograms. Next, we perform multichannel feature fusion and use the resulting triple-channel spectrograms as model inputs. This method fully preserves the temporal information and spectral characteristics of the spectral sequence, thereby improving the performance of the model. Secondly, this study introduces the Efficient Channel Attention Module in the ConvNeXt model. This module combines the advantages of the Convolutional Block Attention Module and Efficient Channel Attention Network, further enhancing the expressive ability of the network by highlighting useful information and suppressing irrelevant information. Finally, we also validate the effectiveness of multichannel inputs by deep learning models (AlexNet18, ResNet50, MobileNet-V3, EfficientNet, VIT) and compare them with existing techniques reported in the literature. The results indicate that the root-mean-square error (RMSE) of the TriCH-ECAM-ConvNeXt model in predicting soil nitrogen content (N (g/kg)), organic carbon content (OC (g/kg)), cation exchange capacity (CEC (cmol(+)/kg)), pH, clay content (%), and sand content (%) was reduced to 0.9847, 19.7347, 6.3380, 0.3812, 5.1537, and 12.9706, respectively, and the coefficient of determination (<i>R</i><sup>2</sup>) increased to 0.9307, 0.9544, 0.7999, 0.9206, 0.8493, and 0.7526, respectively.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 12","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145750946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rapid Multi-Indicator Quality Evaluation of Ginger Using Genetic Algorithm and Near-Infrared Spectroscopy 基于遗传算法和近红外光谱的生姜多指标快速质量评价
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-12-10 DOI: 10.1002/cem.70090
Tianshu Wang, Chengwu Chen, Hui Yan, Kongfa Hu, Xichen Yang, Xia Zhang, Guisheng Zhou, Jinao Duan

To achieve rapid and comprehensive evaluation of the quality of ginger, a rapid multi-indicator quality evaluation method based on genetic algorithm and near-infrared spectroscopy technology is proposed to detect the content of multiple compounds (6-gingerol, 8-gingerol, 10-gingerol, 6-shogaol, and zingerone). First, the near-infrared spectra of ginger samples is collected. Then, the spectra is preprocessed to reduce the noise. Next, features of the spectra are extracted through the genetic algorithm where the population initialization and fitness function methods are designed. Finally, the prediction model is generated through regression. Experimental results demonstrate that the proposed method achieves higher R values (0.9052, 0.9107, 0.9269, 0.9843, 0.9030) compared to the traditional PLSR model (0.6666, 0.51, 0.4358, 0.9248, 0.4846) for 6-gingerol, 8-gingerol, 10-gingerol, 6-shogaol, and zingerone, respectively. Therefore, the proposed method can reduce prediction errors and improve the performance of near-infrared spectroscopy quantitative analysis model for ginger.

为实现对生姜质量的快速综合评价,提出了一种基于遗传算法和近红外光谱技术的快速多指标质量评价方法,检测6-姜辣素、8-姜辣素、10-姜辣素、6-姜辣素和姜酮等多种化合物的含量。首先,采集生姜样品的近红外光谱。然后,对光谱进行预处理,去除噪声。其次,通过遗传算法提取光谱特征,设计种群初始化和适应度函数方法;最后,通过回归生成预测模型。实验结果表明,与传统PLSR模型(0.6666、0.51、0.4358、0.9248、0.4846)相比,该方法对6-姜辣素、8-姜辣素、10-姜辣素、6-姜辣素和姜辣素分别获得了更高的R值(0.9052、0.9107、0.9269、0.9843、0.9030)。因此,该方法可以降低生姜近红外光谱定量分析模型的预测误差,提高模型的性能。
{"title":"Rapid Multi-Indicator Quality Evaluation of Ginger Using Genetic Algorithm and Near-Infrared Spectroscopy","authors":"Tianshu Wang,&nbsp;Chengwu Chen,&nbsp;Hui Yan,&nbsp;Kongfa Hu,&nbsp;Xichen Yang,&nbsp;Xia Zhang,&nbsp;Guisheng Zhou,&nbsp;Jinao Duan","doi":"10.1002/cem.70090","DOIUrl":"https://doi.org/10.1002/cem.70090","url":null,"abstract":"<div>\u0000 \u0000 <p>To achieve rapid and comprehensive evaluation of the quality of ginger, a rapid multi-indicator quality evaluation method based on genetic algorithm and near-infrared spectroscopy technology is proposed to detect the content of multiple compounds (6-gingerol, 8-gingerol, 10-gingerol, 6-shogaol, and zingerone). First, the near-infrared spectra of ginger samples is collected. Then, the spectra is preprocessed to reduce the noise. Next, features of the spectra are extracted through the genetic algorithm where the population initialization and fitness function methods are designed. Finally, the prediction model is generated through regression. Experimental results demonstrate that the proposed method achieves higher R values (0.9052, 0.9107, 0.9269, 0.9843, 0.9030) compared to the traditional PLSR model (0.6666, 0.51, 0.4358, 0.9248, 0.4846) for 6-gingerol, 8-gingerol, 10-gingerol, 6-shogaol, and zingerone, respectively. Therefore, the proposed method can reduce prediction errors and improve the performance of near-infrared spectroscopy quantitative analysis model for ginger.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 12","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145750947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimating Maize Canopy Nitrogen and Chlorophyll Content Using CNN-GRU-CBAM and Hyperspectral Imagery 利用CNN-GRU-CBAM和高光谱影像估算玉米冠层氮和叶绿素含量
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-12-04 DOI: 10.1002/cem.70093
Haoquan Kong, Li Tian, Shujuan Yi, Yuhui Jia, Weiwei Guo, Hanlin Xu, Yongzhi Liu

Rapid, noninvasive quantification of canopy nitrogen (N) and chlorophyll (Chl) content is critical for precision nitrogen management in maize cultivation. Although near-infrared spectroscopy (near-infrared spectroscopy, NIRS) offers a viable approach for biochemical component analysis, conventional machine learning models often fail to capture the complex nonlinear relationships inherent in spectral data and lack interpretability, limiting their robustness for real-time inversion tasks. To address these limitations, this study introduces a hybrid deep learning architecture combining convolutional neural networks (CNNs) and gated recurrent units (GRUs), augmented by a convolutional block attention module (CBAM), integrated with explainable artificial intelligence for accurate biochemical content inversion. Preprocessing of hyperspectral images from 200 maize canopy samples via sequential Savitzky–Golay smoothing (SG), standard normal variate (SG-SNV), and SG transformations enhanced mean test set R2 by 0.016 units. Subsequent dimensionality reduction via the successive projection algorithm (SPA) and competitive adaptive reweighting sampling (CARS) significantly reduced spectral features from 176 to 10 and 22 bands, respectively. The core predictive model synergistically combines CNNs and GRUs, augmented by a CBAM to enhance feature extraction and temporal dependency modeling. Comparative evaluation demonstrates the superior performance of CNN-GRU-CBAM over traditional machine learning and alternative deep learning models. For the test set, it achieved R2 values of 0.934 (N) and 0.788 (Chl), with corresponding root mean square error (RMSE) values of 1.940 and 0.216. Model interpretability was rigorously validated using Shapley Additive Explanations (SHAP), identifying key spectral regions driving predictions. This work innovatively bridges high-performance deep learning with explainable artificial intelligence, enabling precise, nondestructive estimation of maize foliar biochemical constituents. The framework provides a transferable approach for biochemical content inversion in diverse crops.

快速、无创地定量测定玉米冠层氮(N)和叶绿素(Chl)含量对玉米种植中氮的精确管理至关重要。尽管近红外光谱(near-infrared spectroscopy, NIRS)为生化成分分析提供了一种可行的方法,但传统的机器学习模型往往无法捕捉光谱数据中固有的复杂非线性关系,并且缺乏可解释性,从而限制了它们对实时反演任务的鲁棒性。为了解决这些限制,本研究引入了一种混合深度学习架构,该架构结合了卷积神经网络(cnn)和门控循环单元(gru),由卷积块注意模块(CBAM)增强,与可解释的人工智能集成,用于精确的生化内容反转。通过序贯Savitzky-Golay平滑(SG)、标准正态变量(SG- snv)和SG变换对200个玉米冠层样品的高光谱图像进行预处理,使平均检验集R2提高了0.016个单位。随后通过连续投影算法(SPA)和竞争自适应重加权采样(CARS)进行降维,将光谱特征分别从176个波段降至10个和22个波段。核心预测模型将cnn和gru协同结合,并通过CBAM增强特征提取和时间依赖建模。对比评估表明CNN-GRU-CBAM优于传统的机器学习和替代深度学习模型。对于检验集,其R2值分别为0.934 (N)和0.788 (Chl),相应的均方根误差(RMSE)值分别为1.940和0.216。使用Shapley加性解释(SHAP)严格验证了模型的可解释性,确定了驱动预测的关键光谱区域。这项工作创新性地将高性能深度学习与可解释的人工智能结合起来,实现了对玉米叶片生化成分的精确、无损估计。该框架为不同作物的生化含量反演提供了一种可转移的方法。
{"title":"Estimating Maize Canopy Nitrogen and Chlorophyll Content Using CNN-GRU-CBAM and Hyperspectral Imagery","authors":"Haoquan Kong,&nbsp;Li Tian,&nbsp;Shujuan Yi,&nbsp;Yuhui Jia,&nbsp;Weiwei Guo,&nbsp;Hanlin Xu,&nbsp;Yongzhi Liu","doi":"10.1002/cem.70093","DOIUrl":"https://doi.org/10.1002/cem.70093","url":null,"abstract":"<div>\u0000 \u0000 <p>Rapid, noninvasive quantification of canopy nitrogen (N) and chlorophyll (Chl) content is critical for precision nitrogen management in maize cultivation. Although near-infrared spectroscopy (near-infrared spectroscopy, NIRS) offers a viable approach for biochemical component analysis, conventional machine learning models often fail to capture the complex nonlinear relationships inherent in spectral data and lack interpretability, limiting their robustness for real-time inversion tasks. To address these limitations, this study introduces a hybrid deep learning architecture combining convolutional neural networks (CNNs) and gated recurrent units (GRUs), augmented by a convolutional block attention module (CBAM), integrated with explainable artificial intelligence for accurate biochemical content inversion. Preprocessing of hyperspectral images from 200 maize canopy samples via sequential Savitzky–Golay smoothing (SG), standard normal variate (SG-SNV), and SG transformations enhanced mean test set <i>R</i><sup>2</sup> by 0.016 units. Subsequent dimensionality reduction via the successive projection algorithm (SPA) and competitive adaptive reweighting sampling (CARS) significantly reduced spectral features from 176 to 10 and 22 bands, respectively. The core predictive model synergistically combines CNNs and GRUs, augmented by a CBAM to enhance feature extraction and temporal dependency modeling. Comparative evaluation demonstrates the superior performance of CNN-GRU-CBAM over traditional machine learning and alternative deep learning models. For the test set, it achieved <i>R</i><sup>2</sup> values of 0.934 (N) and 0.788 (Chl), with corresponding root mean square error (RMSE) values of 1.940 and 0.216. Model interpretability was rigorously validated using Shapley Additive Explanations (SHAP), identifying key spectral regions driving predictions. This work innovatively bridges high-performance deep learning with explainable artificial intelligence, enabling precise, nondestructive estimation of maize foliar biochemical constituents. The framework provides a transferable approach for biochemical content inversion in diverse crops.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 12","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145695015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quantitative Detection of Trace Characteristic VOCs of Liver Metabolic Diseases Based on the MEMS Sensor Array 基于MEMS传感器阵列的肝脏代谢性疾病痕量特征VOCs定量检测
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-12-03 DOI: 10.1002/cem.70089
Cheng Zhang, Yao Tian, Ze Zhang, Lingmin Yu, Hairong Wang

There are a variety of volatile organic compounds (VOCs) gases in human exhalation, and among them isoprene, ethanol, and formaldehyde can be used as biomarkers for liver metabolic diseases. In order to accurately detect these trace-concentration VOC gases, a sensor array was built with 4 MEMS gas sensors, and one of them was the self-developed sensor, which has a very high response to isoprene. To improve prediction accuracy of gas concentration, we investigated the convolutional neural network with a Multi-Expert Temporal Fusion Network (METF-Net) model based on multitask learning. Based on the MEMS sensor array, the isoprene, ethanol, and formaldehyde at sub ppm level can be correctly identified; their RMSEs of isoprene, ethanol, and formaldehyde are 33.48, 64.01, and 18.84 ppb, and the predicted concentrations with error rates of isoprene, ethanol, and formaldehyde are 6.70%, 6.40%, and 9.42%, respectively. This method has the potential of being applied in the screening of liver metabolic diseases at an early stage.

人体呼出气体中有多种挥发性有机化合物(VOCs)气体,其中异戊二烯、乙醇和甲醛可作为肝脏代谢性疾病的生物标志物。为了准确检测这些痕量VOC气体,我们构建了一个由4个MEMS气体传感器组成的传感器阵列,其中一个是自主研发的传感器,该传感器对异戊二烯具有很高的响应。为了提高气体浓度的预测精度,研究了基于多任务学习的多专家时间融合网络(METF-Net)卷积神经网络模型。基于MEMS传感器阵列,可以正确识别亚ppm水平的异戊二烯、乙醇和甲醛;异戊二烯、乙醇和甲醛的rmse分别为33.48、64.01和18.84 ppb,异戊二烯、乙醇和甲醛的预测错误率分别为6.70%、6.40%和9.42%。该方法具有应用于肝脏代谢性疾病早期筛查的潜力。
{"title":"Quantitative Detection of Trace Characteristic VOCs of Liver Metabolic Diseases Based on the MEMS Sensor Array","authors":"Cheng Zhang,&nbsp;Yao Tian,&nbsp;Ze Zhang,&nbsp;Lingmin Yu,&nbsp;Hairong Wang","doi":"10.1002/cem.70089","DOIUrl":"https://doi.org/10.1002/cem.70089","url":null,"abstract":"<div>\u0000 \u0000 <p>There are a variety of volatile organic compounds (VOCs) gases in human exhalation, and among them isoprene, ethanol, and formaldehyde can be used as biomarkers for liver metabolic diseases. In order to accurately detect these trace-concentration VOC gases, a sensor array was built with 4 MEMS gas sensors, and one of them was the self-developed sensor, which has a very high response to isoprene. To improve prediction accuracy of gas concentration, we investigated the convolutional neural network with a Multi-Expert Temporal Fusion Network (METF-Net) model based on multitask learning. Based on the MEMS sensor array, the isoprene, ethanol, and formaldehyde at sub ppm level can be correctly identified; their RMSEs of isoprene, ethanol, and formaldehyde are 33.48, 64.01, and 18.84 ppb, and the predicted concentrations with error rates of isoprene, ethanol, and formaldehyde are 6.70%, 6.40%, and 9.42%, respectively. This method has the potential of being applied in the screening of liver metabolic diseases at an early stage.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 12","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145695018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Construction and Improvement of a Model for Quantifying Blood Glucose Concentration Using Mid-Infrared Spectroscopy 中红外光谱测定血糖浓度模型的建立与改进
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-12-02 DOI: 10.1002/cem.70091
Yuta Takami, Keita Miyagawa, Yuki Tsuda, Koichi Akiyama, Yuji Matsuura, Hiromasa Kaneko

Measurements of blood glucose concentration use invasive methods such as venous blood sampling, and finger-prick blood testing using self-monitoring blood glucose meters with subcutaneous sensors. For daily use, the development of noninvasive blood glucose measurement methods is required. In this study, we constructed a model to estimate blood glucose concentrations noninvasively from mid-infrared absorption spectra measured using photothermal deflectometry enhanced by total internal reflection. We improved the estimation accuracy of the model using Savitzky–Golay preprocessing and the Boruta variable selection method. In addition, the model was corrected using subject data from the first day of measurements to improve estimation accuracy.

血糖浓度的测量使用侵入性方法,如静脉血采样和手指刺血测试,使用带有皮下传感器的自我监测血糖仪。对于日常使用,需要开发无创血糖测量方法。在这项研究中,我们构建了一个模型,利用全内反射增强的光热偏转法测量中红外吸收光谱,无创地估计血糖浓度。采用Savitzky-Golay预处理和Boruta变量选择方法提高了模型的估计精度。此外,使用测量第一天的受试者数据对模型进行校正,以提高估计精度。
{"title":"Construction and Improvement of a Model for Quantifying Blood Glucose Concentration Using Mid-Infrared Spectroscopy","authors":"Yuta Takami,&nbsp;Keita Miyagawa,&nbsp;Yuki Tsuda,&nbsp;Koichi Akiyama,&nbsp;Yuji Matsuura,&nbsp;Hiromasa Kaneko","doi":"10.1002/cem.70091","DOIUrl":"https://doi.org/10.1002/cem.70091","url":null,"abstract":"<p>Measurements of blood glucose concentration use invasive methods such as venous blood sampling, and finger-prick blood testing using self-monitoring blood glucose meters with subcutaneous sensors. For daily use, the development of noninvasive blood glucose measurement methods is required. In this study, we constructed a model to estimate blood glucose concentrations noninvasively from mid-infrared absorption spectra measured using photothermal deflectometry enhanced by total internal reflection. We improved the estimation accuracy of the model using Savitzky–Golay preprocessing and the Boruta variable selection method. In addition, the model was corrected using subject data from the first day of measurements to improve estimation accuracy.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 12","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/epdf/10.1002/cem.70091","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145695116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Partial Least Squares 偏最小二乘
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-12-01 DOI: 10.1002/cem.70069
Richard G. Brereton
<p>In the previous article, we discussed the enormous increase in the impact of chemometrics methods over the last four decades [<span>1</span>] and the important role PLS (partial least squares or projection to latent structures) has had in this revolution. However, we are yet to describe this technique, which will be the subject of this and subsequent articles.</p><p>There are many thousands, or perhaps tens of thousands, of theoretical, methodological and tutorial articles about PLS over the last 50 years, and possibly many hundreds of thousands of articles involving the use of this approach. In the early decades of the development of chemometrics as a coherent discipline in the 1980s and 1990s, there was a significant focus on PLS, but still after so many decades, it still spawns new insights. There are conferences dedicated to PLS. This article is therefore only one of very many such articles, but PLS can be approached in endless ways, and no general introduction to chemometrics is complete without describing this method.</p><p>PLS was first proposed in the 1960s by Herman Wold [<span>2, 3</span>]. The method was slowly introduced to chemometrics with a significant expansion in interest in the 1980s. Svante Wold first publicised its applicability in the 1970s and 1980s [<span>4, 5</span>]. Early pioneers of the 1980s include Paul Geladi [<span>6</span>], Harald Martens and Tormod Naes [<span>7</span>] who wrote classical articles/books that to this day are still viewed as essential reading. During the 1980s, there were numerous conferences, software developments and courses on PLS. This development was not only important in chemistry but also in economics and social sciences.</p><p>The original PLS algorithm, called PLS1, was enhanced during this period, most notably by PLS2 but also by many other developments, which continue to this day. New theoretical articles on the properties of PLS continue as topical areas for research.</p><p>As originally described, PLS was used for quantitative regression or calibration, sometimes distinguished by the terminology PLSR (PLS regression), where the <i>c</i> block was a continuous variable, such as a concentration, reaction rate or activity. Most of the early applications in chemistry were, for example, in NIR spectroscopy, where the aim was to calibrate the spectra to the concentration of an analyte or a class of compounds on a continuous scale.</p><p>However, over the past few years, PLSDA (PLS discriminant analysis) [<span>12</span>] has become an important technique used for multivariate classification. In this case, the <i>c</i> block is discrete, representing a numerical label or a classifier. Typically, if there are two groups, <i>c</i> = +1 for group A, and <i>c</i> = −1 for group B. For multiple groups, there are several modifications [<span>13</span>] available.</p><p>In subsequent articles, we will look at the properties of the matrices obtained using the PLS1 algorithm and how they fundament
在上一篇文章中,我们讨论了化学计量学方法在过去四十年中影响的巨大增长,以及PLS(偏最小二乘或隐结构投影)在这场革命中所起的重要作用。然而,我们还没有描述这种技术,这将是本文和后续文章的主题。在过去的50年里,关于PLS的理论、方法和教程文章有成千上万篇,可能还有数十万篇文章涉及到这种方法的使用。在20世纪80年代和90年代,化学计量学作为一门连贯的学科发展的最初几十年里,人们对PLS有了很大的关注,但在这么多年之后,它仍然产生了新的见解。有专门讨论PLS的会议。因此,本文只是许多此类文章中的一篇,但是PLS可以以无穷无尽的方式接近,并且没有描述这种方法的化学计量学的一般介绍是完整的。PLS最早由Herman Wold在20世纪60年代提出[2,3]。该方法在20世纪80年代逐渐被引入化学计量学,并引起了极大的兴趣。Svante world在20世纪70年代和80年代首次公布了其适用性[4,5]。20世纪80年代早期的先驱包括Paul Geladi b[6], Harald Martens和Tormod Naes b[7],他们写的经典文章/书籍至今仍被视为必不可少的读物。在20世纪80年代,有许多关于PLS的会议、软件开发和课程,这一发展不仅在化学领域很重要,而且在经济和社会科学领域也很重要。最初的PLS算法,称为PLS1,在此期间得到了增强,最明显的是PLS2,但也有许多其他的发展,一直持续到今天。关于PLS性质的新理论文章继续成为热门研究领域。如最初所述,PLS用于定量回归或校准,有时用术语PLSR (PLS回归)来区分,其中c块是连续变量,如浓度,反应速率或活性。例如,在化学领域的大多数早期应用都是在近红外光谱中,其目的是在连续刻度上校准光谱以确定分析物或一类化合物的浓度。然而,在过去的几年里,PLS判别分析(PLS discriminant analysis, PLSDA)[12]已经成为一种重要的多变量分类技术。在这种情况下,c块是离散的,表示数字标签或分类器。通常情况下,如果有两个组,则A组c = +1, b组c =−1。如果有多个组,则可以修改[13]。在随后的文章中,我们将研究使用PLS1算法获得的矩阵的属性,以及它们与PCA的根本区别,尽管其中一些具有相同的名称。作者声明无利益冲突。数据共享不适用于本文,因为在当前研究期间没有生成或分析数据集。
{"title":"Partial Least Squares","authors":"Richard G. Brereton","doi":"10.1002/cem.70069","DOIUrl":"https://doi.org/10.1002/cem.70069","url":null,"abstract":"&lt;p&gt;In the previous article, we discussed the enormous increase in the impact of chemometrics methods over the last four decades [&lt;span&gt;1&lt;/span&gt;] and the important role PLS (partial least squares or projection to latent structures) has had in this revolution. However, we are yet to describe this technique, which will be the subject of this and subsequent articles.&lt;/p&gt;&lt;p&gt;There are many thousands, or perhaps tens of thousands, of theoretical, methodological and tutorial articles about PLS over the last 50 years, and possibly many hundreds of thousands of articles involving the use of this approach. In the early decades of the development of chemometrics as a coherent discipline in the 1980s and 1990s, there was a significant focus on PLS, but still after so many decades, it still spawns new insights. There are conferences dedicated to PLS. This article is therefore only one of very many such articles, but PLS can be approached in endless ways, and no general introduction to chemometrics is complete without describing this method.&lt;/p&gt;&lt;p&gt;PLS was first proposed in the 1960s by Herman Wold [&lt;span&gt;2, 3&lt;/span&gt;]. The method was slowly introduced to chemometrics with a significant expansion in interest in the 1980s. Svante Wold first publicised its applicability in the 1970s and 1980s [&lt;span&gt;4, 5&lt;/span&gt;]. Early pioneers of the 1980s include Paul Geladi [&lt;span&gt;6&lt;/span&gt;], Harald Martens and Tormod Naes [&lt;span&gt;7&lt;/span&gt;] who wrote classical articles/books that to this day are still viewed as essential reading. During the 1980s, there were numerous conferences, software developments and courses on PLS. This development was not only important in chemistry but also in economics and social sciences.&lt;/p&gt;&lt;p&gt;The original PLS algorithm, called PLS1, was enhanced during this period, most notably by PLS2 but also by many other developments, which continue to this day. New theoretical articles on the properties of PLS continue as topical areas for research.&lt;/p&gt;&lt;p&gt;As originally described, PLS was used for quantitative regression or calibration, sometimes distinguished by the terminology PLSR (PLS regression), where the &lt;i&gt;c&lt;/i&gt; block was a continuous variable, such as a concentration, reaction rate or activity. Most of the early applications in chemistry were, for example, in NIR spectroscopy, where the aim was to calibrate the spectra to the concentration of an analyte or a class of compounds on a continuous scale.&lt;/p&gt;&lt;p&gt;However, over the past few years, PLSDA (PLS discriminant analysis) [&lt;span&gt;12&lt;/span&gt;] has become an important technique used for multivariate classification. In this case, the &lt;i&gt;c&lt;/i&gt; block is discrete, representing a numerical label or a classifier. Typically, if there are two groups, &lt;i&gt;c&lt;/i&gt; = +1 for group A, and &lt;i&gt;c&lt;/i&gt; = −1 for group B. For multiple groups, there are several modifications [&lt;span&gt;13&lt;/span&gt;] available.&lt;/p&gt;&lt;p&gt;In subsequent articles, we will look at the properties of the matrices obtained using the PLS1 algorithm and how they fundament","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 12","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/epdf/10.1002/cem.70069","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145695002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine Learning–Driven Near- and Mid-Infrared Chemometrics for Rapid, Cost-Effective COVID-19 Screening in Dried Plasma 机器学习驱动的近红外和中红外化学计量学在干燥血浆中快速、经济地筛查COVID-19
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-11-24 DOI: 10.1002/cem.70087
Fernanda F. S. Oliveira, Wilson J. Cardoso, Luiz F. P. Ramos, Túlio R. Freitas, Priscilla S. Filgueiras, Rafaella F. Q. Grenfell, Reinaldo F. Teófilo, Adriano de Paula Sabino

The COVID-19 pandemic has highlighted the urgent need for rapid, accurate, and cost-effective diagnostic alternatives to conventional methods such as RT-qPCR and immunoassays. In this study, we explored the potential of vibrational spectroscopy, specifically near-infrared (NIR) and mid-infrared (MID) spectroscopy, for detecting SARS-CoV-2 infection in dried plasma samples. Spectral data were obtained from 83 patients (45 COVID-19 positive and 38 negative) and analyzed using partial least squares discriminant analysis (PLS-DA), with and without variable selection by the ordered predictors selection for discriminant analysis (OPSDA) method. While initial models using full spectra showed moderate classification accuracy, the application of OPSDA significantly enhanced model performance. For the NIR dataset, OPSDA-based models achieved 100% sensitivity and specificity in both training and test sets (n = 25). For the MID dataset, the test set (n = 25) sensitivity reached 86%, with 100% specificity. These results demonstrate that NIR and MID spectroscopy, when combined with advanced chemometric approaches, can provide reliable, rapid, and low-cost screening for COVID-19. This platform holds promise for broader applications in clinical diagnostics beyond the current pandemic.

COVID-19大流行突出表明,迫切需要快速、准确和具有成本效益的诊断替代方法,以替代RT-qPCR和免疫测定等传统方法。在这项研究中,我们探索了振动光谱,特别是近红外(NIR)和中红外(MID)光谱检测干燥血浆样品中SARS-CoV-2感染的潜力。83例患者(45例阳性,38例阴性)的光谱数据采用偏最小二乘判别分析(PLS-DA)进行分析,采用有序预测因子选择判别分析(OPSDA)方法进行变量选择和不进行变量选择。虽然使用全光谱的初始模型具有中等的分类精度,但OPSDA的应用显著提高了模型的性能。对于NIR数据集,基于opsda的模型在训练集和测试集(n = 25)中都达到了100%的灵敏度和特异性。对于MID数据集,测试集(n = 25)的灵敏度达到86%,特异性为100%。这些结果表明,近红外光谱和MID光谱与先进的化学计量方法相结合,可以提供可靠、快速和低成本的COVID-19筛查。该平台有望在当前大流行之外的临床诊断中得到更广泛的应用。
{"title":"Machine Learning–Driven Near- and Mid-Infrared Chemometrics for Rapid, Cost-Effective COVID-19 Screening in Dried Plasma","authors":"Fernanda F. S. Oliveira,&nbsp;Wilson J. Cardoso,&nbsp;Luiz F. P. Ramos,&nbsp;Túlio R. Freitas,&nbsp;Priscilla S. Filgueiras,&nbsp;Rafaella F. Q. Grenfell,&nbsp;Reinaldo F. Teófilo,&nbsp;Adriano de Paula Sabino","doi":"10.1002/cem.70087","DOIUrl":"https://doi.org/10.1002/cem.70087","url":null,"abstract":"<p>The COVID-19 pandemic has highlighted the urgent need for rapid, accurate, and cost-effective diagnostic alternatives to conventional methods such as RT-qPCR and immunoassays. In this study, we explored the potential of vibrational spectroscopy, specifically near-infrared (NIR) and mid-infrared (MID) spectroscopy, for detecting SARS-CoV-2 infection in dried plasma samples. Spectral data were obtained from 83 patients (45 COVID-19 positive and 38 negative) and analyzed using partial least squares discriminant analysis (PLS-DA), with and without variable selection by the ordered predictors selection for discriminant analysis (OPSDA) method. While initial models using full spectra showed moderate classification accuracy, the application of OPSDA significantly enhanced model performance. For the NIR dataset, OPSDA-based models achieved 100% sensitivity and specificity in both training and test sets (<i>n</i> = 25). For the MID dataset, the test set (<i>n</i> = 25) sensitivity reached 86%, with 100% specificity. These results demonstrate that NIR and MID spectroscopy, when combined with advanced chemometric approaches, can provide reliable, rapid, and low-cost screening for COVID-19. This platform holds promise for broader applications in clinical diagnostics beyond the current pandemic.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 12","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/epdf/10.1002/cem.70087","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145585271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Developing Digital Twin Visualizations: A Methodology and Case Study on Chemical Separation Processing 发展数字孪生可视化:化学分离处理的方法和案例研究
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-11-18 DOI: 10.1002/cem.70085
Adam Pluth, Kolton Heaps, Samantha Thueson, Jack C. Dunker, Ashley Shields

As advances in digital engineering continue to push the technological boundaries, digital twin (DT) visualizations for diagnostics and safeguards advancement become much more feasible and practical. DTs generate large and complex data streams that require effective user interfaces to provide monitoring and diagnostic capabilities. Unfortunately, while these frameworks exist, there is not much research on the systematic documentation of human–computer interaction (HCI) for DT visualization. This work presents a dual-mode visualization methodology (two dimensional [2D] graphical user interface dashboard and 3D mixed reality) designed to support diagnostic tasks in DT systems and building on a validated framework and applying established HCI principles. The methodology is demonstrated through a case study of aqueous processing at Idaho National Laboratory, using experimental data from the chemical solvent extraction runs. Our interfaces display real-time alerts and monitoring to inform users of safeguards anomalies. The interfaces use immersive 3D mixed-reality visualization for further system and experiment investigation. This work demonstrates how the systematic application of HCI principles can inform DT visualization design for diagnostic and safeguards applications. While formal user evaluation studies remain as future work, this paper documents the systematic design methodology and demonstrates a proof-of-concept implementation.

随着数字工程的进步不断推动技术界限,用于诊断和保障进步的数字孪生(DT)可视化变得更加可行和实用。dt生成庞大而复杂的数据流,需要有效的用户界面来提供监控和诊断功能。不幸的是,虽然这些框架存在,但对于DT可视化的人机交互(HCI)系统文档的研究并不多。这项工作提出了一种双模式可视化方法(二维[2D]图形用户界面仪表板和3D混合现实),旨在支持DT系统中的诊断任务,并建立在经过验证的框架上,并应用已建立的HCI原则。该方法通过爱达荷国家实验室的水处理案例研究进行了演示,使用了化学溶剂萃取运行的实验数据。我们的界面显示实时警报和监控,以通知用户的安全异常。界面采用沉浸式三维混合现实可视化,便于进一步的系统和实验研究。这项工作展示了HCI原理的系统应用如何为诊断和保障应用的DT可视化设计提供信息。虽然正式的用户评估研究仍是未来的工作,但本文记录了系统的设计方法,并演示了概念验证的实现。
{"title":"Developing Digital Twin Visualizations: A Methodology and Case Study on Chemical Separation Processing","authors":"Adam Pluth,&nbsp;Kolton Heaps,&nbsp;Samantha Thueson,&nbsp;Jack C. Dunker,&nbsp;Ashley Shields","doi":"10.1002/cem.70085","DOIUrl":"https://doi.org/10.1002/cem.70085","url":null,"abstract":"<p>As advances in digital engineering continue to push the technological boundaries, digital twin (DT) visualizations for diagnostics and safeguards advancement become much more feasible and practical. DTs generate large and complex data streams that require effective user interfaces to provide monitoring and diagnostic capabilities. Unfortunately, while these frameworks exist, there is not much research on the systematic documentation of human–computer interaction (HCI) for DT visualization. This work presents a dual-mode visualization methodology (two dimensional [2D] graphical user interface dashboard and 3D mixed reality) designed to support diagnostic tasks in DT systems and building on a validated framework and applying established HCI principles. The methodology is demonstrated through a case study of aqueous processing at Idaho National Laboratory, using experimental data from the chemical solvent extraction runs. Our interfaces display real-time alerts and monitoring to inform users of safeguards anomalies. The interfaces use immersive 3D mixed-reality visualization for further system and experiment investigation. This work demonstrates how the systematic application of HCI principles can inform DT visualization design for diagnostic and safeguards applications. While formal user evaluation studies remain as future work, this paper documents the systematic design methodology and demonstrates a proof-of-concept implementation.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 11","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/epdf/10.1002/cem.70085","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145580917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing Metabolomics Analysis: Performance Evaluation of OPLS-DA and OPLS-EP Models 增强代谢组学分析:OPLS-DA和OPLS-EP模型的性能评价
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-11-17 DOI: 10.1002/cem.70086
Oleksandr Ilchenko, Antti Henrik

In the analysis of metabolomics data, selecting the appropriate statistical approach is crucial for maximizing model interpretation, predictivity and reliability. This study evaluates the effectiveness of Orthogonal Partial Least Squares (OPLS) models, specifically comparing OPLS-DA (assuming sample independence) and OPLS-EP (assuming sample dependency) in datasets of bacterial samples under different experimental conditions. OPLS-EP consistently demonstrates superior predictive performance, evidenced by higher predictive ability by means of cross-validation (Q2) compared to OPLS-DA, indicating greater model significance. Our findings prove the advantages of the paired statistical approach. This approach ensures that treatment effects are accurately measured by minimizing inter-sample variation and enhancing signal detection. Previous research in metabolomics has demonstrated the benefits of this method for biomarker sensitivity, particularly in matched case–control studies. The present study extends this understanding by applying paired statistical approaches to bacterial isolate treatments, offering novel insights into their utility. Overall, the findings emphasize the importance of OPLS-EP in enhancing biomarker sensitivity and model reliability in metabolomics research.

在代谢组学数据分析中,选择合适的统计方法对于最大化模型解释、预测和可靠性至关重要。本研究评估了正交偏最小二乘(OPLS)模型的有效性,具体比较了不同实验条件下细菌样本数据集上的OPLS- da(假设样本独立)和OPLS- ep(假设样本依赖)。与OPLS-DA相比,通过交叉验证(Q2), OPLS-EP具有更高的预测能力,显示出更强的模型意义。我们的发现证明了配对统计方法的优势。该方法通过最小化样本间变化和增强信号检测,确保了处理效果的准确测量。先前的代谢组学研究已经证明了这种方法对生物标志物敏感性的好处,特别是在匹配的病例对照研究中。本研究通过将配对统计方法应用于细菌分离治疗,扩展了这一理解,为其效用提供了新的见解。总的来说,这些发现强调了OPLS-EP在提高代谢组学研究中生物标志物敏感性和模型可靠性方面的重要性。
{"title":"Enhancing Metabolomics Analysis: Performance Evaluation of OPLS-DA and OPLS-EP Models","authors":"Oleksandr Ilchenko,&nbsp;Antti Henrik","doi":"10.1002/cem.70086","DOIUrl":"https://doi.org/10.1002/cem.70086","url":null,"abstract":"<p>In the analysis of metabolomics data, selecting the appropriate statistical approach is crucial for maximizing model interpretation, predictivity and reliability. This study evaluates the effectiveness of Orthogonal Partial Least Squares (OPLS) models, specifically comparing OPLS-DA (assuming sample independence) and OPLS-EP (assuming sample dependency) in datasets of bacterial samples under different experimental conditions. OPLS-EP consistently demonstrates superior predictive performance, evidenced by higher predictive ability by means of cross-validation (Q2) compared to OPLS-DA, indicating greater model significance. Our findings prove the advantages of the paired statistical approach. This approach ensures that treatment effects are accurately measured by minimizing inter-sample variation and enhancing signal detection. Previous research in metabolomics has demonstrated the benefits of this method for biomarker sensitivity, particularly in matched case–control studies. The present study extends this understanding by applying paired statistical approaches to bacterial isolate treatments, offering novel insights into their utility. Overall, the findings emphasize the importance of OPLS-EP in enhancing biomarker sensitivity and model reliability in metabolomics research.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 11","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/epdf/10.1002/cem.70086","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145580726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Chemometrics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1