首页 > 最新文献

Chemometrics and Intelligent Laboratory Systems最新文献

英文 中文
Time series analysis of nucleic acid reactions via a generalized transformer model 基于广义变压器模型的核酸反应时间序列分析
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-09-06 DOI: 10.1016/j.chemolab.2025.105522
Canfeng Liu , Binhui Wang , Hui Dong , Yihan Pan , Jiawen Lin , Jintian Yang , Yihui Tao , Hao Sun
The contemporary landscape of medical diagnostics and therapeutic interventions has witnessed a remarkable surge in the production of time series data. Artificial intelligence (AI), particularly the deep learning, has presented promising values in investigating the high-dimension and meaningful significance hidden behind these diagnostic data. In this work, we propose a novel analytics for intelligent nucleic acid amplification tests (NAAT) based on deep learning and paper microfluidics. On-chip amplification data were straightforwardly fed to a deep learning model derived from Transformer neural network. To facilitate the development and deployment of the approach, we conducted a lightweight processing of the Transformer model. Then, the capacity of the model for accurately predicting the reaction trend and end-point value was validated. We also employed ablation experiments to evaluate the effects of various parameters on prediction performance followed by optimizing the model. Then, three clinical datasets including 706 positive and 205 negative samples obtained from Fujian Provincial Hospital were used to verify the generalization of the approach. Without any modification of the model structure and hyperparameters, accuracy, sensitivity, and specificity by the presented approach were 98.28 %, 97.52 % and 99.02 %. Further comparison studies based on the nine different AI algorithms including recurrent neural network and long-short term memory were performed. The presented study holds potential to facilitating routine diagnostic tasks for preventing pandemic and propelling the development of smart portable instruments.
医学诊断和治疗干预的当代景观见证了时间序列数据生产的显著激增。人工智能(AI),特别是深度学习,在研究隐藏在这些诊断数据背后的高维和有意义的意义方面显示出了很好的价值。在这项工作中,我们提出了一种基于深度学习和纸微流体的智能核酸扩增测试(NAAT)分析方法。片上放大数据直接输入到由Transformer神经网络导出的深度学习模型。为了方便该方法的开发和部署,我们对Transformer模型进行了轻量级处理。验证了该模型准确预测反应趋势和终点值的能力。我们还通过烧蚀实验来评估各种参数对预测性能的影响,并对模型进行优化。然后利用福建省立医院706例阳性样本和205例阴性样本的3个临床数据集验证该方法的泛化性。在不改变模型结构和超参数的情况下,该方法的准确率、灵敏度和特异性分别为98.28%、97.52%和99.02%。基于循环神经网络和长短期记忆等9种不同的人工智能算法进行了进一步的比较研究。本研究具有促进预防大流行的常规诊断任务和推动智能便携式仪器发展的潜力。
{"title":"Time series analysis of nucleic acid reactions via a generalized transformer model","authors":"Canfeng Liu ,&nbsp;Binhui Wang ,&nbsp;Hui Dong ,&nbsp;Yihan Pan ,&nbsp;Jiawen Lin ,&nbsp;Jintian Yang ,&nbsp;Yihui Tao ,&nbsp;Hao Sun","doi":"10.1016/j.chemolab.2025.105522","DOIUrl":"10.1016/j.chemolab.2025.105522","url":null,"abstract":"<div><div>The contemporary landscape of medical diagnostics and therapeutic interventions has witnessed a remarkable surge in the production of time series data. Artificial intelligence (AI), particularly the deep learning, has presented promising values in investigating the high-dimension and meaningful significance hidden behind these diagnostic data. In this work, we propose a novel analytics for intelligent nucleic acid amplification tests (NAAT) based on deep learning and paper microfluidics. On-chip amplification data were straightforwardly fed to a deep learning model derived from Transformer neural network. To facilitate the development and deployment of the approach, we conducted a lightweight processing of the Transformer model. Then, the capacity of the model for accurately predicting the reaction trend and end-point value was validated. We also employed ablation experiments to evaluate the effects of various parameters on prediction performance followed by optimizing the model. Then, three clinical datasets including 706 positive and 205 negative samples obtained from Fujian Provincial Hospital were used to verify the generalization of the approach. Without any modification of the model structure and hyperparameters, accuracy, sensitivity, and specificity by the presented approach were 98.28 %, 97.52 % and 99.02 %. Further comparison studies based on the nine different AI algorithms including recurrent neural network and long-short term memory were performed. The presented study holds potential to facilitating routine diagnostic tasks for preventing pandemic and propelling the development of smart portable instruments.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"267 ","pages":"Article 105522"},"PeriodicalIF":3.8,"publicationDate":"2025-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145046222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An integrated framework combining CenFormer and PLS regression for rapid distillate oil classification and property prediction 结合CenFormer和PLS回归的馏分油快速分类和性质预测集成框架
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-09-05 DOI: 10.1016/j.chemolab.2025.105530
Yifan Wang , Xisong Chen , Lei Jiang , Yunyun Hu
Rapid and accurate classification and property prediction of distillate oil are essential for intelligent quality control and process optimization in modern refineries. Traditional methods, such as spectral analysis with chemometrics, are widely applied, but heavily depend on manual feature engineering and offer limited representation capacities. Recent advances in deep learning have shown promise for oil analysis, yet existing models often struggle to jointly capture fine-grained local patterns and long-range spectral dependencies, and rarely optimize feature space geometry. To address these challenges, an integrated framework is proposed, integrating spectral preprocessing, a dual-branch CenFormer model, a joint loss function, and dynamic property prediction. Spectral preprocessing is employed to sharpen spectral features by applying baseline correction, spectral truncation, and vector normalization. The CenFormer model leverages a CNN-Transformer dual-branch architecture, enabling the simultaneous capture of fine-grained local patterns and long-range spectral dependencies. A joint loss function, combining softmax and center loss, enforces intra-class compactness and inter-class separability, thereby improving feature discriminability. For property prediction, a similarity-based sample selection strategy is performed, followed by PLS regression, to enable adaptive modeling of physicochemical attributes. Experimental results demonstrate the effectiveness of the framework, achieving a classification accuracy of 99.51 %, low RMSEs and rRMSEs, and high R2 in property prediction, highlighting its potential for rapid and reliable spectral analysis in industrial applications.
快速准确的馏分油分类和性质预测是现代炼油厂智能质量控制和工艺优化的必要条件。传统的方法,如化学计量学的光谱分析,被广泛应用,但严重依赖于人工特征工程和提供有限的表示能力。深度学习的最新进展显示了石油分析的前景,但现有模型通常难以共同捕获细粒度的局部模式和远程光谱依赖关系,并且很少优化特征空间几何。为了解决这些挑战,提出了一个集成框架,集成了光谱预处理、双分支CenFormer模型、联合损失函数和动态特性预测。光谱预处理通过基线校正、光谱截断和矢量归一化来锐化光谱特征。CenFormer模型利用CNN-Transformer双分支架构,能够同时捕获细粒度的本地模式和远程频谱依赖关系。结合softmax和中心损失的联合损失函数增强了类内紧性和类间可分性,从而提高了特征的可判别性。对于属性预测,执行基于相似性的样本选择策略,然后是PLS回归,以实现物理化学属性的自适应建模。实验结果证明了该框架的有效性,分类准确率达到99.51%,在属性预测中rmse和rrmse较低,R2较高,在工业应用中具有快速可靠的光谱分析潜力。
{"title":"An integrated framework combining CenFormer and PLS regression for rapid distillate oil classification and property prediction","authors":"Yifan Wang ,&nbsp;Xisong Chen ,&nbsp;Lei Jiang ,&nbsp;Yunyun Hu","doi":"10.1016/j.chemolab.2025.105530","DOIUrl":"10.1016/j.chemolab.2025.105530","url":null,"abstract":"<div><div>Rapid and accurate classification and property prediction of distillate oil are essential for intelligent quality control and process optimization in modern refineries. Traditional methods, such as spectral analysis with chemometrics, are widely applied, but heavily depend on manual feature engineering and offer limited representation capacities. Recent advances in deep learning have shown promise for oil analysis, yet existing models often struggle to jointly capture fine-grained local patterns and long-range spectral dependencies, and rarely optimize feature space geometry. To address these challenges, an integrated framework is proposed, integrating spectral preprocessing, a dual-branch CenFormer model, a joint loss function, and dynamic property prediction. Spectral preprocessing is employed to sharpen spectral features by applying baseline correction, spectral truncation, and vector normalization. The CenFormer model leverages a CNN-Transformer dual-branch architecture, enabling the simultaneous capture of fine-grained local patterns and long-range spectral dependencies. A joint loss function, combining softmax and center loss, enforces intra-class compactness and inter-class separability, thereby improving feature discriminability. For property prediction, a similarity-based sample selection strategy is performed, followed by PLS regression, to enable adaptive modeling of physicochemical attributes. Experimental results demonstrate the effectiveness of the framework, achieving a classification accuracy of 99.51 %, low RMSEs and rRMSEs, and high <span><math><mrow><msup><mi>R</mi><mn>2</mn></msup></mrow></math></span> in property prediction, highlighting its potential for rapid and reliable spectral analysis in industrial applications.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"267 ","pages":"Article 105530"},"PeriodicalIF":3.8,"publicationDate":"2025-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145046224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Self-attention embedded StyleGAN for virtual sample generation in sensing applications 自关注嵌入式StyleGAN在传感应用中的虚拟样本生成
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-09-04 DOI: 10.1016/j.chemolab.2025.105519
Xue-Yu Zhang , Qun-Xiong Zhu , Ming-Jia Liu , Feng Ma , Yi Luo , Wei Ke , Yan-Lin He , Ming-Qing Zhang , Yuan Xu
Given the challenges of low variability in industrial processes, which intensify data scarcity and produce anomalous distributions that compromise data-driven model accuracy. Existing sample generation methods often overlook key factors such as sparsity and correlation among data. To address these challenges, this paper proposes a StyleGAN-based virtual sample generation method with an embedded self-attention mechanism (SASG-VSG). Firstly, StyleGAN is used to map the original data space to a disentangled latent space. The output variables then act as control conditions, guiding the model to interpolate along the output dimension to ensure a more uniform distribution of generated samples. Besides, a self-attention module is incorporated into the discriminator to enhance its ability to capture the similarity between the virtual samples and the original data distribution. Finally, validation experiments on a purified terephthalic acid (PTA) solvent system and a sulfur recovery unit (SRU) confirm the capability of the proposed SASG-VSG in generating high-quality virtual samples for soft-sensing applications.
考虑到工业过程中低可变性的挑战,这加剧了数据稀缺性,并产生了损害数据驱动模型准确性的异常分布。现有的样本生成方法往往忽略了数据之间的稀疏性和相关性等关键因素。为了解决这些问题,本文提出了一种基于stylegan的基于嵌入式自关注机制的虚拟样本生成方法(sag - vsg)。首先,使用StyleGAN将原始数据空间映射到解纠缠的潜在空间。然后,输出变量作为控制条件,引导模型沿着输出维度进行插值,以确保生成的样本分布更加均匀。此外,在鉴别器中加入了自关注模块,增强了鉴别器捕捉虚拟样本与原始数据分布相似度的能力。最后,在纯化对苯二甲酸(PTA)溶剂系统和硫回收装置(SRU)上的验证实验证实了所提出的SASG-VSG能够为软测量应用生成高质量的虚拟样品。
{"title":"Self-attention embedded StyleGAN for virtual sample generation in sensing applications","authors":"Xue-Yu Zhang ,&nbsp;Qun-Xiong Zhu ,&nbsp;Ming-Jia Liu ,&nbsp;Feng Ma ,&nbsp;Yi Luo ,&nbsp;Wei Ke ,&nbsp;Yan-Lin He ,&nbsp;Ming-Qing Zhang ,&nbsp;Yuan Xu","doi":"10.1016/j.chemolab.2025.105519","DOIUrl":"10.1016/j.chemolab.2025.105519","url":null,"abstract":"<div><div>Given the challenges of low variability in industrial processes, which intensify data scarcity and produce anomalous distributions that compromise data-driven model accuracy. Existing sample generation methods often overlook key factors such as sparsity and correlation among data. To address these challenges, this paper proposes a StyleGAN-based virtual sample generation method with an embedded self-attention mechanism (SASG-VSG). Firstly, StyleGAN is used to map the original data space to a disentangled latent space. The output variables then act as control conditions, guiding the model to interpolate along the output dimension to ensure a more uniform distribution of generated samples. Besides, a self-attention module is incorporated into the discriminator to enhance its ability to capture the similarity between the virtual samples and the original data distribution. Finally, validation experiments on a purified terephthalic acid (PTA) solvent system and a sulfur recovery unit (SRU) confirm the capability of the proposed SASG-VSG in generating high-quality virtual samples for soft-sensing applications.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"267 ","pages":"Article 105519"},"PeriodicalIF":3.8,"publicationDate":"2025-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145046223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An expectation–maximization algorithm for spectral reconstruction under the spectral hard model 光谱硬模型下的光谱重建期望最大化算法
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-09-02 DOI: 10.1016/j.chemolab.2025.105518
Marvin Kasterke , Lea Kaufmann , Maria Kateri , Thorsten Brands
Indirect Hard Modeling (IHM) is a physics-based evaluation method for the quantitative analysis of fluid compositions using spectroscopic techniques such as Raman spectroscopy. In this approach, mixture spectra are represented as a superposition of pure substance models, with each component described by a sum of parameterized peak functions. Nevertheless, the accuracy of the compositions prediction depends critically on user decisions regarding both the number of peak functions and the specific parameter adjustments employed. In this work, we apply an expectation–maximization (EM) based algorithm for generating spectral reconstructions of pure substance models that does not require the pre-specification of the number of peaks or any initial values. The efficient and fast performance of the used EM algorithm enables the fit of a given spectrum for an unknown number of peaks, based on a model selection criterion. In simulation studies, we demonstrate that this approach can recognize the true underlying function in settings of high noise, peak overlapping and background signals, yielding reliable results. In a validation study, the algorithm was tested using experimental data. It was integrated into an Indirect Hard Modeling framework and applied to three chemical test systems. The quality of the obtained results were in the range of other automated IHM model generating approaches while significantly reducing both time and computational effort.
间接硬建模(IHM)是一种基于物理的评价方法,用于利用光谱技术(如拉曼光谱)对流体成分进行定量分析。在这种方法中,混合光谱被表示为纯物质模型的叠加,每个成分由参数化峰函数的和来描述。然而,成分预测的准确性主要取决于用户对峰值函数的数量和所采用的具体参数调整的决定。在这项工作中,我们应用基于期望最大化(EM)的算法来生成纯物质模型的光谱重建,该模型不需要预先指定峰的数量或任何初始值。基于模型选择准则,所使用的电磁算法具有高效和快速的性能,可以对未知数量的峰进行给定光谱的拟合。在仿真研究中,我们证明了这种方法可以在高噪声,峰值重叠和背景信号的设置中识别真正的底层函数,并产生可靠的结果。在验证研究中,使用实验数据对算法进行了测试。将其集成到一个间接硬建模框架中,并应用于三个化学测试系统。获得的结果质量在其他自动化IHM模型生成方法的范围内,同时显着减少了时间和计算工作量。
{"title":"An expectation–maximization algorithm for spectral reconstruction under the spectral hard model","authors":"Marvin Kasterke ,&nbsp;Lea Kaufmann ,&nbsp;Maria Kateri ,&nbsp;Thorsten Brands","doi":"10.1016/j.chemolab.2025.105518","DOIUrl":"10.1016/j.chemolab.2025.105518","url":null,"abstract":"<div><div>Indirect Hard Modeling (IHM) is a physics-based evaluation method for the quantitative analysis of fluid compositions using spectroscopic techniques such as Raman spectroscopy. In this approach, mixture spectra are represented as a superposition of pure substance models, with each component described by a sum of parameterized peak functions. Nevertheless, the accuracy of the compositions prediction depends critically on user decisions regarding both the number of peak functions and the specific parameter adjustments employed. In this work, we apply an expectation–maximization (EM) based algorithm for generating spectral reconstructions of pure substance models that does not require the pre-specification of the number of peaks or any initial values. The efficient and fast performance of the used EM algorithm enables the fit of a given spectrum for an unknown number of peaks, based on a model selection criterion. In simulation studies, we demonstrate that this approach can recognize the true underlying function in settings of high noise, peak overlapping and background signals, yielding reliable results. In a validation study, the algorithm was tested using experimental data. It was integrated into an Indirect Hard Modeling framework and applied to three chemical test systems. The quality of the obtained results were in the range of other automated IHM model generating approaches while significantly reducing both time and computational effort.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"267 ","pages":"Article 105518"},"PeriodicalIF":3.8,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145019946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Implementation of artificial intelligence and multivariate analysis to analyze electrical and physicochemical properties of seawater-affected agriculture soil 实施人工智能和多元分析,分析受海水影响的农业土壤的电学和理化性质
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-08-30 DOI: 10.1016/j.chemolab.2025.105520
Ajay L. Vishwakarma, Shruti O. Varma, M.R. Sonawane, Ajay Chaudhari
The impact of salinity on soil has become a major environmental challenge due to global warming and urbanization. The electrical properties of soil are intricately influenced by physicochemical properties, salinity levels, moisture content, and geological features of the land. This work aimed to evaluate the electrical and chemical properties of the agricultural, riparian zone, and near-seafront salt marsh soils using a PC-based automated microwave X-band bench method at frequency 9.55 GHz with ‘infinite sample’ technique. Also, Chemical properties such as pH, sodium absorption ratio (SAR), exchangeable sodium percentage (ESP), organic carbon (OC), phosphorous (P), potassium (K), micronutrients (Fe, Mn, Cu, and Zn), and physical properties such as porosity (PO), particle and bulk density (PD and BD) of soil samples were measured using laboratory method in triplicate. Furthermore, Hierarchical Cluster Analysis (HCA) and Principal Component Analysis (PCA) were employed to classify and differentiate samples based on their properties, providing insights into underlying patterns and groupings. To accurately estimate the dielectric constant and dielectric loss, we implemented Multiple Linear Regression (MLR) and an Artificial Neural Network (ANN) model using a feed-forward back propagation. To evaluate the performance and predictive accuracy of the developed models, statistical metrics such as Root Mean Square Error (RMSE) and the coefficient of determination (R2) were used. The R2 and RMSE values of the dielectric constant obtained by the ANN model with PO, BD, PD, P, OC, K, and ESP as entered variables were 0.99 and 9.23 × 10−04, and for dielectric loss, were 0.98 and 2.93 × 10−02, respectively. For MLR, the R2 value of the dielectric constant and dielectric loss was 0.88 and 0.80. SHAP (SHapley Additive exPlanations) analysis, combined with an ANN model, revealed that the DC is influenced by the Exchangeable Sodium Percentage (ESP), while DL minutely affected. Thus, ANN and SHAP accurately predicted dielectric properties of soil, offering a nondestructive and efficient approach for monitoring salinity effects on soil health.
随着全球变暖和城市化进程的推进,盐碱化对土壤的影响已成为一项重大的环境挑战。土壤的电特性受到土壤的物理化学特性、盐度、水分含量和地质特征的复杂影响。这项工作旨在利用基于pc的自动化微波x波段实验方法,在9.55 GHz频率下使用“无限样本”技术,评估农业、河岸带和近海滨盐沼土壤的电学和化学性质。此外,采用实验室方法对土壤样品的pH、钠吸收比(SAR)、交换钠百分率(ESP)、有机碳(OC)、磷(P)、钾(K)、微量元素(Fe、Mn、Cu和Zn)等化学性质以及孔隙度(PO)、颗粒密度和容重(PD和BD)等物理性质进行了测量。此外,采用层次聚类分析(HCA)和主成分分析(PCA)根据样本的性质对其进行分类和区分,从而深入了解潜在的模式和分组。为了准确估计介质常数和介质损耗,我们采用了多元线性回归(MLR)和人工神经网络(ANN)模型。采用均方根误差(RMSE)和决定系数(R2)等统计指标评价所建模型的性能和预测准确性。以PO、BD、PD、P、OC、K和ESP为输入变量的神经网络模型得到的介电常数R2和RMSE分别为0.99和9.23 × 10−04,介电损耗分别为0.98和2.93 × 10−02。MLR的介电常数和介电损耗R2分别为0.88和0.80。SHapley加性解释(SHapley Additive exPlanations)分析结合人工神经网络模型,发现DC受可交换钠百分比(ESP)的影响,而DL受影响较小。因此,ANN和SHAP能够准确预测土壤的介电特性,为监测盐分对土壤健康的影响提供了一种无损且有效的方法。
{"title":"Implementation of artificial intelligence and multivariate analysis to analyze electrical and physicochemical properties of seawater-affected agriculture soil","authors":"Ajay L. Vishwakarma,&nbsp;Shruti O. Varma,&nbsp;M.R. Sonawane,&nbsp;Ajay Chaudhari","doi":"10.1016/j.chemolab.2025.105520","DOIUrl":"10.1016/j.chemolab.2025.105520","url":null,"abstract":"<div><div>The impact of salinity on soil has become a major environmental challenge due to global warming and urbanization. The electrical properties of soil are intricately influenced by physicochemical properties, salinity levels, moisture content, and geological features of the land. This work aimed to evaluate the electrical and chemical properties of the agricultural, riparian zone, and near-seafront salt marsh soils using a PC-based automated microwave X-band bench method at frequency 9.55 GHz with ‘infinite sample’ technique. Also, Chemical properties such as pH, sodium absorption ratio (SAR), exchangeable sodium percentage (ESP), organic carbon (OC), phosphorous (P), potassium (K), micronutrients (Fe, Mn, Cu, and Zn), and physical properties such as porosity (PO), particle and bulk density (PD and BD) of soil samples were measured using laboratory method in triplicate. Furthermore, Hierarchical Cluster Analysis (HCA) and Principal Component Analysis (PCA) were employed to classify and differentiate samples based on their properties, providing insights into underlying patterns and groupings. To accurately estimate the dielectric constant and dielectric loss, we implemented Multiple Linear Regression (MLR) and an Artificial Neural Network (ANN) model using a feed-forward back propagation. To evaluate the performance and predictive accuracy of the developed models, statistical metrics such as Root Mean Square Error (RMSE) and the coefficient of determination (R<sup>2</sup>) were used. The R<sup>2</sup> and RMSE values of the dielectric constant obtained by the ANN model with PO, BD, PD, P, OC, K, and ESP as entered variables were 0.99 and 9.23 × 10<sup>−04</sup>, and for dielectric loss, were 0.98 and 2.93 × 10<sup>−02</sup>, respectively. For MLR, the R<sup>2</sup> value of the dielectric constant and dielectric loss was 0.88 and 0.80. SHAP (SHapley Additive exPlanations) analysis, combined with an ANN model, revealed that the DC is influenced by the Exchangeable Sodium Percentage (ESP), while DL minutely affected. Thus, ANN and SHAP accurately predicted dielectric properties of soil, offering a nondestructive and efficient approach for monitoring salinity effects on soil health.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"267 ","pages":"Article 105520"},"PeriodicalIF":3.8,"publicationDate":"2025-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144997328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sharpness-aware minimization with physics-informed regularizations for predicting semiconductor material properties in molecular dynamics 分子动力学中预测半导体材料特性的具有物理信息的正则化的锐度感知最小化
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-08-30 DOI: 10.1016/j.chemolab.2025.105511
Dong-Hee Shin, Young-Han Son, Tae-Eui Kam
In recent years, the growing adoption of artificial intelligence across diverse scientific fields has significantly increased demand for advanced semiconductor chips, necessitating innovations in semiconductor material design. Accurate prediction of semiconductor material properties is essential for improving chip performance, as these properties directly affect electrical, thermal, and mechanical characteristics. Traditionally, density functional theory has been the gold standard for atomic-scale simulations in material property prediction; however, its high computational cost limits scalability. Molecular dynamics simulations provide a scalable alternative by leveraging the power of machine learning force fields (MLFFs); however, semiconductor systems present unique challenges due to non-equilibrium dynamics, surface defects, and impurities. These factors often result in out-of-distribution (OOD) atomic configurations, which can significantly degrade model performance. To address this challenge, we propose Physics-Informed Sharpness-Aware Minimization (PI-SAM), a novel framework designed to enhance the prediction of semiconductor material properties across diverse datasets and challenging OOD scenarios. Specifically, PI-SAM leverages sharpness-aware minimization to achieve flatter loss minima, improving the model’s generalization. Additionally, it incorporates physics-informed regularizations to enforce energy-force consistency and account for potential energy surface curvature, ensuring alignment with the underlying physical principles governing semiconductor behavior. Experimental results demonstrate that our PI-SAM outperforms competing methods, especially on OOD datasets, underscoring its effectiveness in improving generalization.
近年来,人工智能在不同科学领域的日益普及,大大增加了对先进半导体芯片的需求,这就需要在半导体材料设计方面进行创新。半导体材料特性的准确预测对于提高芯片性能至关重要,因为这些特性直接影响电学、热学和机械特性。传统上,密度泛函理论一直是预测材料性质的原子尺度模拟的金标准;然而,它的高计算成本限制了可扩展性。分子动力学模拟通过利用机器学习力场(MLFFs)的力量提供了一种可扩展的替代方案;然而,由于非平衡动力学、表面缺陷和杂质,半导体系统面临着独特的挑战。这些因素通常会导致分布外(OOD)原子配置,这会显著降低模型性能。为了应对这一挑战,我们提出了物理知情的锐度感知最小化(PI-SAM),这是一个新的框架,旨在增强对不同数据集和具有挑战性的OOD场景中半导体材料特性的预测。具体来说,PI-SAM利用锐度感知最小化来实现更平坦的损失最小化,从而提高模型的泛化能力。此外,它还结合了物理信息的正则化,以加强能量-力的一致性,并考虑潜在的能量表面曲率,确保与控制半导体行为的潜在物理原理保持一致。实验结果表明,我们的PI-SAM优于竞争对手的方法,特别是在OOD数据集上,强调了它在提高泛化方面的有效性。
{"title":"Sharpness-aware minimization with physics-informed regularizations for predicting semiconductor material properties in molecular dynamics","authors":"Dong-Hee Shin,&nbsp;Young-Han Son,&nbsp;Tae-Eui Kam","doi":"10.1016/j.chemolab.2025.105511","DOIUrl":"10.1016/j.chemolab.2025.105511","url":null,"abstract":"<div><div>In recent years, the growing adoption of artificial intelligence across diverse scientific fields has significantly increased demand for advanced semiconductor chips, necessitating innovations in semiconductor material design. Accurate prediction of semiconductor material properties is essential for improving chip performance, as these properties directly affect electrical, thermal, and mechanical characteristics. Traditionally, density functional theory has been the gold standard for atomic-scale simulations in material property prediction; however, its high computational cost limits scalability. Molecular dynamics simulations provide a scalable alternative by leveraging the power of machine learning force fields (MLFFs); however, semiconductor systems present unique challenges due to non-equilibrium dynamics, surface defects, and impurities. These factors often result in out-of-distribution (OOD) atomic configurations, which can significantly degrade model performance. To address this challenge, we propose Physics-Informed Sharpness-Aware Minimization (PI-SAM), a novel framework designed to enhance the prediction of semiconductor material properties across diverse datasets and challenging OOD scenarios. Specifically, PI-SAM leverages sharpness-aware minimization to achieve flatter loss minima, improving the model’s generalization. Additionally, it incorporates physics-informed regularizations to enforce energy-force consistency and account for potential energy surface curvature, ensuring alignment with the underlying physical principles governing semiconductor behavior. Experimental results demonstrate that our PI-SAM outperforms competing methods, especially on OOD datasets, underscoring its effectiveness in improving generalization.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"267 ","pages":"Article 105511"},"PeriodicalIF":3.8,"publicationDate":"2025-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144926577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Beer's linguistics and chemistry: an investigation opening new research perspectives 比尔的语言学和化学:开启新的研究视角的调查
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-08-30 DOI: 10.1016/j.chemolab.2025.105521
Nicola Cavallini , Francesco Savorani , Rasmus Bro , Marina Cocchi
In the last two decades, interest in food production and consumption has progressively grown, alongside the booming popularity of craft beer, fueled by micro-breweries and home brewing. Beer is a complex mixture of compounds — from carbohydrates to proteins and ethanol — shaped by the recipe, ingredients, and production process. Less obvious is that the human tongue, in synergy with the oral cavity and nose, acts as a powerful sensor array. Tasting experiences can be viewed as “analytical sessions”, where sensory signals processed by the brain determine not only if the beer is appreciated but also which tastes and flavours are perceived.
In our study, we investigated the connection between the “objective” chemical profile of beer and the “subjective” sensory descriptions from user reviews. We analysed 88 beers using near-infrared (NIR), visible, and nuclear magnetic resonance (NMR) spectroscopy, pairing them with text reviews processed through natural language processing (NLP) tools and converted into numerical data via a bag-of-words approach. Principal Component Analysis-Generalized Canonical Analysis (PCA-GCA) revealed correlations between chemical signals and topics like “hops,” “brown colour,” and “booze”. NMR data showed the strongest correlations, especially for hops-related terms, while visible spectra linked to colour descriptors. Automated topic extraction often performed comparably to manual term selection, suggesting potential for scalable studies. Despite limitations like dataset size and beer variety, this approach shows promise for aligning chemical composition with sensory perception, with applications for product development and broader food analysis.
A novel approach integrates text corpora with analytical data through chemometrics, linking language complexity to instrumental responses. Results showed strong correlations, like NMR signals with hops-related terms and visible spectra with beer colour. This previously unexplored connection opens the door to designing food products tailored to consumer preferences. The approach is broadly applicable, from food science to medical diagnosis or aligning expert opinions with factual data.
在过去的二十年里,人们对食品生产和消费的兴趣逐渐增长,同时,在微型啤酒厂和家庭酿造的推动下,精酿啤酒蓬勃发展。啤酒是一种复杂的混合物——从碳水化合物到蛋白质和乙醇——由配方、原料和生产过程决定。不太明显的是,人类的舌头与口腔和鼻子协同作用,充当了一个强大的传感器阵列。品尝体验可以被视为“分析会议”,大脑处理的感官信号不仅决定了啤酒是否被欣赏,还决定了感知到的味道和风味。在我们的研究中,我们调查了啤酒的“客观”化学特征与用户评论的“主观”感官描述之间的联系。我们使用近红外(NIR),可见光和核磁共振(NMR)光谱分析了88种啤酒,将它们与通过自然语言处理(NLP)工具处理的文本评论配对,并通过单词袋方法转换为数字数据。主成分分析-广义典型分析(PCA-GCA)揭示了化学信号与“啤酒花”、“棕色”和“酒”等话题之间的相关性。核磁共振数据显示出最强的相关性,特别是与啤酒花相关的术语,而可见光谱与颜色描述符有关。自动主题提取通常与人工术语选择相当,这表明可扩展研究的潜力。尽管存在数据集大小和啤酒种类等限制,但这种方法显示出将化学成分与感官知觉相结合的前景,并可用于产品开发和更广泛的食品分析。一种新颖的方法通过化学计量学将文本语料库与分析数据相结合,将语言复杂性与工具响应联系起来。结果显示了很强的相关性,如核磁共振信号与啤酒花相关的术语和可见光谱与啤酒的颜色。这种以前未被探索的联系为设计适合消费者偏好的食品打开了大门。该方法广泛适用于从食品科学到医学诊断或将专家意见与事实数据结合起来。
{"title":"Beer's linguistics and chemistry: an investigation opening new research perspectives","authors":"Nicola Cavallini ,&nbsp;Francesco Savorani ,&nbsp;Rasmus Bro ,&nbsp;Marina Cocchi","doi":"10.1016/j.chemolab.2025.105521","DOIUrl":"10.1016/j.chemolab.2025.105521","url":null,"abstract":"<div><div>In the last two decades, interest in food production and consumption has progressively grown, alongside the booming popularity of craft beer, fueled by micro-breweries and home brewing. Beer is a complex mixture of compounds — from carbohydrates to proteins and ethanol — shaped by the recipe, ingredients, and production process. Less obvious is that the human tongue, in synergy with the oral cavity and nose, acts as a powerful sensor array. Tasting experiences can be viewed as “analytical sessions”, where sensory signals processed by the brain determine not only if the beer is appreciated but also which tastes and flavours are perceived.</div><div>In our study, we investigated the connection between the “objective” chemical profile of beer and the “subjective” sensory descriptions from user reviews. We analysed 88 beers using near-infrared (NIR), visible, and nuclear magnetic resonance (NMR) spectroscopy, pairing them with text reviews processed through natural language processing (NLP) tools and converted into numerical data via a bag-of-words approach. Principal Component Analysis-Generalized Canonical Analysis (PCA-GCA) revealed correlations between chemical signals and topics like “hops,” “brown colour,” and “booze”. NMR data showed the strongest correlations, especially for hops-related terms, while visible spectra linked to colour descriptors. Automated topic extraction often performed comparably to manual term selection, suggesting potential for scalable studies. Despite limitations like dataset size and beer variety, this approach shows promise for aligning chemical composition with sensory perception, with applications for product development and broader food analysis.</div><div>A novel approach integrates text corpora with analytical data through chemometrics, linking language complexity to instrumental responses. Results showed strong correlations, like NMR signals with hops-related terms and visible spectra with beer colour. This previously unexplored connection opens the door to designing food products tailored to consumer preferences. The approach is broadly applicable, from food science to medical diagnosis or aligning expert opinions with factual data.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"267 ","pages":"Article 105521"},"PeriodicalIF":3.8,"publicationDate":"2025-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144997327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Not from scratch: Explainable deep transfer learning fine-tunning with domain adaptation enables trustworthy COVID-19 prediction 不是从零开始:可解释的深度迁移学习微调与领域自适应可以实现可信的COVID-19预测
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-08-28 DOI: 10.1016/j.chemolab.2025.105517
Bingqiang Zhao , Honglin Zhai , Tianhua Wang , Haiping Shao , Ling Zhu
Medical image analysis can help diagnose Coronavirus Disease 2019 (COVID-19) early and save patient lives before the disease worsens. However, there are various limitations to manual inspection of these medical images, such as dependence on physician experience and subjectivity of assessment. To enable fast and precise disease diagnosis, we propose XDTLMI-Net, a framework using four CNNs (GoogLeNet, ResNet18, ResNet50, ResNet101) skilled in image data processing. This framework uses existing medical domain knowledge to guide transfer learning for COVID-19 Computed tomography (CT) scan images and Chest X-rays (CXR) images. XDTLMI-Net performed three tasks of medical image classification of COVID-19 on three public datasets: COVID-19 CT, SARS-COV-2 CT and COVID-19 CXR. It achieved an average classification accuracy of 0.9897, 0.9752 and 0.9397, and an average classification F1-score of 0.9 guide transfer learning with 898, 0.9741 and 0.9394, respectively. Moreover, we employed the Shaply Additive exPlanations and Gradient-weighted Class Activation Mapping to interpret the COVID-19 predictions and help understand the predictive models’ decision-making process. Generally, a general end-to-end framework called XDTLMI-Net based on CNN and transfer learning was developed, which works on small datasets of medical images, and does not require any segmentation or image preprocessing procedures. Moreover, XDTLMI-Net outperformed on three datasets in fine-tuning course and gave reasonable importance to each input COVID-19 image, showing its potential for application in different clinical scenarios.
医学图像分析可以帮助早期诊断2019冠状病毒病(COVID-19),并在疾病恶化之前挽救患者的生命。然而,人工检查这些医学图像有各种局限性,如对医生经验的依赖和评估的主观性。为了实现快速准确的疾病诊断,我们提出了XDTLMI-Net框架,该框架使用了四个cnn (GoogLeNet, ResNet18, ResNet50, ResNet101)熟练的图像数据处理。该框架使用现有的医学领域知识来指导COVID-19计算机断层扫描(CT)图像和胸部x射线(CXR)图像的迁移学习。XDTLMI-Net在COVID-19 CT、SARS-COV-2 CT和COVID-19 CXR三个公共数据集上完成了COVID-19医学图像分类的三项任务。其平均分类准确率分别为0.9897、0.9752和0.9397,指导迁移学习的平均分类f1得分为0.9,分别为898、0.9741和0.9394。此外,我们采用Shaply加性解释和梯度加权类激活映射来解释COVID-19预测,并帮助理解预测模型的决策过程。一般来说,基于CNN和迁移学习开发了一个通用的端到端框架XDTLMI-Net,它适用于医学图像的小数据集,不需要任何分割和图像预处理程序。此外,XDTLMI-Net在三个数据集的微调过程中表现优异,并对每个输入的COVID-19图像给予合理的重视,显示了其在不同临床场景中的应用潜力。
{"title":"Not from scratch: Explainable deep transfer learning fine-tunning with domain adaptation enables trustworthy COVID-19 prediction","authors":"Bingqiang Zhao ,&nbsp;Honglin Zhai ,&nbsp;Tianhua Wang ,&nbsp;Haiping Shao ,&nbsp;Ling Zhu","doi":"10.1016/j.chemolab.2025.105517","DOIUrl":"10.1016/j.chemolab.2025.105517","url":null,"abstract":"<div><div>Medical image analysis can help diagnose Coronavirus Disease 2019 (COVID-19) early and save patient lives before the disease worsens. However, there are various limitations to manual inspection of these medical images, such as dependence on physician experience and subjectivity of assessment. To enable fast and precise disease diagnosis, we propose XDTLMI-Net, a framework using four CNNs (GoogLeNet, ResNet18, ResNet50, ResNet101) skilled in image data processing. This framework uses existing medical domain knowledge to guide transfer learning for COVID-19 Computed tomography (CT) scan images and Chest X-rays (CXR) images. XDTLMI-Net performed three tasks of medical image classification of COVID-19 on three public datasets: COVID-19 CT, SARS-COV-2 CT and COVID-19 CXR. It achieved an average classification accuracy of 0.9897, 0.9752 and 0.9397, and an average classification F1-score of 0.9 guide transfer learning with 898, 0.9741 and 0.9394, respectively. Moreover, we employed the Shaply Additive exPlanations and Gradient-weighted Class Activation Mapping to interpret the COVID-19 predictions and help understand the predictive models’ decision-making process. Generally, a general end-to-end framework called XDTLMI-Net based on CNN and transfer learning was developed, which works on small datasets of medical images, and does not require any segmentation or image preprocessing procedures. Moreover, XDTLMI-Net outperformed on three datasets in fine-tuning course and gave reasonable importance to each input COVID-19 image, showing its potential for application in different clinical scenarios.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"266 ","pages":"Article 105517"},"PeriodicalIF":3.8,"publicationDate":"2025-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144917902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FT-NIR combined with multiple intelligent algorithms for rapid identification and quantitative analysis of Iron Mineral Decoction Pieces FT-NIR结合多种智能算法快速识别定量分析铁矿物饮片
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-08-26 DOI: 10.1016/j.chemolab.2025.105512
Yangqian Wu , Yi Wan , Jin Li , Xiangyi Wen , Xiaolan Zhang , Can Zhang , Xiaoli Zhao
Calcined and Vinegar-quenched Magnetite (CVQM), Calcined and Vinegar-quenched Hematite (CVQH), Calcined and Vinegar-quenched Pyrite (CVQP), Calcined and Vinegar-quenched Limonite (CVQL) are all iron-containing mineral decoction pieces, which are easily be confused because of their similar primary compositions and appearances. However, their medicinal values differ significantly, misuse in clinical settings could pose substantial safety risks to patients. In this study, E-eye and Fourier transform near infrared (FT-NIR) combined with multivariate algorithms were employed for the qualitative identification and quantitative prediction of iron content in these four kinds of mineral decoction pieces. The results indicated that the PCA model alongside machine learning classification models with E-eye was ineffective for distinguishing among the four types of decoction pieces, achieving an accuracy rate below 80 %. Furthermore, by utilizing FT-NIR technology with SNV + ICO optimization on raw spectra, we achieved machine-learning classification model accuracies around 90 %, which were improved by 28 %–36 % compared to analyses based solely on raw spectra. Additionally, the quantitative prediction regression (PLSR) model for predicting iron content demonstrated R2C = 0.9627 and R2P = 0.9451, indicating strong linearity and predictive accuracy of the model. Overall, this study demonstrated that FT-NIR combined with multivariate algorithms provided an effective approach for identifying and evaluating the quality of mineral medicines with similar appearances and compositions.
烧醋淬磁铁矿(CVQM)、烧醋淬赤铁矿(CVQH)、烧醋淬黄铁矿(CVQP)、烧醋淬褐铁矿(CVQL)都是含铁矿物饮片,由于它们的主要成分和外观相似,很容易被混淆。然而,它们的药用价值差异很大,在临床环境中的滥用可能会给患者带来重大的安全风险。本研究采用E-eye和傅里叶变换近红外(FT-NIR)结合多元算法对这四种矿物饮片中的铁含量进行定性鉴定和定量预测。结果表明,PCA模型与带有E-eye的机器学习分类模型对四种饮片的区分无效,准确率低于80%。此外,通过利用FT-NIR技术对原始光谱进行SNV + ICO优化,我们实现了90%左右的机器学习分类模型准确率,与仅基于原始光谱的分析相比,准确率提高了28% - 36%。此外,定量预测回归(PLSR)模型预测铁含量的R2C = 0.9627, R2P = 0.9451,表明模型具有较强的线性和预测精度。综上所述,本研究表明FT-NIR结合多元算法为具有相似外观和成分的矿物药物的质量鉴定和评价提供了一种有效的方法。
{"title":"FT-NIR combined with multiple intelligent algorithms for rapid identification and quantitative analysis of Iron Mineral Decoction Pieces","authors":"Yangqian Wu ,&nbsp;Yi Wan ,&nbsp;Jin Li ,&nbsp;Xiangyi Wen ,&nbsp;Xiaolan Zhang ,&nbsp;Can Zhang ,&nbsp;Xiaoli Zhao","doi":"10.1016/j.chemolab.2025.105512","DOIUrl":"10.1016/j.chemolab.2025.105512","url":null,"abstract":"<div><div>Calcined and Vinegar-quenched Magnetite (CVQM), Calcined and Vinegar-quenched Hematite (CVQH), Calcined and Vinegar-quenched Pyrite (CVQP), Calcined and Vinegar-quenched Limonite (CVQL) are all iron-containing mineral decoction pieces, which are easily be confused because of their similar primary compositions and appearances. However, their medicinal values differ significantly, misuse in clinical settings could pose substantial safety risks to patients. In this study, E-eye and Fourier transform near infrared (FT-NIR) combined with multivariate algorithms were employed for the qualitative identification and quantitative prediction of iron content in these four kinds of mineral decoction pieces. The results indicated that the PCA model alongside machine learning classification models with E-eye was ineffective for distinguishing among the four types of decoction pieces, achieving an accuracy rate below 80 %. Furthermore, by utilizing FT-NIR technology with SNV + ICO optimization on raw spectra, we achieved machine-learning classification model accuracies around 90 %, which were improved by 28 %–36 % compared to analyses based solely on raw spectra. Additionally, the quantitative prediction regression (PLSR) model for predicting iron content demonstrated R<sup>2</sup><sub>C</sub> = 0.9627 and R<sup>2</sup><sub>P</sub> = 0.9451, indicating strong linearity and predictive accuracy of the model. Overall, this study demonstrated that FT-NIR combined with multivariate algorithms provided an effective approach for identifying and evaluating the quality of mineral medicines with similar appearances and compositions.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"266 ","pages":"Article 105512"},"PeriodicalIF":3.8,"publicationDate":"2025-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144908209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Non-destructive aging evaluation of transformer insulation oil via Raman spectroscopy and ensemble learning with KPCA feature extraction 基于拉曼光谱和KPCA特征提取的集成学习的变压器绝缘油无损老化评价
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-08-23 DOI: 10.1016/j.chemolab.2025.105514
Feng Hu , Ziyue Pu , Rongying Dai , Wendou Gan , Junchao Liang , Yulong Zhang , Mengxiao Ni , Yan Ge , Hang Wu , Penghui Chen
Transformer insulating oil aging critically impacts power system reliability. This study develops a non-destructive aging evaluation method using Raman spectroscopy with kernel principal component analysis (KPCA) and ensemble learning. Raman spectral data were obtained through accelerated thermal aging experiments and a spectral detection platform; subsequently, the data were preprocessed using Moving Average Sliding, Savitzky-Golay, and Gaussian filtering. Then, Raman features were extracted using KPCA with four kernel functions (Linear, Polynomial, Gaussian and Sigmoid), and evaluation performance was compared using a decision tree; eventually, four weak classifiers (DT, LDA, SVM, and BPNN) were integrated to construct the final ensemble learning evaluation model. Results showed Gaussian filtering achieved the highest signal-to-noise ratio (35.23 dB); Gaussian kernel KPCA yielded the best feature extraction, achieving 96.88 % average accuracy; and the BPNN ensemble learning evaluation model delivered the highest accuracy of 99.6 %. In addition to verifying the benefits of KPCA in feature extraction and the robustness of the model, this study conducted a comparative test with traditional principal component analysis (PCA) methods and introduced various types and intensities of noise into the test set. The study found that the model can effectively evaluate the aging state of transformer insulating oil and has high anti-interference capabilities, providing a new method for improving transformer operating status monitoring.
变压器绝缘油老化严重影响电力系统的可靠性。本文提出了一种基于核主成分分析(KPCA)和集成学习的拉曼光谱无损老化评价方法。通过加速热老化实验和光谱检测平台获得拉曼光谱数据;随后,使用移动平均滑动、Savitzky-Golay和高斯滤波对数据进行预处理。然后,利用4个核函数(Linear、Polynomial、Gaussian和Sigmoid)的KPCA提取拉曼特征,并利用决策树对评价性能进行比较;最后,将四种弱分类器(DT、LDA、SVM和BPNN)集成在一起,构建最终的集成学习评价模型。结果表明:高斯滤波的信噪比最高,为35.23 dB;高斯核KPCA的特征提取效果最好,平均准确率达到96.88%;BPNN集成学习评价模型准确率最高,达到99.6%。除了验证KPCA在特征提取方面的优势和模型的鲁棒性外,本研究还与传统的主成分分析(PCA)方法进行了对比测试,并在测试集中引入了不同类型和强度的噪声。研究发现,该模型能有效评估变压器绝缘油的老化状态,具有较高的抗干扰能力,为改进变压器运行状态监测提供了一种新的方法。
{"title":"Non-destructive aging evaluation of transformer insulation oil via Raman spectroscopy and ensemble learning with KPCA feature extraction","authors":"Feng Hu ,&nbsp;Ziyue Pu ,&nbsp;Rongying Dai ,&nbsp;Wendou Gan ,&nbsp;Junchao Liang ,&nbsp;Yulong Zhang ,&nbsp;Mengxiao Ni ,&nbsp;Yan Ge ,&nbsp;Hang Wu ,&nbsp;Penghui Chen","doi":"10.1016/j.chemolab.2025.105514","DOIUrl":"10.1016/j.chemolab.2025.105514","url":null,"abstract":"<div><div>Transformer insulating oil aging critically impacts power system reliability. This study develops a non-destructive aging evaluation method using Raman spectroscopy with kernel principal component analysis (KPCA) and ensemble learning. Raman spectral data were obtained through accelerated thermal aging experiments and a spectral detection platform; subsequently, the data were preprocessed using Moving Average Sliding, Savitzky-Golay, and Gaussian filtering. Then, Raman features were extracted using KPCA with four kernel functions (Linear, Polynomial, Gaussian and Sigmoid), and evaluation performance was compared using a decision tree; eventually, four weak classifiers (DT, LDA, SVM, and BPNN) were integrated to construct the final ensemble learning evaluation model. Results showed Gaussian filtering achieved the highest signal-to-noise ratio (35.23 dB); Gaussian kernel KPCA yielded the best feature extraction, achieving 96.88 % average accuracy; and the BPNN ensemble learning evaluation model delivered the highest accuracy of 99.6 %. In addition to verifying the benefits of KPCA in feature extraction and the robustness of the model, this study conducted a comparative test with traditional principal component analysis (PCA) methods and introduced various types and intensities of noise into the test set. The study found that the model can effectively evaluate the aging state of transformer insulating oil and has high anti-interference capabilities, providing a new method for improving transformer operating status monitoring.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"266 ","pages":"Article 105514"},"PeriodicalIF":3.8,"publicationDate":"2025-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144894755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Chemometrics and Intelligent Laboratory Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1