首页 > 最新文献

Chemometrics and Intelligent Laboratory Systems最新文献

英文 中文
Dual-stage variable selection: Integrating static filtering and dynamic refinement for high-dimensional NIR analysis 双阶段变量选择:集成静态过滤和动态细化高维近红外分析
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-12-15 Epub Date: 2025-09-11 DOI: 10.1016/j.chemolab.2025.105533
Shiyu Liu , Xuan Liu , Shutao Wang , Chunhai Hu , Lide Fang , Xiaoli Yan
Near-infrared (NIR) spectra inherently possess a large number of overlapping absorption feature variables, the quantity of which typically surpasses the available sample size to a notably greater extent. Variable selection is universally acknowledged as an effective strategy for mitigating the challenges associated with the curse of dimensionality in high-dimensional spectral datasets. In this study, a novel dual-stage variable selection scheme, termed JMIM-RFE, was presented for high-dimensional spectral data analysis by integrating recursive feature elimination (RFE) with maximum of the minimum-based joint mutual information (JMIM), implemented through support vector machine (SVM) classification. JMIM was first employed for static fast filtering of redundant and irrelevant variables, followed by RFE-based dynamic iterative refinement to shrink the variable space while retaining critical spectral features. To comprehensively assess the efficacy, validation experiments were meticulously carried out on three distinct high-dimensional NIR datasets, with particular attention directed towa
近红外(NIR)光谱固有地具有大量重叠的吸收特征变量,其数量通常在很大程度上超过可用的样本量。变量选择被普遍认为是缓解高维光谱数据集中与维度诅咒相关的挑战的有效策略。本文提出了一种新的双阶段变量选择方案JMIM-RFE,该方案将递归特征消除(RFE)与基于最小的最大联合互信息(JMIM)相结合,通过支持向量机(SVM)分类实现高维光谱数据分析。首先采用JMIM对冗余和不相关变量进行静态快速滤波,然后采用基于rfe的动态迭代细化,在保留关键光谱特征的同时缩小变量空间。为了全面评估有效性,在三个不同的高维近红外数据集上精心进行了验证实验,并特别关注了a
{"title":"Dual-stage variable selection: Integrating static filtering and dynamic refinement for high-dimensional NIR analysis","authors":"Shiyu Liu ,&nbsp;Xuan Liu ,&nbsp;Shutao Wang ,&nbsp;Chunhai Hu ,&nbsp;Lide Fang ,&nbsp;Xiaoli Yan","doi":"10.1016/j.chemolab.2025.105533","DOIUrl":"10.1016/j.chemolab.2025.105533","url":null,"abstract":"<div><div>Near-infrared (NIR) spectra inherently possess a large number of overlapping absorption feature variables, the quantity of which typically surpasses the available sample size to a notably greater extent. Variable selection is universally acknowledged as an effective strategy for mitigating the challenges associated with the curse of dimensionality in high-dimensional spectral datasets. In this study, a novel dual-stage variable selection scheme, termed JMIM-RFE, was presented for high-dimensional spectral data analysis by integrating recursive feature elimination (RFE) with maximum of the minimum-based joint mutual information (JMIM), implemented through support vector machine (SVM) classification. JMIM was first employed for static fast filtering of redundant and irrelevant variables, followed by RFE-based dynamic iterative refinement to shrink the variable space while retaining critical spectral features. To comprehensively assess the efficacy, validation experiments were meticulously carried out on three distinct high-dimensional NIR datasets, with particular attention directed towa</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"267 ","pages":"Article 105533"},"PeriodicalIF":3.8,"publicationDate":"2025-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145046226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Implementation of artificial intelligence and multivariate analysis to analyze electrical and physicochemical properties of seawater-affected agriculture soil 实施人工智能和多元分析,分析受海水影响的农业土壤的电学和理化性质
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-12-15 Epub Date: 2025-08-30 DOI: 10.1016/j.chemolab.2025.105520
Ajay L. Vishwakarma, Shruti O. Varma, M.R. Sonawane, Ajay Chaudhari
The impact of salinity on soil has become a major environmental challenge due to global warming and urbanization. The electrical properties of soil are intricately influenced by physicochemical properties, salinity levels, moisture content, and geological features of the land. This work aimed to evaluate the electrical and chemical properties of the agricultural, riparian zone, and near-seafront salt marsh soils using a PC-based automated microwave X-band bench method at frequency 9.55 GHz with ‘infinite sample’ technique. Also, Chemical properties such as pH, sodium absorption ratio (SAR), exchangeable sodium percentage (ESP), organic carbon (OC), phosphorous (P), potassium (K), micronutrients (Fe, Mn, Cu, and Zn), and physical properties such as porosity (PO), particle and bulk density (PD and BD) of soil samples were measured using laboratory method in triplicate. Furthermore, Hierarchical Cluster Analysis (HCA) and Principal Component Analysis (PCA) were employed to classify and differentiate samples based on their properties, providing insights into underlying patterns and groupings. To accurately estimate the dielectric constant and dielectric loss, we implemented Multiple Linear Regression (MLR) and an Artificial Neural Network (ANN) model using a feed-forward back propagation. To evaluate the performance and predictive accuracy of the developed models, statistical metrics such as Root Mean Square Error (RMSE) and the coefficient of determination (R2) were used. The R2 and RMSE values of the dielectric constant obtained by the ANN model with PO, BD, PD, P, OC, K, and ESP as entered variables were 0.99 and 9.23 × 10−04, and for dielectric loss, were 0.98 and 2.93 × 10−02, respectively. For MLR, the R2 value of the dielectric constant and dielectric loss was 0.88 and 0.80. SHAP (SHapley Additive exPlanations) analysis, combined with an ANN model, revealed that the DC is influenced by the Exchangeable Sodium Percentage (ESP), while DL minutely affected. Thus, ANN and SHAP accurately predicted dielectric properties of soil, offering a nondestructive and efficient approach for monitoring salinity effects on soil health.
随着全球变暖和城市化进程的推进,盐碱化对土壤的影响已成为一项重大的环境挑战。土壤的电特性受到土壤的物理化学特性、盐度、水分含量和地质特征的复杂影响。这项工作旨在利用基于pc的自动化微波x波段实验方法,在9.55 GHz频率下使用“无限样本”技术,评估农业、河岸带和近海滨盐沼土壤的电学和化学性质。此外,采用实验室方法对土壤样品的pH、钠吸收比(SAR)、交换钠百分率(ESP)、有机碳(OC)、磷(P)、钾(K)、微量元素(Fe、Mn、Cu和Zn)等化学性质以及孔隙度(PO)、颗粒密度和容重(PD和BD)等物理性质进行了测量。此外,采用层次聚类分析(HCA)和主成分分析(PCA)根据样本的性质对其进行分类和区分,从而深入了解潜在的模式和分组。为了准确估计介质常数和介质损耗,我们采用了多元线性回归(MLR)和人工神经网络(ANN)模型。采用均方根误差(RMSE)和决定系数(R2)等统计指标评价所建模型的性能和预测准确性。以PO、BD、PD、P、OC、K和ESP为输入变量的神经网络模型得到的介电常数R2和RMSE分别为0.99和9.23 × 10−04,介电损耗分别为0.98和2.93 × 10−02。MLR的介电常数和介电损耗R2分别为0.88和0.80。SHapley加性解释(SHapley Additive exPlanations)分析结合人工神经网络模型,发现DC受可交换钠百分比(ESP)的影响,而DL受影响较小。因此,ANN和SHAP能够准确预测土壤的介电特性,为监测盐分对土壤健康的影响提供了一种无损且有效的方法。
{"title":"Implementation of artificial intelligence and multivariate analysis to analyze electrical and physicochemical properties of seawater-affected agriculture soil","authors":"Ajay L. Vishwakarma,&nbsp;Shruti O. Varma,&nbsp;M.R. Sonawane,&nbsp;Ajay Chaudhari","doi":"10.1016/j.chemolab.2025.105520","DOIUrl":"10.1016/j.chemolab.2025.105520","url":null,"abstract":"<div><div>The impact of salinity on soil has become a major environmental challenge due to global warming and urbanization. The electrical properties of soil are intricately influenced by physicochemical properties, salinity levels, moisture content, and geological features of the land. This work aimed to evaluate the electrical and chemical properties of the agricultural, riparian zone, and near-seafront salt marsh soils using a PC-based automated microwave X-band bench method at frequency 9.55 GHz with ‘infinite sample’ technique. Also, Chemical properties such as pH, sodium absorption ratio (SAR), exchangeable sodium percentage (ESP), organic carbon (OC), phosphorous (P), potassium (K), micronutrients (Fe, Mn, Cu, and Zn), and physical properties such as porosity (PO), particle and bulk density (PD and BD) of soil samples were measured using laboratory method in triplicate. Furthermore, Hierarchical Cluster Analysis (HCA) and Principal Component Analysis (PCA) were employed to classify and differentiate samples based on their properties, providing insights into underlying patterns and groupings. To accurately estimate the dielectric constant and dielectric loss, we implemented Multiple Linear Regression (MLR) and an Artificial Neural Network (ANN) model using a feed-forward back propagation. To evaluate the performance and predictive accuracy of the developed models, statistical metrics such as Root Mean Square Error (RMSE) and the coefficient of determination (R<sup>2</sup>) were used. The R<sup>2</sup> and RMSE values of the dielectric constant obtained by the ANN model with PO, BD, PD, P, OC, K, and ESP as entered variables were 0.99 and 9.23 × 10<sup>−04</sup>, and for dielectric loss, were 0.98 and 2.93 × 10<sup>−02</sup>, respectively. For MLR, the R<sup>2</sup> value of the dielectric constant and dielectric loss was 0.88 and 0.80. SHAP (SHapley Additive exPlanations) analysis, combined with an ANN model, revealed that the DC is influenced by the Exchangeable Sodium Percentage (ESP), while DL minutely affected. Thus, ANN and SHAP accurately predicted dielectric properties of soil, offering a nondestructive and efficient approach for monitoring salinity effects on soil health.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"267 ","pages":"Article 105520"},"PeriodicalIF":3.8,"publicationDate":"2025-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144997328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Artificial neural network-assisted study on thermohydrodynamic behavior of tetrahybrid nanofluids in a porous stretching cylinder 人工神经网络辅助下多孔拉伸圆柱体中四杂化纳米流体热流体动力学行为的研究
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-12-15 Epub Date: 2025-09-19 DOI: 10.1016/j.chemolab.2025.105537
Pooja Devi, Bhuvaneshvar Kumar
<div><div>This study explores the flow dynamics and thermal characteristics of a tetrahybrid nanofluid over a stretching cylinder, considering the effects of a magnetic field and internal heat generation. Two distinct tetrahybrid nanofluids are examined for the comparative analysis of temperature, pressure, velocity distributions, skin friction, and heat transfer performance: one composed of Ag+SiO<span><math><msub><mrow></mrow><mrow><mn>2</mn></mrow></msub></math></span>+TiO<span><math><msub><mrow></mrow><mrow><mn>2</mn></mrow></msub></math></span>+Al<span><math><msub><mrow></mrow><mrow><mn>2</mn></mrow></msub></math></span>O<span><math><msub><mrow></mrow><mrow><mn>3</mn></mrow></msub></math></span> suspended in kerosene oil, and the other consisting of Au+CuO+Fe<span><math><msub><mrow></mrow><mrow><mn>3</mn></mrow></msub></math></span>O<span><math><msub><mrow></mrow><mrow><mn>4</mn></mrow></msub></math></span>+ Multi-Walled Carbon Nanotubes (<span><math><mrow><mi>M</mi><mi>W</mi><mi>C</mi><mi>N</mi><mi>T</mi><mi>s</mi></mrow></math></span>) dispersed in water. The governing equations are solved numerically using the fourth-order Runge–Kutta method coupled with a shooting strategy and artificial neural network (ANN). Parametric studies revealed that the Au+ CuO+Fe<span><math><msub><mrow></mrow><mrow><mn>3</mn></mrow></msub></math></span>O<span><math><msub><mrow></mrow><mrow><mn>4</mn></mrow></msub></math></span>+Multi-Walled Carbon Nanotubes (<span><math><mrow><mi>M</mi><mi>W</mi><mi>C</mi><mi>N</mi><mi>T</mi><mi>s</mi></mrow></math></span>) nanofluid exhibited superior thermal performance, characterized by higher Nusselt numbers, while the Ag+SiO<span><math><msub><mrow></mrow><mrow><mn>2</mn></mrow></msub></math></span>+TiO<span><math><msub><mrow></mrow><mrow><mn>2</mn></mrow></msub></math></span>+Al<span><math><msub><mrow></mrow><mrow><mn>2</mn></mrow></msub></math></span>O<span><math><msub><mrow></mrow><mrow><mn>3</mn></mrow></msub></math></span> nanofluid provided enhanced momentum transport and higher velocity profiles. Au+CuO+Fe<span><math><msub><mrow></mrow><mrow><mn>3</mn></mrow></msub></math></span>O<span><math><msub><mrow></mrow><mrow><mn>4</mn></mrow></msub></math></span>+Multi-Walled Carbon Nanotubes (<span><math><mrow><mi>M</mi><mi>W</mi><mi>C</mi><mi>N</mi><mi>T</mi><mi>s</mi></mrow></math></span>) shows stronger pressure resistance near the surface, while Ag+SiO<span><math><msub><mrow></mrow><mrow><mn>2</mn></mrow></msub></math></span>+TiO<span><math><msub><mrow></mrow><mrow><mn>2</mn></mrow></msub></math></span>+Al<span><math><msub><mrow></mrow><mrow><mn>2</mn></mrow></msub></math></span>O<span><math><msub><mrow></mrow><mrow><mn>3</mn></mrow></msub></math></span> yields greater skin friction due to higher effective viscosity. An artificial neural network (ANN) was trained using Bayesian regularization to accurately predict skin friction and Nusselt number values. The Au+CuO+Fe<span><math><msub><mrow></mrow><mrow><mn>3</mn></mrow>
本研究探讨了四杂交纳米流体在拉伸圆柱体上的流动动力学和热特性,考虑了磁场和内部热产生的影响。对两种不同的四杂化纳米流体进行了温度、压力、速度分布、表面摩擦和传热性能的对比分析:一种由悬浮在煤油中的Ag+SiO2+TiO2+Al2O3组成,另一种由分散在水中的Au+CuO+Fe3O4+多壁碳纳米管(MWCNTs)组成。采用四阶龙格-库塔法结合射击策略和人工神经网络对控制方程进行了数值求解。参数研究表明,Au+ CuO+Fe3O4+多壁碳纳米管(MWCNTs)纳米流体表现出优异的热性能,具有较高的努塞尔数,而Ag+SiO2+TiO2+Al2O3纳米流体具有增强的动量传递和更高的速度分布。Au+CuO+Fe3O4+多壁碳纳米管(MWCNTs)在近表面表现出更强的耐压性,而Ag+SiO2+TiO2+Al2O3由于更高的有效粘度而产生更大的表面摩擦。采用贝叶斯正则化方法训练人工神经网络,准确预测皮肤摩擦和努塞尔数。Au+CuO+Fe3O4+多壁碳纳米管(MWCNTs)纳米流体非常适合于高效率的热管理系统,包括电子冷却,太阳能集热器和磁热输送。相比之下,Ag+SiO2+TiO2+ Al2O3纳米流体在聚合物挤出、涂层和润滑过程中具有优势。多孔介质效应的加入进一步扩大了地热系统、填料床反应器和智能热交换器的适用性。人工神经网络预测结果与数值结果吻合较好,相关系数均在0.9999以上,证明了其作为替代模型的可靠性。
{"title":"Artificial neural network-assisted study on thermohydrodynamic behavior of tetrahybrid nanofluids in a porous stretching cylinder","authors":"Pooja Devi,&nbsp;Bhuvaneshvar Kumar","doi":"10.1016/j.chemolab.2025.105537","DOIUrl":"10.1016/j.chemolab.2025.105537","url":null,"abstract":"&lt;div&gt;&lt;div&gt;This study explores the flow dynamics and thermal characteristics of a tetrahybrid nanofluid over a stretching cylinder, considering the effects of a magnetic field and internal heat generation. Two distinct tetrahybrid nanofluids are examined for the comparative analysis of temperature, pressure, velocity distributions, skin friction, and heat transfer performance: one composed of Ag+SiO&lt;span&gt;&lt;math&gt;&lt;msub&gt;&lt;mrow&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;+TiO&lt;span&gt;&lt;math&gt;&lt;msub&gt;&lt;mrow&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;+Al&lt;span&gt;&lt;math&gt;&lt;msub&gt;&lt;mrow&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;O&lt;span&gt;&lt;math&gt;&lt;msub&gt;&lt;mrow&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt; suspended in kerosene oil, and the other consisting of Au+CuO+Fe&lt;span&gt;&lt;math&gt;&lt;msub&gt;&lt;mrow&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;O&lt;span&gt;&lt;math&gt;&lt;msub&gt;&lt;mrow&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;+ Multi-Walled Carbon Nanotubes (&lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;mi&gt;W&lt;/mi&gt;&lt;mi&gt;C&lt;/mi&gt;&lt;mi&gt;N&lt;/mi&gt;&lt;mi&gt;T&lt;/mi&gt;&lt;mi&gt;s&lt;/mi&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;) dispersed in water. The governing equations are solved numerically using the fourth-order Runge–Kutta method coupled with a shooting strategy and artificial neural network (ANN). Parametric studies revealed that the Au+ CuO+Fe&lt;span&gt;&lt;math&gt;&lt;msub&gt;&lt;mrow&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;O&lt;span&gt;&lt;math&gt;&lt;msub&gt;&lt;mrow&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;+Multi-Walled Carbon Nanotubes (&lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;mi&gt;W&lt;/mi&gt;&lt;mi&gt;C&lt;/mi&gt;&lt;mi&gt;N&lt;/mi&gt;&lt;mi&gt;T&lt;/mi&gt;&lt;mi&gt;s&lt;/mi&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;) nanofluid exhibited superior thermal performance, characterized by higher Nusselt numbers, while the Ag+SiO&lt;span&gt;&lt;math&gt;&lt;msub&gt;&lt;mrow&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;+TiO&lt;span&gt;&lt;math&gt;&lt;msub&gt;&lt;mrow&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;+Al&lt;span&gt;&lt;math&gt;&lt;msub&gt;&lt;mrow&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;O&lt;span&gt;&lt;math&gt;&lt;msub&gt;&lt;mrow&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt; nanofluid provided enhanced momentum transport and higher velocity profiles. Au+CuO+Fe&lt;span&gt;&lt;math&gt;&lt;msub&gt;&lt;mrow&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;O&lt;span&gt;&lt;math&gt;&lt;msub&gt;&lt;mrow&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;+Multi-Walled Carbon Nanotubes (&lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;mi&gt;W&lt;/mi&gt;&lt;mi&gt;C&lt;/mi&gt;&lt;mi&gt;N&lt;/mi&gt;&lt;mi&gt;T&lt;/mi&gt;&lt;mi&gt;s&lt;/mi&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;) shows stronger pressure resistance near the surface, while Ag+SiO&lt;span&gt;&lt;math&gt;&lt;msub&gt;&lt;mrow&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;+TiO&lt;span&gt;&lt;math&gt;&lt;msub&gt;&lt;mrow&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;+Al&lt;span&gt;&lt;math&gt;&lt;msub&gt;&lt;mrow&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;O&lt;span&gt;&lt;math&gt;&lt;msub&gt;&lt;mrow&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt; yields greater skin friction due to higher effective viscosity. An artificial neural network (ANN) was trained using Bayesian regularization to accurately predict skin friction and Nusselt number values. The Au+CuO+Fe&lt;span&gt;&lt;math&gt;&lt;msub&gt;&lt;mrow&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/mrow&gt;","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"267 ","pages":"Article 105537"},"PeriodicalIF":3.8,"publicationDate":"2025-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145099579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust soft sensor development based on Dirichlet process mixture of regression model for multimode processes 基于Dirichlet过程混合回归模型的多模过程鲁棒软传感器开发
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-12-15 Epub Date: 2025-10-11 DOI: 10.1016/j.chemolab.2025.105550
Changrui Xie, Xi Chen
Industrial processes often exhibit multimode characteristics due to factors like load variations, equipment changes, and feedstock fluctuations. This paper introduces a Dirichlet Process-based Twofold-Robust Mixture Regression Model (DPR2MRM) for multimode processes. As a Bayesian nonparametric model, it automatically determines the number of mixture components from observed data using Dirichlet process mixture techniques, avoiding underfitting and overfitting. The model employs a Student's-t mixture model for input space learning, leveraging its long-tail properties for robust mode identification. For each mode, a regression model is built to capture the relationship between inputs and outputs, incorporating Student's-t noise to ensure robustness against output space outliers. The optimal posteriors of the model parameters are inferenced within a full Bayesian framework, and an analytical posterior predictive distribution is derived. The effectiveness of the DPR2MRM is demonstrated through a numerical example and two industrial applications.
由于负荷变化、设备变化和原料波动等因素,工业过程经常表现出多模式特性。介绍了一种基于Dirichlet过程的多模过程双鲁棒混合回归模型(DPR2MRM)。该模型是一种贝叶斯非参数模型,利用Dirichlet过程混合技术,从观测数据中自动确定混合分量的个数,避免了欠拟合和过拟合。该模型采用Student -t混合模型进行输入空间学习,利用其长尾特性进行鲁棒模式识别。对于每种模式,都建立了一个回归模型来捕捉输入和输出之间的关系,并结合Student's-t噪声来确保对输出空间异常值的鲁棒性。在全贝叶斯框架内推导出模型参数的最优后验,并推导出分析后验预测分布。通过一个数值算例和两个工业应用验证了DPR2MRM的有效性。
{"title":"Robust soft sensor development based on Dirichlet process mixture of regression model for multimode processes","authors":"Changrui Xie,&nbsp;Xi Chen","doi":"10.1016/j.chemolab.2025.105550","DOIUrl":"10.1016/j.chemolab.2025.105550","url":null,"abstract":"<div><div>Industrial processes often exhibit multimode characteristics due to factors like load variations, equipment changes, and feedstock fluctuations. This paper introduces a Dirichlet Process-based Twofold-Robust Mixture Regression Model (DPR<sup>2</sup>MRM) for multimode processes. As a Bayesian nonparametric model, it automatically determines the number of mixture components from observed data using Dirichlet process mixture techniques, avoiding underfitting and overfitting. The model employs a Student's-<em>t</em> mixture model for input space learning, leveraging its long-tail properties for robust mode identification. For each mode, a regression model is built to capture the relationship between inputs and outputs, incorporating Student's-<em>t</em> noise to ensure robustness against output space outliers. The optimal posteriors of the model parameters are inferenced within a full Bayesian framework, and an analytical posterior predictive distribution is derived. The effectiveness of the DPR<sup>2</sup>MRM is demonstrated through a numerical example and two industrial applications.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"267 ","pages":"Article 105550"},"PeriodicalIF":3.8,"publicationDate":"2025-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145325250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Federated learning with local–global collaboration for predicting acute coronary syndrome 局部-全局联合学习预测急性冠脉综合征
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-12-15 Epub Date: 2025-09-18 DOI: 10.1016/j.chemolab.2025.105515
Yonggong Ren , Jia Shang , Meiwei Zhang , Xiaolu Xu , Zhaohong Geng
Acute Coronary Syndrome (ACS) is a prevalent cardiovascular disease characterized by high incidence and mortality rates. Numerous studies have focused on utilizing artificial intelligence and machine learning algorithms to assess and predict the risk of ACS in patients. However, due to the sensitivity and privacy of medical data, training machine learning models on a centralized server that aggregates ACS data from various institutions poses certain risks. For the first time, this study validates the effectiveness of utilizing federated learning to collaboratively analyze medical data for predicting ACS. A federated learning-based ACS prediction model, i.e., FedLG, which incorporates local–global collaboration for mutual correction, is presented accordingly. On the client side, a regularization term is added to the loss function to reduce deviations caused by heterogeneous data, helping the global model remain accurate and representative. On the server side, gradient normalization is applied to balance contributions from clients with different update frequencies, resulting in a more stable and reliable global model. Comprehensive experiments on the ACS dataset from a tertiary hospital in China show that FedLG consistently outperforms models trained on individual clients, as well as three other federated baselines, across seven evaluation metrics under both IID and non-IID settings. Temporal hold-out validation further indicates that FedLG maintains better generalizability than other baselines. In addition, analysis of feature importance shows that FedLG identifies lipid-related biomarkers, which aligns with clinical knowledge, enhancing the interpretability of the results. The source code of FedLG is freely available at https://github.com/bioinformatics-xu/FedLG.
急性冠脉综合征(ACS)是一种发病率高、死亡率高的常见心血管疾病。许多研究都集中在利用人工智能和机器学习算法来评估和预测患者ACS的风险。然而,由于医疗数据的敏感性和隐私性,在汇聚了来自各个机构的ACS数据的集中式服务器上训练机器学习模型存在一定的风险。本研究首次验证了利用联邦学习协同分析医学数据以预测ACS的有效性。提出了一种基于联邦学习的ACS预测模型,即FedLG,该模型结合了局部-全局协作进行相互校正。在客户端,将正则化项添加到损失函数中,以减少异构数据引起的偏差,帮助全局模型保持准确性和代表性。在服务器端,应用梯度归一化来平衡来自不同更新频率的客户机的贡献,从而产生更稳定和可靠的全局模型。在中国一家三级医院的ACS数据集上进行的综合实验表明,在IID和非IID设置下的七个评估指标中,fedex始终优于针对个人客户训练的模型,以及其他三个联邦基线。时间保持验证进一步表明,FedLG比其他基线具有更好的泛化性。此外,对特征重要性的分析表明,FedLG识别出与脂质相关的生物标志物,这与临床知识一致,增强了结果的可解释性。FedLG的源代码可以在https://github.com/bioinformatics-xu/FedLG上免费获得。
{"title":"Federated learning with local–global collaboration for predicting acute coronary syndrome","authors":"Yonggong Ren ,&nbsp;Jia Shang ,&nbsp;Meiwei Zhang ,&nbsp;Xiaolu Xu ,&nbsp;Zhaohong Geng","doi":"10.1016/j.chemolab.2025.105515","DOIUrl":"10.1016/j.chemolab.2025.105515","url":null,"abstract":"<div><div>Acute Coronary Syndrome (ACS) is a prevalent cardiovascular disease characterized by high incidence and mortality rates. Numerous studies have focused on utilizing artificial intelligence and machine learning algorithms to assess and predict the risk of ACS in patients. However, due to the sensitivity and privacy of medical data, training machine learning models on a centralized server that aggregates ACS data from various institutions poses certain risks. For the first time, this study validates the effectiveness of utilizing federated learning to collaboratively analyze medical data for predicting ACS. A federated learning-based ACS prediction model, i.e., FedLG, which incorporates local–global collaboration for mutual correction, is presented accordingly. On the client side, a regularization term is added to the loss function to reduce deviations caused by heterogeneous data, helping the global model remain accurate and representative. On the server side, gradient normalization is applied to balance contributions from clients with different update frequencies, resulting in a more stable and reliable global model. Comprehensive experiments on the ACS dataset from a tertiary hospital in China show that FedLG consistently outperforms models trained on individual clients, as well as three other federated baselines, across seven evaluation metrics under both IID and non-IID settings. Temporal hold-out validation further indicates that FedLG maintains better generalizability than other baselines. In addition, analysis of feature importance shows that FedLG identifies lipid-related biomarkers, which aligns with clinical knowledge, enhancing the interpretability of the results. The source code of FedLG is freely available at <span><span>https://github.com/bioinformatics-xu/FedLG</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"267 ","pages":"Article 105515"},"PeriodicalIF":3.8,"publicationDate":"2025-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145119320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
High-speed processing of hyperspectral images for enabling demanding industrial applications 高光谱图像的高速处理,使苛刻的工业应用
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-12-15 Epub Date: 2025-09-10 DOI: 10.1016/j.chemolab.2025.105531
Emil Rynkeby Kristensen , Jonas Dornonville de la Cour , Tobias Warburg , René Lynge Eriksen , Bjarke Jørgensen , James Emil Avery , Mogens Hinge
Inline industrial application of high-speed hyperspectral imaging for real-time chemometric analysis presents a computationally difficult problem due to the complexity of the analysis and the large amount of spectral data that needs to be processed in real-time. The image resolution and acquisition rate of modern sensors, as well as increased ambition for detail and accuracy, makes it a challenge to design computational methods and to implement them sufficiently efficient to complete within the few milliseconds available between frames. Real-time chemometrics including intensity calibration, Savitzky-Golay filtering, principal component analysis, and support vector machine classification for plastic identification was performed directly on a hyperspectral camera. Three processing scenarios were evaluated: a Python-based CPU implementation, a C++ CPU implementation, and a GPU implementation using OpenCL. The performance was assessed in terms of total processing time per image. The results demonstrate that GPU-based processing increased frame rate to 160 fps compared to 35 fps and 94 fps achieved with CPU-based processing. Analysis shows that the speed of the GPU based processing is limited by the image acquisition rate of the sensor. The GPU processing has excess computational capacity which enables integration of more complex classification models or parallel execution of multiple models with different purposes. Removing the data processing as the limiting factor of performance, increases the industrial relevance of hyperspectral imaging systems.
由于分析的复杂性和需要实时处理的大量光谱数据,高速高光谱成像用于实时化学计量分析的在线工业应用提出了一个计算难题。现代传感器的图像分辨率和采集率,以及对细节和准确性的追求,使得设计计算方法并在帧之间的几毫秒内足够有效地完成它们成为一项挑战。实时化学计量学包括强度校准、Savitzky-Golay滤波、主成分分析和支持向量机分类,直接在高光谱相机上进行塑料识别。评估了三种处理方案:基于python的CPU实现,c++ CPU实现和使用OpenCL的GPU实现。性能是根据每张图像的总处理时间来评估的。结果表明,基于gpu的处理可以将帧率提高到160 fps,而基于cpu的处理可以达到35 fps和94 fps。分析表明,基于GPU的处理速度受到传感器图像采集速率的限制。GPU处理具有过剩的计算能力,可以集成更复杂的分类模型或并行执行多个不同用途的模型。消除数据处理作为性能的限制因素,增加了高光谱成像系统的工业相关性。
{"title":"High-speed processing of hyperspectral images for enabling demanding industrial applications","authors":"Emil Rynkeby Kristensen ,&nbsp;Jonas Dornonville de la Cour ,&nbsp;Tobias Warburg ,&nbsp;René Lynge Eriksen ,&nbsp;Bjarke Jørgensen ,&nbsp;James Emil Avery ,&nbsp;Mogens Hinge","doi":"10.1016/j.chemolab.2025.105531","DOIUrl":"10.1016/j.chemolab.2025.105531","url":null,"abstract":"<div><div>Inline industrial application of high-speed hyperspectral imaging for real-time chemometric analysis presents a computationally difficult problem due to the complexity of the analysis and the large amount of spectral data that needs to be processed in real-time. The image resolution and acquisition rate of modern sensors, as well as increased ambition for detail and accuracy, makes it a challenge to design computational methods and to implement them sufficiently efficient to complete within the few milliseconds available between frames. Real-time chemometrics including intensity calibration, Savitzky-Golay filtering, principal component analysis, and support vector machine classification for plastic identification was performed directly on a hyperspectral camera. Three processing scenarios were evaluated: a Python-based CPU implementation, a C++ CPU implementation, and a GPU implementation using OpenCL. The performance was assessed in terms of total processing time per image. The results demonstrate that GPU-based processing increased frame rate to 160 fps compared to 35 fps and 94 fps achieved with CPU-based processing. Analysis shows that the speed of the GPU based processing is limited by the image acquisition rate of the sensor. The GPU processing has excess computational capacity which enables integration of more complex classification models or parallel execution of multiple models with different purposes. Removing the data processing as the limiting factor of performance, increases the industrial relevance of hyperspectral imaging systems.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"267 ","pages":"Article 105531"},"PeriodicalIF":3.8,"publicationDate":"2025-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145060694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multiple features fusion and mixup with conditional decoder for 多特征融合和混合与条件解码器
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-12-15 Epub Date: 2025-09-12 DOI: 10.1016/j.chemolab.2025.105534
Youpeng Fan , Yongchun Fang
In recent years, the combination of vibration spectral data and data-driven methods has dominated the development and application of close spectral recognition. Nevertheless, in practical applications, open spectral categories (i.e., novel/unknown spectral categories) may be encountered, as collecting comprehend-sive categories is time-consuming and requires professional expertise. The intuitive solution is to obscure features of different categories, but relevant exploratory experiments yield unsatisfactory open-set performance, which may be attributed to sparse spectral features and high inter-class similarity. To remedy this issue, we innovatively propose an end-to-end scheme combining Multiple Features Fusion and Mixup with Conditional Decoder (MFFMCD) in this paper. In particular, to enhance feature representation, MFFMCD adopts two auxiliary feature extraction modules and fuses different branch features. Additionally, to cope with high inter-class similarity, the enhanced features are obscured within a mini-batch and restored to corresponding class samples through a conditional decoder to mimic the feature distribution of unknown classes. Experiments on three publicly available spectral datasets show that the proposed MFFMCD significantly outperforms existing methods. In the end, extensive ablation studies are conducted to investigate the effectiveness, correctness, and robustness of our proposal.
近年来,振动光谱数据与数据驱动相结合的方法主导了近距离光谱识别的发展和应用。然而,在实际应用中,可能会遇到开放的光谱类别(即新的/未知的光谱类别),因为收集全面的光谱类别耗时且需要专业知识。直观的解决方案是模糊不同类别的特征,但相关探索性实验的开集性能不理想,这可能是由于谱特征稀疏和类间相似度高。为了解决这一问题,本文创新性地提出了一种将多特征融合和混合与条件解码器(MFFMCD)相结合的端到端方案。为了增强特征表征,MFFMCD采用了两个辅助特征提取模块,融合了不同分支特征。此外,为了应对高类间相似性,增强的特征被模糊在一个小批中,并通过条件解码器恢复到相应的类样本中,以模拟未知类的特征分布。在三个公开的光谱数据集上的实验表明,所提出的MFFMCD显著优于现有的方法。最后,进行了广泛的消融研究,以调查我们的建议的有效性,正确性和稳健性。
{"title":"Multiple features fusion and mixup with conditional decoder for","authors":"Youpeng Fan ,&nbsp;Yongchun Fang","doi":"10.1016/j.chemolab.2025.105534","DOIUrl":"10.1016/j.chemolab.2025.105534","url":null,"abstract":"<div><div>In recent years, the combination of vibration spectral data and data-driven methods has dominated the development and application of close spectral recognition. Nevertheless, in practical applications, open spectral categories (i.e., novel/unknown spectral categories) may be encountered, as collecting comprehend-sive categories is time-consuming and requires professional expertise. The intuitive solution is to obscure features of different categories, but relevant exploratory experiments yield unsatisfactory open-set performance, which may be attributed to sparse spectral features and high inter-class similarity. To remedy this issue, we innovatively propose an end-to-end scheme combining <strong>M</strong>ultiple <strong>F</strong>eatures <strong>F</strong>usion and <strong>M</strong>ixup with <strong>C</strong>onditional <strong>D</strong>ecoder (MFFMCD) in this paper. In particular, to enhance feature representation, MFFMCD adopts two auxiliary feature extraction modules and fuses different branch features. Additionally, to cope with high inter-class similarity, the enhanced features are obscured within a mini-batch and restored to corresponding class samples through a conditional decoder to mimic the feature distribution of unknown classes. Experiments on three publicly available spectral datasets show that the proposed MFFMCD significantly outperforms existing methods. In the end, extensive ablation studies are conducted to investigate the effectiveness, correctness, and robustness of our proposal.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"267 ","pages":"Article 105534"},"PeriodicalIF":3.8,"publicationDate":"2025-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145060693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Time series analysis of nucleic acid reactions via a generalized transformer model 基于广义变压器模型的核酸反应时间序列分析
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-12-15 Epub Date: 2025-09-06 DOI: 10.1016/j.chemolab.2025.105522
Canfeng Liu , Binhui Wang , Hui Dong , Yihan Pan , Jiawen Lin , Jintian Yang , Yihui Tao , Hao Sun
The contemporary landscape of medical diagnostics and therapeutic interventions has witnessed a remarkable surge in the production of time series data. Artificial intelligence (AI), particularly the deep learning, has presented promising values in investigating the high-dimension and meaningful significance hidden behind these diagnostic data. In this work, we propose a novel analytics for intelligent nucleic acid amplification tests (NAAT) based on deep learning and paper microfluidics. On-chip amplification data were straightforwardly fed to a deep learning model derived from Transformer neural network. To facilitate the development and deployment of the approach, we conducted a lightweight processing of the Transformer model. Then, the capacity of the model for accurately predicting the reaction trend and end-point value was validated. We also employed ablation experiments to evaluate the effects of various parameters on prediction performance followed by optimizing the model. Then, three clinical datasets including 706 positive and 205 negative samples obtained from Fujian Provincial Hospital were used to verify the generalization of the approach. Without any modification of the model structure and hyperparameters, accuracy, sensitivity, and specificity by the presented approach were 98.28 %, 97.52 % and 99.02 %. Further comparison studies based on the nine different AI algorithms including recurrent neural network and long-short term memory were performed. The presented study holds potential to facilitating routine diagnostic tasks for preventing pandemic and propelling the development of smart portable instruments.
医学诊断和治疗干预的当代景观见证了时间序列数据生产的显著激增。人工智能(AI),特别是深度学习,在研究隐藏在这些诊断数据背后的高维和有意义的意义方面显示出了很好的价值。在这项工作中,我们提出了一种基于深度学习和纸微流体的智能核酸扩增测试(NAAT)分析方法。片上放大数据直接输入到由Transformer神经网络导出的深度学习模型。为了方便该方法的开发和部署,我们对Transformer模型进行了轻量级处理。验证了该模型准确预测反应趋势和终点值的能力。我们还通过烧蚀实验来评估各种参数对预测性能的影响,并对模型进行优化。然后利用福建省立医院706例阳性样本和205例阴性样本的3个临床数据集验证该方法的泛化性。在不改变模型结构和超参数的情况下,该方法的准确率、灵敏度和特异性分别为98.28%、97.52%和99.02%。基于循环神经网络和长短期记忆等9种不同的人工智能算法进行了进一步的比较研究。本研究具有促进预防大流行的常规诊断任务和推动智能便携式仪器发展的潜力。
{"title":"Time series analysis of nucleic acid reactions via a generalized transformer model","authors":"Canfeng Liu ,&nbsp;Binhui Wang ,&nbsp;Hui Dong ,&nbsp;Yihan Pan ,&nbsp;Jiawen Lin ,&nbsp;Jintian Yang ,&nbsp;Yihui Tao ,&nbsp;Hao Sun","doi":"10.1016/j.chemolab.2025.105522","DOIUrl":"10.1016/j.chemolab.2025.105522","url":null,"abstract":"<div><div>The contemporary landscape of medical diagnostics and therapeutic interventions has witnessed a remarkable surge in the production of time series data. Artificial intelligence (AI), particularly the deep learning, has presented promising values in investigating the high-dimension and meaningful significance hidden behind these diagnostic data. In this work, we propose a novel analytics for intelligent nucleic acid amplification tests (NAAT) based on deep learning and paper microfluidics. On-chip amplification data were straightforwardly fed to a deep learning model derived from Transformer neural network. To facilitate the development and deployment of the approach, we conducted a lightweight processing of the Transformer model. Then, the capacity of the model for accurately predicting the reaction trend and end-point value was validated. We also employed ablation experiments to evaluate the effects of various parameters on prediction performance followed by optimizing the model. Then, three clinical datasets including 706 positive and 205 negative samples obtained from Fujian Provincial Hospital were used to verify the generalization of the approach. Without any modification of the model structure and hyperparameters, accuracy, sensitivity, and specificity by the presented approach were 98.28 %, 97.52 % and 99.02 %. Further comparison studies based on the nine different AI algorithms including recurrent neural network and long-short term memory were performed. The presented study holds potential to facilitating routine diagnostic tasks for preventing pandemic and propelling the development of smart portable instruments.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"267 ","pages":"Article 105522"},"PeriodicalIF":3.8,"publicationDate":"2025-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145046222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Self-attention embedded StyleGAN for virtual sample generation in sensing applications 自关注嵌入式StyleGAN在传感应用中的虚拟样本生成
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-12-15 Epub Date: 2025-09-04 DOI: 10.1016/j.chemolab.2025.105519
Xue-Yu Zhang , Qun-Xiong Zhu , Ming-Jia Liu , Feng Ma , Yi Luo , Wei Ke , Yan-Lin He , Ming-Qing Zhang , Yuan Xu
Given the challenges of low variability in industrial processes, which intensify data scarcity and produce anomalous distributions that compromise data-driven model accuracy. Existing sample generation methods often overlook key factors such as sparsity and correlation among data. To address these challenges, this paper proposes a StyleGAN-based virtual sample generation method with an embedded self-attention mechanism (SASG-VSG). Firstly, StyleGAN is used to map the original data space to a disentangled latent space. The output variables then act as control conditions, guiding the model to interpolate along the output dimension to ensure a more uniform distribution of generated samples. Besides, a self-attention module is incorporated into the discriminator to enhance its ability to capture the similarity between the virtual samples and the original data distribution. Finally, validation experiments on a purified terephthalic acid (PTA) solvent system and a sulfur recovery unit (SRU) confirm the capability of the proposed SASG-VSG in generating high-quality virtual samples for soft-sensing applications.
考虑到工业过程中低可变性的挑战,这加剧了数据稀缺性,并产生了损害数据驱动模型准确性的异常分布。现有的样本生成方法往往忽略了数据之间的稀疏性和相关性等关键因素。为了解决这些问题,本文提出了一种基于stylegan的基于嵌入式自关注机制的虚拟样本生成方法(sag - vsg)。首先,使用StyleGAN将原始数据空间映射到解纠缠的潜在空间。然后,输出变量作为控制条件,引导模型沿着输出维度进行插值,以确保生成的样本分布更加均匀。此外,在鉴别器中加入了自关注模块,增强了鉴别器捕捉虚拟样本与原始数据分布相似度的能力。最后,在纯化对苯二甲酸(PTA)溶剂系统和硫回收装置(SRU)上的验证实验证实了所提出的SASG-VSG能够为软测量应用生成高质量的虚拟样品。
{"title":"Self-attention embedded StyleGAN for virtual sample generation in sensing applications","authors":"Xue-Yu Zhang ,&nbsp;Qun-Xiong Zhu ,&nbsp;Ming-Jia Liu ,&nbsp;Feng Ma ,&nbsp;Yi Luo ,&nbsp;Wei Ke ,&nbsp;Yan-Lin He ,&nbsp;Ming-Qing Zhang ,&nbsp;Yuan Xu","doi":"10.1016/j.chemolab.2025.105519","DOIUrl":"10.1016/j.chemolab.2025.105519","url":null,"abstract":"<div><div>Given the challenges of low variability in industrial processes, which intensify data scarcity and produce anomalous distributions that compromise data-driven model accuracy. Existing sample generation methods often overlook key factors such as sparsity and correlation among data. To address these challenges, this paper proposes a StyleGAN-based virtual sample generation method with an embedded self-attention mechanism (SASG-VSG). Firstly, StyleGAN is used to map the original data space to a disentangled latent space. The output variables then act as control conditions, guiding the model to interpolate along the output dimension to ensure a more uniform distribution of generated samples. Besides, a self-attention module is incorporated into the discriminator to enhance its ability to capture the similarity between the virtual samples and the original data distribution. Finally, validation experiments on a purified terephthalic acid (PTA) solvent system and a sulfur recovery unit (SRU) confirm the capability of the proposed SASG-VSG in generating high-quality virtual samples for soft-sensing applications.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"267 ","pages":"Article 105519"},"PeriodicalIF":3.8,"publicationDate":"2025-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145046223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Self-attention based Difference Long Short-Term Memory Network for Industrial Data-driven Modeling 基于自注意的差分长短期记忆网络用于工业数据驱动建模
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-12-15 Epub Date: 2025-09-20 DOI: 10.1016/j.chemolab.2025.105535
Xiaoqing Zheng, Bo Peng, Anke Xue, Ming Ge, Yaguang Kong, Aipeng Jiang
In modern industry, soft sensors provide real-time predictions of quality variables that are difficult to measure directly with physical sensors. However, in industrial processes, changes in material properties, catalyst deactivation, and other factors often lead to shifts in data distribution. Existing soft sensor models often overlook the impact of these distribution changes on performance. To address the issue of performance degradation due to changes in data distribution, this paper proposes a self-attention based Difference Long Short-Term Memory (SA-DLSTM) network for soft sensor modeling. By employing self-attention, industrial raw data is refined to facilitate the extraction of nonlinear features, thereby reducing the difficulty in modeling. A Difference Channel is designed to perform correlation analysis and select significant features from the raw data, followed by extracting the difference information that can reveal changes in the data distribution. The SA-DLSTM soft sensor model is established and validated on two benchmark industrial datasets: Debutanizer Column and Sulfur Recovery Unit. Comparisons with benchmark models, and state-of-the-art models show that SA-DLSTM achieves the best performance across all evaluation metrics, demonstrating the effectiveness of the proposed model.
在现代工业中,软传感器提供了难以用物理传感器直接测量的质量变量的实时预测。然而,在工业过程中,材料性质的变化、催化剂失活和其他因素往往会导致数据分布的变化。现有的软测量模型往往忽略了这些分布变化对性能的影响。为了解决由于数据分布变化导致的性能下降问题,本文提出了一种基于自注意的差分长短期记忆(SA-DLSTM)网络用于软传感器建模。利用自关注对工业原始数据进行细化,便于提取非线性特征,从而降低建模难度。差分通道(Difference Channel)的作用是从原始数据中进行相关性分析,选择显著特征,提取能够揭示数据分布变化的差分信息。建立了SA-DLSTM软测量模型,并在脱塔塔和硫回收装置两个基准工业数据集上进行了验证。与基准模型和最先进模型的比较表明,SA-DLSTM在所有评估指标中实现了最佳性能,证明了所提出模型的有效性。
{"title":"Self-attention based Difference Long Short-Term Memory Network for Industrial Data-driven Modeling","authors":"Xiaoqing Zheng,&nbsp;Bo Peng,&nbsp;Anke Xue,&nbsp;Ming Ge,&nbsp;Yaguang Kong,&nbsp;Aipeng Jiang","doi":"10.1016/j.chemolab.2025.105535","DOIUrl":"10.1016/j.chemolab.2025.105535","url":null,"abstract":"<div><div>In modern industry, soft sensors provide real-time predictions of quality variables that are difficult to measure directly with physical sensors. However, in industrial processes, changes in material properties, catalyst deactivation, and other factors often lead to shifts in data distribution. Existing soft sensor models often overlook the impact of these distribution changes on performance. To address the issue of performance degradation due to changes in data distribution, this paper proposes a self-attention based Difference Long Short-Term Memory (SA-DLSTM) network for soft sensor modeling. By employing self-attention, industrial raw data is refined to facilitate the extraction of nonlinear features, thereby reducing the difficulty in modeling. A Difference Channel is designed to perform correlation analysis and select significant features from the raw data, followed by extracting the difference information that can reveal changes in the data distribution. The SA-DLSTM soft sensor model is established and validated on two benchmark industrial datasets: Debutanizer Column and Sulfur Recovery Unit. Comparisons with benchmark models, and state-of-the-art models show that SA-DLSTM achieves the best performance across all evaluation metrics, demonstrating the effectiveness of the proposed model.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"267 ","pages":"Article 105535"},"PeriodicalIF":3.8,"publicationDate":"2025-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145109706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Chemometrics and Intelligent Laboratory Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1