Chemometrics and Intelligent Laboratory Systems最新文献_第5页

Gaussian mixture model clustering allows accurate semantic image segmentation of wheat kernels from near-infrared hyperspectral images 高斯混合模型聚类可以对近红外高光谱图像中的小麦籽粒进行准确的语义图像分割

IF 3.7 2区化学 Q2 AUTOMATION & CONTROL SYSTEMS

Chemometrics and Intelligent Laboratory Systems

Pub Date : 2025-02-07 DOI: 10.1016/j.chemolab.2025.105341

Andreas Kartakoullis , Nicola Caporaso , Martin B. Whitworth , Ian D. Fisk

In this study, an ad-hoc image processing pipeline has been developed and proposed for the purpose of semantically segmenting wheat kernel data acquired through near-infrared hyperspectral imaging (HSI). The Gaussian Mixture Model (GMM), characterized as a soft clustering method, has been employed for this task, yielding noteworthy results in both kernel and germ segmentation. A comparative analysis was conducted, wherein GMM was compared with two hard clustering methods, hierarchical clustering and k-means, as well as other common clustering algorithms prevalent in food HSI applications. Notably, GMM exhibited the highest accuracy, with a Jaccard index of 0.745, surpassing hierarchical clustering at 0.698 and k-means at 0.652. Furthermore, the spectral variations observed in wheat kernel topology can be used for semantic image segmentation, especially in the context of selecting the germ portion within the wheat kernels. These findings carry practical significance for professionals in the fields of hyperspectral imaging (HSI) and machine vision, particularly for food product quality assessment and real-time inspection.

为了对近红外高光谱成像（HSI）获取的小麦籽粒数据进行语义分割，本研究开发并提出了一种自适应图像处理流水线。采用软聚类方法的高斯混合模型（Gaussian Mixture Model， GMM）在核和芽的分割上都取得了显著的结果。通过对比分析，将GMM与两种硬聚类方法分层聚类和k-means聚类以及食品HSI应用中常见的其他聚类算法进行了比较。值得注意的是，GMM具有最高的准确性，其Jaccard指数为0.745，超过了分层聚类的0.698和k-means的0.652。此外，在小麦籽粒拓扑结构中观测到的光谱变化可以用于语义图像分割，特别是在选择小麦籽粒内胚芽部分的背景下。这些发现对高光谱成像（HSI）和机器视觉领域的专业人员，特别是食品质量评估和实时检测具有实际意义。

{"title":"Gaussian mixture model clustering allows accurate semantic image segmentation of wheat kernels from near-infrared hyperspectral images","authors":"Andreas Kartakoullis , Nicola Caporaso , Martin B. Whitworth , Ian D. Fisk","doi":"10.1016/j.chemolab.2025.105341","DOIUrl":"10.1016/j.chemolab.2025.105341","url":null,"abstract":"<div><div>In this study, an ad-hoc image processing pipeline has been developed and proposed for the purpose of semantically segmenting wheat kernel data acquired through near-infrared hyperspectral imaging (HSI). The Gaussian Mixture Model (GMM), characterized as a soft clustering method, has been employed for this task, yielding noteworthy results in both kernel and germ segmentation. A comparative analysis was conducted, wherein GMM was compared with two hard clustering methods, hierarchical clustering and k-means, as well as other common clustering algorithms prevalent in food HSI applications. Notably, GMM exhibited the highest accuracy, with a Jaccard index of 0.745, surpassing hierarchical clustering at 0.698 and k-means at 0.652. Furthermore, the spectral variations observed in wheat kernel topology can be used for semantic image segmentation, especially in the context of selecting the germ portion within the wheat kernels. These findings carry practical significance for professionals in the fields of hyperspectral imaging (HSI) and machine vision, particularly for food product quality assessment and real-time inspection.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"259 ","pages":"Article 105341"},"PeriodicalIF":3.7,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143421366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Classification of aluminum alloy using laser-induced breakdown spectroscopy combined with discriminative restricted Boltzmann machine 激光诱导击穿光谱结合判别受限玻尔兹曼机对铝合金进行分类

IF 3.7 2区化学 Q2 AUTOMATION & CONTROL SYSTEMS

Chemometrics and Intelligent Laboratory Systems

Pub Date : 2025-02-04 DOI: 10.1016/j.chemolab.2025.105342

Yujia Dai , Qing Ma , Tingsong Zhang , Shangyong Zhao , Lu Zhou , Xun Gao , Ziyuan Liu

Laser-Induced Breakdown Spectroscopy (LIBS), combined with modern machine learning tools, has emerged as a powerful technique for metal material identification, leveraging its high sensitivity and rapid response. However, the current spectral data analysis methods typically involve a two-step process of dimensionality reduction and model learning, lacking seamless integration. In this study, we address this issue by investigating a discriminative learning approach based on LIBS, utilizing the Discriminative Restricted Boltzmann Machine (DRBM). We apply LIBS technology in conjunction with DRBM for spectral feature selection and classification of five distinct small-sample aluminum alloy samples. The learned spectral latent distribution from the generative model component of DRBM effectively regularizes the discriminative process, thereby overcoming the problem of training overfitting arising from the high-dimensional small-sample limitation. This results in a stable and generalizable qualitative analysis model independent of empirical knowledge. The approach presented in this study achieves a 100 % accuracy, surpassing the best-performing traditional machine learning method (PCA-RF) by 13.33 % in accuracy and demonstrating a similar improvement compared to a Backpropagation Neural Network (BPNN) with the same structure.

激光诱导击穿光谱（LIBS）与现代机器学习工具相结合，凭借其高灵敏度和快速响应能力，已成为金属材料识别的一种强大技术。然而，目前的光谱数据分析方法通常涉及降维和模型学习两步过程，缺乏无缝集成。在这项研究中，我们通过研究一种基于LIBS的判别学习方法来解决这个问题，该方法利用了判别受限玻尔兹曼机（DRBM）。我们将LIBS技术与DRBM相结合，对五种不同的小样本铝合金样品进行光谱特征选择和分类。从DRBM的生成模型成分中学习到的谱潜分布有效地正则化了判别过程，从而克服了高维小样本限制引起的训练过拟合问题。这导致了一个稳定的和可推广的定性分析模型独立于经验知识。本研究中提出的方法实现了100%的准确率，在准确率上超过了性能最好的传统机器学习方法（PCA-RF） 13.33%，并且与具有相同结构的反向传播神经网络（BPNN）相比，显示了类似的改进。

{"title":"Classification of aluminum alloy using laser-induced breakdown spectroscopy combined with discriminative restricted Boltzmann machine","authors":"Yujia Dai , Qing Ma , Tingsong Zhang , Shangyong Zhao , Lu Zhou , Xun Gao , Ziyuan Liu","doi":"10.1016/j.chemolab.2025.105342","DOIUrl":"10.1016/j.chemolab.2025.105342","url":null,"abstract":"<div><div>Laser-Induced Breakdown Spectroscopy (LIBS), combined with modern machine learning tools, has emerged as a powerful technique for metal material identification, leveraging its high sensitivity and rapid response. However, the current spectral data analysis methods typically involve a two-step process of dimensionality reduction and model learning, lacking seamless integration. In this study, we address this issue by investigating a discriminative learning approach based on LIBS, utilizing the Discriminative Restricted Boltzmann Machine (DRBM). We apply LIBS technology in conjunction with DRBM for spectral feature selection and classification of five distinct small-sample aluminum alloy samples. The learned spectral latent distribution from the generative model component of DRBM effectively regularizes the discriminative process, thereby overcoming the problem of training overfitting arising from the high-dimensional small-sample limitation. This results in a stable and generalizable qualitative analysis model independent of empirical knowledge. The approach presented in this study achieves a 100 % accuracy, surpassing the best-performing traditional machine learning method (PCA-RF) by 13.33 % in accuracy and demonstrating a similar improvement compared to a Backpropagation Neural Network (BPNN) with the same structure.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"258 ","pages":"Article 105342"},"PeriodicalIF":3.7,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143349115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A latent space-based multivariate capability index: A new paradigm for raw material supplier selection in industry 4.0 基于潜在空间的多元能力指标：工业4.0下原材料供应商选择的新范式

IF 3.7 2区化学 Q2 AUTOMATION & CONTROL SYSTEMS

Chemometrics and Intelligent Laboratory Systems

Pub Date : 2025-02-04 DOI: 10.1016/j.chemolab.2025.105339

Joan Borràs-Ferrís , Carl Duchesne , Alberto Ferrer

We present a novel Latent Space-based Multivariate Capability Index (LSb-MC_pk) aligned with the Quality by Design initiative and used as a criterion for ranking and selecting suppliers for a particular raw material used in a manufacturing process. The novelty of this new index is that, contrary to other multivariate capability indexes that are defined either in the raw material space or in the Critical Quality Attributes (CQAs) space of the product manufactured, this new LSb-MC_pk is defined in the latent space connecting both spaces. This endows the new index with a clear advantage over classical ones as it quantifies the capacity of each raw material supplier of providing assurance of quality with a certain confidence level for the CQAs of the manufactured product before manufacturing a single unit of the product. All we need is a rich database with historical information of several raw material properties along with the CQAs. Besides, we present a novel methodology to carry out the diagnosis for assignable causes when a supplier does not score a good capability index. The proposed LSb-MC_pk is based on Partial Least Squares (PLS) regression, and it is illustrated using data from both an industrial and a simulation study.

我们提出了一种新的基于潜在空间的多元能力指数（LSb-MCpk），该指数与设计质量倡议相一致，并用作制造过程中使用的特定原材料的排名和选择供应商的标准。这个新指标的新颖之处在于，与其他在原材料空间或在制造产品的关键质量属性（cqa）空间中定义的多元能力指标不同，这个新的LSb-MCpk是在连接两个空间的潜在空间中定义的。这使得新指标与传统指标相比具有明显的优势，因为它量化了每个原材料供应商在制造单个产品之前为制造产品的cqa提供一定置信度的质量保证的能力。我们所需要的是一个丰富的数据库，其中包含几种原材料属性的历史信息以及cqa。此外，我们还提出了一种新的方法，当供应商的能力指标不佳时，对可分配原因进行诊断。提出的LSb-MCpk基于偏最小二乘（PLS）回归，并使用来自工业和模拟研究的数据进行说明。

{"title":"A latent space-based multivariate capability index: A new paradigm for raw material supplier selection in industry 4.0","authors":"Joan Borràs-Ferrís , Carl Duchesne , Alberto Ferrer","doi":"10.1016/j.chemolab.2025.105339","DOIUrl":"10.1016/j.chemolab.2025.105339","url":null,"abstract":"<div><div>We present a novel Latent Space-based Multivariate Capability Index (<em>LSb-MC</em><sub><em>pk</em></sub>) aligned with the Quality by Design initiative and used as a criterion for ranking and selecting suppliers for a particular raw material used in a manufacturing process. The novelty of this new index is that, contrary to other multivariate capability indexes that are defined either in the raw material space or in the Critical Quality Attributes (CQAs) space of the product manufactured, this new <em>LSb-MC</em><sub><em>pk</em></sub> is defined in the latent space connecting both spaces. This endows the new index with a clear advantage over classical ones as it quantifies the capacity of each raw material supplier of providing assurance of quality with a certain confidence level for the CQAs of the manufactured product before manufacturing a single unit of the product. All we need is a rich database with historical information of several raw material properties along with the CQAs. Besides, we present a novel methodology to carry out the diagnosis for assignable causes when a supplier does not score a good capability index. The proposed <em>LSb-MC</em><sub><em>pk</em></sub> is based on Partial Least Squares (PLS) regression, and it is illustrated using data from both an industrial and a simulation study.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"258 ","pages":"Article 105339"},"PeriodicalIF":3.7,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143386824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Optimising thermal performance of water-based hybrid nanofluids with magnetic and radiative effects over a spinning disc 具有磁辐射效应的水基混合纳米流体在旋转圆盘上的热性能优化

IF 3.7 2区化学 Q2 AUTOMATION & CONTROL SYSTEMS

Chemometrics and Intelligent Laboratory Systems

Pub Date : 2025-02-02 DOI: 10.1016/j.chemolab.2025.105336

Maddina Dinesh Kumar , Dharmaiah Gurram , Se-Jin Yook , C.S.K. Raju , Nehad Ali Shah

Research background and significance

Hybrid nanofluids have garnered significant attention because of their capacity to enhance heat transmission in a range of technical applications; optimising their thermal performance is crucial for improving the efficiency of cooling systems, energy storage devices, and heat exchangers with rotating surfaces.

Present study novelty and methodology

In a present study investigating the heat, velocity and mass diffusion transformation under the effect of the Rossland and magnetic approximations, a ternary hybrid nanofluid is a mixing of more than two characteristics using a base fluid through a spinning disc surface, utilising to speed up the heat transmission rate due to ternary hybrid nanofluid, converting non-linear PDE to ODE in this process dimensional governing equations will convert to dimensionless by using the similarity transformations afterwards with MATLAB inbuilt BVP5C solver has been using for the numeral computation, The quadratic regression model's response surface method (RSM) has been employed to research the impacts of independent parameters on physical parameters; surface plots are drawn through Python programming.

Quantitative evaluation

For the RSM quadratic regression model

(R^{2} = 99.51 %)

, it shows the model fit goodness. case-1 including more

C_{f}

rate of transmission than case-2, In case-1 with more

S h

transmission rate in comparison to case-2, In case-1 Possessing more

N u s

rate of transmission than case 2.

研究背景与意义混合纳米流体因其在一系列技术应用中增强热传递的能力而备受关注；优化其热性能对于提高冷却系统、储能装置和带有旋转表面的热交换器的效率至关重要。本研究的新颖性和方法在本研究中，研究了在罗斯兰近似和磁性近似影响下的热量、速度和质量扩散转化，三元混合纳米流体是一种通过旋转圆盘表面使用基础流体混合两种以上特性的流体，利用三元混合纳米流体加快热量传输速率、在此过程中，将非线性 PDE 转换为 ODE，然后通过相似性转换将有维控制方程转换为无维控制方程，并使用 MATLAB 内置的 BVP5C 求解器进行数字计算，采用二次回归模型的响应曲面法（RSM）研究独立参数对物理参数的影响；曲面图通过 Python 程序绘制。定量评估对于 RSM 二次回归模型（R2=99.51%），它显示了模型拟合的良好性。在案例 1 中，Cf 的传播率高于案例 2；在案例 1 中，Sh 的传播率高于案例 2；在案例 1 中，Nus 的传播率高于案例 2。

{"title":"Optimising thermal performance of water-based hybrid nanofluids with magnetic and radiative effects over a spinning disc","authors":"Maddina Dinesh Kumar , Dharmaiah Gurram , Se-Jin Yook , C.S.K. Raju , Nehad Ali Shah","doi":"10.1016/j.chemolab.2025.105336","DOIUrl":"10.1016/j.chemolab.2025.105336","url":null,"abstract":"<div><h3>Research background and significance</h3><div>Hybrid nanofluids have garnered significant attention because of their capacity to enhance heat transmission in a range of technical applications; optimising their thermal performance is crucial for improving the efficiency of cooling systems, energy storage devices, and heat exchangers with rotating surfaces.</div></div><div><h3>Present study novelty and methodology</h3><div>In a present study investigating the heat, velocity and mass diffusion transformation under the effect of the Rossland and magnetic approximations, a ternary hybrid nanofluid is a mixing of more than two characteristics using a base fluid through a spinning disc surface, utilising to speed up the heat transmission rate due to ternary hybrid nanofluid, converting non-linear PDE to ODE in this process dimensional governing equations will convert to dimensionless by using the similarity transformations afterwards with MATLAB inbuilt BVP5C solver has been using for the numeral computation, The quadratic regression model's response surface method (RSM) has been employed to research the impacts of independent parameters on physical parameters; surface plots are drawn through Python programming.</div></div><div><h3>Quantitative evaluation</h3><div>For the RSM quadratic regression model <span><math><mrow><mo>(</mo><mrow><msup><mi>R</mi><mn>2</mn></msup><mo>=</mo><mn>99.51</mn><mo>%</mo></mrow><mo>)</mo></mrow></math></span>, it shows the model fit goodness. case-1 including more <span><math><mrow><msub><mi>C</mi><mi>f</mi></msub></mrow></math></span> rate of transmission than case-2, In case-1 with more <span><math><mrow><mi>S</mi><mi>h</mi></mrow></math></span> transmission rate in comparison to case-2, In case-1 Possessing more <span><math><mrow><mi>N</mi><mi>u</mi><mi>s</mi></mrow></math></span> rate of transmission than case 2.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"258 ","pages":"Article 105336"},"PeriodicalIF":3.7,"publicationDate":"2025-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143183730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A new class of unit models with a quantile regression approach applied to contamination data 一类新的单元模型与分位数回归方法应用于污染数据

IF 3.7 2区化学 Q2 AUTOMATION & CONTROL SYSTEMS

Chemometrics and Intelligent Laboratory Systems

Pub Date : 2025-01-31 DOI: 10.1016/j.chemolab.2025.105322

Karol I. Santoro , Yolanda M. Gómez , Héctor J. Gómez , Diego I. Gallardo

In this paper, we introduce a new class of unit models defined on the open unit interval. Through the reparameterization of the model, the location parameter can be interpreted as a quantile of the distribution. Furthermore, we can assess the impact of explanatory variables within the conditional quantiles of the dependent variable, offering an alternative to the Kumaraswamy quantile regression model. We engage in quantile regression and apply it to two instances of environmental data. We evaluate the effectiveness of the newly introduced models in scenarios both with and without covariates, drawing comparisons with results yielded by the Kumaraswamy regression model. The proposed method has been implemented in an R package.

本文引入了一类新的在开单位区间上定义的单元模型。通过模型的重新参数化，可以将位置参数解释为分布的一个分位数。此外，我们可以在因变量的条件分位数内评估解释变量的影响，为Kumaraswamy分位数回归模型提供了另一种选择。我们进行分位数回归，并将其应用于两个环境数据实例。我们评估了新引入的模型在有和没有协变量的情况下的有效性，并与Kumaraswamy回归模型的结果进行了比较。该方法已在一个R包中实现。

引用次数: 0

Design of Poly(lactic-co-glycolic acid) nanoparticles in drug delivery by artificial intelligence methods to find the conditions of nanoparticles synthesis 设计聚（乳酸-羟基乙酸）纳米颗粒在给药中的应用，通过人工智能方法寻找纳米颗粒合成的条件

IF 3.7 2区化学 Q2 AUTOMATION & CONTROL SYSTEMS

Chemometrics and Intelligent Laboratory Systems

Pub Date : 2025-01-27 DOI: 10.1016/j.chemolab.2025.105335

Bader Huwaimel , Saad Alqarni

Poly (lactic-co-glycolic acid) (PLGA) is one of the most commonly used polymers for drug delivery due to its biodegradable property. Production of PLGA particles in nanosized scale would be of great importance to exploit the properties of this polymer for nano-based drug delivery. This work explores machine learning methods for the PLGA regression tasks of particle size (nm) prediction and Zeta potential (mV) in the synthesis process. Utilizing a comprehensive dataset with categorical inputs (PLGA type and anti-solvent type) and numerical inputs (PLGA concentration and anti-solvent concentration), the research incorporates Isolation Forest for outlier detection, Min-Max Normalization, and One-Hot Encoding for preprocessing. Several regression models including LASSO, Polynomial Regression (PR), and Support Vector Regression (SVR) were employed in combination with Bagging Ensemble methods for enhanced predictive performance. Glowworm Swarm Optimization (GSO) was applied for hyperparameter tuning. The results indicate that BAG-SVR attained the highest test R² of 0.9422 for particle size prediction. For Zeta potential prediction, BAG-PR outperformed other models, achieving a test R² score of 0.98881.

聚乳酸-羟基乙酸（PLGA）由于其可生物降解的特性而成为最常用的药物递送聚合物之一。制备纳米级PLGA颗粒对于开发该聚合物用于纳米药物递送具有重要意义。本工作探索了合成过程中粒度（nm）预测和Zeta电位（mV）的PLGA回归任务的机器学习方法。利用分类输入（PLGA类型和反溶剂类型）和数值输入（PLGA浓度和反溶剂浓度）的综合数据集，研究采用隔离森林进行异常值检测，Min-Max归一化和One-Hot编码进行预处理。采用LASSO、多项式回归（PR）和支持向量回归（SVR）等回归模型与Bagging Ensemble方法相结合，提高了预测性能。采用GSO算法进行超参数整定。结果表明，BAG-SVR预测粒径的检验R2最高，为0.9422。对于Zeta电位预测，BAG-PR优于其他模型，检验R2得分为0.98881。

{"title":"Design of Poly(lactic-co-glycolic acid) nanoparticles in drug delivery by artificial intelligence methods to find the conditions of nanoparticles synthesis","authors":"Bader Huwaimel , Saad Alqarni","doi":"10.1016/j.chemolab.2025.105335","DOIUrl":"10.1016/j.chemolab.2025.105335","url":null,"abstract":"<div><div>Poly (lactic-co-glycolic acid) (PLGA) is one of the most commonly used polymers for drug delivery due to its biodegradable property. Production of PLGA particles in nanosized scale would be of great importance to exploit the properties of this polymer for nano-based drug delivery. This work explores machine learning methods for the PLGA regression tasks of particle size (nm) prediction and Zeta potential (mV) in the synthesis process. Utilizing a comprehensive dataset with categorical inputs (PLGA type and anti-solvent type) and numerical inputs (PLGA concentration and anti-solvent concentration), the research incorporates Isolation Forest for outlier detection, Min-Max Normalization, and One-Hot Encoding for preprocessing. Several regression models including LASSO, Polynomial Regression (PR), and Support Vector Regression (SVR) were employed in combination with Bagging Ensemble methods for enhanced predictive performance. Glowworm Swarm Optimization (GSO) was applied for hyperparameter tuning. The results indicate that BAG-SVR attained the highest test R<sup>2</sup> of 0.9422 for particle size prediction. For Zeta potential prediction, BAG-PR outperformed other models, achieving a test R<sup>2</sup> score of 0.98881.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"258 ","pages":"Article 105335"},"PeriodicalIF":3.7,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143183733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Automatic spectral fitting for LIBS and Raman spectra by boosted deconvolution method 用增强反褶积法自动拟合LIBS和Raman光谱

IF 3.7 2区化学 Q2 AUTOMATION & CONTROL SYSTEMS

Chemometrics and Intelligent Laboratory Systems

Pub Date : 2025-01-25 DOI: 10.1016/j.chemolab.2025.105334

M.A. Meneses-Nava

This study introduces a spectral analysis method known as Boosted Deconvolution Fitting (BDF) to process spectroscopic data. The BDF method enhances spectral resolution and precisely adjusts spectra by integrating boosted deconvolution for determining band profile parameters, and a multicomponent analysis technique for minor adjustments in band intensity. This technique seeks to address the shortcomings of conventional methods like the Levenberg-Marquardt algorithm (LMA), especially in terms of improving spectral resolution, accurately determining parameters of overlapping bands, and reducing sensitivity to initial conditions. The efficacy of the BDF method is affected by various factors, including the chosen band profile type (Gaussian or Lorentzian), the signal-to-noise ratio (SNR) of the dataset, and the separation and relative intensities of the spectral bands.

本文介绍了一种称为增强反褶积拟合（boosting Deconvolution Fitting， BDF）的光谱分析方法来处理光谱数据。BDF方法提高了光谱分辨率，并通过积分增强反褶积来精确调整光谱，以确定波段剖面参数，并采用多分量分析技术进行波段强度的微小调整。该技术旨在解决Levenberg-Marquardt算法（LMA）等传统方法的缺点，特别是在提高光谱分辨率，准确确定重叠带参数以及降低对初始条件的灵敏度方面。BDF方法的有效性受到多种因素的影响，包括所选择的波段轮廓类型（高斯或洛伦兹）、数据集的信噪比（SNR）以及光谱波段的分离和相对强度。

引用次数: 0

Reconstructing spectral shapes with GAN models: A data-driven approach for high-resolution spectra from low-resolution spectrometers 用GAN模型重建光谱形状：来自低分辨率光谱仪的高分辨率光谱的数据驱动方法

IF 3.7 2区化学 Q2 AUTOMATION & CONTROL SYSTEMS

Chemometrics and Intelligent Laboratory Systems

Pub Date : 2025-01-24 DOI: 10.1016/j.chemolab.2025.105333

Min-Hsu Tai, Cheng-Che Hsu

This study presents the development of a generative adversarial network (GAN) to generate high-resolution (HR) spectra from low-resolution (LR) spectra. Plasma emissions with second positive system of nitrogen are used for demonstration. Specair™ is used to generate HR and LR spectra pairs as the training data covering the range of rotational temperatures (T_rot) and vibrational temperatures (T_vib) ranging from 300 to 1200 K and 2000 to 6500 K, respectively. Optical emission spectra from low-pressure and atmospheric-pressure plasmas are used as the testing data to show the feasibility of the model for generating HR spectra with spectra acquired using LR spectrometers. Feature matching is used during the training stage to tackle the instability issues. The distributions of the discriminator scores are used as an initial criterion to monitor the training procedure. The results show a weighted coefficient of determination (

{\overline{R}}^{2}

) greater than 0.9999 between the simulated and generated HR spectra. The fitting errors for T_rot and T_vib between generated HR spectra and experimental HR spectra acquired from an HR spectrometer are mostly below 5 %. The results indicate that this GAN serves as an efficient approach to obtain HR spectra when HR spectrometers are not available.

本研究介绍了生成对抗网络（GAN）的开发情况，该网络可从低分辨率（LR）光谱生成高分辨率（HR）光谱。等离子体发射的第二正氮系统被用于演示。使用 Specair™ 生成 HR 和 LR 光谱对作为训练数据，涵盖的旋转温度 (Trot) 和振动温度 (Tvib) 范围分别为 300 至 1200 K 和 2000 至 6500 K。低压和大气压等离子体的光学发射光谱被用作测试数据，以显示该模型利用 LR 光谱仪获取的光谱生成 HR 光谱的可行性。在训练阶段使用特征匹配来解决不稳定性问题。判别分数的分布被用作监测训练过程的初始标准。结果显示，模拟和生成的 HR 光谱之间的加权判定系数 (R‾2) 大于 0.9999。生成的心率频谱与从心率频谱仪获取的实验心率频谱之间的 Trot 和 Tvib 拟合误差大多低于 5%。结果表明，在没有 HR 光谱仪的情况下，该 GAN 是获取 HR 光谱的有效方法。

{"title":"Reconstructing spectral shapes with GAN models: A data-driven approach for high-resolution spectra from low-resolution spectrometers","authors":"Min-Hsu Tai, Cheng-Che Hsu","doi":"10.1016/j.chemolab.2025.105333","DOIUrl":"10.1016/j.chemolab.2025.105333","url":null,"abstract":"<div><div>This study presents the development of a generative adversarial network (GAN) to generate high-resolution (HR) spectra from low-resolution (LR) spectra. Plasma emissions with second positive system of nitrogen are used for demonstration. Specair™ is used to generate HR and LR spectra pairs as the training data covering the range of rotational temperatures (T<sub>rot</sub>) and vibrational temperatures (T<sub>vib</sub>) ranging from 300 to 1200 K and 2000 to 6500 K, respectively. Optical emission spectra from low-pressure and atmospheric-pressure plasmas are used as the testing data to show the feasibility of the model for generating HR spectra with spectra acquired using LR spectrometers. Feature matching is used during the training stage to tackle the instability issues. The distributions of the discriminator scores are used as an initial criterion to monitor the training procedure. The results show a weighted coefficient of determination (<span><math><mrow><msup><mover><mi>R</mi><mo>‾</mo></mover><mn>2</mn></msup></mrow></math></span>) greater than 0.9999 between the simulated and generated HR spectra. The fitting errors for T<sub>rot</sub> and T<sub>vib</sub> between generated HR spectra and experimental HR spectra acquired from an HR spectrometer are mostly below 5 %. The results indicate that this GAN serves as an efficient approach to obtain HR spectra when HR spectrometers are not available.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"258 ","pages":"Article 105333"},"PeriodicalIF":3.7,"publicationDate":"2025-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143183721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An enhanced IWCARS method for measuring soil available potassium 一种改进的IWCARS法测定土壤速效钾

IF 3.7 2区化学 Q2 AUTOMATION & CONTROL SYSTEMS

Chemometrics and Intelligent Laboratory Systems

Pub Date : 2025-01-23 DOI: 10.1016/j.chemolab.2025.105324

Zhaoxuan Pan , Xiaoyu Zhao , Yue Zhao , Lijing Cai , Liang Tong , Zhe Zhai

The Competitive Adaptive Re-weighted Sampling (CARS) method, while excelling in feature extraction, encounters several challenges when processing low-quality data, including high computational complexity, intricate parameter settings, and the potential for overfitting. To address these issues, this paper introduces the IWCARS (Initial Weight and Weight, I & W) algorithm, which implements two key methodological enhancements: initial weight selection and weight update strategy. This algorithm, building upon the traditional CARS algorithm and density-based clustering, offers a supplementary tool for data feature selection by computing density and weight, and employs an adaptive model evaluation mechanism to select the most pertinent features, ultimately constructing a model with enhanced predictive capability. IWCARS optimizes model performance by dynamically adjusting the feature set, thereby improving the algorithm's prediction performance and model fit. Furthermore, the IWCARS method, in conjunction with a Partial Least Squares (PLS) model, was applied to measure soil Available Potassium (AK) content using near-infrared spectroscopy. Five pre-processing techniques were conducted on the near-infrared spectrum, with the IWCARS + PLS model constructed using first derivative data, yielding optimal results. The experimental results demonstrated that the model based on 1st Derivative + IWCARS + PLS yielded the best performance. Specifically, the model achieved R_C² of 0.9905, R_p² of 0.9817, RMSEC of 0.8917, RMSEP of 0.9024, and RPD of 8.5176. Robustness, versatility, and transferability tests demonstrated that the proposed IWCARS algorithm, when integrated into the PLS model, achieved commendable measurement accuracy. While there are limited strategies for concurrently addressing high computational complexity, challenging parameter settings, and overfitting risks, this study aims to mitigate these concerns by reducing the computational complexity of the CARS algorithm, simplifying parameter settings, and preventing overfitting, ultimately enhancing the model's fitting accuracy, training speed, and generalization capability.

竞争自适应重加权采样（CARS）方法虽然在特征提取方面表现出色，但在处理低质量数据时遇到了一些挑战，包括高计算复杂度，复杂的参数设置以及过度拟合的可能性。为了解决这些问题，本文引入了IWCARS (Initial Weight和Weight， I &；该算法实现了两个关键的方法改进：初始权值选择和权值更新策略。该算法在传统CARS算法和基于密度的聚类的基础上，通过计算密度和权值为数据特征选择提供补充工具，并采用自适应模型评价机制选择最相关的特征，最终构建具有增强预测能力的模型。IWCARS通过动态调整特征集来优化模型性能，从而提高算法的预测性能和模型拟合。此外，采用IWCARS方法，结合偏最小二乘（PLS）模型，利用近红外光谱技术测定土壤速效钾（AK）含量。对近红外光谱进行了5种预处理技术，利用一阶导数数据构建了IWCARS + PLS模型，得到了最优的预处理结果。实验结果表明，基于一阶导数+ IWCARS + PLS的模型性能最好。具体而言，模型的RC2为0.9905，Rp2为0.9817，RMSEC为0.8917，RMSEP为0.9024，RPD为8.5176。鲁棒性、通用性和可转移性测试表明，当集成到PLS模型中时，所提出的IWCARS算法实现了值得称赞的测量精度。同时解决高计算复杂度、具有挑战性的参数设置和过拟合风险的策略有限，本研究旨在通过降低CARS算法的计算复杂度、简化参数设置和防止过拟合来缓解这些问题，最终提高模型的拟合精度、训练速度和泛化能力。

{"title":"An enhanced IWCARS method for measuring soil available potassium","authors":"Zhaoxuan Pan , Xiaoyu Zhao , Yue Zhao , Lijing Cai , Liang Tong , Zhe Zhai","doi":"10.1016/j.chemolab.2025.105324","DOIUrl":"10.1016/j.chemolab.2025.105324","url":null,"abstract":"<div><div>The Competitive Adaptive Re-weighted Sampling (CARS) method, while excelling in feature extraction, encounters several challenges when processing low-quality data, including high computational complexity, intricate parameter settings, and the potential for overfitting. To address these issues, this paper introduces the IWCARS (Initial Weight and Weight, I & W) algorithm, which implements two key methodological enhancements: initial weight selection and weight update strategy. This algorithm, building upon the traditional CARS algorithm and density-based clustering, offers a supplementary tool for data feature selection by computing density and weight, and employs an adaptive model evaluation mechanism to select the most pertinent features, ultimately constructing a model with enhanced predictive capability. IWCARS optimizes model performance by dynamically adjusting the feature set, thereby improving the algorithm's prediction performance and model fit. Furthermore, the IWCARS method, in conjunction with a Partial Least Squares (PLS) model, was applied to measure soil Available Potassium (AK) content using near-infrared spectroscopy. Five pre-processing techniques were conducted on the near-infrared spectrum, with the IWCARS + PLS model constructed using first derivative data, yielding optimal results. The experimental results demonstrated that the model based on 1st Derivative + IWCARS + PLS yielded the best performance. Specifically, the model achieved R<sub>C</sub><sup>2</sup> of 0.9905, R<sub>p</sub><sup>2</sup> of 0.9817, RMSEC of 0.8917, RMSEP of 0.9024, and RPD of 8.5176. Robustness, versatility, and transferability tests demonstrated that the proposed IWCARS algorithm, when integrated into the PLS model, achieved commendable measurement accuracy. While there are limited strategies for concurrently addressing high computational complexity, challenging parameter settings, and overfitting risks, this study aims to mitigate these concerns by reducing the computational complexity of the CARS algorithm, simplifying parameter settings, and preventing overfitting, ultimately enhancing the model's fitting accuracy, training speed, and generalization capability.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"258 ","pages":"Article 105324"},"PeriodicalIF":3.7,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143183732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MADGUI: Multi-Application Design Graphical User Interface for active learning assisted by Bayesian optimization MADGUI：多应用程序设计图形用户界面，用于贝叶斯优化辅助的主动学习

IF 3.7 2区化学 Q2 AUTOMATION & CONTROL SYSTEMS

Chemometrics and Intelligent Laboratory Systems

Pub Date : 2025-01-21 DOI: 10.1016/j.chemolab.2025.105323

Christophe Bajan, Guillaume Lambard

We present MADGUI, Multi-Application Design Graphical User Interface (GUI) using Bayesian Optimization and prediction model for data analysis and optimize process or composition. Its strength is its user-friendly design, which requires no programming knowledge. It is built using the Streamlit library in Python and is divided into three parts, allowing users to select various parameters and fill csv/xlsx files without any coding required. Overall, MADGUI is designed as an optimal experiment design platform with active machine learning, which accelerates the discovery of optimal solutions and provides an intuitive GUI for users with no experience in coding, machine learning, or optimization.

我们提出了MADGUI，多应用程序设计图形用户界面（GUI），使用贝叶斯优化和预测模型进行数据分析和优化过程或组成。它的优势在于它的用户友好设计，不需要编程知识。它是使用Python中的Streamlit库构建的，分为三个部分，允许用户选择各种参数并填充csv/xlsx文件，而无需任何编码。总体而言，MADGUI被设计为具有主动机器学习的最佳实验设计平台，它加速了最佳解决方案的发现，并为没有编码，机器学习或优化经验的用户提供了直观的GUI。

引用次数: 0