首页 > 最新文献

Journal of Chemometrics最新文献

英文 中文
Characterising the Effect of Cultivar and Roasting Temperature on FT-NIR Spectral Data of Wheat Using ASCA 利用ASCA表征品种和焙烧温度对小麦FT-NIR光谱数据的影响
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2026-01-15 DOI: 10.1002/cem.70096
Mia van Niekerk, Federico Marini, Stefan Hayward, Marena Manley

The physicochemical and functional properties of wheat can be modified by exposing the whole grains to thermal pretreatment. Conventional analytical methods used to investigate such modifications are expensive, labour-intensive and may be inaccurate due to interfering compounds. An economical alternative is to use Fourier transform near-infrared (FT-NIR) spectroscopy in combination with multivariate data analysis techniques. Analysis of variance simultaneous component analysis (ASCA) is an exploratory data analysis technique used to characterise the effects of experimental design factors on the chemical composition captured in the FT-NIR spectral data. In this study, two hard wheat cultivars were exposed to 10 different temperatures by means of forced convection continuous tumble roasting. ASCA was applied to the standard normal variate preprocessed spectral data to evaluate the effects of cultivar, roasting temperature and their interaction. All three factors, cultivar, roasting temperature and their interaction, had a significant effect (p <$$ < $$ 0.05) on the spectral data. Differences between the roasted wheat cultivars were associated with moisture, starch and aromatic compounds. The association with aromatic structures was supported by the differences in the phenolic contents of the two cultivars. Low roasting temperatures (108°C–150°C) were associated with starch and moisture changes particularly at approximately 1450, 1410 and 1940 nm. Water evaporated from the kernels, and the degree of starch polymerisation decreased. High roasting temperatures (170°C–232°C) were associated with starch and amino acids (ca. 2100 and 2294 nm), which likely underwent structural changes and participated in nonenzymatic browning reactions.

小麦的理化性质和功能特性可以通过热预处理来改变。用于研究这类修饰的传统分析方法是昂贵的,劳动密集型的,并且可能由于干扰化合物而不准确。一个经济的替代方案是使用傅立叶变换近红外(FT-NIR)光谱与多元数据分析技术相结合。方差分析同时成分分析(ASCA)是一种探索性数据分析技术,用于描述实验设计因素对FT-NIR光谱数据中捕获的化学成分的影响。采用强制对流连续滚筒式焙烧的方法,对两种硬质小麦品种进行了10种不同温度的焙烧试验。应用ASCA对标准正态变量预处理光谱数据进行分析,评价品种、焙烧温度及其相互作用的影响。品种、焙烧温度及其相互作用对光谱数据均有显著影响(p &lt; $$ < $$ 0.05)。不同烘焙小麦品种之间的差异与水分、淀粉和芳香化合物有关。两个品种酚类物质含量的差异支持了与芳香结构的关联。低焙烧温度(108°C - 150°C)与淀粉和水分变化有关,特别是在大约1450、1410和1940 nm处。水分从籽粒中蒸发,淀粉聚合度降低。高温(170°C - 232°C)与淀粉和氨基酸(约2100和2294 nm)有关,它们可能发生了结构变化并参与了非酶褐变反应。
{"title":"Characterising the Effect of Cultivar and Roasting Temperature on FT-NIR Spectral Data of Wheat Using ASCA","authors":"Mia van Niekerk,&nbsp;Federico Marini,&nbsp;Stefan Hayward,&nbsp;Marena Manley","doi":"10.1002/cem.70096","DOIUrl":"https://doi.org/10.1002/cem.70096","url":null,"abstract":"<p>The physicochemical and functional properties of wheat can be modified by exposing the whole grains to thermal pretreatment. Conventional analytical methods used to investigate such modifications are expensive, labour-intensive and may be inaccurate due to interfering compounds. An economical alternative is to use Fourier transform near-infrared (FT-NIR) spectroscopy in combination with multivariate data analysis techniques. Analysis of variance simultaneous component analysis (ASCA) is an exploratory data analysis technique used to characterise the effects of experimental design factors on the chemical composition captured in the FT-NIR spectral data. In this study, two hard wheat cultivars were exposed to 10 different temperatures by means of forced convection continuous tumble roasting. ASCA was applied to the standard normal variate preprocessed spectral data to evaluate the effects of cultivar, roasting temperature and their interaction. All three factors, cultivar, roasting temperature and their interaction, had a significant effect (<i>p</i> <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mo>&lt;</mo>\u0000 </mrow>\u0000 <annotation>$$ &lt; $$</annotation>\u0000 </semantics></math> 0.05) on the spectral data. Differences between the roasted wheat cultivars were associated with moisture, starch and aromatic compounds. The association with aromatic structures was supported by the differences in the phenolic contents of the two cultivars. Low roasting temperatures (108°C–150°C) were associated with starch and moisture changes particularly at approximately 1450, 1410 and 1940 nm. Water evaporated from the kernels, and the degree of starch polymerisation decreased. High roasting temperatures (170°C–232°C) were associated with starch and amino acids (ca. 2100 and 2294 nm), which likely underwent structural changes and participated in nonenzymatic browning reactions.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"40 2","pages":""},"PeriodicalIF":2.1,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/epdf/10.1002/cem.70096","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145987153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Field Strength Distribution-Based Sample Selection Method 基于场强分布的样本选择方法
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2026-01-08 DOI: 10.1002/cem.70094
Zhonghai He, Jialong Sun, Yi Zhang, Xiaofang Zhang

In the process of spectral modeling, the representativeness of samples to the overall space determines the modeling efficiency. A commonly used unsupervised sample selection method is based on the maximum–minimum distance of selected samples. However, this approach selects samples based solely on a single distance metric, which introduces a degree of randomness. Inspired by the spatial field strength distribution law of multiple point charges, we propose a novel sample selection method based on field strength. In this method, each sample point is treated as a point charge that generates an electric field in its vicinity. The field strength at any given position is the sum of the contributions from all point charges at that location, with a higher field strength indicating that the point is already well represented. By calculating the total field strength exerted by each selected sample on the candidate points and incorporating the point with the minimum field strength into the calibration set, the method maximizes the coverage of field strength in the calibration space. Sequentially selecting and adding points with the lowest field strength yields a highly representative sample set. This approach enables efficient and unsupervised selection of modeling samples.

在光谱建模过程中,样本对整体空间的代表性决定了建模效率。一种常用的无监督样本选择方法是基于所选样本的最大-最小距离。然而,这种方法仅基于单个距离度量来选择样本,这引入了一定程度的随机性。受多点电荷空间场强分布规律的启发,提出了一种基于场强的样品选择方法。在这种方法中,每个采样点被视为在其附近产生电场的点电荷。任何给定位置的场强是该位置所有点电荷贡献的总和,场强越高表明该点已经很好地表现出来。该方法通过计算每个选定样本对候选点施加的总场强,并将场强最小的点纳入校准集,使场强在校准空间的覆盖范围最大化。依次选择和添加具有最低场强的点产生一个高度代表性的样本集。这种方法能够有效地和无监督地选择建模样本。
{"title":"Field Strength Distribution-Based Sample Selection Method","authors":"Zhonghai He,&nbsp;Jialong Sun,&nbsp;Yi Zhang,&nbsp;Xiaofang Zhang","doi":"10.1002/cem.70094","DOIUrl":"https://doi.org/10.1002/cem.70094","url":null,"abstract":"<div>\u0000 \u0000 <p>In the process of spectral modeling, the representativeness of samples to the overall space determines the modeling efficiency. A commonly used unsupervised sample selection method is based on the maximum–minimum distance of selected samples. However, this approach selects samples based solely on a single distance metric, which introduces a degree of randomness. Inspired by the spatial field strength distribution law of multiple point charges, we propose a novel sample selection method based on field strength. In this method, each sample point is treated as a point charge that generates an electric field in its vicinity. The field strength at any given position is the sum of the contributions from all point charges at that location, with a higher field strength indicating that the point is already well represented. By calculating the total field strength exerted by each selected sample on the candidate points and incorporating the point with the minimum field strength into the calibration set, the method maximizes the coverage of field strength in the calibration space. Sequentially selecting and adding points with the lowest field strength yields a highly representative sample set. This approach enables efficient and unsupervised selection of modeling samples.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"40 1","pages":""},"PeriodicalIF":2.1,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145963821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fractional Kinetic Modelling of the Adsorption and Desorption Processes From Experimental SPR Curves 基于实验SPR曲线的吸附和解吸过程的分数动力学模型
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-12-30 DOI: 10.1002/cem.70099
Higor V. M. Ferreira, Nelson H. T. Lemes, Yara L. Coelho, Luciano S. Virtuoso, Ana C. dos Santos Pires, Luis H. M. da Silva
<p>The application of surface plasmon resonance (SPR) has transformed the study of interactions between a ligand immobilized on the surface of a sensor chip (<span></span><math> <semantics> <mrow> <msub> <mrow> <mi>L</mi> </mrow> <mrow> <mi>S</mi> </mrow> </msub> </mrow> <annotation>$$ {L}_S $$</annotation> </semantics></math>) and an analyte in solution (<span></span><math> <semantics> <mrow> <mi>A</mi> </mrow> <annotation>$$ A $$</annotation> </semantics></math>). This technique enables the real-time monitoring of binding processes with high sensitivity. The adsorption–desorption dynamics, <span></span><math> <semantics> <mrow> <mi>A</mi> <mo>+</mo> <msub> <mrow> <mi>L</mi> </mrow> <mrow> <mi>S</mi> </mrow> </msub> <mo>→</mo> <mi>A</mi> <msub> <mrow> <mi>L</mi> </mrow> <mrow> <mi>S</mi> </mrow> </msub> </mrow> <annotation>$$ A+{L}_Sto A{L}_S $$</annotation> </semantics></math>, are commonly described by a set of coupled integer-order differential equations. However, such formulations exhibit limited ability to account for temperature distributions, diffusion, and transport effects involved in the reaction process. Fractional kinetic models provide a natural framework for incorporating nonlocal and memory effects into the description of complex reaction dynamics. In this study, a fractional-order kinetic model based on the Caputo derivative is applied to analyze experimental SPR data for the interaction between immobilized Baru protein (IBP) and Congo Red dye (CR), at concentrations ranging from 7.5 to 97.5 <span></span><math> <semantics> <mrow> <mi>μ</mi> </mrow> <annotation>$$ upmu $$</annotation> </semantics></math>M, pH 7.4, and 16°C. The dependence of the kinetic parameters on the model order is systematically investigated, and it is shown that the classical integer-order formulation fails to adequately reproduce the experimental sensorgrams. The results demonstrate that the fractional-order model captures the intrinsic co
表面等离子体共振(SPR)的应用改变了固定在传感器芯片(ls $$ {L}_S $$)表面的配体与溶液中分析物(A $$ A $$)。该技术能够以高灵敏度实时监测结合过程。吸附-解吸动力学;A + l s→A l s$$ A+{L}_Sto A{L}_S $$,通常用一组耦合的整阶微分方程来描述。然而,这些公式在解释反应过程中涉及的温度分布、扩散和输运效应方面的能力有限。分数动力学模型为将非局部效应和记忆效应纳入复杂反应动力学的描述提供了一个自然的框架。在本研究中,基于Caputo导数的分数级动力学模型分析了固定Baru蛋白(IBP)与刚果红染料(CR)在浓度为7.5 ~ 97.5 μ $$ upmu $$ M, pH为7.4,温度为16°C条件下相互作用的实验SPR数据。系统地研究了动力学参数对模型阶数的依赖关系,表明经典的整阶公式不能充分再现实验传感器图。结果表明,分数阶模型捕捉了SPR实验中观察到的吸附-解吸过程的内在复杂性,显著改善了实验数据的表征。
{"title":"Fractional Kinetic Modelling of the Adsorption and Desorption Processes From Experimental SPR Curves","authors":"Higor V. M. Ferreira,&nbsp;Nelson H. T. Lemes,&nbsp;Yara L. Coelho,&nbsp;Luciano S. Virtuoso,&nbsp;Ana C. dos Santos Pires,&nbsp;Luis H. M. da Silva","doi":"10.1002/cem.70099","DOIUrl":"https://doi.org/10.1002/cem.70099","url":null,"abstract":"&lt;p&gt;The application of surface plasmon resonance (SPR) has transformed the study of interactions between a ligand immobilized on the surface of a sensor chip (&lt;span&gt;&lt;/span&gt;&lt;math&gt;\u0000 &lt;semantics&gt;\u0000 &lt;mrow&gt;\u0000 &lt;msub&gt;\u0000 &lt;mrow&gt;\u0000 &lt;mi&gt;L&lt;/mi&gt;\u0000 &lt;/mrow&gt;\u0000 &lt;mrow&gt;\u0000 &lt;mi&gt;S&lt;/mi&gt;\u0000 &lt;/mrow&gt;\u0000 &lt;/msub&gt;\u0000 &lt;/mrow&gt;\u0000 &lt;annotation&gt;$$ {L}_S $$&lt;/annotation&gt;\u0000 &lt;/semantics&gt;&lt;/math&gt;) and an analyte in solution (&lt;span&gt;&lt;/span&gt;&lt;math&gt;\u0000 &lt;semantics&gt;\u0000 &lt;mrow&gt;\u0000 &lt;mi&gt;A&lt;/mi&gt;\u0000 &lt;/mrow&gt;\u0000 &lt;annotation&gt;$$ A $$&lt;/annotation&gt;\u0000 &lt;/semantics&gt;&lt;/math&gt;). This technique enables the real-time monitoring of binding processes with high sensitivity. The adsorption–desorption dynamics, &lt;span&gt;&lt;/span&gt;&lt;math&gt;\u0000 &lt;semantics&gt;\u0000 &lt;mrow&gt;\u0000 &lt;mi&gt;A&lt;/mi&gt;\u0000 &lt;mo&gt;+&lt;/mo&gt;\u0000 &lt;msub&gt;\u0000 &lt;mrow&gt;\u0000 &lt;mi&gt;L&lt;/mi&gt;\u0000 &lt;/mrow&gt;\u0000 &lt;mrow&gt;\u0000 &lt;mi&gt;S&lt;/mi&gt;\u0000 &lt;/mrow&gt;\u0000 &lt;/msub&gt;\u0000 &lt;mo&gt;→&lt;/mo&gt;\u0000 &lt;mi&gt;A&lt;/mi&gt;\u0000 &lt;msub&gt;\u0000 &lt;mrow&gt;\u0000 &lt;mi&gt;L&lt;/mi&gt;\u0000 &lt;/mrow&gt;\u0000 &lt;mrow&gt;\u0000 &lt;mi&gt;S&lt;/mi&gt;\u0000 &lt;/mrow&gt;\u0000 &lt;/msub&gt;\u0000 &lt;/mrow&gt;\u0000 &lt;annotation&gt;$$ A+{L}_Sto A{L}_S $$&lt;/annotation&gt;\u0000 &lt;/semantics&gt;&lt;/math&gt;, are commonly described by a set of coupled integer-order differential equations. However, such formulations exhibit limited ability to account for temperature distributions, diffusion, and transport effects involved in the reaction process. Fractional kinetic models provide a natural framework for incorporating nonlocal and memory effects into the description of complex reaction dynamics. In this study, a fractional-order kinetic model based on the Caputo derivative is applied to analyze experimental SPR data for the interaction between immobilized Baru protein (IBP) and Congo Red dye (CR), at concentrations ranging from 7.5 to 97.5 &lt;span&gt;&lt;/span&gt;&lt;math&gt;\u0000 &lt;semantics&gt;\u0000 &lt;mrow&gt;\u0000 &lt;mi&gt;μ&lt;/mi&gt;\u0000 &lt;/mrow&gt;\u0000 &lt;annotation&gt;$$ upmu $$&lt;/annotation&gt;\u0000 &lt;/semantics&gt;&lt;/math&gt;M, pH 7.4, and 16°C. The dependence of the kinetic parameters on the model order is systematically investigated, and it is shown that the classical integer-order formulation fails to adequately reproduce the experimental sensorgrams. The results demonstrate that the fractional-order model captures the intrinsic co","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"40 1","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/epdf/10.1002/cem.70099","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145887831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Stacked Target-Related Autoencoder-Extreme Learning Machine: A Novel Soft Measurement Modeling Approach for Near-Infrared Spectroscopy 堆叠目标相关自编码器-极限学习机:一种新的近红外光谱软测量建模方法
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-12-22 DOI: 10.1002/cem.70095
Shun Li, Fangkun Zhang, Shuobo Chen, Baoming Shan, Qilei Xu

This paper proposes a novel quantitative modeling and prediction approach for near-infrared (NIR) spectroscopy, combining a stacked target-related autoencoder with an extreme learning machine (STAE-ELM). The STAE performs hierarchical pre-training using multiple improved target-related autoencoders (TAEs) to extract deep spectral features highly correlated with target values. Crucially, the top-level structure of the STAE is replaced by the ELM, which serves as the final prediction model. This integration streamlines training by reducing parameters and steps while simultaneously enhancing performance through optimized initialization of the ELM's weights and biases. Compared to conventional feature selection methods and stacked autoencoders, the STAE-ELM extracts more comprehensive and target-relevant deep features from spectral data, mitigating overfitting risks. The method's efficacy was validated on five open NIR datasets, benchmarking against three approaches: feature selection modeling, SAE-based feature extraction modeling, and backpropagation-based deep network modeling. Results demonstrate that calibration models built with STAE-ELM achieved average reductions in RMSEP of 18.48%, 5.74%, and 12.14%, respectively compared to these benchmarks. Furthermore, modeling efficiency was significantly improved over the backpropagation-based deep network approach.

本文提出了一种新的近红外光谱定量建模和预测方法,该方法将堆叠目标相关自编码器与极限学习机(STAE-ELM)相结合。STAE使用多个改进的目标相关自编码器(TAEs)进行分层预训练,以提取与目标值高度相关的深度光谱特征。至关重要的是,STAE的顶层结构被ELM取代,ELM作为最终的预测模型。这种集成通过减少参数和步骤来简化训练,同时通过优化初始化ELM的权重和偏差来提高性能。与传统的特征选择方法和堆叠式自编码器相比,STAE-ELM从光谱数据中提取更全面和与目标相关的深层特征,降低了过拟合风险。在五个开放的近红外数据集上验证了该方法的有效性,并对三种方法进行了基准测试:特征选择建模、基于sae的特征提取建模和基于反向传播的深度网络建模。结果表明,与这些基准相比,使用STAE-ELM构建的校准模型的RMSEP平均降低了18.48%,5.74%和12.14%。此外,与基于反向传播的深度网络方法相比,建模效率显著提高。
{"title":"Stacked Target-Related Autoencoder-Extreme Learning Machine: A Novel Soft Measurement Modeling Approach for Near-Infrared Spectroscopy","authors":"Shun Li,&nbsp;Fangkun Zhang,&nbsp;Shuobo Chen,&nbsp;Baoming Shan,&nbsp;Qilei Xu","doi":"10.1002/cem.70095","DOIUrl":"https://doi.org/10.1002/cem.70095","url":null,"abstract":"<div>\u0000 \u0000 <p>This paper proposes a novel quantitative modeling and prediction approach for near-infrared (NIR) spectroscopy, combining a stacked target-related autoencoder with an extreme learning machine (STAE-ELM). The STAE performs hierarchical pre-training using multiple improved target-related autoencoders (TAEs) to extract deep spectral features highly correlated with target values. Crucially, the top-level structure of the STAE is replaced by the ELM, which serves as the final prediction model. This integration streamlines training by reducing parameters and steps while simultaneously enhancing performance through optimized initialization of the ELM's weights and biases. Compared to conventional feature selection methods and stacked autoencoders, the STAE-ELM extracts more comprehensive and target-relevant deep features from spectral data, mitigating overfitting risks. The method's efficacy was validated on five open NIR datasets, benchmarking against three approaches: feature selection modeling, SAE-based feature extraction modeling, and backpropagation-based deep network modeling. Results demonstrate that calibration models built with STAE-ELM achieved average reductions in RMSEP of 18.48%, 5.74%, and 12.14%, respectively compared to these benchmarks. Furthermore, modeling efficiency was significantly improved over the backpropagation-based deep network approach.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"40 1","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145891329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Application of an ECAM-ConvNeXt Model With Multichannel Spectrogram Based on Vis–NIR for Soil Property Prediction 基于多通道光谱图的ECAM-ConvNeXt模型在近红外土壤性质预测中的应用
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-12-10 DOI: 10.1002/cem.70092
Qinghao Shuai, Zhengguang Chen, Shuo Liu, Quan Wang

Vis–NIR spectroscopy is increasingly widely used for soil property analysis due to its rapid, cost-effective, and nondestructive advantages. In particular, deep learning models perform very well when working with large sample data. In this study, we propose a deep learning model based on three-channel ECAM-ConvNeXt. Firstly, the method applies three window functions, Bartlett, Gaussian, and Blackman, in the short-time Fourier transform to convert a one-dimensional spectral sequence signal into three different two-dimensional spectrograms. Next, we perform multichannel feature fusion and use the resulting triple-channel spectrograms as model inputs. This method fully preserves the temporal information and spectral characteristics of the spectral sequence, thereby improving the performance of the model. Secondly, this study introduces the Efficient Channel Attention Module in the ConvNeXt model. This module combines the advantages of the Convolutional Block Attention Module and Efficient Channel Attention Network, further enhancing the expressive ability of the network by highlighting useful information and suppressing irrelevant information. Finally, we also validate the effectiveness of multichannel inputs by deep learning models (AlexNet18, ResNet50, MobileNet-V3, EfficientNet, VIT) and compare them with existing techniques reported in the literature. The results indicate that the root-mean-square error (RMSE) of the TriCH-ECAM-ConvNeXt model in predicting soil nitrogen content (N (g/kg)), organic carbon content (OC (g/kg)), cation exchange capacity (CEC (cmol(+)/kg)), pH, clay content (%), and sand content (%) was reduced to 0.9847, 19.7347, 6.3380, 0.3812, 5.1537, and 12.9706, respectively, and the coefficient of determination (R2) increased to 0.9307, 0.9544, 0.7999, 0.9206, 0.8493, and 0.7526, respectively.

可见-近红外光谱由于其快速、经济、无损等优点,在土壤性质分析中得到越来越广泛的应用。特别是,深度学习模型在处理大样本数据时表现非常好。在本研究中,我们提出了一种基于三通道ECAM-ConvNeXt的深度学习模型。该方法首先在短时傅里叶变换中应用Bartlett、Gaussian和Blackman三个窗函数,将一维谱序列信号转换为三个不同的二维谱图。接下来,我们执行多通道特征融合,并使用得到的三通道频谱图作为模型输入。该方法充分保留了光谱序列的时间信息和光谱特征,从而提高了模型的性能。其次,在ConvNeXt模型中引入了高效通道注意模块。该模块结合了卷积块注意模块和高效通道注意网络的优点,通过突出有用信息,抑制无关信息,进一步增强网络的表达能力。最后,我们还通过深度学习模型(AlexNet18、ResNet50、MobileNet-V3、EfficientNet、VIT)验证了多通道输入的有效性,并将它们与文献中报道的现有技术进行了比较。结果表明:TriCH-ECAM-ConvNeXt模型预测土壤氮含量(N (g/kg))、有机碳含量(OC (g/kg))、阳离子交换容量(CEC (cmol(+)/kg)、pH、粘土含量(%)、砂土含量(%)的均方根误差(RMSE)分别降低至0.9847、19.7347、6.3380、0.3812、5.1537、12.9706,决定系数(R2)分别提高至0.9307、0.9544、0.7999、0.9206、0.8493、0.7526。
{"title":"Application of an ECAM-ConvNeXt Model With Multichannel Spectrogram Based on Vis–NIR for Soil Property Prediction","authors":"Qinghao Shuai,&nbsp;Zhengguang Chen,&nbsp;Shuo Liu,&nbsp;Quan Wang","doi":"10.1002/cem.70092","DOIUrl":"https://doi.org/10.1002/cem.70092","url":null,"abstract":"<div>\u0000 \u0000 <p>Vis–NIR spectroscopy is increasingly widely used for soil property analysis due to its rapid, cost-effective, and nondestructive advantages. In particular, deep learning models perform very well when working with large sample data. In this study, we propose a deep learning model based on three-channel ECAM-ConvNeXt. Firstly, the method applies three window functions, Bartlett, Gaussian, and Blackman, in the short-time Fourier transform to convert a one-dimensional spectral sequence signal into three different two-dimensional spectrograms. Next, we perform multichannel feature fusion and use the resulting triple-channel spectrograms as model inputs. This method fully preserves the temporal information and spectral characteristics of the spectral sequence, thereby improving the performance of the model. Secondly, this study introduces the Efficient Channel Attention Module in the ConvNeXt model. This module combines the advantages of the Convolutional Block Attention Module and Efficient Channel Attention Network, further enhancing the expressive ability of the network by highlighting useful information and suppressing irrelevant information. Finally, we also validate the effectiveness of multichannel inputs by deep learning models (AlexNet18, ResNet50, MobileNet-V3, EfficientNet, VIT) and compare them with existing techniques reported in the literature. The results indicate that the root-mean-square error (RMSE) of the TriCH-ECAM-ConvNeXt model in predicting soil nitrogen content (N (g/kg)), organic carbon content (OC (g/kg)), cation exchange capacity (CEC (cmol(+)/kg)), pH, clay content (%), and sand content (%) was reduced to 0.9847, 19.7347, 6.3380, 0.3812, 5.1537, and 12.9706, respectively, and the coefficient of determination (<i>R</i><sup>2</sup>) increased to 0.9307, 0.9544, 0.7999, 0.9206, 0.8493, and 0.7526, respectively.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 12","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145750946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rapid Multi-Indicator Quality Evaluation of Ginger Using Genetic Algorithm and Near-Infrared Spectroscopy 基于遗传算法和近红外光谱的生姜多指标快速质量评价
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-12-10 DOI: 10.1002/cem.70090
Tianshu Wang, Chengwu Chen, Hui Yan, Kongfa Hu, Xichen Yang, Xia Zhang, Guisheng Zhou, Jinao Duan

To achieve rapid and comprehensive evaluation of the quality of ginger, a rapid multi-indicator quality evaluation method based on genetic algorithm and near-infrared spectroscopy technology is proposed to detect the content of multiple compounds (6-gingerol, 8-gingerol, 10-gingerol, 6-shogaol, and zingerone). First, the near-infrared spectra of ginger samples is collected. Then, the spectra is preprocessed to reduce the noise. Next, features of the spectra are extracted through the genetic algorithm where the population initialization and fitness function methods are designed. Finally, the prediction model is generated through regression. Experimental results demonstrate that the proposed method achieves higher R values (0.9052, 0.9107, 0.9269, 0.9843, 0.9030) compared to the traditional PLSR model (0.6666, 0.51, 0.4358, 0.9248, 0.4846) for 6-gingerol, 8-gingerol, 10-gingerol, 6-shogaol, and zingerone, respectively. Therefore, the proposed method can reduce prediction errors and improve the performance of near-infrared spectroscopy quantitative analysis model for ginger.

为实现对生姜质量的快速综合评价,提出了一种基于遗传算法和近红外光谱技术的快速多指标质量评价方法,检测6-姜辣素、8-姜辣素、10-姜辣素、6-姜辣素和姜酮等多种化合物的含量。首先,采集生姜样品的近红外光谱。然后,对光谱进行预处理,去除噪声。其次,通过遗传算法提取光谱特征,设计种群初始化和适应度函数方法;最后,通过回归生成预测模型。实验结果表明,与传统PLSR模型(0.6666、0.51、0.4358、0.9248、0.4846)相比,该方法对6-姜辣素、8-姜辣素、10-姜辣素、6-姜辣素和姜辣素分别获得了更高的R值(0.9052、0.9107、0.9269、0.9843、0.9030)。因此,该方法可以降低生姜近红外光谱定量分析模型的预测误差,提高模型的性能。
{"title":"Rapid Multi-Indicator Quality Evaluation of Ginger Using Genetic Algorithm and Near-Infrared Spectroscopy","authors":"Tianshu Wang,&nbsp;Chengwu Chen,&nbsp;Hui Yan,&nbsp;Kongfa Hu,&nbsp;Xichen Yang,&nbsp;Xia Zhang,&nbsp;Guisheng Zhou,&nbsp;Jinao Duan","doi":"10.1002/cem.70090","DOIUrl":"https://doi.org/10.1002/cem.70090","url":null,"abstract":"<div>\u0000 \u0000 <p>To achieve rapid and comprehensive evaluation of the quality of ginger, a rapid multi-indicator quality evaluation method based on genetic algorithm and near-infrared spectroscopy technology is proposed to detect the content of multiple compounds (6-gingerol, 8-gingerol, 10-gingerol, 6-shogaol, and zingerone). First, the near-infrared spectra of ginger samples is collected. Then, the spectra is preprocessed to reduce the noise. Next, features of the spectra are extracted through the genetic algorithm where the population initialization and fitness function methods are designed. Finally, the prediction model is generated through regression. Experimental results demonstrate that the proposed method achieves higher R values (0.9052, 0.9107, 0.9269, 0.9843, 0.9030) compared to the traditional PLSR model (0.6666, 0.51, 0.4358, 0.9248, 0.4846) for 6-gingerol, 8-gingerol, 10-gingerol, 6-shogaol, and zingerone, respectively. Therefore, the proposed method can reduce prediction errors and improve the performance of near-infrared spectroscopy quantitative analysis model for ginger.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 12","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145750947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimating Maize Canopy Nitrogen and Chlorophyll Content Using CNN-GRU-CBAM and Hyperspectral Imagery 利用CNN-GRU-CBAM和高光谱影像估算玉米冠层氮和叶绿素含量
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-12-04 DOI: 10.1002/cem.70093
Haoquan Kong, Li Tian, Shujuan Yi, Yuhui Jia, Weiwei Guo, Hanlin Xu, Yongzhi Liu

Rapid, noninvasive quantification of canopy nitrogen (N) and chlorophyll (Chl) content is critical for precision nitrogen management in maize cultivation. Although near-infrared spectroscopy (near-infrared spectroscopy, NIRS) offers a viable approach for biochemical component analysis, conventional machine learning models often fail to capture the complex nonlinear relationships inherent in spectral data and lack interpretability, limiting their robustness for real-time inversion tasks. To address these limitations, this study introduces a hybrid deep learning architecture combining convolutional neural networks (CNNs) and gated recurrent units (GRUs), augmented by a convolutional block attention module (CBAM), integrated with explainable artificial intelligence for accurate biochemical content inversion. Preprocessing of hyperspectral images from 200 maize canopy samples via sequential Savitzky–Golay smoothing (SG), standard normal variate (SG-SNV), and SG transformations enhanced mean test set R2 by 0.016 units. Subsequent dimensionality reduction via the successive projection algorithm (SPA) and competitive adaptive reweighting sampling (CARS) significantly reduced spectral features from 176 to 10 and 22 bands, respectively. The core predictive model synergistically combines CNNs and GRUs, augmented by a CBAM to enhance feature extraction and temporal dependency modeling. Comparative evaluation demonstrates the superior performance of CNN-GRU-CBAM over traditional machine learning and alternative deep learning models. For the test set, it achieved R2 values of 0.934 (N) and 0.788 (Chl), with corresponding root mean square error (RMSE) values of 1.940 and 0.216. Model interpretability was rigorously validated using Shapley Additive Explanations (SHAP), identifying key spectral regions driving predictions. This work innovatively bridges high-performance deep learning with explainable artificial intelligence, enabling precise, nondestructive estimation of maize foliar biochemical constituents. The framework provides a transferable approach for biochemical content inversion in diverse crops.

快速、无创地定量测定玉米冠层氮(N)和叶绿素(Chl)含量对玉米种植中氮的精确管理至关重要。尽管近红外光谱(near-infrared spectroscopy, NIRS)为生化成分分析提供了一种可行的方法,但传统的机器学习模型往往无法捕捉光谱数据中固有的复杂非线性关系,并且缺乏可解释性,从而限制了它们对实时反演任务的鲁棒性。为了解决这些限制,本研究引入了一种混合深度学习架构,该架构结合了卷积神经网络(cnn)和门控循环单元(gru),由卷积块注意模块(CBAM)增强,与可解释的人工智能集成,用于精确的生化内容反转。通过序贯Savitzky-Golay平滑(SG)、标准正态变量(SG- snv)和SG变换对200个玉米冠层样品的高光谱图像进行预处理,使平均检验集R2提高了0.016个单位。随后通过连续投影算法(SPA)和竞争自适应重加权采样(CARS)进行降维,将光谱特征分别从176个波段降至10个和22个波段。核心预测模型将cnn和gru协同结合,并通过CBAM增强特征提取和时间依赖建模。对比评估表明CNN-GRU-CBAM优于传统的机器学习和替代深度学习模型。对于检验集,其R2值分别为0.934 (N)和0.788 (Chl),相应的均方根误差(RMSE)值分别为1.940和0.216。使用Shapley加性解释(SHAP)严格验证了模型的可解释性,确定了驱动预测的关键光谱区域。这项工作创新性地将高性能深度学习与可解释的人工智能结合起来,实现了对玉米叶片生化成分的精确、无损估计。该框架为不同作物的生化含量反演提供了一种可转移的方法。
{"title":"Estimating Maize Canopy Nitrogen and Chlorophyll Content Using CNN-GRU-CBAM and Hyperspectral Imagery","authors":"Haoquan Kong,&nbsp;Li Tian,&nbsp;Shujuan Yi,&nbsp;Yuhui Jia,&nbsp;Weiwei Guo,&nbsp;Hanlin Xu,&nbsp;Yongzhi Liu","doi":"10.1002/cem.70093","DOIUrl":"https://doi.org/10.1002/cem.70093","url":null,"abstract":"<div>\u0000 \u0000 <p>Rapid, noninvasive quantification of canopy nitrogen (N) and chlorophyll (Chl) content is critical for precision nitrogen management in maize cultivation. Although near-infrared spectroscopy (near-infrared spectroscopy, NIRS) offers a viable approach for biochemical component analysis, conventional machine learning models often fail to capture the complex nonlinear relationships inherent in spectral data and lack interpretability, limiting their robustness for real-time inversion tasks. To address these limitations, this study introduces a hybrid deep learning architecture combining convolutional neural networks (CNNs) and gated recurrent units (GRUs), augmented by a convolutional block attention module (CBAM), integrated with explainable artificial intelligence for accurate biochemical content inversion. Preprocessing of hyperspectral images from 200 maize canopy samples via sequential Savitzky–Golay smoothing (SG), standard normal variate (SG-SNV), and SG transformations enhanced mean test set <i>R</i><sup>2</sup> by 0.016 units. Subsequent dimensionality reduction via the successive projection algorithm (SPA) and competitive adaptive reweighting sampling (CARS) significantly reduced spectral features from 176 to 10 and 22 bands, respectively. The core predictive model synergistically combines CNNs and GRUs, augmented by a CBAM to enhance feature extraction and temporal dependency modeling. Comparative evaluation demonstrates the superior performance of CNN-GRU-CBAM over traditional machine learning and alternative deep learning models. For the test set, it achieved <i>R</i><sup>2</sup> values of 0.934 (N) and 0.788 (Chl), with corresponding root mean square error (RMSE) values of 1.940 and 0.216. Model interpretability was rigorously validated using Shapley Additive Explanations (SHAP), identifying key spectral regions driving predictions. This work innovatively bridges high-performance deep learning with explainable artificial intelligence, enabling precise, nondestructive estimation of maize foliar biochemical constituents. The framework provides a transferable approach for biochemical content inversion in diverse crops.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 12","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145695015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quantitative Detection of Trace Characteristic VOCs of Liver Metabolic Diseases Based on the MEMS Sensor Array 基于MEMS传感器阵列的肝脏代谢性疾病痕量特征VOCs定量检测
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-12-03 DOI: 10.1002/cem.70089
Cheng Zhang, Yao Tian, Ze Zhang, Lingmin Yu, Hairong Wang

There are a variety of volatile organic compounds (VOCs) gases in human exhalation, and among them isoprene, ethanol, and formaldehyde can be used as biomarkers for liver metabolic diseases. In order to accurately detect these trace-concentration VOC gases, a sensor array was built with 4 MEMS gas sensors, and one of them was the self-developed sensor, which has a very high response to isoprene. To improve prediction accuracy of gas concentration, we investigated the convolutional neural network with a Multi-Expert Temporal Fusion Network (METF-Net) model based on multitask learning. Based on the MEMS sensor array, the isoprene, ethanol, and formaldehyde at sub ppm level can be correctly identified; their RMSEs of isoprene, ethanol, and formaldehyde are 33.48, 64.01, and 18.84 ppb, and the predicted concentrations with error rates of isoprene, ethanol, and formaldehyde are 6.70%, 6.40%, and 9.42%, respectively. This method has the potential of being applied in the screening of liver metabolic diseases at an early stage.

人体呼出气体中有多种挥发性有机化合物(VOCs)气体,其中异戊二烯、乙醇和甲醛可作为肝脏代谢性疾病的生物标志物。为了准确检测这些痕量VOC气体,我们构建了一个由4个MEMS气体传感器组成的传感器阵列,其中一个是自主研发的传感器,该传感器对异戊二烯具有很高的响应。为了提高气体浓度的预测精度,研究了基于多任务学习的多专家时间融合网络(METF-Net)卷积神经网络模型。基于MEMS传感器阵列,可以正确识别亚ppm水平的异戊二烯、乙醇和甲醛;异戊二烯、乙醇和甲醛的rmse分别为33.48、64.01和18.84 ppb,异戊二烯、乙醇和甲醛的预测错误率分别为6.70%、6.40%和9.42%。该方法具有应用于肝脏代谢性疾病早期筛查的潜力。
{"title":"Quantitative Detection of Trace Characteristic VOCs of Liver Metabolic Diseases Based on the MEMS Sensor Array","authors":"Cheng Zhang,&nbsp;Yao Tian,&nbsp;Ze Zhang,&nbsp;Lingmin Yu,&nbsp;Hairong Wang","doi":"10.1002/cem.70089","DOIUrl":"https://doi.org/10.1002/cem.70089","url":null,"abstract":"<div>\u0000 \u0000 <p>There are a variety of volatile organic compounds (VOCs) gases in human exhalation, and among them isoprene, ethanol, and formaldehyde can be used as biomarkers for liver metabolic diseases. In order to accurately detect these trace-concentration VOC gases, a sensor array was built with 4 MEMS gas sensors, and one of them was the self-developed sensor, which has a very high response to isoprene. To improve prediction accuracy of gas concentration, we investigated the convolutional neural network with a Multi-Expert Temporal Fusion Network (METF-Net) model based on multitask learning. Based on the MEMS sensor array, the isoprene, ethanol, and formaldehyde at sub ppm level can be correctly identified; their RMSEs of isoprene, ethanol, and formaldehyde are 33.48, 64.01, and 18.84 ppb, and the predicted concentrations with error rates of isoprene, ethanol, and formaldehyde are 6.70%, 6.40%, and 9.42%, respectively. This method has the potential of being applied in the screening of liver metabolic diseases at an early stage.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 12","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145695018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Construction and Improvement of a Model for Quantifying Blood Glucose Concentration Using Mid-Infrared Spectroscopy 中红外光谱测定血糖浓度模型的建立与改进
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-12-02 DOI: 10.1002/cem.70091
Yuta Takami, Keita Miyagawa, Yuki Tsuda, Koichi Akiyama, Yuji Matsuura, Hiromasa Kaneko

Measurements of blood glucose concentration use invasive methods such as venous blood sampling, and finger-prick blood testing using self-monitoring blood glucose meters with subcutaneous sensors. For daily use, the development of noninvasive blood glucose measurement methods is required. In this study, we constructed a model to estimate blood glucose concentrations noninvasively from mid-infrared absorption spectra measured using photothermal deflectometry enhanced by total internal reflection. We improved the estimation accuracy of the model using Savitzky–Golay preprocessing and the Boruta variable selection method. In addition, the model was corrected using subject data from the first day of measurements to improve estimation accuracy.

血糖浓度的测量使用侵入性方法,如静脉血采样和手指刺血测试,使用带有皮下传感器的自我监测血糖仪。对于日常使用,需要开发无创血糖测量方法。在这项研究中,我们构建了一个模型,利用全内反射增强的光热偏转法测量中红外吸收光谱,无创地估计血糖浓度。采用Savitzky-Golay预处理和Boruta变量选择方法提高了模型的估计精度。此外,使用测量第一天的受试者数据对模型进行校正,以提高估计精度。
{"title":"Construction and Improvement of a Model for Quantifying Blood Glucose Concentration Using Mid-Infrared Spectroscopy","authors":"Yuta Takami,&nbsp;Keita Miyagawa,&nbsp;Yuki Tsuda,&nbsp;Koichi Akiyama,&nbsp;Yuji Matsuura,&nbsp;Hiromasa Kaneko","doi":"10.1002/cem.70091","DOIUrl":"https://doi.org/10.1002/cem.70091","url":null,"abstract":"<p>Measurements of blood glucose concentration use invasive methods such as venous blood sampling, and finger-prick blood testing using self-monitoring blood glucose meters with subcutaneous sensors. For daily use, the development of noninvasive blood glucose measurement methods is required. In this study, we constructed a model to estimate blood glucose concentrations noninvasively from mid-infrared absorption spectra measured using photothermal deflectometry enhanced by total internal reflection. We improved the estimation accuracy of the model using Savitzky–Golay preprocessing and the Boruta variable selection method. In addition, the model was corrected using subject data from the first day of measurements to improve estimation accuracy.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 12","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/epdf/10.1002/cem.70091","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145695116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Partial Least Squares 偏最小二乘
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-12-01 DOI: 10.1002/cem.70069
Richard G. Brereton
<p>In the previous article, we discussed the enormous increase in the impact of chemometrics methods over the last four decades [<span>1</span>] and the important role PLS (partial least squares or projection to latent structures) has had in this revolution. However, we are yet to describe this technique, which will be the subject of this and subsequent articles.</p><p>There are many thousands, or perhaps tens of thousands, of theoretical, methodological and tutorial articles about PLS over the last 50 years, and possibly many hundreds of thousands of articles involving the use of this approach. In the early decades of the development of chemometrics as a coherent discipline in the 1980s and 1990s, there was a significant focus on PLS, but still after so many decades, it still spawns new insights. There are conferences dedicated to PLS. This article is therefore only one of very many such articles, but PLS can be approached in endless ways, and no general introduction to chemometrics is complete without describing this method.</p><p>PLS was first proposed in the 1960s by Herman Wold [<span>2, 3</span>]. The method was slowly introduced to chemometrics with a significant expansion in interest in the 1980s. Svante Wold first publicised its applicability in the 1970s and 1980s [<span>4, 5</span>]. Early pioneers of the 1980s include Paul Geladi [<span>6</span>], Harald Martens and Tormod Naes [<span>7</span>] who wrote classical articles/books that to this day are still viewed as essential reading. During the 1980s, there were numerous conferences, software developments and courses on PLS. This development was not only important in chemistry but also in economics and social sciences.</p><p>The original PLS algorithm, called PLS1, was enhanced during this period, most notably by PLS2 but also by many other developments, which continue to this day. New theoretical articles on the properties of PLS continue as topical areas for research.</p><p>As originally described, PLS was used for quantitative regression or calibration, sometimes distinguished by the terminology PLSR (PLS regression), where the <i>c</i> block was a continuous variable, such as a concentration, reaction rate or activity. Most of the early applications in chemistry were, for example, in NIR spectroscopy, where the aim was to calibrate the spectra to the concentration of an analyte or a class of compounds on a continuous scale.</p><p>However, over the past few years, PLSDA (PLS discriminant analysis) [<span>12</span>] has become an important technique used for multivariate classification. In this case, the <i>c</i> block is discrete, representing a numerical label or a classifier. Typically, if there are two groups, <i>c</i> = +1 for group A, and <i>c</i> = −1 for group B. For multiple groups, there are several modifications [<span>13</span>] available.</p><p>In subsequent articles, we will look at the properties of the matrices obtained using the PLS1 algorithm and how they fundament
在上一篇文章中,我们讨论了化学计量学方法在过去四十年中影响的巨大增长,以及PLS(偏最小二乘或隐结构投影)在这场革命中所起的重要作用。然而,我们还没有描述这种技术,这将是本文和后续文章的主题。在过去的50年里,关于PLS的理论、方法和教程文章有成千上万篇,可能还有数十万篇文章涉及到这种方法的使用。在20世纪80年代和90年代,化学计量学作为一门连贯的学科发展的最初几十年里,人们对PLS有了很大的关注,但在这么多年之后,它仍然产生了新的见解。有专门讨论PLS的会议。因此,本文只是许多此类文章中的一篇,但是PLS可以以无穷无尽的方式接近,并且没有描述这种方法的化学计量学的一般介绍是完整的。PLS最早由Herman Wold在20世纪60年代提出[2,3]。该方法在20世纪80年代逐渐被引入化学计量学,并引起了极大的兴趣。Svante world在20世纪70年代和80年代首次公布了其适用性[4,5]。20世纪80年代早期的先驱包括Paul Geladi b[6], Harald Martens和Tormod Naes b[7],他们写的经典文章/书籍至今仍被视为必不可少的读物。在20世纪80年代,有许多关于PLS的会议、软件开发和课程,这一发展不仅在化学领域很重要,而且在经济和社会科学领域也很重要。最初的PLS算法,称为PLS1,在此期间得到了增强,最明显的是PLS2,但也有许多其他的发展,一直持续到今天。关于PLS性质的新理论文章继续成为热门研究领域。如最初所述,PLS用于定量回归或校准,有时用术语PLSR (PLS回归)来区分,其中c块是连续变量,如浓度,反应速率或活性。例如,在化学领域的大多数早期应用都是在近红外光谱中,其目的是在连续刻度上校准光谱以确定分析物或一类化合物的浓度。然而,在过去的几年里,PLS判别分析(PLS discriminant analysis, PLSDA)[12]已经成为一种重要的多变量分类技术。在这种情况下,c块是离散的,表示数字标签或分类器。通常情况下,如果有两个组,则A组c = +1, b组c =−1。如果有多个组,则可以修改[13]。在随后的文章中,我们将研究使用PLS1算法获得的矩阵的属性,以及它们与PCA的根本区别,尽管其中一些具有相同的名称。作者声明无利益冲突。数据共享不适用于本文,因为在当前研究期间没有生成或分析数据集。
{"title":"Partial Least Squares","authors":"Richard G. Brereton","doi":"10.1002/cem.70069","DOIUrl":"https://doi.org/10.1002/cem.70069","url":null,"abstract":"&lt;p&gt;In the previous article, we discussed the enormous increase in the impact of chemometrics methods over the last four decades [&lt;span&gt;1&lt;/span&gt;] and the important role PLS (partial least squares or projection to latent structures) has had in this revolution. However, we are yet to describe this technique, which will be the subject of this and subsequent articles.&lt;/p&gt;&lt;p&gt;There are many thousands, or perhaps tens of thousands, of theoretical, methodological and tutorial articles about PLS over the last 50 years, and possibly many hundreds of thousands of articles involving the use of this approach. In the early decades of the development of chemometrics as a coherent discipline in the 1980s and 1990s, there was a significant focus on PLS, but still after so many decades, it still spawns new insights. There are conferences dedicated to PLS. This article is therefore only one of very many such articles, but PLS can be approached in endless ways, and no general introduction to chemometrics is complete without describing this method.&lt;/p&gt;&lt;p&gt;PLS was first proposed in the 1960s by Herman Wold [&lt;span&gt;2, 3&lt;/span&gt;]. The method was slowly introduced to chemometrics with a significant expansion in interest in the 1980s. Svante Wold first publicised its applicability in the 1970s and 1980s [&lt;span&gt;4, 5&lt;/span&gt;]. Early pioneers of the 1980s include Paul Geladi [&lt;span&gt;6&lt;/span&gt;], Harald Martens and Tormod Naes [&lt;span&gt;7&lt;/span&gt;] who wrote classical articles/books that to this day are still viewed as essential reading. During the 1980s, there were numerous conferences, software developments and courses on PLS. This development was not only important in chemistry but also in economics and social sciences.&lt;/p&gt;&lt;p&gt;The original PLS algorithm, called PLS1, was enhanced during this period, most notably by PLS2 but also by many other developments, which continue to this day. New theoretical articles on the properties of PLS continue as topical areas for research.&lt;/p&gt;&lt;p&gt;As originally described, PLS was used for quantitative regression or calibration, sometimes distinguished by the terminology PLSR (PLS regression), where the &lt;i&gt;c&lt;/i&gt; block was a continuous variable, such as a concentration, reaction rate or activity. Most of the early applications in chemistry were, for example, in NIR spectroscopy, where the aim was to calibrate the spectra to the concentration of an analyte or a class of compounds on a continuous scale.&lt;/p&gt;&lt;p&gt;However, over the past few years, PLSDA (PLS discriminant analysis) [&lt;span&gt;12&lt;/span&gt;] has become an important technique used for multivariate classification. In this case, the &lt;i&gt;c&lt;/i&gt; block is discrete, representing a numerical label or a classifier. Typically, if there are two groups, &lt;i&gt;c&lt;/i&gt; = +1 for group A, and &lt;i&gt;c&lt;/i&gt; = −1 for group B. For multiple groups, there are several modifications [&lt;span&gt;13&lt;/span&gt;] available.&lt;/p&gt;&lt;p&gt;In subsequent articles, we will look at the properties of the matrices obtained using the PLS1 algorithm and how they fundament","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 12","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/epdf/10.1002/cem.70069","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145695002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Chemometrics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1