首页 > 最新文献

Journal of Chemometrics最新文献

英文 中文
Estimating Maize Canopy Nitrogen and Chlorophyll Content Using CNN-GRU-CBAM and Hyperspectral Imagery 利用CNN-GRU-CBAM和高光谱影像估算玉米冠层氮和叶绿素含量
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-12-04 DOI: 10.1002/cem.70093
Haoquan Kong, Li Tian, Shujuan Yi, Yuhui Jia, Weiwei Guo, Hanlin Xu, Yongzhi Liu

Rapid, noninvasive quantification of canopy nitrogen (N) and chlorophyll (Chl) content is critical for precision nitrogen management in maize cultivation. Although near-infrared spectroscopy (near-infrared spectroscopy, NIRS) offers a viable approach for biochemical component analysis, conventional machine learning models often fail to capture the complex nonlinear relationships inherent in spectral data and lack interpretability, limiting their robustness for real-time inversion tasks. To address these limitations, this study introduces a hybrid deep learning architecture combining convolutional neural networks (CNNs) and gated recurrent units (GRUs), augmented by a convolutional block attention module (CBAM), integrated with explainable artificial intelligence for accurate biochemical content inversion. Preprocessing of hyperspectral images from 200 maize canopy samples via sequential Savitzky–Golay smoothing (SG), standard normal variate (SG-SNV), and SG transformations enhanced mean test set R2 by 0.016 units. Subsequent dimensionality reduction via the successive projection algorithm (SPA) and competitive adaptive reweighting sampling (CARS) significantly reduced spectral features from 176 to 10 and 22 bands, respectively. The core predictive model synergistically combines CNNs and GRUs, augmented by a CBAM to enhance feature extraction and temporal dependency modeling. Comparative evaluation demonstrates the superior performance of CNN-GRU-CBAM over traditional machine learning and alternative deep learning models. For the test set, it achieved R2 values of 0.934 (N) and 0.788 (Chl), with corresponding root mean square error (RMSE) values of 1.940 and 0.216. Model interpretability was rigorously validated using Shapley Additive Explanations (SHAP), identifying key spectral regions driving predictions. This work innovatively bridges high-performance deep learning with explainable artificial intelligence, enabling precise, nondestructive estimation of maize foliar biochemical constituents. The framework provides a transferable approach for biochemical content inversion in diverse crops.

快速、无创地定量测定玉米冠层氮(N)和叶绿素(Chl)含量对玉米种植中氮的精确管理至关重要。尽管近红外光谱(near-infrared spectroscopy, NIRS)为生化成分分析提供了一种可行的方法,但传统的机器学习模型往往无法捕捉光谱数据中固有的复杂非线性关系,并且缺乏可解释性,从而限制了它们对实时反演任务的鲁棒性。为了解决这些限制,本研究引入了一种混合深度学习架构,该架构结合了卷积神经网络(cnn)和门控循环单元(gru),由卷积块注意模块(CBAM)增强,与可解释的人工智能集成,用于精确的生化内容反转。通过序贯Savitzky-Golay平滑(SG)、标准正态变量(SG- snv)和SG变换对200个玉米冠层样品的高光谱图像进行预处理,使平均检验集R2提高了0.016个单位。随后通过连续投影算法(SPA)和竞争自适应重加权采样(CARS)进行降维,将光谱特征分别从176个波段降至10个和22个波段。核心预测模型将cnn和gru协同结合,并通过CBAM增强特征提取和时间依赖建模。对比评估表明CNN-GRU-CBAM优于传统的机器学习和替代深度学习模型。对于检验集,其R2值分别为0.934 (N)和0.788 (Chl),相应的均方根误差(RMSE)值分别为1.940和0.216。使用Shapley加性解释(SHAP)严格验证了模型的可解释性,确定了驱动预测的关键光谱区域。这项工作创新性地将高性能深度学习与可解释的人工智能结合起来,实现了对玉米叶片生化成分的精确、无损估计。该框架为不同作物的生化含量反演提供了一种可转移的方法。
{"title":"Estimating Maize Canopy Nitrogen and Chlorophyll Content Using CNN-GRU-CBAM and Hyperspectral Imagery","authors":"Haoquan Kong,&nbsp;Li Tian,&nbsp;Shujuan Yi,&nbsp;Yuhui Jia,&nbsp;Weiwei Guo,&nbsp;Hanlin Xu,&nbsp;Yongzhi Liu","doi":"10.1002/cem.70093","DOIUrl":"https://doi.org/10.1002/cem.70093","url":null,"abstract":"<div>\u0000 \u0000 <p>Rapid, noninvasive quantification of canopy nitrogen (N) and chlorophyll (Chl) content is critical for precision nitrogen management in maize cultivation. Although near-infrared spectroscopy (near-infrared spectroscopy, NIRS) offers a viable approach for biochemical component analysis, conventional machine learning models often fail to capture the complex nonlinear relationships inherent in spectral data and lack interpretability, limiting their robustness for real-time inversion tasks. To address these limitations, this study introduces a hybrid deep learning architecture combining convolutional neural networks (CNNs) and gated recurrent units (GRUs), augmented by a convolutional block attention module (CBAM), integrated with explainable artificial intelligence for accurate biochemical content inversion. Preprocessing of hyperspectral images from 200 maize canopy samples via sequential Savitzky–Golay smoothing (SG), standard normal variate (SG-SNV), and SG transformations enhanced mean test set <i>R</i><sup>2</sup> by 0.016 units. Subsequent dimensionality reduction via the successive projection algorithm (SPA) and competitive adaptive reweighting sampling (CARS) significantly reduced spectral features from 176 to 10 and 22 bands, respectively. The core predictive model synergistically combines CNNs and GRUs, augmented by a CBAM to enhance feature extraction and temporal dependency modeling. Comparative evaluation demonstrates the superior performance of CNN-GRU-CBAM over traditional machine learning and alternative deep learning models. For the test set, it achieved <i>R</i><sup>2</sup> values of 0.934 (N) and 0.788 (Chl), with corresponding root mean square error (RMSE) values of 1.940 and 0.216. Model interpretability was rigorously validated using Shapley Additive Explanations (SHAP), identifying key spectral regions driving predictions. This work innovatively bridges high-performance deep learning with explainable artificial intelligence, enabling precise, nondestructive estimation of maize foliar biochemical constituents. The framework provides a transferable approach for biochemical content inversion in diverse crops.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 12","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145695015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quantitative Detection of Trace Characteristic VOCs of Liver Metabolic Diseases Based on the MEMS Sensor Array 基于MEMS传感器阵列的肝脏代谢性疾病痕量特征VOCs定量检测
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-12-03 DOI: 10.1002/cem.70089
Cheng Zhang, Yao Tian, Ze Zhang, Lingmin Yu, Hairong Wang

There are a variety of volatile organic compounds (VOCs) gases in human exhalation, and among them isoprene, ethanol, and formaldehyde can be used as biomarkers for liver metabolic diseases. In order to accurately detect these trace-concentration VOC gases, a sensor array was built with 4 MEMS gas sensors, and one of them was the self-developed sensor, which has a very high response to isoprene. To improve prediction accuracy of gas concentration, we investigated the convolutional neural network with a Multi-Expert Temporal Fusion Network (METF-Net) model based on multitask learning. Based on the MEMS sensor array, the isoprene, ethanol, and formaldehyde at sub ppm level can be correctly identified; their RMSEs of isoprene, ethanol, and formaldehyde are 33.48, 64.01, and 18.84 ppb, and the predicted concentrations with error rates of isoprene, ethanol, and formaldehyde are 6.70%, 6.40%, and 9.42%, respectively. This method has the potential of being applied in the screening of liver metabolic diseases at an early stage.

人体呼出气体中有多种挥发性有机化合物(VOCs)气体,其中异戊二烯、乙醇和甲醛可作为肝脏代谢性疾病的生物标志物。为了准确检测这些痕量VOC气体,我们构建了一个由4个MEMS气体传感器组成的传感器阵列,其中一个是自主研发的传感器,该传感器对异戊二烯具有很高的响应。为了提高气体浓度的预测精度,研究了基于多任务学习的多专家时间融合网络(METF-Net)卷积神经网络模型。基于MEMS传感器阵列,可以正确识别亚ppm水平的异戊二烯、乙醇和甲醛;异戊二烯、乙醇和甲醛的rmse分别为33.48、64.01和18.84 ppb,异戊二烯、乙醇和甲醛的预测错误率分别为6.70%、6.40%和9.42%。该方法具有应用于肝脏代谢性疾病早期筛查的潜力。
{"title":"Quantitative Detection of Trace Characteristic VOCs of Liver Metabolic Diseases Based on the MEMS Sensor Array","authors":"Cheng Zhang,&nbsp;Yao Tian,&nbsp;Ze Zhang,&nbsp;Lingmin Yu,&nbsp;Hairong Wang","doi":"10.1002/cem.70089","DOIUrl":"https://doi.org/10.1002/cem.70089","url":null,"abstract":"<div>\u0000 \u0000 <p>There are a variety of volatile organic compounds (VOCs) gases in human exhalation, and among them isoprene, ethanol, and formaldehyde can be used as biomarkers for liver metabolic diseases. In order to accurately detect these trace-concentration VOC gases, a sensor array was built with 4 MEMS gas sensors, and one of them was the self-developed sensor, which has a very high response to isoprene. To improve prediction accuracy of gas concentration, we investigated the convolutional neural network with a Multi-Expert Temporal Fusion Network (METF-Net) model based on multitask learning. Based on the MEMS sensor array, the isoprene, ethanol, and formaldehyde at sub ppm level can be correctly identified; their RMSEs of isoprene, ethanol, and formaldehyde are 33.48, 64.01, and 18.84 ppb, and the predicted concentrations with error rates of isoprene, ethanol, and formaldehyde are 6.70%, 6.40%, and 9.42%, respectively. This method has the potential of being applied in the screening of liver metabolic diseases at an early stage.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 12","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145695018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Construction and Improvement of a Model for Quantifying Blood Glucose Concentration Using Mid-Infrared Spectroscopy 中红外光谱测定血糖浓度模型的建立与改进
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-12-02 DOI: 10.1002/cem.70091
Yuta Takami, Keita Miyagawa, Yuki Tsuda, Koichi Akiyama, Yuji Matsuura, Hiromasa Kaneko

Measurements of blood glucose concentration use invasive methods such as venous blood sampling, and finger-prick blood testing using self-monitoring blood glucose meters with subcutaneous sensors. For daily use, the development of noninvasive blood glucose measurement methods is required. In this study, we constructed a model to estimate blood glucose concentrations noninvasively from mid-infrared absorption spectra measured using photothermal deflectometry enhanced by total internal reflection. We improved the estimation accuracy of the model using Savitzky–Golay preprocessing and the Boruta variable selection method. In addition, the model was corrected using subject data from the first day of measurements to improve estimation accuracy.

血糖浓度的测量使用侵入性方法,如静脉血采样和手指刺血测试,使用带有皮下传感器的自我监测血糖仪。对于日常使用,需要开发无创血糖测量方法。在这项研究中,我们构建了一个模型,利用全内反射增强的光热偏转法测量中红外吸收光谱,无创地估计血糖浓度。采用Savitzky-Golay预处理和Boruta变量选择方法提高了模型的估计精度。此外,使用测量第一天的受试者数据对模型进行校正,以提高估计精度。
{"title":"Construction and Improvement of a Model for Quantifying Blood Glucose Concentration Using Mid-Infrared Spectroscopy","authors":"Yuta Takami,&nbsp;Keita Miyagawa,&nbsp;Yuki Tsuda,&nbsp;Koichi Akiyama,&nbsp;Yuji Matsuura,&nbsp;Hiromasa Kaneko","doi":"10.1002/cem.70091","DOIUrl":"https://doi.org/10.1002/cem.70091","url":null,"abstract":"<p>Measurements of blood glucose concentration use invasive methods such as venous blood sampling, and finger-prick blood testing using self-monitoring blood glucose meters with subcutaneous sensors. For daily use, the development of noninvasive blood glucose measurement methods is required. In this study, we constructed a model to estimate blood glucose concentrations noninvasively from mid-infrared absorption spectra measured using photothermal deflectometry enhanced by total internal reflection. We improved the estimation accuracy of the model using Savitzky–Golay preprocessing and the Boruta variable selection method. In addition, the model was corrected using subject data from the first day of measurements to improve estimation accuracy.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 12","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/epdf/10.1002/cem.70091","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145695116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Partial Least Squares 偏最小二乘
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-12-01 DOI: 10.1002/cem.70069
Richard G. Brereton
<p>In the previous article, we discussed the enormous increase in the impact of chemometrics methods over the last four decades [<span>1</span>] and the important role PLS (partial least squares or projection to latent structures) has had in this revolution. However, we are yet to describe this technique, which will be the subject of this and subsequent articles.</p><p>There are many thousands, or perhaps tens of thousands, of theoretical, methodological and tutorial articles about PLS over the last 50 years, and possibly many hundreds of thousands of articles involving the use of this approach. In the early decades of the development of chemometrics as a coherent discipline in the 1980s and 1990s, there was a significant focus on PLS, but still after so many decades, it still spawns new insights. There are conferences dedicated to PLS. This article is therefore only one of very many such articles, but PLS can be approached in endless ways, and no general introduction to chemometrics is complete without describing this method.</p><p>PLS was first proposed in the 1960s by Herman Wold [<span>2, 3</span>]. The method was slowly introduced to chemometrics with a significant expansion in interest in the 1980s. Svante Wold first publicised its applicability in the 1970s and 1980s [<span>4, 5</span>]. Early pioneers of the 1980s include Paul Geladi [<span>6</span>], Harald Martens and Tormod Naes [<span>7</span>] who wrote classical articles/books that to this day are still viewed as essential reading. During the 1980s, there were numerous conferences, software developments and courses on PLS. This development was not only important in chemistry but also in economics and social sciences.</p><p>The original PLS algorithm, called PLS1, was enhanced during this period, most notably by PLS2 but also by many other developments, which continue to this day. New theoretical articles on the properties of PLS continue as topical areas for research.</p><p>As originally described, PLS was used for quantitative regression or calibration, sometimes distinguished by the terminology PLSR (PLS regression), where the <i>c</i> block was a continuous variable, such as a concentration, reaction rate or activity. Most of the early applications in chemistry were, for example, in NIR spectroscopy, where the aim was to calibrate the spectra to the concentration of an analyte or a class of compounds on a continuous scale.</p><p>However, over the past few years, PLSDA (PLS discriminant analysis) [<span>12</span>] has become an important technique used for multivariate classification. In this case, the <i>c</i> block is discrete, representing a numerical label or a classifier. Typically, if there are two groups, <i>c</i> = +1 for group A, and <i>c</i> = −1 for group B. For multiple groups, there are several modifications [<span>13</span>] available.</p><p>In subsequent articles, we will look at the properties of the matrices obtained using the PLS1 algorithm and how they fundament
在上一篇文章中,我们讨论了化学计量学方法在过去四十年中影响的巨大增长,以及PLS(偏最小二乘或隐结构投影)在这场革命中所起的重要作用。然而,我们还没有描述这种技术,这将是本文和后续文章的主题。在过去的50年里,关于PLS的理论、方法和教程文章有成千上万篇,可能还有数十万篇文章涉及到这种方法的使用。在20世纪80年代和90年代,化学计量学作为一门连贯的学科发展的最初几十年里,人们对PLS有了很大的关注,但在这么多年之后,它仍然产生了新的见解。有专门讨论PLS的会议。因此,本文只是许多此类文章中的一篇,但是PLS可以以无穷无尽的方式接近,并且没有描述这种方法的化学计量学的一般介绍是完整的。PLS最早由Herman Wold在20世纪60年代提出[2,3]。该方法在20世纪80年代逐渐被引入化学计量学,并引起了极大的兴趣。Svante world在20世纪70年代和80年代首次公布了其适用性[4,5]。20世纪80年代早期的先驱包括Paul Geladi b[6], Harald Martens和Tormod Naes b[7],他们写的经典文章/书籍至今仍被视为必不可少的读物。在20世纪80年代,有许多关于PLS的会议、软件开发和课程,这一发展不仅在化学领域很重要,而且在经济和社会科学领域也很重要。最初的PLS算法,称为PLS1,在此期间得到了增强,最明显的是PLS2,但也有许多其他的发展,一直持续到今天。关于PLS性质的新理论文章继续成为热门研究领域。如最初所述,PLS用于定量回归或校准,有时用术语PLSR (PLS回归)来区分,其中c块是连续变量,如浓度,反应速率或活性。例如,在化学领域的大多数早期应用都是在近红外光谱中,其目的是在连续刻度上校准光谱以确定分析物或一类化合物的浓度。然而,在过去的几年里,PLS判别分析(PLS discriminant analysis, PLSDA)[12]已经成为一种重要的多变量分类技术。在这种情况下,c块是离散的,表示数字标签或分类器。通常情况下,如果有两个组,则A组c = +1, b组c =−1。如果有多个组,则可以修改[13]。在随后的文章中,我们将研究使用PLS1算法获得的矩阵的属性,以及它们与PCA的根本区别,尽管其中一些具有相同的名称。作者声明无利益冲突。数据共享不适用于本文,因为在当前研究期间没有生成或分析数据集。
{"title":"Partial Least Squares","authors":"Richard G. Brereton","doi":"10.1002/cem.70069","DOIUrl":"https://doi.org/10.1002/cem.70069","url":null,"abstract":"&lt;p&gt;In the previous article, we discussed the enormous increase in the impact of chemometrics methods over the last four decades [&lt;span&gt;1&lt;/span&gt;] and the important role PLS (partial least squares or projection to latent structures) has had in this revolution. However, we are yet to describe this technique, which will be the subject of this and subsequent articles.&lt;/p&gt;&lt;p&gt;There are many thousands, or perhaps tens of thousands, of theoretical, methodological and tutorial articles about PLS over the last 50 years, and possibly many hundreds of thousands of articles involving the use of this approach. In the early decades of the development of chemometrics as a coherent discipline in the 1980s and 1990s, there was a significant focus on PLS, but still after so many decades, it still spawns new insights. There are conferences dedicated to PLS. This article is therefore only one of very many such articles, but PLS can be approached in endless ways, and no general introduction to chemometrics is complete without describing this method.&lt;/p&gt;&lt;p&gt;PLS was first proposed in the 1960s by Herman Wold [&lt;span&gt;2, 3&lt;/span&gt;]. The method was slowly introduced to chemometrics with a significant expansion in interest in the 1980s. Svante Wold first publicised its applicability in the 1970s and 1980s [&lt;span&gt;4, 5&lt;/span&gt;]. Early pioneers of the 1980s include Paul Geladi [&lt;span&gt;6&lt;/span&gt;], Harald Martens and Tormod Naes [&lt;span&gt;7&lt;/span&gt;] who wrote classical articles/books that to this day are still viewed as essential reading. During the 1980s, there were numerous conferences, software developments and courses on PLS. This development was not only important in chemistry but also in economics and social sciences.&lt;/p&gt;&lt;p&gt;The original PLS algorithm, called PLS1, was enhanced during this period, most notably by PLS2 but also by many other developments, which continue to this day. New theoretical articles on the properties of PLS continue as topical areas for research.&lt;/p&gt;&lt;p&gt;As originally described, PLS was used for quantitative regression or calibration, sometimes distinguished by the terminology PLSR (PLS regression), where the &lt;i&gt;c&lt;/i&gt; block was a continuous variable, such as a concentration, reaction rate or activity. Most of the early applications in chemistry were, for example, in NIR spectroscopy, where the aim was to calibrate the spectra to the concentration of an analyte or a class of compounds on a continuous scale.&lt;/p&gt;&lt;p&gt;However, over the past few years, PLSDA (PLS discriminant analysis) [&lt;span&gt;12&lt;/span&gt;] has become an important technique used for multivariate classification. In this case, the &lt;i&gt;c&lt;/i&gt; block is discrete, representing a numerical label or a classifier. Typically, if there are two groups, &lt;i&gt;c&lt;/i&gt; = +1 for group A, and &lt;i&gt;c&lt;/i&gt; = −1 for group B. For multiple groups, there are several modifications [&lt;span&gt;13&lt;/span&gt;] available.&lt;/p&gt;&lt;p&gt;In subsequent articles, we will look at the properties of the matrices obtained using the PLS1 algorithm and how they fundament","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 12","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/epdf/10.1002/cem.70069","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145695002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine Learning–Driven Near- and Mid-Infrared Chemometrics for Rapid, Cost-Effective COVID-19 Screening in Dried Plasma 机器学习驱动的近红外和中红外化学计量学在干燥血浆中快速、经济地筛查COVID-19
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-11-24 DOI: 10.1002/cem.70087
Fernanda F. S. Oliveira, Wilson J. Cardoso, Luiz F. P. Ramos, Túlio R. Freitas, Priscilla S. Filgueiras, Rafaella F. Q. Grenfell, Reinaldo F. Teófilo, Adriano de Paula Sabino

The COVID-19 pandemic has highlighted the urgent need for rapid, accurate, and cost-effective diagnostic alternatives to conventional methods such as RT-qPCR and immunoassays. In this study, we explored the potential of vibrational spectroscopy, specifically near-infrared (NIR) and mid-infrared (MID) spectroscopy, for detecting SARS-CoV-2 infection in dried plasma samples. Spectral data were obtained from 83 patients (45 COVID-19 positive and 38 negative) and analyzed using partial least squares discriminant analysis (PLS-DA), with and without variable selection by the ordered predictors selection for discriminant analysis (OPSDA) method. While initial models using full spectra showed moderate classification accuracy, the application of OPSDA significantly enhanced model performance. For the NIR dataset, OPSDA-based models achieved 100% sensitivity and specificity in both training and test sets (n = 25). For the MID dataset, the test set (n = 25) sensitivity reached 86%, with 100% specificity. These results demonstrate that NIR and MID spectroscopy, when combined with advanced chemometric approaches, can provide reliable, rapid, and low-cost screening for COVID-19. This platform holds promise for broader applications in clinical diagnostics beyond the current pandemic.

COVID-19大流行突出表明,迫切需要快速、准确和具有成本效益的诊断替代方法,以替代RT-qPCR和免疫测定等传统方法。在这项研究中,我们探索了振动光谱,特别是近红外(NIR)和中红外(MID)光谱检测干燥血浆样品中SARS-CoV-2感染的潜力。83例患者(45例阳性,38例阴性)的光谱数据采用偏最小二乘判别分析(PLS-DA)进行分析,采用有序预测因子选择判别分析(OPSDA)方法进行变量选择和不进行变量选择。虽然使用全光谱的初始模型具有中等的分类精度,但OPSDA的应用显著提高了模型的性能。对于NIR数据集,基于opsda的模型在训练集和测试集(n = 25)中都达到了100%的灵敏度和特异性。对于MID数据集,测试集(n = 25)的灵敏度达到86%,特异性为100%。这些结果表明,近红外光谱和MID光谱与先进的化学计量方法相结合,可以提供可靠、快速和低成本的COVID-19筛查。该平台有望在当前大流行之外的临床诊断中得到更广泛的应用。
{"title":"Machine Learning–Driven Near- and Mid-Infrared Chemometrics for Rapid, Cost-Effective COVID-19 Screening in Dried Plasma","authors":"Fernanda F. S. Oliveira,&nbsp;Wilson J. Cardoso,&nbsp;Luiz F. P. Ramos,&nbsp;Túlio R. Freitas,&nbsp;Priscilla S. Filgueiras,&nbsp;Rafaella F. Q. Grenfell,&nbsp;Reinaldo F. Teófilo,&nbsp;Adriano de Paula Sabino","doi":"10.1002/cem.70087","DOIUrl":"https://doi.org/10.1002/cem.70087","url":null,"abstract":"<p>The COVID-19 pandemic has highlighted the urgent need for rapid, accurate, and cost-effective diagnostic alternatives to conventional methods such as RT-qPCR and immunoassays. In this study, we explored the potential of vibrational spectroscopy, specifically near-infrared (NIR) and mid-infrared (MID) spectroscopy, for detecting SARS-CoV-2 infection in dried plasma samples. Spectral data were obtained from 83 patients (45 COVID-19 positive and 38 negative) and analyzed using partial least squares discriminant analysis (PLS-DA), with and without variable selection by the ordered predictors selection for discriminant analysis (OPSDA) method. While initial models using full spectra showed moderate classification accuracy, the application of OPSDA significantly enhanced model performance. For the NIR dataset, OPSDA-based models achieved 100% sensitivity and specificity in both training and test sets (<i>n</i> = 25). For the MID dataset, the test set (<i>n</i> = 25) sensitivity reached 86%, with 100% specificity. These results demonstrate that NIR and MID spectroscopy, when combined with advanced chemometric approaches, can provide reliable, rapid, and low-cost screening for COVID-19. This platform holds promise for broader applications in clinical diagnostics beyond the current pandemic.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 12","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/epdf/10.1002/cem.70087","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145585271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Developing Digital Twin Visualizations: A Methodology and Case Study on Chemical Separation Processing 发展数字孪生可视化:化学分离处理的方法和案例研究
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-11-18 DOI: 10.1002/cem.70085
Adam Pluth, Kolton Heaps, Samantha Thueson, Jack C. Dunker, Ashley Shields

As advances in digital engineering continue to push the technological boundaries, digital twin (DT) visualizations for diagnostics and safeguards advancement become much more feasible and practical. DTs generate large and complex data streams that require effective user interfaces to provide monitoring and diagnostic capabilities. Unfortunately, while these frameworks exist, there is not much research on the systematic documentation of human–computer interaction (HCI) for DT visualization. This work presents a dual-mode visualization methodology (two dimensional [2D] graphical user interface dashboard and 3D mixed reality) designed to support diagnostic tasks in DT systems and building on a validated framework and applying established HCI principles. The methodology is demonstrated through a case study of aqueous processing at Idaho National Laboratory, using experimental data from the chemical solvent extraction runs. Our interfaces display real-time alerts and monitoring to inform users of safeguards anomalies. The interfaces use immersive 3D mixed-reality visualization for further system and experiment investigation. This work demonstrates how the systematic application of HCI principles can inform DT visualization design for diagnostic and safeguards applications. While formal user evaluation studies remain as future work, this paper documents the systematic design methodology and demonstrates a proof-of-concept implementation.

随着数字工程的进步不断推动技术界限,用于诊断和保障进步的数字孪生(DT)可视化变得更加可行和实用。dt生成庞大而复杂的数据流,需要有效的用户界面来提供监控和诊断功能。不幸的是,虽然这些框架存在,但对于DT可视化的人机交互(HCI)系统文档的研究并不多。这项工作提出了一种双模式可视化方法(二维[2D]图形用户界面仪表板和3D混合现实),旨在支持DT系统中的诊断任务,并建立在经过验证的框架上,并应用已建立的HCI原则。该方法通过爱达荷国家实验室的水处理案例研究进行了演示,使用了化学溶剂萃取运行的实验数据。我们的界面显示实时警报和监控,以通知用户的安全异常。界面采用沉浸式三维混合现实可视化,便于进一步的系统和实验研究。这项工作展示了HCI原理的系统应用如何为诊断和保障应用的DT可视化设计提供信息。虽然正式的用户评估研究仍是未来的工作,但本文记录了系统的设计方法,并演示了概念验证的实现。
{"title":"Developing Digital Twin Visualizations: A Methodology and Case Study on Chemical Separation Processing","authors":"Adam Pluth,&nbsp;Kolton Heaps,&nbsp;Samantha Thueson,&nbsp;Jack C. Dunker,&nbsp;Ashley Shields","doi":"10.1002/cem.70085","DOIUrl":"https://doi.org/10.1002/cem.70085","url":null,"abstract":"<p>As advances in digital engineering continue to push the technological boundaries, digital twin (DT) visualizations for diagnostics and safeguards advancement become much more feasible and practical. DTs generate large and complex data streams that require effective user interfaces to provide monitoring and diagnostic capabilities. Unfortunately, while these frameworks exist, there is not much research on the systematic documentation of human–computer interaction (HCI) for DT visualization. This work presents a dual-mode visualization methodology (two dimensional [2D] graphical user interface dashboard and 3D mixed reality) designed to support diagnostic tasks in DT systems and building on a validated framework and applying established HCI principles. The methodology is demonstrated through a case study of aqueous processing at Idaho National Laboratory, using experimental data from the chemical solvent extraction runs. Our interfaces display real-time alerts and monitoring to inform users of safeguards anomalies. The interfaces use immersive 3D mixed-reality visualization for further system and experiment investigation. This work demonstrates how the systematic application of HCI principles can inform DT visualization design for diagnostic and safeguards applications. While formal user evaluation studies remain as future work, this paper documents the systematic design methodology and demonstrates a proof-of-concept implementation.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 11","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/epdf/10.1002/cem.70085","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145580917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing Metabolomics Analysis: Performance Evaluation of OPLS-DA and OPLS-EP Models 增强代谢组学分析:OPLS-DA和OPLS-EP模型的性能评价
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-11-17 DOI: 10.1002/cem.70086
Oleksandr Ilchenko, Antti Henrik

In the analysis of metabolomics data, selecting the appropriate statistical approach is crucial for maximizing model interpretation, predictivity and reliability. This study evaluates the effectiveness of Orthogonal Partial Least Squares (OPLS) models, specifically comparing OPLS-DA (assuming sample independence) and OPLS-EP (assuming sample dependency) in datasets of bacterial samples under different experimental conditions. OPLS-EP consistently demonstrates superior predictive performance, evidenced by higher predictive ability by means of cross-validation (Q2) compared to OPLS-DA, indicating greater model significance. Our findings prove the advantages of the paired statistical approach. This approach ensures that treatment effects are accurately measured by minimizing inter-sample variation and enhancing signal detection. Previous research in metabolomics has demonstrated the benefits of this method for biomarker sensitivity, particularly in matched case–control studies. The present study extends this understanding by applying paired statistical approaches to bacterial isolate treatments, offering novel insights into their utility. Overall, the findings emphasize the importance of OPLS-EP in enhancing biomarker sensitivity and model reliability in metabolomics research.

在代谢组学数据分析中,选择合适的统计方法对于最大化模型解释、预测和可靠性至关重要。本研究评估了正交偏最小二乘(OPLS)模型的有效性,具体比较了不同实验条件下细菌样本数据集上的OPLS- da(假设样本独立)和OPLS- ep(假设样本依赖)。与OPLS-DA相比,通过交叉验证(Q2), OPLS-EP具有更高的预测能力,显示出更强的模型意义。我们的发现证明了配对统计方法的优势。该方法通过最小化样本间变化和增强信号检测,确保了处理效果的准确测量。先前的代谢组学研究已经证明了这种方法对生物标志物敏感性的好处,特别是在匹配的病例对照研究中。本研究通过将配对统计方法应用于细菌分离治疗,扩展了这一理解,为其效用提供了新的见解。总的来说,这些发现强调了OPLS-EP在提高代谢组学研究中生物标志物敏感性和模型可靠性方面的重要性。
{"title":"Enhancing Metabolomics Analysis: Performance Evaluation of OPLS-DA and OPLS-EP Models","authors":"Oleksandr Ilchenko,&nbsp;Antti Henrik","doi":"10.1002/cem.70086","DOIUrl":"https://doi.org/10.1002/cem.70086","url":null,"abstract":"<p>In the analysis of metabolomics data, selecting the appropriate statistical approach is crucial for maximizing model interpretation, predictivity and reliability. This study evaluates the effectiveness of Orthogonal Partial Least Squares (OPLS) models, specifically comparing OPLS-DA (assuming sample independence) and OPLS-EP (assuming sample dependency) in datasets of bacterial samples under different experimental conditions. OPLS-EP consistently demonstrates superior predictive performance, evidenced by higher predictive ability by means of cross-validation (Q2) compared to OPLS-DA, indicating greater model significance. Our findings prove the advantages of the paired statistical approach. This approach ensures that treatment effects are accurately measured by minimizing inter-sample variation and enhancing signal detection. Previous research in metabolomics has demonstrated the benefits of this method for biomarker sensitivity, particularly in matched case–control studies. The present study extends this understanding by applying paired statistical approaches to bacterial isolate treatments, offering novel insights into their utility. Overall, the findings emphasize the importance of OPLS-EP in enhancing biomarker sensitivity and model reliability in metabolomics research.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 11","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/epdf/10.1002/cem.70086","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145580726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PCA-Based Peak Feature Selection for Classification of Spectroscopic Datasets 基于pca的光谱数据分类的峰特征选择
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-11-11 DOI: 10.1002/cem.70074
Ingo Schmitt, Kay Sowoidnich, Tapashi Gosswami, Bernd Sumpf, Martin Maiwald, Matthias Wolff

Reducing feature dimensionality in spectroscopic data is crucial for efficient analysis and classification. Using all available features for classification typically results in an unacceptably high runtime and poor accuracy. Popular feature extraction methods, such as principal component analysis (PCA), linear discriminant analysis (LDA), and autoencoders, reduce feature dimensionality by extracting latent features that can be challenging to interpret. To enable better human interpretation of the classification model, we avoid extraction methods and instead propose applying feature selection methods. In this work, we develop an innovative PCA-based feature selection method for spectroscopic data, providing an essential subset of the original features. As an important advantage, no prior knowledge about the characteristic signals of the respective target substance is required. In this proof-of-concept study, the proposed method is initially characterized using simulated Raman and infrared absorption datasets. From the top five PCA eigenvectors of spectroscopic data, we identify a set of three top peaks each at specific wavenumbers (features). The compact set of selected features is then used for classification tasks applying a decision tree. Based on two well-defined spectroscopic datasets, our study demonstrates that our new method of PCA-based peak finding outperforms selected other approaches with regard to interpretability and accuracy. For both investigated datasets, accuracies greater than 97% are achieved. Our approach shows large potential for accurate classification combined with interpretability in further scenarios involving spectroscopic datasets.

降低光谱数据的特征维数是有效分析和分类的关键。使用所有可用的特性进行分类通常会导致不可接受的高运行时和低准确性。流行的特征提取方法,如主成分分析(PCA)、线性判别分析(LDA)和自动编码器,通过提取难以解释的潜在特征来降低特征维数。为了使人类更好地解释分类模型,我们避免使用提取方法,而是提出使用特征选择方法。在这项工作中,我们开发了一种创新的基于pca的光谱数据特征选择方法,提供了原始特征的基本子集。作为一个重要的优点,不需要关于各自目标物质的特征信号的先验知识。在这项概念验证研究中,所提出的方法最初使用模拟拉曼和红外吸收数据集进行了表征。从光谱数据的前五个PCA特征向量中,我们确定了一组三个特定波数(特征)的峰值。然后将所选特征的压缩集用于应用决策树的分类任务。基于两个定义良好的光谱数据集,我们的研究表明,我们的基于pca的新方法在可解释性和准确性方面优于其他选择的方法。对于所调查的两个数据集,准确率均超过97%。我们的方法显示了在涉及光谱数据集的进一步场景中精确分类和可解释性的巨大潜力。
{"title":"PCA-Based Peak Feature Selection for Classification of Spectroscopic Datasets","authors":"Ingo Schmitt,&nbsp;Kay Sowoidnich,&nbsp;Tapashi Gosswami,&nbsp;Bernd Sumpf,&nbsp;Martin Maiwald,&nbsp;Matthias Wolff","doi":"10.1002/cem.70074","DOIUrl":"https://doi.org/10.1002/cem.70074","url":null,"abstract":"<p>Reducing feature dimensionality in spectroscopic data is crucial for efficient analysis and classification. Using all available features for classification typically results in an unacceptably high runtime and poor accuracy. Popular feature extraction methods, such as principal component analysis (PCA), linear discriminant analysis (LDA), and autoencoders, reduce feature dimensionality by extracting latent features that can be challenging to interpret. To enable better human interpretation of the classification model, we avoid extraction methods and instead propose applying feature selection methods. In this work, we develop an innovative PCA-based feature selection method for spectroscopic data, providing an essential subset of the original features. As an important advantage, no prior knowledge about the characteristic signals of the respective target substance is required. In this proof-of-concept study, the proposed method is initially characterized using simulated Raman and infrared absorption datasets. From the top five PCA eigenvectors of spectroscopic data, we identify a set of three top peaks each at specific wavenumbers (features). The compact set of selected features is then used for classification tasks applying a decision tree. Based on two well-defined spectroscopic datasets, our study demonstrates that our new method of PCA-based peak finding outperforms selected other approaches with regard to interpretability and accuracy. For both investigated datasets, accuracies greater than 97% are achieved. Our approach shows large potential for accurate classification combined with interpretability in further scenarios involving spectroscopic datasets.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 11","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/epdf/10.1002/cem.70074","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145521864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Is Chemometrics the Most Important Advance in Analytical Chemistry Since the 1980s and Partial Least Squares the Jewel in the Crown? 化学计量学是自20世纪80年代以来分析化学最重要的进展,偏最小二乘法是皇冠上的宝石吗?
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-11-09 DOI: 10.1002/cem.70070
Richard G. Brereton

The first article in this column was published in 2014 [1], over 10 years ago. In this intermission, we look at the importance of chemometrics to analytical chemists and the central role of PLS (Partial Least Squares), which will be the subject of the next articles.

The name chemometrics (in Swedish) was first advocated in the literature in 1972 [2] but was far from the modern day emphasis on multivariate data and primarily involved univariate curve fitting. Although a few enthusiasts communicated together during the 1970s, it was not until the early 1980s that the subject we now recognise became organised.

Most of the major building blocks were developed in the 1980s, and we will look at how the subject fitted into analytical chemistry as from 1980. A good way to study the impact is via citations. We will use Web of Science (Clarivate) with a census date of 24 July 2025 to chart the progress of major publications in this area. We look at some of the more prominent relevant journals.

Analytica Chimica Acta is one of the first general analytical journals to publish a significant number of papers in chemometrics. The classic paper by Geladi and Kowalski on PLS (Partial Least Squares) [3] in 1986 is the most cited article ever in this journal since 1980 (5746 cites). The second most cited (2483 cites) is on a subject related to chemometrics (Box–Behnken designs), suggesting the dominant impact of chemometrics in this forum. What is remarkable is the longevity of this article. Most articles decline in importance after a few years. In Figure 1, we present the citations to this article as compared to all papers together in ACA since 1986. So not only is this, by a long way, the most highly cited article ever published in this journal, its importance is increasing with time over nearly a 40-year period, compared to general articles published in 1986. This is even clearer when we look at the percentage of citations to all papers in ACA published in 1986 over time, in Figure 2. By 2024, over 80% of all citations are to this one paper. This strongly suggests that PLS continues to remain a long-term advance in analytical chemistry over four decades.

In Elsevier's Chemometrics and Intelligent Laboratory Systems, which first published in 1986, the two most cited papers by a wide margin are about PCA (9016 cites) [4] and, again, PLS (7652 cites) [5]. However, as the former paper was published in 1987 and the latter in 2001, in fact, the interest in PLS has been greater than PCA.

The journal Analytical Chemistry, although it published some early reviews in chemometrics, does not prominently feature chemometrics articles. In the decade 1980–1989, the third most cited article was focused on chemometrics and involved PLS [6] suggesting the importance of PLS to chemometrics in its early days (2434 cites). The most cited article (7820 c

本专栏的第一篇文章发表于2014年,距今已有10多年。在这个间歇,我们看看化学计量学对分析化学家的重要性和PLS(偏最小二乘)的中心作用,这将是下一篇文章的主题。化学计量学(瑞典语)这个名称最早是在1972年的文献中提出的,但与现代强调的多变量数据相距甚远,主要涉及单变量曲线拟合。尽管在20世纪70年代有一些爱好者在一起交流,但直到20世纪80年代初,我们现在所认识的这个主题才有了组织。大多数主要的构建模块都是在20世纪80年代发展起来的,我们将从1980年开始研究这门学科是如何融入分析化学的。研究影响的一个好方法是通过引用。我们将使用人口普查日期为2025年7月24日的Web of Science (Clarivate)来绘制该领域主要出版物的进展情况。我们来看一些比较著名的相关期刊。《分析化学学报》是最早发表大量化学计量学论文的通用分析期刊之一。Geladi和Kowalski在1986年发表的关于偏最小二乘的经典论文是该杂志自1980年以来被引用最多的文章(5746次)。被引用次数第二多(2483次)的主题与化学计量学(Box-Behnken设计)有关,这表明化学计量学在这个论坛上的主导影响。值得注意的是这篇文章的篇幅很长。大多数文章几年后重要性就下降了。在图1中,我们将这篇文章的引用与1986年以来ACA所有论文的引用进行了比较。因此,这篇文章不仅是该杂志上发表的被引用次数最多的文章,而且与1986年发表的一般文章相比,在近40年的时间里,它的重要性随着时间的推移而增加。当我们观察1986年ACA发表的所有论文的引用百分比时,这一点就更加清晰了,如图2所示。到2024年,超过80%的引用都指向这篇论文。这强烈表明,PLS继续保持在分析化学的长期进步超过四十年。在1986年首次出版的Elsevier的chemometics and Intelligent Laboratory Systems中,被引用最多的两篇论文是关于PCA(9016次引用)[5]和PLS(7652次引用)[5]。然而,由于前一篇论文发表于1987年,后一篇发表于2001年,事实上,对PLS的兴趣大于PCA。《分析化学》(Analytical Chemistry)杂志虽然发表了一些关于化学计量学的早期评论,但并没有突出化学计量学的文章。在1980-1989年间,被引次数第三多的文章集中在化学计量学上,并涉及PLS b[6],这表明PLS在早期对化学计量学的重要性(2434篇引用)。自1980年以来,被引次数最多的文章是1996年发表的关于质谱法的文章(7820次)。仪器分析化学在质谱、微型化、蛋白质组学、成像等方面取得了许多惊人的进展,这篇文章代表了核心分析文献中影响力最高的文章。但其引用影响力小于b[4],与[5]相当,后者发表时间晚了5年。更引人注目的是,与分析化学中被引用最多的文章相比,关于PLS的文章的寿命更长,如图3所示。(2021年的暂时增长是由于大多数期刊通用的引用计数方法发生了变化)。可以看到,两篇关于PLS的论文都随着时间的推移而增加其影响,而论文b[7]与大多数文章一样,具有有效的生命周期,然后其影响随着时间的推移而降低。在最近的RSC期刊《分析方法》(Analytical Methods)上,其第一篇文章发表于2009年,被引用最多的论文是关于PCA的(2485次引用),第二篇是关于PARAFAC的文章,这两篇文章都是关于化学计量学的。《塔兰塔》(Talanta)杂志上有一篇关于响应面方法学(Response Surface Methodology)的文章,是1980年至今被引用最多的论文(4632次引用),许多人认为这是化学计量学的范畴。在该期刊中,被引用最多的文章是关于PLS判别分析b[10](2240次引用),第二篇被引用最多的论文也是关于PLS的一个变体。这表明在化学计量学文献中对PLS及其变体的主要兴趣。尽管本文主要基于对Web of Science中的引文的分析,但可以得出结论,化学计量学已经导致了分析化学的重大进步,可能是过去四十年来最广泛的进步。现代仪器可以从色谱和光谱中产生大量的多元数据,越来越多地被用作研究代谢组学、蛋白质组学、遗传科学、环境监测、食品化学等广泛问题的关键。
{"title":"Is Chemometrics the Most Important Advance in Analytical Chemistry Since the 1980s and Partial Least Squares the Jewel in the Crown?","authors":"Richard G. Brereton","doi":"10.1002/cem.70070","DOIUrl":"https://doi.org/10.1002/cem.70070","url":null,"abstract":"<p>The first article in this column was published in 2014 [<span>1</span>], over 10 years ago. In this intermission, we look at the importance of chemometrics to analytical chemists and the central role of PLS (Partial Least Squares), which will be the subject of the next articles.</p><p>The name chemometrics (in Swedish) was first advocated in the literature in 1972 [<span>2</span>] but was far from the modern day emphasis on multivariate data and primarily involved univariate curve fitting. Although a few enthusiasts communicated together during the 1970s, it was not until the early 1980s that the subject we now recognise became organised.</p><p>Most of the major building blocks were developed in the 1980s, and we will look at how the subject fitted into analytical chemistry as from 1980. A good way to study the impact is via citations. We will use Web of Science (Clarivate) with a census date of 24 July 2025 to chart the progress of major publications in this area. We look at some of the more prominent relevant journals.</p><p><i>Analytica Chimica Acta</i> is one of the first general analytical journals to publish a significant number of papers in chemometrics. The classic paper by Geladi and Kowalski on PLS (Partial Least Squares) [<span>3</span>] in 1986 is the most cited article ever in this journal since 1980 (5746 cites). The second most cited (2483 cites) is on a subject related to chemometrics (Box–Behnken designs), suggesting the dominant impact of chemometrics in this forum. What is remarkable is the longevity of this article. Most articles decline in importance after a few years. In Figure 1, we present the citations to this article as compared to all papers together in ACA since 1986. So not only is this, by a long way, the most highly cited article ever published in this journal, its importance is increasing with time over nearly a 40-year period, compared to general articles published in 1986. This is even clearer when we look at the percentage of citations to all papers in ACA published in 1986 over time, in Figure 2. By 2024, over 80% of all citations are to this one paper. This strongly suggests that PLS continues to remain a long-term advance in analytical chemistry over four decades.</p><p>In Elsevier's <i>Chemometrics and Intelligent Laboratory Systems</i>, which first published in 1986, the two most cited papers by a wide margin are about PCA (9016 cites) [<span>4</span>] and, again, PLS (7652 cites) [<span>5</span>]. However, as the former paper was published in 1987 and the latter in 2001, in fact, the interest in PLS has been greater than PCA.</p><p>The journal <i>Analytical Chemistry</i>, although it published some early reviews in chemometrics, does not prominently feature chemometrics articles. In the decade 1980–1989, the third most cited article was focused on chemometrics and involved PLS [<span>6</span>] suggesting the importance of PLS to chemometrics in its early days (2434 cites). The most cited article (7820 c","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 11","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/epdf/10.1002/cem.70070","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145529739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction to “Transforming Hyperspectral Images Into Chemical Maps: A Novel End-to-End Deep Learning Approach” 更正“将高光谱图像转换为化学图:一种新颖的端到端深度学习方法”
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-10-28 DOI: 10.1002/cem.70084

O.-C. G. Engstrøm, M. Albano-Gaglio, E. S. Dreier, et al., “ Transforming Hyperspectral Images Into Chemical Maps: A Novel End-to-End Deep Learning Approach,” Journal of Chemometrics 39, no. 8 (2025): e70041, https://doi.org/10.1002/cem.70041.

In the original work, there was an apparent discrepancy between two ways of making predictions with the same PLS model. One way was predicting directly on mean spectra. The other was predicting on pixel-wise spectra and subsequently averaging the result. The discrepancy between these two ways of evaluation was solely due to the order of application of preprocessing.

In the original work, the mean spectrum was computed by taking the negative logarithm of each pixel (to transform from reflectance to pseudo absorbance), averaging all pixels, applying SNV (Standard Normal Variate) transform, and finally convolution with an SG (Savitzky–Golay) filter. However, when applying pixel-wise PLS, SNV and SG were applied to each pixel individually. As SNV is nonlinear (the standard deviation is not linear), it matters whether it is applied before or after averaging. So, this update now applies preprocessing to each pixel individually before taking the average to compute a mean spectrum. This removes any discrepancy between the results obtained by pixel-wise PLS and PLS on mean spectra. Any reference to this discrepancy has been removed from the work and all results and figures have been updated accordingly. Appendix D describes the updates in detail.

We apologize for this error.

符合。陈晓明,陈晓明,陈晓明,等,“基于深度学习的高光谱图像转换方法”,《化学计量学报》第39期。8 (2025): e70041, https://doi.org/10.1002/cem.70041。在最初的工作中,在同一PLS模型下,两种预测方法之间存在明显的差异。一种方法是直接预测平均光谱。另一种方法是对逐像素光谱进行预测,然后对结果进行平均。这两种评价方法之间的差异仅仅是由于预处理的应用顺序不同。在最初的工作中,平均光谱的计算方法是对每个像素取负对数(从反射率转换为伪吸光度),对所有像素取平均值,应用标准正态变量(SNV)变换,最后与SG (Savitzky-Golay)滤波器进行卷积。然而,当应用逐像素PLS时,SNV和SG分别应用于每个像素。由于SNV是非线性的(标准差不是线性的),所以它是在平均之前还是之后应用是很重要的。所以,这个更新现在对每个像素单独进行预处理,然后取平均值来计算平均谱。这消除了像素级PLS和平均光谱PLS结果之间的任何差异。任何提及这一差异的内容都已从工作中删除,所有结果和数据也已相应更新。附录D详细描述了更新内容。我们为这个错误道歉。
{"title":"Correction to “Transforming Hyperspectral Images Into Chemical Maps: A Novel End-to-End Deep Learning Approach”","authors":"","doi":"10.1002/cem.70084","DOIUrl":"https://doi.org/10.1002/cem.70084","url":null,"abstract":"<p>\u0000 <span>O.-C. G. Engstrøm</span>, <span>M. Albano-Gaglio</span>, <span>E. S. Dreier</span>, et al., “ <span>Transforming Hyperspectral Images Into Chemical Maps: A Novel End-to-End Deep Learning Approach</span>,” <i>Journal of Chemometrics</i> <span>39</span>, no. <span>8</span> (<span>2025</span>): e70041, https://doi.org/10.1002/cem.70041.\u0000 </p><p>In the original work, there was an apparent discrepancy between two ways of making predictions with the same PLS model. One way was predicting directly on mean spectra. The other was predicting on pixel-wise spectra and subsequently averaging the result. The discrepancy between these two ways of evaluation was solely due to the order of application of preprocessing.</p><p>In the original work, the mean spectrum was computed by taking the negative logarithm of each pixel (to transform from reflectance to pseudo absorbance), averaging all pixels, applying SNV (Standard Normal Variate) transform, and finally convolution with an SG (Savitzky–Golay) filter. However, when applying pixel-wise PLS, SNV and SG were applied to each pixel individually. As SNV is nonlinear (the standard deviation is not linear), it matters whether it is applied before or after averaging. So, this update now applies preprocessing to each pixel individually before taking the average to compute a mean spectrum. This removes any discrepancy between the results obtained by pixel-wise PLS and PLS on mean spectra. Any reference to this discrepancy has been removed from the work and all results and figures have been updated accordingly. Appendix D describes the updates in detail.</p><p>We apologize for this error.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 11","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/epdf/10.1002/cem.70084","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145406846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Chemometrics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1