首页 > 最新文献

Journal of Chemometrics最新文献

英文 中文
Enhancing Similarity Measures for Binary Data in Clustering: The Role of Rare Events and Matching Absences 增强二值数据聚类的相似性度量:罕见事件和匹配缺失的作用
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-09-04 DOI: 10.1002/cem.70061
Tânia F. G. G. Cova, Alberto A. C. C. Pais

Clustering of binary data is central to various applications, particularly in the fields of medical diagnostics, chemistry, and chemoinformatics. However, standard similarity measures often fail to capture the informative value of rare features and matching absences, treating all attributes as equally relevant. This can lead to suboptimal clustering, especially when informative patterns are hidden in low-frequency features. This study proposes a probability-weighted approach to measuring similarity, which gives more weight to rare features and accounts for the value of shared absences based on their occurrence probabilities. We analyze how this adjustment impacts clustering results, using visual comparisons and experiments on real datasets. The results show consistent gains in clustering precision and stability compared to standard measures. Our findings suggest that incorporating the rarity of features into similarity computation can offer a more reliable basis for clustering binary data, especially in domains where rare signals carry meaningful information.

二进制数据的聚类是各种应用的核心,特别是在医学诊断、化学和化学信息学领域。然而,标准的相似性度量往往不能捕获稀有特征和匹配缺失的信息价值,将所有属性视为同等相关。这可能导致次优聚类,特别是当信息模式隐藏在低频特征中时。本文提出了一种概率加权方法来衡量相似性,该方法赋予罕见特征更多的权重,并根据它们的出现概率来计算共享缺席的值。我们使用视觉比较和真实数据集的实验来分析这种调整如何影响聚类结果。结果表明,与标准度量相比,聚类精度和稳定性得到了一致的提高。我们的研究结果表明,将特征的稀缺性纳入相似度计算可以为二元数据的聚类提供更可靠的基础,特别是在罕见信号携带有意义信息的领域。
{"title":"Enhancing Similarity Measures for Binary Data in Clustering: The Role of Rare Events and Matching Absences","authors":"Tânia F. G. G. Cova,&nbsp;Alberto A. C. C. Pais","doi":"10.1002/cem.70061","DOIUrl":"10.1002/cem.70061","url":null,"abstract":"<div>\u0000 \u0000 <p>Clustering of binary data is central to various applications, particularly in the fields of medical diagnostics, chemistry, and chemoinformatics. However, standard similarity measures often fail to capture the informative value of rare features and matching absences, treating all attributes as equally relevant. This can lead to suboptimal clustering, especially when informative patterns are hidden in low-frequency features. This study proposes a probability-weighted approach to measuring similarity, which gives more weight to rare features and accounts for the value of shared absences based on their occurrence probabilities. We analyze how this adjustment impacts clustering results, using visual comparisons and experiments on real datasets. The results show consistent gains in clustering precision and stability compared to standard measures. Our findings suggest that incorporating the rarity of features into similarity computation can offer a more reliable basis for clustering binary data, especially in domains where rare signals carry meaningful information.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 9","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144935214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving Grading Accuracy by Optimizing the Logistic Loss Function in PLS Modelling 通过优化PLS建模中的Logistic损失函数来提高分级精度
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-09-02 DOI: 10.1002/cem.70064
Zhonghai He, Huilong Sheng, Yi Zhang, Xiaofang Zhang

The prediction results from Partial Least Squares (PLS) model are commonly used to assess whether a product meets quality standards, or whether adjustments are needed in production process parameters. It's easy to understand that misgrading is mostly occurred for marginal samples (samples near the threshold). We propose Logistic-Enhanced PLS (LE-PLS) model, which defines a logistic loss function and minimizes it via gradient descent to optimize the PLS projection vector. The prediction result of LE-PLS for marginal samples tends to be far away from the threshold value. This optimization enables LE-PLS to enhance grading capability while largely maintaining the regression accuracy of the PLS. LE-PLS was evaluated on two real-world datasets (bean pulp and corn gluten meal) and one simulated dataset, correcting 10 out of 19 misgraded samples, 6 out of 7, and 6 out of 12, respectively. Statistical analysis using paired t-tests confirmed that these improvements were significant. Although RMSEP increased slightly, the change remained within an acceptable range considering the substantial enhancement in grading performance. The algorithm has a computational complexity of Oiteration*samples*variables$$ mathrm{O}left({mathrm{iteration}}^{ast }{mathrm{samples}}^{ast}mathrm{variables}right) $$ during modeling training. However, its prediction-phase complexity is only Osamples*variables$$ mathrm{O}left({mathrm{samples}}^{ast}mathrm{variables}right) $$. Given these advantages, LE-PLS is well-suited for practical applications in NIR-based quality grading of products.

偏最小二乘(PLS)模型的预测结果通常用于评估产品是否符合质量标准,或是否需要调整生产工艺参数。很容易理解,错误分级主要发生在边缘样本(接近阈值的样本)。本文提出了logistic增强型PLS (LE-PLS)模型,该模型定义了logistic损失函数,并通过梯度下降最小化logistic损失函数来优化PLS投影向量。边际样本的LE-PLS预测结果趋向于远离阈值。在两个真实数据集(豆浆和玉米蛋白粉)和一个模拟数据集上对LE-PLS进行了评估,分别纠正了19个错误样本中的10个、7个样本中的6个和12个样本中的6个。使用配对t检验的统计分析证实了这些改善是显著的。虽然RMSEP略有增加,但考虑到分级性能的实质性提高,变化仍在可接受的范围内。该算法在建模训练过程中的计算复杂度为O迭代*样本*变量$$ mathrm{O}left({mathrm{iteration}}^{ast }{mathrm{samples}}^{ast}mathrm{variables}right) $$。然而,其预测阶段复杂度仅为O个样本*个变量$$ mathrm{O}left({mathrm{samples}}^{ast}mathrm{variables}right) $$。鉴于这些优点,LE-PLS非常适合于基于nir的产品质量分级的实际应用。
{"title":"Improving Grading Accuracy by Optimizing the Logistic Loss Function in PLS Modelling","authors":"Zhonghai He,&nbsp;Huilong Sheng,&nbsp;Yi Zhang,&nbsp;Xiaofang Zhang","doi":"10.1002/cem.70064","DOIUrl":"10.1002/cem.70064","url":null,"abstract":"<div>\u0000 \u0000 <p>The prediction results from Partial Least Squares (PLS) model are commonly used to assess whether a product meets quality standards, or whether adjustments are needed in production process parameters. It's easy to understand that misgrading is mostly occurred for marginal samples (samples near the threshold). We propose Logistic-Enhanced PLS (LE-PLS) model, which defines a logistic loss function and minimizes it via gradient descent to optimize the PLS projection vector. The prediction result of LE-PLS for marginal samples tends to be far away from the threshold value. This optimization enables LE-PLS to enhance grading capability while largely maintaining the regression accuracy of the PLS. LE-PLS was evaluated on two real-world datasets (bean pulp and corn gluten meal) and one simulated dataset, correcting 10 out of 19 misgraded samples, 6 out of 7, and 6 out of 12, respectively. Statistical analysis using paired <i>t</i>-tests confirmed that these improvements were significant. Although RMSEP increased slightly, the change remained within an acceptable range considering the substantial enhancement in grading performance. The algorithm has a computational complexity of <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>O</mi>\u0000 <mfenced>\u0000 <mrow>\u0000 <mtext>iteration</mtext>\u0000 <mo>*</mo>\u0000 <mtext>samples</mtext>\u0000 <mo>*</mo>\u0000 <mtext>variables</mtext>\u0000 </mrow>\u0000 </mfenced>\u0000 </mrow>\u0000 <annotation>$$ mathrm{O}left({mathrm{iteration}}^{ast }{mathrm{samples}}^{ast}mathrm{variables}right) $$</annotation>\u0000 </semantics></math> during modeling training. However, its prediction-phase complexity is only <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>O</mi>\u0000 <mfenced>\u0000 <mrow>\u0000 <mtext>samples</mtext>\u0000 <mo>*</mo>\u0000 <mtext>variables</mtext>\u0000 </mrow>\u0000 </mfenced>\u0000 </mrow>\u0000 <annotation>$$ mathrm{O}left({mathrm{samples}}^{ast}mathrm{variables}right) $$</annotation>\u0000 </semantics></math>. Given these advantages, LE-PLS is well-suited for practical applications in NIR-based quality grading of products.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 9","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144935016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Paul Geladi Legacy: Pioneering Chemometrics for the Future Paul Geladi的遗产:未来化学计量学的先驱
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-08-29 DOI: 10.1002/cem.70065
Beatriz Galindo-Prieto
<p>This special issue, entitled ‘Paul Geladi Legacy: Pioneering Chemometrics for the Future’, is a tribute to the remarkable scientific contributions of Professor Paul Geladi to the field of chemometrics. This very special issue brings together a comprehensive collection of topics that reflect the breadth and depth of Paul's work in chemometrics. While nice memories and Paul's interests in science have been shared by some of his friends and colleagues in recent publications, this editorial and its related special issue will focus on some of the most relevant scientific areas that Professor Paul Geladi explored throughout his prolific career. The title of this special issue honouring Paul is not trivial. For many years, Paul emphasized the future of chemometrics as an important and in-depth topic that should be part of scientific meetings, conferences and specialized literature. Indeed, as Paul remarked on several occasions, pioneering chemometrics for the future, not only by adapting its methodologies and advances to new challenges and technologies but also creating new chemometric research directions according to evolving trends in science, is crucial for the field of chemometrics to succeed. To achieve this, high-quality teaching and the education of the next generations in chemometrics is especially important, as well as fostering collaboration across research groups. An exemplar of the latter was the initiative led by Paul called ‘The Laboratory Profile’ (published at <i>Journal of Chemometrics</i> in the 90s), which strengthened the global network of chemometric laboratories and showcased the wide array of scientific activities taking place across university, research institutions and industry. The breadth of Paul's knowledge, enhanced from a rich network of scientists, enabled him to successfully apply the most suitable chemometric techniques across various applications.</p><p>Professor Paul Geladi was a dedicated educator. In 1986, when audiovisual resources were still rarely used in statistical lectures, Paul was ahead of his time publishing an article on the use of videotapes as pedagogic tools in chemometrics education. Besides, Paul wrote several tutorials on chemometric methods, two of which stand out as his most cited work. The first is his tutorial on principal component analysis (co-authored with Wold and Esbensen), which covers the most relevant aspects of PCA and its application, whilst the second tutorial focuses on partial least squares regression (co-authored with Kowalski) and covers the concept and algebra of the PLS algorithm. These tutorials published in international journals remain foundational references in the field. In addition, Paul authored three books of high relevance in the field of chemometrics. His book <i>Multi-Way Analysis with Applications in the Chemical Sciences</i> (co-authored with Smilde and Bro) provides chemometricians with the mathematical foundations needed to understand multi-way approaches and pra
本期特刊题为“Paul Geladi的遗产:未来化学计量学的先驱”,是对Paul Geladi教授对化学计量学领域卓越的科学贡献的致敬。这个非常特殊的问题汇集了一个全面的主题集合,反映了保罗在化学计量学方面工作的广度和深度。虽然保罗的一些朋友和同事在最近的出版物中分享了他对科学的美好回忆和兴趣,但这篇社论及其相关的特刊将重点关注保罗·格拉迪教授在其多产的职业生涯中探索的一些最相关的科学领域。这期纪念保罗的特刊的标题不是微不足道的。多年来,Paul强调化学计量学的未来是一个重要而深入的话题,应该成为科学会议、会议和专业文献的一部分。事实上,正如Paul在多个场合提到的,开拓未来的化学计量学,不仅要适应新的挑战和技术,而且要根据科学的发展趋势创造新的化学计量学研究方向,这对化学计量学领域的成功至关重要。为了实现这一目标,高质量的教学和下一代化学计量学的教育尤为重要,同时也促进了研究小组之间的合作。后者的一个例子是由Paul领导的名为“实验室简介”的倡议(发表在90年代的化学计量学杂志上),该倡议加强了化学计量实验室的全球网络,并展示了大学,研究机构和行业中发生的广泛的科学活动。Paul的知识广度,从丰富的科学家网络增强,使他能够成功地在各种应用中应用最合适的化学计量学技术。保罗·格拉迪教授是一位敬业的教育家。1986年,当视听资源还很少用于统计学讲座时,保罗已经走在了时代的前面,发表了一篇关于在化学计量学教育中使用录像带作为教学工具的文章。此外,保罗还写了几本关于化学计量学方法的教程,其中两本是他被引用最多的作品。第一本是他关于主成分分析的教程(与Wold和Esbensen合著),涵盖了PCA及其应用的最相关方面,而第二本教程侧重于偏最小二乘回归(与Kowalski合著),并涵盖了PLS算法的概念和代数。这些发表在国际期刊上的教程仍然是该领域的基础参考。此外,保罗还撰写了三本与化学计量学领域高度相关的书籍。他的著作《化学科学中的多路分析与应用》(与Smilde和Bro合著)为化学计量学家提供了理解多路方法并实际应用它们所需的数学基础。他的另外两本书《多元图像分析》和《高光谱图像分析技术与应用》(均与Grahn合著)在对图像分析感兴趣的化学计量学家中非常受欢迎。分析图像是保罗职业生涯的主要课题之一,这一点从他撰写的与多变量和高光谱成像相关的大量科学文章中可以看出。在他对方法发展的众多贡献中,保罗在80年代写的一篇最相关的文章讨论了肉类近红外反射光谱的线性化和散射校正(与MacDougall和Martens合著)。Paul出版了许多与化学计量学主题相关的优秀出版物和教材,如数据预处理,光谱学(特别是近红外),图像分析,多元校准,多路分析,变量选择,主成分分析(PCA),偏最小二乘(PLS),多组学数据分析,机器学习和化学计量学算法的开发。Paul的主要优势之一是他在广泛的学科中利用和适应化学计量学方法的新挑战的开创性能力。为了尽可能地代表Paul的科学遗产的广度,这期特刊包含了与化学、光谱学、高光谱成像、农业和食品科学、采样理论、环境健康、人工智能、分子生物学、组学和化学计量学方法的发展/优化有关的精选文章。我要感谢你们所有人的贡献和支持让这期特别的纪念Paul的《化学计量学杂志》成为可能。我相信保罗会很高兴看到这么多好朋友和同事的贡献,无论是通过写文章、评论还是以其他方式帮助他,以纪念他一生的工作。
{"title":"Paul Geladi Legacy: Pioneering Chemometrics for the Future","authors":"Beatriz Galindo-Prieto","doi":"10.1002/cem.70065","DOIUrl":"10.1002/cem.70065","url":null,"abstract":"&lt;p&gt;This special issue, entitled ‘Paul Geladi Legacy: Pioneering Chemometrics for the Future’, is a tribute to the remarkable scientific contributions of Professor Paul Geladi to the field of chemometrics. This very special issue brings together a comprehensive collection of topics that reflect the breadth and depth of Paul's work in chemometrics. While nice memories and Paul's interests in science have been shared by some of his friends and colleagues in recent publications, this editorial and its related special issue will focus on some of the most relevant scientific areas that Professor Paul Geladi explored throughout his prolific career. The title of this special issue honouring Paul is not trivial. For many years, Paul emphasized the future of chemometrics as an important and in-depth topic that should be part of scientific meetings, conferences and specialized literature. Indeed, as Paul remarked on several occasions, pioneering chemometrics for the future, not only by adapting its methodologies and advances to new challenges and technologies but also creating new chemometric research directions according to evolving trends in science, is crucial for the field of chemometrics to succeed. To achieve this, high-quality teaching and the education of the next generations in chemometrics is especially important, as well as fostering collaboration across research groups. An exemplar of the latter was the initiative led by Paul called ‘The Laboratory Profile’ (published at &lt;i&gt;Journal of Chemometrics&lt;/i&gt; in the 90s), which strengthened the global network of chemometric laboratories and showcased the wide array of scientific activities taking place across university, research institutions and industry. The breadth of Paul's knowledge, enhanced from a rich network of scientists, enabled him to successfully apply the most suitable chemometric techniques across various applications.&lt;/p&gt;&lt;p&gt;Professor Paul Geladi was a dedicated educator. In 1986, when audiovisual resources were still rarely used in statistical lectures, Paul was ahead of his time publishing an article on the use of videotapes as pedagogic tools in chemometrics education. Besides, Paul wrote several tutorials on chemometric methods, two of which stand out as his most cited work. The first is his tutorial on principal component analysis (co-authored with Wold and Esbensen), which covers the most relevant aspects of PCA and its application, whilst the second tutorial focuses on partial least squares regression (co-authored with Kowalski) and covers the concept and algebra of the PLS algorithm. These tutorials published in international journals remain foundational references in the field. In addition, Paul authored three books of high relevance in the field of chemometrics. His book &lt;i&gt;Multi-Way Analysis with Applications in the Chemical Sciences&lt;/i&gt; (co-authored with Smilde and Bro) provides chemometricians with the mathematical foundations needed to understand multi-way approaches and pra","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 9","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/epdf/10.1002/cem.70065","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144915074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Online Simultaneous Determination of Astragalus Polysaccharides and Calycosin-7-O-β-D-Glucoside in Astragali Radix Percolate Based on Near-Infrared Spectroscopy Technology 近红外光谱技术在线同时测定过渗黄芪中黄芪多糖和毛蕊花素-7- o -β- d -葡萄糖苷
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-08-22 DOI: 10.1002/cem.70062
Li Zha, Kaiqi Zhang, Die Xie, Yongming Luo, Xin Che, Lihong Wang

As a crucial extraction process in traditional Chinese medicine, quality control of percolation still faces challenges in real-time monitoring methods. To address this challenge, this study focused on the Astragalus percolation process and established an NIRS-based method for synchronous online monitoring of two bioactive markers in Astragalus percolates: Astragalus polysaccharides (APSs) and calycosin-7-O-β-D-glucoside (CG), achieving rapid and nondestructive analysis. In this study, near-infrared (NIR) spectra were collected online at different time points during percolation to determine APS and CG concentrations by means of NIRS technology, with high-performance liquid chromatography (HPLC) and ultraviolet–visible spectrophotometry (UV–Vis) used as reference methods. Two modeling approaches—partial least squares regression (PLSR) and support vector regression (SVR)—were employed to establish quantitative analytical models for these bioactive components, with model performance optimized through spectral preprocessing and feature variable selection. Results demonstrated that SVR-based models achieved superior predictive accuracy compared with PLSR. The optimal APS model showed calibration and validation set R2 values of 0.9995 and 0.9874, respectively, while the CG model yielded 0.9811 (calibration) and 0.9632 (validation). Both components exhibited residual prediction deviation (RPD) values exceeding the threshold (RPD > 3), with 6.5349 for APS and 3.8357 for CG, confirming excellent predictive capability. Paired t-test analysis of external test sets (p > 0.05) revealed no statistically significant difference between measured and predicted values, further validating the model's robustness for unknown sample prediction. The concentrations of APS and CG in the Astragalus percolation solution can be simultaneously determined by this method within 30 s, significantly improving analytical efficiency compared with the conventional method (60–80 min per sample), while featuring simple operation, solvent-free consumption, low cost, and pollution-free advantages. This study demonstrates that the combination of NIRS and chemometrics enables real-time monitoring of multiple key substance concentrations during the percolation process. As a green analytical technology, NIRS shows significant potential for improving production efficiency and ensuring product quality consistency.

浸透作为中药提取的关键工艺,其质量控制在实时监测方法上仍面临挑战。为了解决这一挑战,本研究以黄芪的渗滤过程为研究对象,建立了基于nir的方法,对黄芪多糖(APSs)和毛蕊花苷-7- o -β- d -葡萄糖苷(CG)两种生物活性标志物进行同步在线监测,实现了快速无损分析。本研究以高效液相色谱法(HPLC)和紫外可见分光光度法(UV-Vis)为参比方法,在线采集渗透过程中不同时间点的近红外(NIR)光谱,测定APS和CG浓度。采用偏最小二乘回归(PLSR)和支持向量回归(SVR)两种建模方法建立生物活性成分定量分析模型,并通过光谱预处理和特征变量选择优化模型性能。结果表明,与PLSR相比,基于svr的模型具有更高的预测精度。最优APS模型的校正集R2为0.9995,验证集R2为0.9874,CG模型的校正集R2为0.9811,验证集R2为0.9632。两个分量的残差预测偏差(RPD)值均超过阈值(RPD > 3), APS的残差预测偏差为6.5349,CG的残差预测偏差为3.8357,具有较好的预测能力。外部检验集配对t检验分析(p > 0.05)显示实测值与预测值之间无统计学差异,进一步验证了模型对未知样本预测的稳健性。该方法可在30 s内同时测定黄芪渗滤液中APS和CG的浓度,与常规方法(60-80 min /个样品)相比,分析效率显著提高,同时具有操作简单、无溶剂消耗、成本低、无污染等优点。本研究表明,近红外光谱和化学计量学的结合可以实时监测渗透过程中多种关键物质的浓度。近红外光谱作为一种绿色分析技术,在提高生产效率和确保产品质量一致性方面显示出巨大的潜力。
{"title":"Online Simultaneous Determination of Astragalus Polysaccharides and Calycosin-7-O-β-D-Glucoside in Astragali Radix Percolate Based on Near-Infrared Spectroscopy Technology","authors":"Li Zha,&nbsp;Kaiqi Zhang,&nbsp;Die Xie,&nbsp;Yongming Luo,&nbsp;Xin Che,&nbsp;Lihong Wang","doi":"10.1002/cem.70062","DOIUrl":"10.1002/cem.70062","url":null,"abstract":"<div>\u0000 \u0000 <p>As a crucial extraction process in traditional Chinese medicine, quality control of percolation still faces challenges in real-time monitoring methods. To address this challenge, this study focused on the Astragalus percolation process and established an NIRS-based method for synchronous online monitoring of two bioactive markers in Astragalus percolates: Astragalus polysaccharides (APSs) and calycosin-7-O-β-D-glucoside (CG), achieving rapid and nondestructive analysis. In this study, near-infrared (NIR) spectra were collected online at different time points during percolation to determine APS and CG concentrations by means of NIRS technology, with high-performance liquid chromatography (HPLC) and ultraviolet–visible spectrophotometry (UV–Vis) used as reference methods. Two modeling approaches—partial least squares regression (PLSR) and support vector regression (SVR)—were employed to establish quantitative analytical models for these bioactive components, with model performance optimized through spectral preprocessing and feature variable selection. Results demonstrated that SVR-based models achieved superior predictive accuracy compared with PLSR. The optimal APS model showed calibration and validation set <i>R</i><sup>2</sup> values of 0.9995 and 0.9874, respectively, while the CG model yielded 0.9811 (calibration) and 0.9632 (validation). Both components exhibited residual prediction deviation (RPD) values exceeding the threshold (RPD &gt; 3), with 6.5349 for APS and 3.8357 for CG, confirming excellent predictive capability. Paired <i>t</i>-test analysis of external test sets (<i>p</i> &gt; 0.05) revealed no statistically significant difference between measured and predicted values, further validating the model's robustness for unknown sample prediction. The concentrations of APS and CG in the Astragalus percolation solution can be simultaneously determined by this method within 30 s, significantly improving analytical efficiency compared with the conventional method (60–80 min per sample), while featuring simple operation, solvent-free consumption, low cost, and pollution-free advantages. This study demonstrates that the combination of NIRS and chemometrics enables real-time monitoring of multiple key substance concentrations during the percolation process. As a green analytical technology, NIRS shows significant potential for improving production efficiency and ensuring product quality consistency.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 9","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144888491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deciphering the Distinctive Features of Alpha-D-mannopyranoside Structure From Similar Structures Against FimH Through ANN and PCA: Insights and Perspectives 利用人工神经网络和主成分分析法从抗FimH的相似结构中破译α - d -甘露吡喃苷结构的特征:见解和观点
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-08-21 DOI: 10.1002/cem.70063
M. Dhanalakshmi, K. R. Jinuraj, Muhammed Iqbal, D. Sruthi, Kajari Das, Sushma Dave, N. Muthulakshmi Andal

This computational study aimed to demonstrate distinct characteristics of alpha-D-mannopyranoside structure, leveraging D-mannose and its analogs due to their known roles in host–pathogen interactions and potential to be used as nutraceuticals. Targeting bacterial adhesion is a critical strategy to combat urinary tract infections (UTIs), especially given rising antibiotic resistance. The FimH lectin on Escherichia coli is a key mediator of this adhesion, making it a compelling target for novel anti-adhesive therapies. We employed a multi-stage virtual screening pipeline to efficiently explore a vast chemical space around the ligands and their binding interactions. Ligand-based virtual screening, utilizing self-organizing maps (SOMs), clustered 5256 D-mannose-similar structures, identifying a promising subset of 141 molecules with 39 known bioassay actives. This was followed by structure-based ligand docking to precisely evaluate their inhibitory impact on FimH lectin. To understand the structural features driving activity, principal component analysis (PCA) was then applied to analyze the molecular structures and their physicochemical descriptors. Our analysis revealed that 15 compounds exhibited the highest binding energy and docking scores. Crucially, the alpha-D-mannopyranoside conformation demonstrated the most effective inhibitory profile. This superior activity, despite structural similarities, was differentiated by two 3D-matrix descriptors: HRG and Wi G, highlighting their significance in predicting subtle yet impactful conformational preferences.

本计算研究旨在展示- d -甘露糖pyranoside结构的独特特征,利用d -甘露糖及其类似物,因为它们在宿主-病原体相互作用中的已知作用和用作营养保健品的潜力。针对细菌粘连是对抗尿路感染(uti)的关键策略,特别是在抗生素耐药性上升的情况下。大肠杆菌上的FimH凝集素是这种粘附的关键介质,使其成为新型抗粘附疗法的引人注目的靶点。我们采用多级虚拟筛选管道来有效地探索配体及其结合相互作用周围的广阔化学空间。基于配体的虚拟筛选,利用自组织图(SOMs),聚集5256个d-甘露糖类似结构,鉴定出141个具有39种已知生物测定活性的分子。接下来是基于结构的配体对接,以精确评估它们对FimH凝集素的抑制作用。为了了解驱动活性的结构特征,应用主成分分析(PCA)对分子结构及其理化描述符进行了分析。我们的分析表明,15种化合物具有最高的结合能和对接分数。关键是,α - d -甘露pyranoside构象显示出最有效的抑制谱。尽管结构相似,但这种优越的活性由两个3d矩阵描述符区分:HRG和Wi G,突出了它们在预测微妙但有影响的构象偏好方面的重要性。
{"title":"Deciphering the Distinctive Features of Alpha-D-mannopyranoside Structure From Similar Structures Against FimH Through ANN and PCA: Insights and Perspectives","authors":"M. Dhanalakshmi,&nbsp;K. R. Jinuraj,&nbsp;Muhammed Iqbal,&nbsp;D. Sruthi,&nbsp;Kajari Das,&nbsp;Sushma Dave,&nbsp;N. Muthulakshmi Andal","doi":"10.1002/cem.70063","DOIUrl":"10.1002/cem.70063","url":null,"abstract":"<div>\u0000 \u0000 <p>This computational study aimed to demonstrate distinct characteristics of alpha-D-mannopyranoside structure, leveraging D-mannose and its analogs due to their known roles in host–pathogen interactions and potential to be used as nutraceuticals. Targeting bacterial adhesion is a critical strategy to combat urinary tract infections (UTIs), especially given rising antibiotic resistance. The FimH lectin on <i>Escherichia coli</i> is a key mediator of this adhesion, making it a compelling target for novel anti-adhesive therapies. We employed a multi-stage virtual screening pipeline to efficiently explore a vast chemical space around the ligands and their binding interactions. Ligand-based virtual screening, utilizing self-organizing maps (SOMs), clustered 5256 D-mannose-similar structures, identifying a promising subset of 141 molecules with 39 known bioassay actives. This was followed by structure-based ligand docking to precisely evaluate their inhibitory impact on FimH lectin. To understand the structural features driving activity, principal component analysis (PCA) was then applied to analyze the molecular structures and their physicochemical descriptors. Our analysis revealed that 15 compounds exhibited the highest binding energy and docking scores. Crucially, the alpha-D-mannopyranoside conformation demonstrated the most effective inhibitory profile. This superior activity, despite structural similarities, was differentiated by two 3D-matrix descriptors: HRG and Wi G, highlighting their significance in predicting subtle yet impactful conformational preferences.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 9","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144881314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Online Monitoring Scheme Using GLPP Through Kantorovich Distance Combined With a Sliding Window Technique for Nonlinear Dynamic Process Fault Detection 基于Kantorovich距离和滑动窗口技术的GLPP在线监测方案用于非线性动态过程故障检测
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-08-14 DOI: 10.1002/cem.70058
Cheng Zhang, Lu Ren, Jing Zhang, Yuan Li

To address the issue of insufficient fault detection performance of global–local preserving projections (GLPP) in the detection of minor faults within nonlinear dynamic processes, a novel fault detection method based on GLPP and Kantorovich distance combined with a sliding window (GLPP-KD) is proposed. Firstly, the GLPP algorithm is used to construct a weight matrix to retain the key information of the data, and the objective function containing local and global information is transformed into a generalized eigenvector problem to obtain a projection matrix. Additionally, the sliding window technique integrated with the Kantorovich distance is employed to quantify the discrepancies between probability distributions, thereby capturing the local dynamic characteristics of the data. Eventually, the fault detection task is achieved by identifying the minor distinctions between normal and faulty states. Experimental results show that compared with traditional methods, GLPP-KD improves the fault detection accuracy and effectively reduces the false alarm rate. The proposed method provides a strong guarantee for the safe and stable operation of the industry and has high application value.

为了解决全局局部保持投影(GLPP)在非线性动态过程中检测小故障时故障检测性能不足的问题,提出了一种基于全局局部保持投影和Kantorovich距离结合滑动窗口的故障检测方法(GLPP- kd)。首先,利用GLPP算法构造权重矩阵以保留数据的关键信息,并将包含局部和全局信息的目标函数转化为广义特征向量问题,得到投影矩阵;此外,采用结合Kantorovich距离的滑动窗口技术来量化概率分布之间的差异,从而捕捉数据的局部动态特征。最终,通过识别正常状态和故障状态之间的细微差别来完成故障检测任务。实验结果表明,与传统方法相比,GLPP-KD提高了故障检测精度,有效降低了误报率。该方法为工业安全稳定运行提供了有力保障,具有较高的应用价值。
{"title":"Online Monitoring Scheme Using GLPP Through Kantorovich Distance Combined With a Sliding Window Technique for Nonlinear Dynamic Process Fault Detection","authors":"Cheng Zhang,&nbsp;Lu Ren,&nbsp;Jing Zhang,&nbsp;Yuan Li","doi":"10.1002/cem.70058","DOIUrl":"10.1002/cem.70058","url":null,"abstract":"<div>\u0000 \u0000 <p>To address the issue of insufficient fault detection performance of global–local preserving projections (GLPP) in the detection of minor faults within nonlinear dynamic processes, a novel fault detection method based on GLPP and Kantorovich distance combined with a sliding window (GLPP-KD) is proposed. Firstly, the GLPP algorithm is used to construct a weight matrix to retain the key information of the data, and the objective function containing local and global information is transformed into a generalized eigenvector problem to obtain a projection matrix. Additionally, the sliding window technique integrated with the Kantorovich distance is employed to quantify the discrepancies between probability distributions, thereby capturing the local dynamic characteristics of the data. Eventually, the fault detection task is achieved by identifying the minor distinctions between normal and faulty states. Experimental results show that compared with traditional methods, GLPP-KD improves the fault detection accuracy and effectively reduces the false alarm rate. The proposed method provides a strong guarantee for the safe and stable operation of the industry and has high application value.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 9","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144832986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Expanding the Chemometric Data Analysis Toolbox With Immersive Analytics 扩展化学计量数据分析工具箱与沉浸式分析
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-08-12 DOI: 10.1002/cem.70060
John H. Kalivas

Immersive analytics is a developing field growing as technology improves. This paper presents some important points, but by no means is the discussion complete. The cited papers and books should be read to fully grasp the potential of the general field of immersive analytics. The direction of this paper is to highlight those components useful for chemometric data analyses in virtual reality.

随着技术的进步,沉浸式分析是一个不断发展的领域。本文提出了一些重要的观点,但绝不是完整的讨论。应该阅读被引用的论文和书籍,以充分掌握沉浸式分析的一般领域的潜力。本文的研究方向是突出那些对虚拟现实中化学计量数据分析有用的组件。
{"title":"Expanding the Chemometric Data Analysis Toolbox With Immersive Analytics","authors":"John H. Kalivas","doi":"10.1002/cem.70060","DOIUrl":"10.1002/cem.70060","url":null,"abstract":"<div>\u0000 \u0000 <p>Immersive analytics is a developing field growing as technology improves. This paper presents some important points, but by no means is the discussion complete. The cited papers and books should be read to fully grasp the potential of the general field of immersive analytics. The direction of this paper is to highlight those components useful for chemometric data analyses in virtual reality.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 9","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144832732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Honoring Professor Tormod Næs—A Pillar of Chemometrics 纪念Tormod Næs-A化学计量学支柱教授
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-08-07 DOI: 10.1002/cem.70059
Ingrid Måge
<p>It is both a privilege and a personal honor to introduce this special issue of the <i>Journal of Chemometrics</i>, dedicated to celebrating the career of Professor Tormod Næs. As a mentor, colleague, and friend, Tormod has been a guiding light throughout my scientific journey from my earliest days as a PhD student under his supervision to our many years of working together at Nofima.</p><p>Tormod's contributions to the field of chemometrics are both foundational and far-reaching. His ability to bridge rigorous statistical theory with practical application is a defining feature of his work and a testament to his rare combination of intellectual depth and scientific intuition.</p><p>His early work in multivariate calibration, particularly in near-infrared (NIR) spectroscopy, laid the groundwork for numerous applications in food science, process modeling, and sensory analysis. His 1992 book <i>Multivariate Calibration</i>, co-authored with Prof. Harald Martens, remains a seminal reference. It is cited nearly 9000 times, and it continues to serve as an accessible introduction to chemometrics for both students and practitioners.</p><p>Equally pioneering was his work in sensometrics, where he developed methods to understand individual differences in sensory and consumer data, an area that has become increasingly important in this field. Tools like PanelCheck and ConsumerCheck, which he helped develop, have empowered practitioners and researchers to apply complex statistical methods with ease and confidence.</p><p>My main area of collaboration with Tormod has been in multiblock modelling. His theoretical innovations in this field include methods such as SO-PLS and ROSA, in the context of prediction, interpretation, and path modelling. The methods have been widely adopted and further developed by researchers around the world and have numerous applications in process modeling, spectroscopy, sensometrics, -omics and beyond. Tormod's work in this area has opened new avenues for data fusion and interpretation across a broad range of scientific domains.</p><p>Tormod's scholarly achievements include over 250 peer-reviewed articles, 7 books, and more than 28,000 citations. Beyond these impressive numbers, the most important part of his legacy is, in my view, the community he has nurtured. He has supervised 25 PhD students and mentored countless others, always prioritizing their development. Tormod is known for his remarkable ability to encourage young scientists and consistently push them forward. His constructive, thorough, and insightful feedback is always delivered with kindness. His mentorship has shaped not only the scientific work but also the confidence and careers of many young researchers.</p><p>Tormod's international collaborations have enriched the field globally. His affiliations with institutions such as the University of Oslo and the University of Copenhagen, along with long-standing partnerships across Europe, the United States and South Afric
这是我的荣幸,也是我个人的荣幸,介绍这一期化学计量学杂志,致力于庆祝Tormod Næs教授的职业生涯。作为导师、同事和朋友,Tormod在我的科学之旅中一直是一盏指路明灯,从我最早在他的指导下读博士,到我们在诺菲玛共事多年。Tormod对化学计量学领域的贡献是基础性的和深远的。他将严谨的统计理论与实际应用相结合的能力是他工作的一个显著特征,也是他罕见地将知识深度与科学直觉结合在一起的证明。他在多元校准方面的早期工作,特别是在近红外(NIR)光谱方面的工作,为食品科学、过程建模和感官分析方面的众多应用奠定了基础。他1992年与Harald Martens教授合著的《多元校准》一书仍然是一个开创性的参考。它被引用了近9000次,并且它继续作为学生和从业者对化学计量学的介绍。同样具有开创性的是他在感官测量学方面的工作,在那里他开发了理解感官和消费者数据的个体差异的方法,这一领域在该领域变得越来越重要。他帮助开发的PanelCheck和ConsumerCheck等工具使从业者和研究人员能够轻松、自信地应用复杂的统计方法。我与Tormod合作的主要领域是多块建模。他在该领域的理论创新包括在预测、解释和路径建模方面的SO-PLS和ROSA等方法。这些方法已被世界各地的研究人员广泛采用和进一步发展,并在过程建模,光谱学,感测学,组学等方面有许多应用。Tormod在这一领域的工作为广泛的科学领域的数据融合和解释开辟了新的途径。Tormod的学术成就包括250多篇同行评议的文章,7本书,超过28,000次引用。在我看来,除了这些令人印象深刻的数字之外,他最重要的遗产是他所培养的社区。他指导了25名博士生,并指导了无数其他博士生,总是优先考虑他们的发展。托莫德以鼓励年轻科学家并不断推动他们前进的非凡能力而闻名。他建设性的、彻底的、有见地的反馈总是带着善意。他的指导不仅影响了科学工作,也影响了许多年轻研究人员的信心和职业生涯。Tormod的国际合作丰富了全球领域。他与奥斯陆大学和哥本哈根大学等机构的合作关系,以及在欧洲、美国和南非的长期合作伙伴关系,反映了他的广泛影响和他在世界范围内的高度尊重。随着Tormod步入退休,他的影响通过我们使用的工具、我们教授的方法和我们提出的问题继续存在。本期特刊汇集了受Tormod作品和性格影响的同事、合作者和以前的学生的贡献。这不仅是对他的科学成就的致敬,也是对他所体现的科学精神、慷慨和合作精神的致敬。我谨代表所有有幸与Tormod共事的人,感谢你们孜孜不倦的贡献,感谢你们的指导和友谊。我们不仅庆祝你非凡的职业生涯,也庆祝背后的人,化学计量学的真正支柱。
{"title":"Honoring Professor Tormod Næs—A Pillar of Chemometrics","authors":"Ingrid Måge","doi":"10.1002/cem.70059","DOIUrl":"10.1002/cem.70059","url":null,"abstract":"&lt;p&gt;It is both a privilege and a personal honor to introduce this special issue of the &lt;i&gt;Journal of Chemometrics&lt;/i&gt;, dedicated to celebrating the career of Professor Tormod Næs. As a mentor, colleague, and friend, Tormod has been a guiding light throughout my scientific journey from my earliest days as a PhD student under his supervision to our many years of working together at Nofima.&lt;/p&gt;&lt;p&gt;Tormod's contributions to the field of chemometrics are both foundational and far-reaching. His ability to bridge rigorous statistical theory with practical application is a defining feature of his work and a testament to his rare combination of intellectual depth and scientific intuition.&lt;/p&gt;&lt;p&gt;His early work in multivariate calibration, particularly in near-infrared (NIR) spectroscopy, laid the groundwork for numerous applications in food science, process modeling, and sensory analysis. His 1992 book &lt;i&gt;Multivariate Calibration&lt;/i&gt;, co-authored with Prof. Harald Martens, remains a seminal reference. It is cited nearly 9000 times, and it continues to serve as an accessible introduction to chemometrics for both students and practitioners.&lt;/p&gt;&lt;p&gt;Equally pioneering was his work in sensometrics, where he developed methods to understand individual differences in sensory and consumer data, an area that has become increasingly important in this field. Tools like PanelCheck and ConsumerCheck, which he helped develop, have empowered practitioners and researchers to apply complex statistical methods with ease and confidence.&lt;/p&gt;&lt;p&gt;My main area of collaboration with Tormod has been in multiblock modelling. His theoretical innovations in this field include methods such as SO-PLS and ROSA, in the context of prediction, interpretation, and path modelling. The methods have been widely adopted and further developed by researchers around the world and have numerous applications in process modeling, spectroscopy, sensometrics, -omics and beyond. Tormod's work in this area has opened new avenues for data fusion and interpretation across a broad range of scientific domains.&lt;/p&gt;&lt;p&gt;Tormod's scholarly achievements include over 250 peer-reviewed articles, 7 books, and more than 28,000 citations. Beyond these impressive numbers, the most important part of his legacy is, in my view, the community he has nurtured. He has supervised 25 PhD students and mentored countless others, always prioritizing their development. Tormod is known for his remarkable ability to encourage young scientists and consistently push them forward. His constructive, thorough, and insightful feedback is always delivered with kindness. His mentorship has shaped not only the scientific work but also the confidence and careers of many young researchers.&lt;/p&gt;&lt;p&gt;Tormod's international collaborations have enriched the field globally. His affiliations with institutions such as the University of Oslo and the University of Copenhagen, along with long-standing partnerships across Europe, the United States and South Afric","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 8","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/epdf/10.1002/cem.70059","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145135287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the Equivalence Between Null Space and Orthogonal Space in Latent Variable Regression Modeling 潜在变量回归模型中零空间与正交空间的等价性
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-08-04 DOI: 10.1002/cem.70057
Sergio García-Carrión, Francesco Sartori, Joan Borràs-Ferrís, Pierantonio Facco, Massimiliano Barolo, Alberto Ferrer

The concepts of null space and orthogonal space have been developed in independent contexts and with different purposes: the former arises in the inversion of partial least-squares (PLS) regression models, and the latter in orthogonal PLS (O-PLS) modeling. In this study, we bridge PLS model inversion and O-PLS modeling by mathematically proving that the null space and the orthogonal space are the same space. We also provide a graphical interpretation of the equivalence between the two spaces, using both a simulated and a real case study.

零空间和正交空间的概念已经在独立的背景和不同的目的下发展起来:前者出现在偏最小二乘(PLS)回归模型的反演中,后者出现在正交PLS (O-PLS)建模中。在本研究中,我们通过数学证明零空间和正交空间是同一空间,架起了PLS模型反演和O-PLS建模的桥梁。我们还使用模拟和真实的案例研究,对两个空间之间的等效性进行了图形化解释。
{"title":"On the Equivalence Between Null Space and Orthogonal Space in Latent Variable Regression Modeling","authors":"Sergio García-Carrión,&nbsp;Francesco Sartori,&nbsp;Joan Borràs-Ferrís,&nbsp;Pierantonio Facco,&nbsp;Massimiliano Barolo,&nbsp;Alberto Ferrer","doi":"10.1002/cem.70057","DOIUrl":"10.1002/cem.70057","url":null,"abstract":"<p>The concepts of null space and orthogonal space have been developed in independent contexts and with different purposes: the former arises in the inversion of partial least-squares (PLS) regression models, and the latter in orthogonal PLS (O-PLS) modeling. In this study, we bridge PLS model inversion and O-PLS modeling by mathematically proving that the null space and the orthogonal space are the same space. We also provide a graphical interpretation of the equivalence between the two spaces, using both a simulated and a real case study.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 8","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.70057","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144773865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Infrared Spectroscopy and Machine Learning for Classification of Red Stamp Inks on Questioned Documents 红外光谱和机器学习对可疑文件上红色印章墨水的分类
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2025-07-29 DOI: 10.1002/cem.70056
Yong Ju Lee, Chang Woo Jeong, Mi Jung Choi, Tai-Ju Lee, Hyoung Jin Kim

This study demonstrates that integrating infrared spectroscopy with machine learning enables highly accurate, nondestructive classification of red-stamp ink manufacturers. We evaluated five classifiers—partial least squares discriminant analysis (PLS-DA), k-nearest neighbor (k-NN), support vector machine (SVM), random forest (RF), and a feed-forward neural network (FNN)—across multiple spectral regions. The FNN trained on second-derivative spectra in the 1700–900 cm−1 region achieved perfect test metrics (F1 = 1.000; AUC = 1.000), while PLS-DA and RF also performed robustly (F1 ≥ 0.933). Variable importance in projection (VIP) analysis identified the 1650–1100 cm−1 subrange as most informative, streamlining feature selection and model training. Applied to three unknown samples, the optimized FNN produced high-confidence manufacturer predictions consistent with expected origins. These results confirm that targeted spectral selection combined with derivative preprocessing markedly enhances nondestructive ink classification for forensic applications.

这项研究表明,将红外光谱与机器学习相结合,可以对红章油墨制造商进行高度准确、无损的分类。我们评估了五种分类器-偏最小二乘判别分析(PLS-DA), k-近邻(k-NN),支持向量机(SVM),随机森林(RF)和前馈神经网络(FNN) -跨越多个光谱区域。在1700-900 cm−1区域的二阶导数光谱上训练的FNN获得了完美的测试指标(F1 = 1.000;AUC = 1.000), PLS-DA和RF也表现良好(F1≥0.933)。可变重要性投影(VIP)分析确定了1650-1100 cm−1的子范围是信息量最大的,简化了特征选择和模型训练。应用于三个未知样本,优化后的FNN产生与预期原点一致的高置信度制造商预测。这些结果证实,结合导数预处理的目标光谱选择显着增强了无损油墨分类在法医应用中的应用。
{"title":"Infrared Spectroscopy and Machine Learning for Classification of Red Stamp Inks on Questioned Documents","authors":"Yong Ju Lee,&nbsp;Chang Woo Jeong,&nbsp;Mi Jung Choi,&nbsp;Tai-Ju Lee,&nbsp;Hyoung Jin Kim","doi":"10.1002/cem.70056","DOIUrl":"10.1002/cem.70056","url":null,"abstract":"<div>\u0000 \u0000 <p>This study demonstrates that integrating infrared spectroscopy with machine learning enables highly accurate, nondestructive classification of red-stamp ink manufacturers. We evaluated five classifiers—partial least squares discriminant analysis (PLS-DA), k-nearest neighbor (k-NN), support vector machine (SVM), random forest (RF), and a feed-forward neural network (FNN)—across multiple spectral regions. The FNN trained on second-derivative spectra in the 1700–900 cm<sup>−1</sup> region achieved perfect test metrics (F1 = 1.000; AUC = 1.000), while PLS-DA and RF also performed robustly (F1 ≥ 0.933). Variable importance in projection (VIP) analysis identified the 1650–1100 cm<sup>−1</sup> subrange as most informative, streamlining feature selection and model training. Applied to three unknown samples, the optimized FNN produced high-confidence manufacturer predictions consistent with expected origins. These results confirm that targeted spectral selection combined with derivative preprocessing markedly enhances nondestructive ink classification for forensic applications.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 8","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144717052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Chemometrics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1