首页 > 最新文献

Journal of Chemometrics最新文献

英文 中文
The Trouble With Hotelling's T2 霍特林T2的麻烦
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2026-02-17 DOI: 10.1002/cem.70108
Richard G. Brereton

The relationship between the Mahalanobis distance and the χ2 and Hotelling T2 distributions is described. The methods are illustrated by two case studies, a 40 × 2 simulation and a 54 × 34 dataset consisting of the NMR of metabolic extracts from maize harvested at 8.5°C. It is shown that if the number of samples is close to but still greater than the number of variables, p values calculated using T2 as usually defined are higher than should be normally expected. This is interpreted as a problem caused by the difficulty of determining the number of degrees of freedom in multivariate matrices. It is recommended to use the χ2 distribution in preference when calculating multivariate p values for probabilities that a sample belongs to a predefined distribution. When the sample size is large relative to the number of variables, there is little point in using T2, and when small, T2 can predict misleading p values due to problems determining the number of degrees of freedom.

描述了马氏距离与χ2和Hotelling T2分布之间的关系。通过两个案例研究说明了这些方法,一个是40 × 2模拟,一个是54 × 34数据集,包括在8.5°C下收获的玉米代谢提取物的NMR。结果表明,如果样本数量接近但仍大于变量数量,则使用通常定义的T2计算的p值高于正常预期值。这被解释为一个由难以确定多变量矩阵中自由度的数量所引起的问题。在计算样本属于预定义分布的概率的多变量p值时,建议优先使用χ2分布。当样本量相对于变量数量较大时,使用T2就没有什么意义,当样本量较小时,由于确定自由度数量的问题,T2可以预测误导性的p值。
{"title":"The Trouble With Hotelling's T2","authors":"Richard G. Brereton","doi":"10.1002/cem.70108","DOIUrl":"https://doi.org/10.1002/cem.70108","url":null,"abstract":"<div>\u0000 \u0000 <p>The relationship between the Mahalanobis distance and the <i>χ</i><sup>2</sup> and Hotelling <i>T</i><sup>2</sup> distributions is described. The methods are illustrated by two case studies, a 40 × 2 simulation and a 54 × 34 dataset consisting of the NMR of metabolic extracts from maize harvested at 8.5°C. It is shown that if the number of samples is close to but still greater than the number of variables, <i>p</i> values calculated using <i>T</i><sup>2</sup> as usually defined are higher than should be normally expected. This is interpreted as a problem caused by the difficulty of determining the number of degrees of freedom in multivariate matrices. It is recommended to use the <i>χ</i><sup>2</sup> distribution in preference when calculating multivariate <i>p</i> values for probabilities that a sample belongs to a predefined distribution. When the sample size is large relative to the number of variables, there is little point in using <i>T</i><sup>2</sup>, and when small, <i>T</i><sup>2</sup> can predict misleading <i>p</i> values due to problems determining the number of degrees of freedom.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"40 2","pages":""},"PeriodicalIF":2.1,"publicationDate":"2026-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147320770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RD-PMA: A Rigid-Deformable Point-Matching Algorithm for Two-Dimensional Calibration of GC-IMS RD-PMA:一种用于GC-IMS二维定标的刚性变形点匹配算法
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2026-02-17 DOI: 10.1002/cem.70106
Shujuan Liu, Di Zhou, Shilong Mu, Jian Jia, Xiaoguang Gao, Xiuli He

Gas chromatography-ion mobility spectrometry (GC-IMS) is a powerful analytical technique for characterizing volatile organic compounds (VOCs). However, variability in retention and drift times frequently complicates peak alignment and downstream analyses. To address this challenge, we introduce RD-PMA, a Rigid-Deformable Point-Matching Algorithm derived from and extending an established point-set registration framework. RD-PMA is specifically designed for GC-IMS data and incorporates dimension-dependent modeling, where rigid translation is applied in the drift-time dimension and nonlinear deformation is employed in the retention-time dimension. Robustness is further enhanced through an iterative outlier-removal procedure. Extensive evaluations across multiple GC-IMS datasets, including both homogeneous and heterogeneous conditions, as well as same-parameter and different-parameter scenarios, demonstrate that RD-PMA consistently outperforms conventional alignment approaches, such as internal reference-based shifting and other point-matching methods. The algorithm preserves structural consistency while delivering high alignment accuracy and resilience to noise and parameter variability. Collectively, these findings establish RD-PMA as a reliable and scalable solution for GC-IMS peak alignment, with broad applicability to large-scale VOC profiling in chemometrics, food science, and clinical research.

气相色谱-离子迁移谱法(GC-IMS)是表征挥发性有机化合物(VOCs)的一种强有力的分析技术。然而,保留和漂移时间的可变性经常使峰值对齐和下游分析复杂化。为了解决这一挑战,我们引入了RD-PMA,这是一种刚性变形点匹配算法,源自并扩展了已建立的点集配准框架。RD-PMA是专门为GC-IMS数据设计的,并结合了维度相关建模,其中在漂移时间维度中应用了刚性平移,在保留时间维度中使用了非线性变形。鲁棒性通过迭代异常值去除程序进一步增强。对多个GC-IMS数据集的广泛评估,包括同质和异质条件,以及相同参数和不同参数的情况,表明RD-PMA始终优于传统的对准方法,如基于内部参考的移动和其他点匹配方法。该算法在保持结构一致性的同时,具有较高的对准精度和抗噪声和参数可变性的能力。总的来说,这些发现使RD-PMA成为一种可靠的、可扩展的GC-IMS峰比对解决方案,广泛适用于化学计量学、食品科学和临床研究中的大规模VOC分析。
{"title":"RD-PMA: A Rigid-Deformable Point-Matching Algorithm for Two-Dimensional Calibration of GC-IMS","authors":"Shujuan Liu,&nbsp;Di Zhou,&nbsp;Shilong Mu,&nbsp;Jian Jia,&nbsp;Xiaoguang Gao,&nbsp;Xiuli He","doi":"10.1002/cem.70106","DOIUrl":"https://doi.org/10.1002/cem.70106","url":null,"abstract":"<div>\u0000 \u0000 <p>Gas chromatography-ion mobility spectrometry (GC-IMS) is a powerful analytical technique for characterizing volatile organic compounds (VOCs). However, variability in retention and drift times frequently complicates peak alignment and downstream analyses. To address this challenge, we introduce RD-PMA, a Rigid-Deformable Point-Matching Algorithm derived from and extending an established point-set registration framework. RD-PMA is specifically designed for GC-IMS data and incorporates dimension-dependent modeling, where rigid translation is applied in the drift-time dimension and nonlinear deformation is employed in the retention-time dimension. Robustness is further enhanced through an iterative outlier-removal procedure. Extensive evaluations across multiple GC-IMS datasets, including both homogeneous and heterogeneous conditions, as well as same-parameter and different-parameter scenarios, demonstrate that RD-PMA consistently outperforms conventional alignment approaches, such as internal reference-based shifting and other point-matching methods. The algorithm preserves structural consistency while delivering high alignment accuracy and resilience to noise and parameter variability. Collectively, these findings establish RD-PMA as a reliable and scalable solution for GC-IMS peak alignment, with broad applicability to large-scale VOC profiling in chemometrics, food science, and clinical research.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"40 2","pages":""},"PeriodicalIF":2.1,"publicationDate":"2026-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147323865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Set of Rules for Model Validation 一组模型验证规则
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2026-02-17 DOI: 10.1002/cem.70110
José Camacho

The validation of a data-driven model is the process of assessing the model's ability to generalize to new, unseen data in the population of interest. This paper proposes a set of general rules for model validation. These rules are designed to help practitioners create reliable validation plans and report their results transparently. While no validation scheme is flawless, these rules can help practitioners ensure their strategy is sufficient for practical use, openly discuss any limitations of their validation strategy, and report clear, comparable performance metrics.

数据驱动模型的验证是评估模型泛化到感兴趣的总体中新的、未见过的数据的能力的过程。本文提出了一套模型验证的通用规则。这些规则旨在帮助从业者创建可靠的验证计划,并透明地报告他们的结果。虽然没有任何验证方案是完美的,但这些规则可以帮助从业者确保他们的策略足以用于实际使用,公开讨论他们的验证策略的任何限制,并报告清晰的、可比较的性能指标。
{"title":"A Set of Rules for Model Validation","authors":"José Camacho","doi":"10.1002/cem.70110","DOIUrl":"https://doi.org/10.1002/cem.70110","url":null,"abstract":"<p>The validation of a data-driven model is the process of assessing the model's ability to generalize to new, unseen data in the population of interest. This paper proposes a set of general rules for model validation. These rules are designed to help practitioners create reliable validation plans and report their results transparently. While no validation scheme is flawless, these rules can help practitioners ensure their strategy is sufficient for practical use, openly discuss any limitations of their validation strategy, and report clear, comparable performance metrics.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"40 2","pages":""},"PeriodicalIF":2.1,"publicationDate":"2026-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/epdf/10.1002/cem.70110","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147315510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Supporting Process Analytical Technology in Pharmaceutical Manufacturing With Lean Chemometrics 用精益化学计量学支持制药生产过程分析技术
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2026-02-12 DOI: 10.1002/cem.70111
Adam J. Rish, Samuel R. Henson, Owen Rehrauer
{"title":"Supporting Process Analytical Technology in Pharmaceutical Manufacturing With Lean Chemometrics","authors":"Adam J. Rish,&nbsp;Samuel R. Henson,&nbsp;Owen Rehrauer","doi":"10.1002/cem.70111","DOIUrl":"https://doi.org/10.1002/cem.70111","url":null,"abstract":"","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"40 2","pages":""},"PeriodicalIF":2.1,"publicationDate":"2026-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/epdf/10.1002/cem.70111","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146256336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multifactorial 1, 3, 4-Oxadiazole Derivatives as Cholinesterase and Glycogen Synthase Kinase-3β Inhibitors for Targeting Alzheimer's Disease: QSAR-Based Virtual Screening, MD Docking, Free Energy Analysis, ADMET, and DFT Studies 多因子1,3,4 -恶二唑衍生物作为针对阿尔茨海默病的胆碱酯酶和糖原合成酶激酶3β抑制剂:基于qsar的虚拟筛选、MD对接、自由能分析、ADMET和DFT研究
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2026-02-09 DOI: 10.1002/cem.70107
Nikita Chhabra, Balaji Wamanrao Matore, Anjali Murmu, Jagadish Singh, Partha Pratim Roy

Alzheimer's disease (AD) involves multiple pathogenic pathways, yet current therapeutic strategies remain largely symptomatic and focused on single molecular targets, underscoring a critical existing gap, emphasizing the need for rational multitarget drug design. Addressing this limitation, the present study employed an integrated in silico framework to identify multitarget 1,3,4-oxadiazole derivatives against acetylcholinesterase (AChE), butyrylcholinesterase (BChE), and glycogen synthase kinase-3β (GSK3β). A dataset of 273 reported compounds was used to develop robust QSAR models via Genetic Algorithm-Multiple Linear Regression (GA-MLR), exhibiting strong internal (R2 = 0.638–0.758, Q2LOO = 0.609–0.736) and satisfactory external predictivity (Q2F1 − Q2F2 = 0.566–0.800). Subsequent virtual screening of 3404 1,3,4-oxadiazoles from BindingDB identified 72 promising hits (IC₅₀ ≤ 300 nM), which underwent molecular dynamics (MD) docking. MD-based CDOCKER analysis highlighted compound 2851 with superior binding energies and stable key interactions compared to donepezil, supported by binding free energy (ΔG) calculations. ADMET evaluation indicated favorable pharmacokinetic and toxicity profiles, while density functional theory (DFT) analysis revealed enhanced reactivity and lower band gaps. This integrated computational workflow identified compound 2851 as a promising MT therapeutic agent for AD.

阿尔茨海默病(AD)涉及多种致病途径,但目前的治疗策略仍然主要是症状性的,并且集中在单分子靶点上,这强调了一个关键的现有空白,强调了合理的多靶点药物设计的必要性。针对这一局限性,本研究采用集成的硅框架来鉴定抗乙酰胆碱酯酶(AChE)、丁基胆碱酯酶(BChE)和糖原合成酶激酶-3β (GSK3β)的多靶点1,3,4-二唑衍生物。通过遗传算法-多元线性回归(GA-MLR),利用273个已报道化合物的数据集建立了稳健的QSAR模型,显示出强大的内部预测能力(R2 = 0.638-0.758, Q2LOO = 0.609-0.736)和令人满意的外部预测能力(Q2F1−Q2F2 = 0.566-0.800)。随后,BindingDB对3404个1,3,4-恶二唑进行了虚拟筛选,确定了72个有希望的命中(IC₅₀≤300 nM),并进行了分子动力学(MD)对接。基于md的CDOCKER分析显示,与多奈培齐相比,化合物2851具有更好的结合能和稳定的键相互作用,这得到了结合自由能(ΔG)计算的支持。ADMET评价显示了良好的药代动力学和毒性谱,而密度泛函理论(DFT)分析显示了增强的反应性和更低的带隙。这种集成的计算工作流程确定了化合物2851是一种有前途的AD MT治疗剂。
{"title":"Multifactorial 1, 3, 4-Oxadiazole Derivatives as Cholinesterase and Glycogen Synthase Kinase-3β Inhibitors for Targeting Alzheimer's Disease: QSAR-Based Virtual Screening, MD Docking, Free Energy Analysis, ADMET, and DFT Studies","authors":"Nikita Chhabra,&nbsp;Balaji Wamanrao Matore,&nbsp;Anjali Murmu,&nbsp;Jagadish Singh,&nbsp;Partha Pratim Roy","doi":"10.1002/cem.70107","DOIUrl":"https://doi.org/10.1002/cem.70107","url":null,"abstract":"<div>\u0000 \u0000 <p>Alzheimer's disease (AD) involves multiple pathogenic pathways, yet current therapeutic strategies remain largely symptomatic and focused on single molecular targets, underscoring a critical existing gap, emphasizing the need for rational multitarget drug design. Addressing this limitation, the present study employed an integrated in silico framework to identify multitarget 1,3,4-oxadiazole derivatives against acetylcholinesterase (AChE), butyrylcholinesterase (BChE), and glycogen synthase kinase-3<i>β</i> (GSK3β). A dataset of 273 reported compounds was used to develop robust QSAR models via Genetic Algorithm-Multiple Linear Regression (GA-MLR), exhibiting strong internal (<i>R</i><sup>2</sup> = 0.638–0.758, <i>Q</i><sup>2</sup><sub>LOO</sub> = 0.609–0.736) and satisfactory external predictivity (<i>Q</i><sup>2</sup><sub>F1</sub> − <i>Q</i><sup>2</sup><sub>F2</sub> = 0.566–0.800). Subsequent virtual screening of 3404 1,3,4-oxadiazoles from BindingDB identified 72 promising hits (IC₅₀ ≤ 300 nM), which underwent molecular dynamics (MD) docking. MD-based CDOCKER analysis highlighted compound 2851 with superior binding energies and stable key interactions compared to donepezil, supported by binding free energy (Δ<i>G</i>) calculations. ADMET evaluation indicated favorable pharmacokinetic and toxicity profiles, while density functional theory (DFT) analysis revealed enhanced reactivity and lower band gaps. This integrated computational workflow identified compound 2851 as a promising MT therapeutic agent for AD.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"40 2","pages":""},"PeriodicalIF":2.1,"publicationDate":"2026-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147275042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Auditory Analytics for Pattern Discovery in Protein Folding Dynamics 蛋白质折叠动力学中模式发现的听觉分析
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2026-02-05 DOI: 10.1002/cem.70097
Carla Scaletti, Kurt J. Hebel, Martin Gruebele

We introduce Auditory Analytics, a methodological framework that utilizes data sonification for scientific discovery. Auditory Analytics describes a cycle of collecting and deriving datasets, mapping data to audible signals (sonification), analytical listening, hypothesis formulation, and tool building, where human insights from any stage of the cycle can feed back into further iterations of the cycle in the form of new datasets, alternative mappings, and new models of the original phenomenon. In Auditory Analytics, the remarkable capacity of the human auditory system to extract meaningful information from complex soundscapes across multiple timescales is repurposed for exploring, interpreting, and analyzing data. As an illustration of how Auditory Analytics can be used to uncover relationships and dynamics in physical systems, we describe an earlier study in which we applied this methodology to investigate state transition passages in a molecular dynamics simulation of a small protein. Auditory Analytics helped us identify distinct hydrogen-bonding patterns associated with different rates of transit between folded and unfolded states, leading to a deeper understanding of the process of protein folding. A single, isolated data mapping—whether visual, auditory, haptic, mathematical, or verbal—provides an incomplete picture of reality; by adding the Auditory Analytics cycle to our portfolio of data interpretation tools, we can build a more complete picture of physical phenomena.

我们介绍听觉分析,一个方法框架,利用数据的声音科学发现。听觉分析描述了一个收集和导出数据集、将数据映射到声音信号(声音化)、分析聆听、假设制定和工具构建的循环,在这个循环的任何阶段,人类的见解都可以以新数据集、替代映射和原始现象的新模型的形式反馈到循环的进一步迭代中。在听觉分析中,人类听觉系统从多个时间尺度的复杂声音环境中提取有意义信息的卓越能力被重新用于探索、解释和分析数据。为了说明听觉分析如何用于揭示物理系统中的关系和动态,我们描述了一项早期的研究,在该研究中,我们应用该方法研究了小蛋白质分子动力学模拟中的状态转换通道。听觉分析帮助我们确定了与折叠和未折叠状态之间不同传输速率相关的不同氢键模式,从而对蛋白质折叠过程有了更深入的了解。一个单一的、孤立的数据映射——无论是视觉的、听觉的、触觉的、数学的还是语言的——提供了一个不完整的现实图景;通过将听觉分析周期添加到我们的数据解释工具组合中,我们可以构建更完整的物理现象图像。
{"title":"Auditory Analytics for Pattern Discovery in Protein Folding Dynamics","authors":"Carla Scaletti,&nbsp;Kurt J. Hebel,&nbsp;Martin Gruebele","doi":"10.1002/cem.70097","DOIUrl":"https://doi.org/10.1002/cem.70097","url":null,"abstract":"<div>\u0000 \u0000 <p>We introduce Auditory Analytics, a methodological framework that utilizes data sonification for scientific discovery. Auditory Analytics describes a cycle of collecting and deriving datasets, mapping data to audible signals (sonification), analytical listening, hypothesis formulation, and tool building, where human insights from any stage of the cycle can feed back into further iterations of the cycle in the form of new datasets, alternative mappings, and new models of the original phenomenon. In Auditory Analytics, the remarkable capacity of the human auditory system to extract meaningful information from complex soundscapes across multiple timescales is repurposed for exploring, interpreting, and analyzing data. As an illustration of how Auditory Analytics can be used to uncover relationships and dynamics in physical systems, we describe an earlier study in which we applied this methodology to investigate state transition passages in a molecular dynamics simulation of a small protein. Auditory Analytics helped us identify distinct hydrogen-bonding patterns associated with different rates of transit between folded and unfolded states, leading to a deeper understanding of the process of protein folding. A single, isolated data mapping—whether visual, auditory, haptic, mathematical, or verbal—provides an incomplete picture of reality; by adding the Auditory Analytics cycle to our portfolio of data interpretation tools, we can build a more complete picture of physical phenomena.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"40 2","pages":""},"PeriodicalIF":2.1,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146256269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Perspective on Using Immersive Analytics With Virtual Reality for One-Class Classification Decisions 沉浸式分析与虚拟现实在一类分类决策中的应用前景
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2026-01-29 DOI: 10.1002/cem.70109
Hyrum J. Redd, John H. Kalivas

Multisensory tools are beginning to reformat research and education in chemistry and many other fields. For example, translating infrared spectra into sound (sonification) can unveil molecular facts that the eye might miss. Tactile approaches are used with 3-D printed scientific data such as electrophoresis gels. These scientific advances are expanding the way we study data and are part of a broader area known as immersive analytics. While immersive analytics cover all the human senses for data immersion, this perspective focuses on data visualization in virtual reality (VR). By visualizing chemometric data information in VR, a human can use their extensively trained pattern recognition and problem-solving skills to make final analysis decisions. As an example, presented is a feasibility study for one-class classification with a focus on correcting samples machine learning (ML) identified as false positives (FPs) and false negatives (FNs) that typically reside in the gray zone (class fringe samples) to respective true negatives (TNs) and positives (TPs). Conversely, while not expected in a well-designed VR universe, it is possible that TP and TN prediction samples in the gray zone could be classified as respective FNs and FPs by decisions in VR. Results are presented for three datasets showing the feasibility of using VR for classification decisions. These datasets are clam contamination and two cancer detection situations. Some brief comments on the potential of using VR to identify local structure within a class are also provided using a quantitative structure–activity relationship (QSAR) dataset.

多感官工具正开始重塑化学和许多其他领域的研究和教育。例如,将红外光谱转换为声音(超声)可以揭示眼睛可能忽略的分子事实,触觉方法用于3d打印的科学数据,如电泳凝胶。这些科学进步拓展了我们研究数据的方式,是沉浸式分析这一更广泛领域的一部分。虽然沉浸式分析涵盖了数据沉浸的所有人类感官,但该视角侧重于虚拟现实(VR)中的数据可视化。通过在VR中可视化化学计量数据信息,人类可以使用他们广泛训练的模式识别和解决问题的技能来做出最终的分析决策。作为一个例子,提出了一项单类分类的可行性研究,重点是将样本机器学习(ML)识别为假阳性(FPs)和假阴性(fn),这些假阳性(FPs)通常位于灰色地带(类边缘样本),以各自的真阴性(tn)和阳性(tp)。相反,在设计良好的VR世界中,灰色区域的TP和TN预测样本可能会被VR中的决策分类为各自的FNs和FPs,而这在设计良好的VR世界中是不可能的。在三个数据集上展示了使用VR进行分类决策的可行性。这些数据集是蛤蜊污染和两种癌症检测情况。使用定量结构-活动关系(QSAR)数据集,对使用VR识别类内局部结构的潜力进行了简要评论。
{"title":"A Perspective on Using Immersive Analytics With Virtual Reality for One-Class Classification Decisions","authors":"Hyrum J. Redd,&nbsp;John H. Kalivas","doi":"10.1002/cem.70109","DOIUrl":"https://doi.org/10.1002/cem.70109","url":null,"abstract":"<div>\u0000 \u0000 <p>Multisensory tools are beginning to reformat research and education in chemistry and many other fields. For example, translating infrared spectra into sound (sonification) can unveil molecular facts that the eye might miss. Tactile approaches are used with 3-D printed scientific data such as electrophoresis gels. These scientific advances are expanding the way we study data and are part of a broader area known as immersive analytics. While immersive analytics cover all the human senses for data immersion, this perspective focuses on data visualization in virtual reality (VR). By visualizing chemometric data information in VR, a human can use their extensively trained pattern recognition and problem-solving skills to make final analysis decisions. As an example, presented is a feasibility study for one-class classification with a focus on correcting samples machine learning (ML) identified as false positives (FPs) and false negatives (FNs) that typically reside in the gray zone (class fringe samples) to respective true negatives (TNs) and positives (TPs). Conversely, while not expected in a well-designed VR universe, it is possible that TP and TN prediction samples in the gray zone could be classified as respective FNs and FPs by decisions in VR. Results are presented for three datasets showing the feasibility of using VR for classification decisions. These datasets are clam contamination and two cancer detection situations. Some brief comments on the potential of using VR to identify local structure within a class are also provided using a quantitative structure–activity relationship (QSAR) dataset.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"40 2","pages":""},"PeriodicalIF":2.1,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146197036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Volatile Gas Detection Based on Electronic Nose Combined With a Feature Complementary Calculation Network to Identify the Adulterated Peanuts 基于电子鼻的挥发性气体检测结合特征互补计算网络识别掺假花生
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2026-01-27 DOI: 10.1002/cem.70103
Qiufen Wang, Tianhao Wang

Peanuts are an important food ingredient, and quality adulteration may pose health risks to consumers. Therefore, a rapid, accurate, and effective method for detecting peanut adulteration should be developed. This paper proposes a peanut adulteration identification method based on an electronic nose (e-nose) and a feature complementary calculation network (FCC-Net). First, volatile organic compound gas data of peanuts with different adulteration ratios are collected using an e-nose system. Then, leveraging the cross-sensitivity and temporal dynamic characteristics of the e-nose data, a feature complementary calculation module (FCCM) is proposed to extract deep gas features. Finally, based on the FCCM, a lightweight FCC-Net is designed to identify peanuts with varying adulteration levels. Experimental results demonstrate that FCC-Net outperforms classical lightweight deep learning models and state-of-the-art gas classification methods in terms of accuracy (97.33%), precision (96.92%), and recall (97.61%), while maintaining extremely low parameters (0.0102 M) and computational cost (0.3700 M). The combination of the e-nose system and FCC-Net provides an efficient and lightweight solution for peanut quality inspection.

花生是一种重要的食品原料,质量掺假会给消费者带来健康风险。因此,需要开发一种快速、准确、有效的检测花生掺假的方法。提出了一种基于电子鼻和特征互补计算网络(FCC-Net)的花生掺假鉴别方法。首先,利用电子鼻系统采集不同掺假比例花生的挥发性有机化合物气体数据。然后,利用电子鼻数据的交叉敏感性和时间动态特性,提出了特征互补计算模块(FCCM)来提取深层气体特征。最后,基于FCCM,设计了一个轻量级的FCC-Net来识别不同掺假水平的花生。实验结果表明,FCC-Net在准确率(97.33%)、精密度(96.92%)和召回率(97.61%)方面优于经典轻量级深度学习模型和最先进的气体分类方法,同时保持了极低的参数(0.0102 M)和计算成本(0.3700 M)。电子鼻系统与FCC-Net的结合为花生质量检测提供了一种高效、轻便的解决方案。
{"title":"Volatile Gas Detection Based on Electronic Nose Combined With a Feature Complementary Calculation Network to Identify the Adulterated Peanuts","authors":"Qiufen Wang,&nbsp;Tianhao Wang","doi":"10.1002/cem.70103","DOIUrl":"https://doi.org/10.1002/cem.70103","url":null,"abstract":"<div>\u0000 \u0000 <p>Peanuts are an important food ingredient, and quality adulteration may pose health risks to consumers. Therefore, a rapid, accurate, and effective method for detecting peanut adulteration should be developed. This paper proposes a peanut adulteration identification method based on an electronic nose (e-nose) and a feature complementary calculation network (FCC-Net). First, volatile organic compound gas data of peanuts with different adulteration ratios are collected using an e-nose system. Then, leveraging the cross-sensitivity and temporal dynamic characteristics of the e-nose data, a feature complementary calculation module (FCCM) is proposed to extract deep gas features. Finally, based on the FCCM, a lightweight FCC-Net is designed to identify peanuts with varying adulteration levels. Experimental results demonstrate that FCC-Net outperforms classical lightweight deep learning models and state-of-the-art gas classification methods in terms of accuracy (97.33%), precision (96.92%), and recall (97.61%), while maintaining extremely low parameters (0.0102 M) and computational cost (0.3700 M). The combination of the e-nose system and FCC-Net provides an efficient and lightweight solution for peanut quality inspection.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"40 2","pages":""},"PeriodicalIF":2.1,"publicationDate":"2026-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146199428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Lean Chemometrics in Spectroscopic Process Analytical Technology 精益化学计量学在光谱过程分析技术
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2026-01-27 DOI: 10.1002/cem.70105
Adam J. Rish, Samuel R. Henson, Owen Rehrauer

The adoption of spectroscopy as a process analytical technology (PAT) modality in the pharmaceutical industry and related sectors has enabled advanced monitoring and control of manufacturing processes. Most applications of spectroscopic PAT instruments are dependent on chemometric multivariate data analysis (MVDA) methods to extract the relevant process data from the spectral measurements. However, calibrating and maintaining conventional MVDA methods is often burdensome, as it requires extensive time, material, and financial costs to generate the necessary representative samples and corresponding reference data. This calibration burden can be a barrier to the adoption of spectroscopic PAT in the pharmaceutical industry. Within this article, a classification of MVDA methods referred to as “lean chemometrics” is proposed and formalized. Lean chemometrics are time-saving, material-sparing, and cost-cutting MVDA methods that reduce the calibration burden relative to conventional chemometric methods of choice for spectroscopic PAT. Categories of various MVDA methods that are classifiable as lean chemometric techniques and practical considerations for integration of these techniques with PAT in common pharmaceutical PAT applications are discussed. The intention of lean chemometrics is to raise awareness of solutions that minimize the challenge of calibration burden toward improving spectroscopic PAT adoption.

在制药工业和相关部门采用光谱作为过程分析技术(PAT)模式,使制造过程的先进监测和控制成为可能。光谱PAT仪器的大多数应用依赖于化学计量多元数据分析(MVDA)方法从光谱测量中提取相关的过程数据。然而,校准和维护传统的MVDA方法往往是繁重的,因为它需要大量的时间、材料和财务成本来生成必要的代表性样本和相应的参考数据。这种校准负担可能成为制药行业采用光谱PAT的障碍。在本文中,被称为“精益化学计量学”的MVDA方法的分类被提出并形式化。精益化学计量学是节省时间、节省材料和降低成本的MVDA方法,相对于光谱PAT选择的传统化学计量方法,减少了校准负担。讨论了可归类为精益化学计量技术的各种MVDA方法的类别以及将这些技术与PAT集成在常见制药PAT应用中的实际考虑。精益化学计量学的目的是提高人们对解决方案的认识,以最大限度地减少校准负担的挑战,从而提高光谱PAT的采用。
{"title":"Lean Chemometrics in Spectroscopic Process Analytical Technology","authors":"Adam J. Rish,&nbsp;Samuel R. Henson,&nbsp;Owen Rehrauer","doi":"10.1002/cem.70105","DOIUrl":"https://doi.org/10.1002/cem.70105","url":null,"abstract":"<div>\u0000 \u0000 <p>The adoption of spectroscopy as a process analytical technology (PAT) modality in the pharmaceutical industry and related sectors has enabled advanced monitoring and control of manufacturing processes. Most applications of spectroscopic PAT instruments are dependent on chemometric multivariate data analysis (MVDA) methods to extract the relevant process data from the spectral measurements. However, calibrating and maintaining conventional MVDA methods is often burdensome, as it requires extensive time, material, and financial costs to generate the necessary representative samples and corresponding reference data. This calibration burden can be a barrier to the adoption of spectroscopic PAT in the pharmaceutical industry. Within this article, a classification of MVDA methods referred to as “lean chemometrics” is proposed and formalized. Lean chemometrics are time-saving, material-sparing, and cost-cutting MVDA methods that reduce the calibration burden relative to conventional chemometric methods of choice for spectroscopic PAT. Categories of various MVDA methods that are classifiable as lean chemometric techniques and practical considerations for integration of these techniques with PAT in common pharmaceutical PAT applications are discussed. The intention of lean chemometrics is to raise awareness of solutions that minimize the challenge of calibration burden toward improving spectroscopic PAT adoption.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"40 2","pages":""},"PeriodicalIF":2.1,"publicationDate":"2026-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146197024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
In-Situ Detection of Microplastic Particles on Food Using Hyperspectral Imaging With One-Dimensional Convolutional Neural Network and Artificial Neural Network 基于一维卷积神经网络和人工神经网络的高光谱成像原位检测食品中的微塑料颗粒
IF 2.1 4区 化学 Q1 SOCIAL WORK Pub Date : 2026-01-21 DOI: 10.1002/cem.70088
Nikhita Sai Nayani, Ran Yang, Yue Sun, Lihong Yang, Lifeng Zhou, Yiming Feng

Hyperspectral imaging (HSI) has emerged as a promising technique for microplastic detection through analysis of reflectance variations across multiple wavelengths. Traditional approaches have focused primarily on isolated microplastic particles, requiring labor-intensive separation procedures impractical for routine monitoring. The challenge of detecting microplastics directly on food surfaces stems from spectral similarities between microplastics and food matrices, making differentiation difficult using conventional methods. Leveraging recent advances in machine learning, this study explores how artificial neural networks (ANN) and one-dimensional convolutional neural networks (1D-CNN) can identify subtle spectral differences to detect microplastic particles on seafood without isolation. We systematically evaluated model architectures, preprocessing techniques, and hyperparameter configurations to optimize detection performance using hyperspectral data from tilapia samples contaminated with polyethylene microspheres. Our findings demonstrate that 1D-CNN models trained on hyperspectral data without dimensionality reduction significantly outperform other approaches, achieving object-level detection F1 scores of 0.963 for 600-μm particles and 0.950 for 300-μm particles. This detection strategy represents a substantial improvement over traditional methods and highlights the potential of deep learning–based approaches for non-destructive, efficient microplastic detection in food safety applications.

高光谱成像(HSI)已经成为一种很有前途的微塑料检测技术,通过分析多个波长的反射率变化。传统的方法主要集中在分离的微塑料颗粒上,需要劳动密集型的分离程序,无法进行常规监测。直接在食物表面检测微塑料的挑战源于微塑料和食物基质之间的光谱相似性,这使得使用传统方法进行区分变得困难。利用机器学习的最新进展,本研究探索了人工神经网络(ANN)和一维卷积神经网络(1D-CNN)如何识别细微的光谱差异,从而在不隔离的情况下检测海产品上的微塑料颗粒。我们系统地评估了模型架构、预处理技术和超参数配置,以优化聚乙烯微球污染罗非鱼样品的高光谱数据的检测性能。我们的研究结果表明,在未降维的高光谱数据上训练的1D-CNN模型显著优于其他方法,在600-μm粒子和300-μm粒子上的目标级检测F1得分分别为0.963和0.950。这种检测策略代表了对传统方法的实质性改进,并突出了基于深度学习的方法在食品安全应用中用于非破坏性、高效微塑料检测的潜力。
{"title":"In-Situ Detection of Microplastic Particles on Food Using Hyperspectral Imaging With One-Dimensional Convolutional Neural Network and Artificial Neural Network","authors":"Nikhita Sai Nayani,&nbsp;Ran Yang,&nbsp;Yue Sun,&nbsp;Lihong Yang,&nbsp;Lifeng Zhou,&nbsp;Yiming Feng","doi":"10.1002/cem.70088","DOIUrl":"https://doi.org/10.1002/cem.70088","url":null,"abstract":"<div>\u0000 \u0000 <p>Hyperspectral imaging (HSI) has emerged as a promising technique for microplastic detection through analysis of reflectance variations across multiple wavelengths. Traditional approaches have focused primarily on isolated microplastic particles, requiring labor-intensive separation procedures impractical for routine monitoring. The challenge of detecting microplastics directly on food surfaces stems from spectral similarities between microplastics and food matrices, making differentiation difficult using conventional methods. Leveraging recent advances in machine learning, this study explores how artificial neural networks (ANN) and one-dimensional convolutional neural networks (1D-CNN) can identify subtle spectral differences to detect microplastic particles on seafood without isolation. We systematically evaluated model architectures, preprocessing techniques, and hyperparameter configurations to optimize detection performance using hyperspectral data from tilapia samples contaminated with polyethylene microspheres. Our findings demonstrate that 1D-CNN models trained on hyperspectral data without dimensionality reduction significantly outperform other approaches, achieving object-level detection F1 scores of 0.963 for 600-μm particles and 0.950 for 300-μm particles. This detection strategy represents a substantial improvement over traditional methods and highlights the potential of deep learning–based approaches for non-destructive, efficient microplastic detection in food safety applications.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"40 2","pages":""},"PeriodicalIF":2.1,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146091422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Chemometrics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1