首页 > 最新文献

Chemometrics and Intelligent Laboratory Systems最新文献

英文 中文
MacroPARAFAC for handling rowwise and cellwise outliers in incomplete multiway data 用于处理不完整多向数据中行向和单元向异常值的 MacroPARAFAC
IF 3.7 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-07-03 DOI: 10.1016/j.chemolab.2024.105170
Mia Hubert, Mehdi Hirari

Multiway data extend two-way matrices into higher-dimensional tensors, often explored through dimensional reduction techniques. In this paper, we study the Parallel Factor Analysis (PARAFAC) model for handling multiway data, representing it more compactly through a concise set of loading matrices and scores. We assume that the data may be incomplete and could contain both rowwise and cellwise outliers, signifying cases that deviate from the majority and outlying cells dispersed throughout the data array. To address these challenges, we present a novel algorithm designed to robustly estimate both loadings and scores. Additionally, we introduce an enhanced outlier map to distinguish various patterns of outlying behavior. Through simulations and the analysis of fluorescence Excitation-Emission Matrix (EEM) data, we demonstrate the robustness of our approach. Our results underscore the effectiveness of diagnostic tools in identifying and interpreting unusual patterns within the data.

多向数据将双向矩阵扩展为高维张量,通常通过降维技术进行探索。在本文中,我们研究了处理多向数据的并行因子分析(PARAFAC)模型,通过一组简洁的载荷矩阵和分数更紧凑地表示多向数据。我们假设数据可能是不完整的,可能包含行向和单元向离群值,即偏离多数的情况和分散在整个数据阵列中的离群单元。为了应对这些挑战,我们提出了一种新颖的算法,旨在稳健地估算载荷和分数。此外,我们还引入了增强型离群图,以区分各种离群行为模式。通过对荧光激发-发射矩阵(EEM)数据的模拟和分析,我们证明了我们方法的稳健性。我们的结果强调了诊断工具在识别和解释数据中异常模式方面的有效性。
{"title":"MacroPARAFAC for handling rowwise and cellwise outliers in incomplete multiway data","authors":"Mia Hubert,&nbsp;Mehdi Hirari","doi":"10.1016/j.chemolab.2024.105170","DOIUrl":"10.1016/j.chemolab.2024.105170","url":null,"abstract":"<div><p>Multiway data extend two-way matrices into higher-dimensional tensors, often explored through dimensional reduction techniques. In this paper, we study the Parallel Factor Analysis (PARAFAC) model for handling multiway data, representing it more compactly through a concise set of loading matrices and scores. We assume that the data may be incomplete and could contain both rowwise and cellwise outliers, signifying cases that deviate from the majority and outlying cells dispersed throughout the data array. To address these challenges, we present a novel algorithm designed to robustly estimate both loadings and scores. Additionally, we introduce an enhanced outlier map to distinguish various patterns of outlying behavior. Through simulations and the analysis of fluorescence Excitation-Emission Matrix (EEM) data, we demonstrate the robustness of our approach. Our results underscore the effectiveness of diagnostic tools in identifying and interpreting unusual patterns within the data.</p></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":null,"pages":null},"PeriodicalIF":3.7,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141566715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Diverse local calibration approaches for chemometric predictive analysis of large near-infrared spectroscopy (NIRS) multi-product datasets 用于大型近红外光谱(NIRS)多产品数据集化学计量预测分析的多种局部校准方法
IF 3.7 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-07-01 DOI: 10.1016/j.chemolab.2024.105173
Xueping Yang , Fuyu Yang , Matthieu Lesnoff , Paolo Berzaghi , Alessandro Ferragina

This study aimed to assess the predictive accuracy of Near-Infrared Spectroscopy (NIRS) across a large multi-product library, employing novel local calibration methodologies. Three local strategies were examined: LOCAL Algorithm, Locally Weighted Regression predicted on k-nearest neighbor selection (kNN-LWPLSR), along with a newly proposed algorithm within this study called Hybrid Local. These strategies were applied to an extensive multi-product dataset. When compared with Global PLS models, the results exhibited significant reductions in RMSEP values for all local strategies. Particularly, the kNN-LWPLSR demonstrated proficient prediction for the constituents of ADF and DM. The newly proposed method [Hybrid Local] exhibits comparable performance to the LOCAL Algorithm; however, it notably reduces the prediction time by half compared to the latter, representing a significant advancement for the practical implementation of NIRS technology within industrial processing scenarios.

本研究旨在采用新颖的局部校准方法,评估近红外光谱(NIRS)在大型多产品库中的预测准确性。研究考察了三种局部策略:LOCAL 算法、基于 k 近邻选择的局部加权回归预测 (kNN-LWPLSR) 以及本研究中新提出的混合局部算法。这些策略被应用于一个广泛的多产品数据集。与全局 PLS 模型相比,所有本地策略的 RMSEP 值都有显著降低。特别是,kNN-LWPLSR 对 ADF 和 DM 的成分进行了出色的预测。新提出的[混合本地]方法与 LOCAL 算法的性能相当,但与后者相比,它明显缩短了一半的预测时间,这对于在工业加工场景中实际应用近红外光谱技术来说是一个重大进步。
{"title":"Diverse local calibration approaches for chemometric predictive analysis of large near-infrared spectroscopy (NIRS) multi-product datasets","authors":"Xueping Yang ,&nbsp;Fuyu Yang ,&nbsp;Matthieu Lesnoff ,&nbsp;Paolo Berzaghi ,&nbsp;Alessandro Ferragina","doi":"10.1016/j.chemolab.2024.105173","DOIUrl":"https://doi.org/10.1016/j.chemolab.2024.105173","url":null,"abstract":"<div><p>This study aimed to assess the predictive accuracy of Near-Infrared Spectroscopy (NIRS) across a large multi-product library, employing novel local calibration methodologies. Three local strategies were examined: LOCAL Algorithm, Locally Weighted Regression predicted on k-nearest neighbor selection (kNN-LWPLSR), along with a newly proposed algorithm within this study called Hybrid Local. These strategies were applied to an extensive multi-product dataset. When compared with Global PLS models, the results exhibited significant reductions in RMSEP values for all local strategies. Particularly, the kNN-LWPLSR demonstrated proficient prediction for the constituents of ADF and DM. The newly proposed method [Hybrid Local] exhibits comparable performance to the LOCAL Algorithm; however, it notably reduces the prediction time by half compared to the latter, representing a significant advancement for the practical implementation of NIRS technology within industrial processing scenarios.</p></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":null,"pages":null},"PeriodicalIF":3.7,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0169743924001138/pdfft?md5=115b1d8cf3d3927fcd4a4da98b29f3e1&pid=1-s2.0-S0169743924001138-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141539174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
New breast cancer biomarkers from diffusion magnetic resonance imaging based on the Diffusion Tensor using multivariate curve resolution (MCR) models 利用多变量曲线分辨率 (MCR) 模型,从基于扩散张量的扩散磁共振成像中提取新的乳腺癌生物标记物
IF 3.7 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-06-26 DOI: 10.1016/j.chemolab.2024.105171
C. Ortiz-Abellán , E. Aguado-Sarrió , J.M. Prats-Montalbán , J. Camps-Herrero , A. Ferrer

Currently, magnetic resonance imaging is the most sensitive imaging technique for detecting cancerous processes in early stages. As for breast cancer, due to the tubular structure of the tissue, being formed by ducts, anisotropic diffusion should be considered instead of the general isotropic diffusion. Anisotropic diffusion is studied by applying a technique called Diffusion Tensor Imaging (DTI), where the diffusion gradient is applied by changing the magnetic field in several spatial directions.

To date, the application of Multivariate Curve Resolution (MCR) models in diffusion sequences has demonstrated its ability to develop cancer biomarkers of easy clinical interpretation in the case of isotropic tissues, such as the prostate. But so far, it has never been applied in the case of anisotropic tissues, as the breast.

Therefore, the main objective of this work is to obtain easy-to-interpret imaging biomarkers useful for early breast cancer diagnosis from diffusion magnetic resonance imaging based on the Diffusion Tensor using multivariate curve resolution (MCR) models. A classification model to identify healthy and tumor affected pixels is also proposed.

目前,磁共振成像是早期检测癌症过程最灵敏的成像技术。就乳腺癌而言,由于乳腺组织是由导管形成的管状结构,因此应考虑各向异性扩散,而不是一般的各向同性扩散。研究各向异性扩散的方法是应用一种称为扩散张量成像(DTI)的技术,通过改变多个空间方向的磁场来应用扩散梯度。迄今为止,在扩散序列中应用多变量曲线分辨率(MCR)模型已经证明了其在各向同性组织(如前列腺)中开发易于临床解释的癌症生物标志物的能力。因此,这项工作的主要目的是利用多变量曲线分辨率(MCR)模型,从基于扩散张量的扩散磁共振成像中获得易于解读的成像生物标记,用于早期乳腺癌诊断。此外,还提出了一种用于识别健康像素和受肿瘤影响像素的分类模型。
{"title":"New breast cancer biomarkers from diffusion magnetic resonance imaging based on the Diffusion Tensor using multivariate curve resolution (MCR) models","authors":"C. Ortiz-Abellán ,&nbsp;E. Aguado-Sarrió ,&nbsp;J.M. Prats-Montalbán ,&nbsp;J. Camps-Herrero ,&nbsp;A. Ferrer","doi":"10.1016/j.chemolab.2024.105171","DOIUrl":"https://doi.org/10.1016/j.chemolab.2024.105171","url":null,"abstract":"<div><p>Currently, magnetic resonance imaging is the most sensitive imaging technique for detecting cancerous processes in early stages. As for breast cancer, due to the tubular structure of the tissue, being formed by ducts, anisotropic diffusion should be considered instead of the general isotropic diffusion. Anisotropic diffusion is studied by applying a technique called Diffusion Tensor Imaging (DTI), where the diffusion gradient is applied by changing the magnetic field in several spatial directions.</p><p>To date, the application of Multivariate Curve Resolution (MCR) models in diffusion sequences has demonstrated its ability to develop cancer biomarkers of easy clinical interpretation in the case of isotropic tissues, such as the prostate. But so far, it has never been applied in the case of anisotropic tissues, as the breast.</p><p>Therefore, the main objective of this work is to obtain easy-to-interpret imaging biomarkers useful for early breast cancer diagnosis from diffusion magnetic resonance imaging based on the Diffusion Tensor using multivariate curve resolution (MCR) models. A classification model to identify healthy and tumor affected pixels is also proposed.</p></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":null,"pages":null},"PeriodicalIF":3.7,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0169743924001114/pdfft?md5=bfa9e402dd60fbdcd42e8d99cb32d250&pid=1-s2.0-S0169743924001114-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141606715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimation of human bloodstains time since deposition using ATR-FTIR spectroscopy and chemometrics in simulated crime conditions 在模拟犯罪环境中使用 ATR-FTIR 光谱和化学计量学估算人类血迹的沉积时间
IF 3.7 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-06-26 DOI: 10.1016/j.chemolab.2024.105172
Miguel Mengual-Pujante , Antonio J. Perán , Antonio Ortiz , María Dolores Pérez-Cárceles

Blood in the form of stains is one of the most frequently encountered fluid in crime scene. Estimation of the time since deposition (TSD) is of great importance to guide the police investigation and the clarification of criminal offences. The time elapsed since deposition is usually estimated by modelling the physicochemical degradation of blood biomolecules over time. This work shows an ATR-FTIR spectroscopy and chemometrics study to estimate TSD of bloodstains on various surfaces and under different ambient conditions (indoor and outdoor). For a period from 0 to 212 days, a total of 960 stains were analyzed. Most of the eleven partial least squares regression (PLSR) models obtained showed a good prediction capacity, with a Residual Predictive Deviation (RPD) value higher than 3, and R2 higher than 0.90. Models for non-rigid supports showed better predictive capacity than those for rigid ones. A non-rigid surface model including the various non-rigid surfaces and ambient conditions was elaborated, which might be the most useful one from the criminalistic point of view. These results show that this technique can be a rapid, robust, and trustable tool for in situ determination of the TSD of bloodstains at crime scenes.

血迹是犯罪现场最常见的液体之一。估计血液沉积时间(TSD)对于指导警方调查和澄清刑事犯罪具有重要意义。沉积后的时间通常是通过模拟血液生物大分子随时间推移而发生的物理化学降解来估算的。这项工作展示了一项 ATR-FTIR 光谱和化学计量学研究,用于估算不同表面和不同环境条件(室内和室外)下血迹的 TSD。在 0 至 212 天期间,共分析了 960 块污渍。所获得的 11 个偏最小二乘回归(PLSR)模型中的大多数都显示出良好的预测能力,残差预测偏差(RPD)值大于 3,R2 大于 0.90。非刚性支撑的模型比刚性支撑的模型显示出更好的预测能力。非刚性表面模型包括各种非刚性表面和环境条件,从犯罪学的角度来看,这可能是最有用的模型。这些结果表明,该技术可以成为犯罪现场血迹 TSD 原位测定的快速、可靠和可信的工具。
{"title":"Estimation of human bloodstains time since deposition using ATR-FTIR spectroscopy and chemometrics in simulated crime conditions","authors":"Miguel Mengual-Pujante ,&nbsp;Antonio J. Perán ,&nbsp;Antonio Ortiz ,&nbsp;María Dolores Pérez-Cárceles","doi":"10.1016/j.chemolab.2024.105172","DOIUrl":"https://doi.org/10.1016/j.chemolab.2024.105172","url":null,"abstract":"<div><p>Blood in the form of stains is one of the most frequently encountered fluid in crime scene. Estimation of the time since deposition (TSD) is of great importance to guide the police investigation and the clarification of criminal offences. The time elapsed since deposition is usually estimated by modelling the physicochemical degradation of blood biomolecules over time. This work shows an ATR-FTIR spectroscopy and chemometrics study to estimate TSD of bloodstains on various surfaces and under different ambient conditions (indoor and outdoor). For a period from 0 to 212 days, a total of 960 stains were analyzed. Most of the eleven partial least squares regression (PLSR) models obtained showed a good prediction capacity, with a Residual Predictive Deviation (RPD) value higher than 3, and R<sup>2</sup> higher than 0.90. Models for non-rigid supports showed better predictive capacity than those for rigid ones. A non-rigid surface model including the various non-rigid surfaces and ambient conditions was elaborated, which might be the most useful one from the criminalistic point of view. These results show that this technique can be a rapid, robust, and trustable tool for <em>in situ</em> determination of the TSD of bloodstains at crime scenes.</p></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":null,"pages":null},"PeriodicalIF":3.7,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0169743924001126/pdfft?md5=12868d33bb0a44826b6ab904bb81dcbd&pid=1-s2.0-S0169743924001126-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141487277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing Hansen Solubility Predictions with Molecular and Graph-Based Approaches 用分子方法和基于图表的方法加强汉森溶解度预测
IF 3.7 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-06-21 DOI: 10.1016/j.chemolab.2024.105168
Darja Cvetković, Marija Mitrović Dankulov, Aleksandar Bogojević, Saša Lazović, Darija Obradović

The fast and accurate prediction of Hansen solubility benefits many diverse fields such as pharmaceuticals, the food industry, and cosmetics. To estimate the individual HSP values (polar, dispersive, and hydrogen bonding components), we investigated the performance of using Mordred descriptors in multiple linear regressions and XGBoost modeling. For HSP predictions, we also tested a graph-based molecular representation with graph neural network (GNN) modeling. To select the optimal models for final training and predictions, we used nested cross-validation and hyper-parameter optimization. The models with the best predictive performance were selected through internal (R2train, RMSE, MEPcv) and external (RMSEP, CCC, MEP, R2test, ar2m, Δr2m) validation metrics using ∼1200 compounds from free-available database https://www.stevenabbott.co.uk. To confirm the practical reliability, we examined the agreement of experimentally obtained HSP data from the literature for 93 compounds and the data predicted by the created models. The results of GNN modeling showed the best predictive characteristics, which include a coefficient of determination between experimentally obtained and predicted HSP values greater than 0.76 for polar and hydrogen bond forces and greater than 0.66 for dispersive forces. Interpreting the fundamental basis of Hansen solubility using the created MLR equations and XGBoost models, HSP values were found to be influenced by van der Waals volume characteristics, 2D matrix molecular representation, and polarity. We elaborated on the practical benefits of using the selected GNN method through Hansen's solubility sphere as an example. This is the first study to demonstrate the advantages of GNN in predicting individual HSP components, as well as the first study to describe in detail their molecular basis using MLR and XGBoost modeling.

快速准确地预测汉森溶解度有利于制药、食品工业和化妆品等多个领域。为了估算各个 HSP 值(极性、分散性和氢键成分),我们研究了在多重线性回归和 XGBoost 建模中使用 Mordred 描述符的性能。对于 HSP 预测,我们还测试了基于图的分子表示法和图神经网络(GNN)建模。为了选择用于最终训练和预测的最佳模型,我们使用了嵌套交叉验证和超参数优化。利用免费数据库 https://www.stevenabbott.co.uk 中的 1200 个化合物,通过内部(R2train、RMSE、MEPcv)和外部(RMSEP、CCC、MEP、R2test、ar2m、Δr2m)验证指标,选出了预测性能最佳的模型。为了证实模型的实际可靠性,我们检验了从文献中获得的 93 种化合物的 HSP 实验数据与所建模型预测数据的一致性。GNN 模型的结果显示了最佳的预测特性,其中包括极性力和氢键力方面实验获得的 HSP 值与预测值之间的决定系数大于 0.76,分散力方面的决定系数大于 0.66。通过使用创建的 MLR 方程和 XGBoost 模型解释汉森溶解度的基本原理,我们发现 HSP 值受到范德华体积特性、二维矩阵分子表示法和极性的影响。我们以汉森溶解度球为例,阐述了使用所选 GNN 方法的实际优势。这是第一项展示 GNN 在预测单个 HSP 成分方面优势的研究,也是第一项使用 MLR 和 XGBoost 建模详细描述其分子基础的研究。
{"title":"Enhancing Hansen Solubility Predictions with Molecular and Graph-Based Approaches","authors":"Darja Cvetković,&nbsp;Marija Mitrović Dankulov,&nbsp;Aleksandar Bogojević,&nbsp;Saša Lazović,&nbsp;Darija Obradović","doi":"10.1016/j.chemolab.2024.105168","DOIUrl":"https://doi.org/10.1016/j.chemolab.2024.105168","url":null,"abstract":"<div><p>The fast and accurate prediction of Hansen solubility benefits many diverse fields such as pharmaceuticals, the food industry, and cosmetics. To estimate the individual HSP values (polar, dispersive, and hydrogen bonding components), we investigated the performance of using Mordred descriptors in multiple linear regressions and XGBoost modeling. For HSP predictions, we also tested a graph-based molecular representation with graph neural network (GNN) modeling. To select the optimal models for final training and predictions, we used nested cross-validation and hyper-parameter optimization. The models with the best predictive performance were selected through internal (<em>R</em><sup><em>2</em></sup><sub>train</sub>, RMSE, MEPcv) and external (RMSEP, CCC, MEP, <em>R</em><sup><em>2</em></sup><sub>test</sub>, <em>ar</em><sup>2</sup>m, Δ<em>r</em><sup>2</sup>m) validation metrics using ∼1200 compounds from free-available database <span>https://www.stevenabbott.co.uk</span><svg><path></path></svg>. To confirm the practical reliability, we examined the agreement of experimentally obtained HSP data from the literature for 93 compounds and the data predicted by the created models. The results of GNN modeling showed the best predictive characteristics, which include a coefficient of determination between experimentally obtained and predicted HSP values greater than 0.76 for polar and hydrogen bond forces and greater than 0.66 for dispersive forces. Interpreting the fundamental basis of Hansen solubility using the created MLR equations and XGBoost models, HSP values were found to be influenced by van der Waals volume characteristics, 2D matrix molecular representation, and polarity. We elaborated on the practical benefits of using the selected GNN method through Hansen's solubility sphere as an example. This is the first study to demonstrate the advantages of GNN in predicting individual HSP components, as well as the first study to describe in detail their molecular basis using MLR and XGBoost modeling.</p></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":null,"pages":null},"PeriodicalIF":3.7,"publicationDate":"2024-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141487276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Lipid Quant 2.1: Open-source software for identification and quantification of lipids measured by lipid class separation QTOF high-resolution mass spectrometry methods Lipid Quant 2.1:用于鉴定和定量通过脂类分离 QTOF 高分辨率质谱方法测量的脂类的开源软件
IF 3.7 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-06-20 DOI: 10.1016/j.chemolab.2024.105169
Michaela Chocholoušková , Gabriel Vivó-Truyols , Denise Wolrab , Robert Jirásko , Michela Antonelli , Ondřej Peterka , Zuzana Vaňková , Michal Holčapek

LipidQuant 2.1 is a software written in Matlab, which is designed for the high-throughput processing of large lipidomic data sets measured by lipid class separation coupled with quadrupole time-of-flight (QTOF) high-resolution mass spectrometry (MS). The software enables the identification of lipid species based on defined mass accuracy. The main focus is on the right lipidomic quantitation using at least one internal standard per lipid class and the implementation of an automated procedure for Type I and Type II isotopic corrections necessary for the determination of accurate molar concentrations, which is not available for the majority of existing software solutions. LipidQuant 2.1 offers three options for peak assignment, visualization of the isotopic pattern, and automated calculation of m/z for various adduct ions. The initial lipidomic database covers 31 lipid classes with more than 2900 lipid species that occur primarily in the human lipidome, but users have the full flexibility to modify and extend the database according to their needs. All algorithms and the detailed user manual are provided. The reliability of LipidQuant 2.1 is demonstrated on a set of more than 250 biological samples measured by ultrahigh-performance supercritical liquid chromatography (UHPSFC) coupled with QTOF-MS.

LipidQuant 2.1 是一款用 Matlab 编写的软件,设计用于高通量处理通过脂质分类分离和四极杆飞行时间(QTOF)高分辨率质谱(MS)测量的大型脂质组数据集。该软件可根据规定的质量精度识别脂质种类。主要重点是使用每类脂质至少一种内标进行正确的脂质组定量,并实施必要的 I 类和 II 类同位素自动校正程序,以确定准确的摩尔浓度,而大多数现有软件解决方案都不具备这种功能。LipidQuant 2.1 提供了三种峰值分配、同位素模式可视化和自动计算各种加成离子 m/z 的选项。初始脂质体数据库涵盖 31 个脂质类别,有 2900 多种主要存在于人体脂质体中的脂质,但用户可以根据自己的需要灵活修改和扩展数据库。所有算法和详细的用户手册均已提供。LipidQuant 2.1 的可靠性在一组通过超高效超临界液相色谱 (UHPSFC) 结合 QTOF-MS 测定的 250 多个生物样本上得到了验证。
{"title":"Lipid Quant 2.1: Open-source software for identification and quantification of lipids measured by lipid class separation QTOF high-resolution mass spectrometry methods","authors":"Michaela Chocholoušková ,&nbsp;Gabriel Vivó-Truyols ,&nbsp;Denise Wolrab ,&nbsp;Robert Jirásko ,&nbsp;Michela Antonelli ,&nbsp;Ondřej Peterka ,&nbsp;Zuzana Vaňková ,&nbsp;Michal Holčapek","doi":"10.1016/j.chemolab.2024.105169","DOIUrl":"https://doi.org/10.1016/j.chemolab.2024.105169","url":null,"abstract":"<div><p>LipidQuant 2.1 is a software written in Matlab, which is designed for the high-throughput processing of large lipidomic data sets measured by lipid class separation coupled with quadrupole time-of-flight (QTOF) high-resolution mass spectrometry (MS). The software enables the identification of lipid species based on defined mass accuracy. The main focus is on the right lipidomic quantitation using at least one internal standard per lipid class and the implementation of an automated procedure for Type I and Type II isotopic corrections necessary for the determination of accurate molar concentrations, which is not available for the majority of existing software solutions. LipidQuant 2.1 offers three options for peak assignment, visualization of the isotopic pattern, and automated calculation of <em>m/z</em> for various adduct ions. The initial lipidomic database covers 31 lipid classes with more than 2900 lipid species that occur primarily in the human lipidome, but users have the full flexibility to modify and extend the database according to their needs. All algorithms and the detailed user manual are provided. The reliability of LipidQuant 2.1 is demonstrated on a set of more than 250 biological samples measured by ultrahigh-performance supercritical liquid chromatography (UHPSFC) coupled with QTOF-MS.</p></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":null,"pages":null},"PeriodicalIF":3.7,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0169743924001096/pdfft?md5=9ea2187d616236fadca4f84096ec1816&pid=1-s2.0-S0169743924001096-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141487275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Latent variable model inversion for intervals. Application to tolerance intervals in class-modelling situations, and specification limits in process control 区间的潜变量模型反演。应用于类别建模情况下的公差区间和过程控制中的规格限制
IF 3.7 2区 化学 Q1 Chemistry Pub Date : 2024-06-18 DOI: 10.1016/j.chemolab.2024.105166
M.S. Sánchez , M.C. Ortiz , S. Ruiz , O. Valencia , L.A. Sarabia

The paper deals with the inversion of intervals when a PLS (Partial Least Squares) model is used. However, instead of discretizing the interval, it is proved that the region resulting from the inversion of a PLS model is a convex set bounded by two parallel hyperplanes, each corresponding to the direct inversion of each endpoint of the given interval.

When the domain of the input variables is a convex set, any feasible solution with predictions within the interval set in the response can be obtained as a convex combination of a point on each of the two hyperplanes. In this way, the new solutions preserve the internal structure of the input variables.

This methodology can be of interest in several domains where the response under study is defined in terms of an interval of admissible values, such as specifications for a product in an industrial process, or tolerance intervals for computing compliant class-models.

The inversion of the corresponding fitted model defines a region in the input space (predictor variables) whose predictions fall within the specified interval. Then, estimating and exploring this region will increase the information about the problem under study.

本文涉及使用 PLS(部分最小二乘)模型时的区间反演。然而,本文并没有将区间离散化,而是证明了 PLS 模型反演所产生的区域是一个由两个平行超平面限定的凸集,每个超平面都对应于给定区间的每个端点的直接反演。这种方法适用于多个领域,在这些领域中,所研究的响应是以可接受值的区间来定义的,例如工业流程中的产品规格,或计算符合要求的类模型的公差区间。然后,对这一区域进行估算和探索将增加有关所研究问题的信息。
{"title":"Latent variable model inversion for intervals. Application to tolerance intervals in class-modelling situations, and specification limits in process control","authors":"M.S. Sánchez ,&nbsp;M.C. Ortiz ,&nbsp;S. Ruiz ,&nbsp;O. Valencia ,&nbsp;L.A. Sarabia","doi":"10.1016/j.chemolab.2024.105166","DOIUrl":"https://doi.org/10.1016/j.chemolab.2024.105166","url":null,"abstract":"<div><p>The paper deals with the inversion of intervals when a PLS (Partial Least Squares) model is used. However, instead of discretizing the interval, it is proved that the region resulting from the inversion of a PLS model is a convex set bounded by two parallel hyperplanes, each corresponding to the direct inversion of each endpoint of the given interval.</p><p>When the domain of the input variables is a convex set, any feasible solution with predictions within the interval set in the response can be obtained as a convex combination of a point on each of the two hyperplanes. In this way, the new solutions preserve the internal structure of the input variables.</p><p>This methodology can be of interest in several domains where the response under study is defined in terms of an interval of admissible values, such as specifications for a product in an industrial process, or tolerance intervals for computing compliant class-models.</p><p>The inversion of the corresponding fitted model defines a region in the input space (predictor variables) whose predictions fall within the specified interval. Then, estimating and exploring this region will increase the information about the problem under study.</p></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":null,"pages":null},"PeriodicalIF":3.7,"publicationDate":"2024-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0169743924001060/pdfft?md5=916b6271ac0ec8660781143e8ff364ff&pid=1-s2.0-S0169743924001060-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141435151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PLS multi-step regressions in data paths 数据路径中的 PLS 多步回归
IF 3.7 2区 化学 Q1 Chemistry Pub Date : 2024-06-17 DOI: 10.1016/j.chemolab.2024.105167
Agnar Höskuldsson

Here is presented a procedure that extends standard PLS Regression to several data matrices in a path. The basic idea is to convert the path of data matrices into interconnected regressions. Forecasts by PLS are extended to multi-step forecasts for each data matrix in the path. We study how far we can make forecasts, i.e., how far we can ‘see’ in the path. It is shown how data paths are divided into parts, where multi-step forecasting can be carried out within each part. The principles of PLS are used to suggest criteria for estimation in the regressions. These methods can be used to supervise a complex path of industrial chemical/biological processes. It is shown how expanding and contracting paths, which is common for industrial processes, can be handled. These methods can be used to carry out analysis of general path models. It is shown briefly by an example how a Structural Equations Model, SEM, can be converted into a collection of sequential paths that can be analyzed by present methods. The results suggest that conclusions made at SEM analysis may not always be reliable. The theory is applied to process data. It is shown how we work with the analysis of each regression in a similar way as in PLS.

这里介绍的是一种将标准 PLS 回归扩展到路径中多个数据矩阵的程序。其基本思想是将数据矩阵路径转换为相互关联的回归。PLS 预测扩展为对路径中每个数据矩阵的多步预测。我们将研究我们能预测多远,即我们能在路径中 "看到 "多远。我们展示了如何将数据路径划分为若干部分,并在每个部分内进行多步预测。PLS 原理用于提出回归估计的标准。这些方法可用于监督工业化学/生物过程的复杂路径。图中展示了如何处理工业过程中常见的扩展和收缩路径。这些方法可用于对一般路径模型进行分析。举例简要说明了如何将结构方程模型(SEM)转换为顺序路径集合,并用现有方法进行分析。结果表明,SEM 分析得出的结论并不总是可靠的。该理论适用于过程数据。结果表明,我们如何以类似于 PLS 的方式对每个回归进行分析。
{"title":"PLS multi-step regressions in data paths","authors":"Agnar Höskuldsson","doi":"10.1016/j.chemolab.2024.105167","DOIUrl":"https://doi.org/10.1016/j.chemolab.2024.105167","url":null,"abstract":"<div><p>Here is presented a procedure that extends standard PLS Regression to several data matrices in a path. The basic idea is to convert the path of data matrices into interconnected regressions. Forecasts by PLS are extended to multi-step forecasts for each data matrix in the path. We study how far we can make forecasts, i.e., how far we can ‘see’ in the path. It is shown how data paths are divided into parts, where multi-step forecasting can be carried out within each part. The principles of PLS are used to suggest criteria for estimation in the regressions. These methods can be used to supervise a complex path of industrial chemical/biological processes. It is shown how expanding and contracting paths, which is common for industrial processes, can be handled. These methods can be used to carry out analysis of general path models. It is shown briefly by an example how a Structural Equations Model, SEM, can be converted into a collection of sequential paths that can be analyzed by present methods. The results suggest that conclusions made at SEM analysis may not always be reliable. The theory is applied to process data. It is shown how we work with the analysis of each regression in a similar way as in PLS.</p></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":null,"pages":null},"PeriodicalIF":3.7,"publicationDate":"2024-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141435166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Shift invariant soft trilinearity: Modelling shifts and shape changes in gas-chromatography coupled mass spectrometry 移动不变的软三线性:气相色谱耦合质谱法中的偏移和形状变化建模
IF 3.9 2区 化学 Q1 Chemistry Pub Date : 2024-06-08 DOI: 10.1016/j.chemolab.2024.105155
Paul-Albert Schneide , Neal B. Gallagher , Rasmus Bro
{"title":"Shift invariant soft trilinearity: Modelling shifts and shape changes in gas-chromatography coupled mass spectrometry","authors":"Paul-Albert Schneide ,&nbsp;Neal B. Gallagher ,&nbsp;Rasmus Bro","doi":"10.1016/j.chemolab.2024.105155","DOIUrl":"https://doi.org/10.1016/j.chemolab.2024.105155","url":null,"abstract":"","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":null,"pages":null},"PeriodicalIF":3.9,"publicationDate":"2024-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141314561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Geographic authentication of argentinian teas by combining one-class models and discriminant methods for modeling near infrared spectra 通过结合近红外光谱建模的单类模型和判别方法对阿根廷茶叶进行地理认证
IF 3.9 2区 化学 Q1 Chemistry Pub Date : 2024-06-06 DOI: 10.1016/j.chemolab.2024.105156
Diana C. Fechner , RamónA. Martinez , Melisa J. Hidalgo , Adriano Araújo Gomes , Roberto G. Pellerano , Héctor C. Goicoechea

In this study, 110 tea samples from South American countries (Argentina, Brazil, and Paraguay) and Asian countries (India and China) were analyzed using near-infrared spectroscopy (NIRS) together with a two-step chemometric authentication strategy (class modeling techniques and discriminant analysis) to authenticate commercial teas from Argentina. In the first step, one-class models were built and validated to authenticate South American teas using preprocessed NIRS data. For this purpose, data-driven soft independent modeling of class analogy (DD-SIMCA) and one-class partial least squares (OC-PLS) were used. The DD-SIMCA model gave the best results, with a sensitivity of 93.10%, specificity of 100%, and efficiency of 95.00%. In the second step, a support vector machine (SVM) was used to build and validate a multiclass model to discriminate between tea samples from Argentina and neighboring countries of South America. The best model was the combination of nine variables selected by the fast correlation-based filter (FCBF) method, with an accuracy of 98.30%. Therefore, we conclude that the combination of NIRS and two-step chemometric tools can be used to authenticate the geographical origin of samples with high inter-class similarity.

在这项研究中,使用近红外光谱(NIRS)分析了来自南美国家(阿根廷、巴西和巴拉圭)和亚洲国家(印度和中国)的 110 个茶叶样本,并采用两步化学计量鉴定策略(类别建模技术和判别分析)对阿根廷的商业茶叶进行鉴定。第一步,利用预处理的近红外光谱数据,建立并验证单类模型,以鉴定南美茶叶。为此,使用了数据驱动的类类比软独立建模(DD-SIMCA)和单类偏最小二乘法(OC-PLS)。DD-SIMCA 模型的结果最好,灵敏度为 93.10%,特异性为 100%,有效率为 95.00%。第二步,使用支持向量机(SVM)建立并验证多类模型,以区分阿根廷和南美邻国的茶叶样本。最佳模型是通过基于快速相关性过滤(FCBF)方法选出的九个变量的组合,准确率为 98.30%。因此,我们得出结论,将近红外光谱和两步化学计量学工具相结合,可用于鉴定类间相似度高的样品的地理来源。
{"title":"Geographic authentication of argentinian teas by combining one-class models and discriminant methods for modeling near infrared spectra","authors":"Diana C. Fechner ,&nbsp;RamónA. Martinez ,&nbsp;Melisa J. Hidalgo ,&nbsp;Adriano Araújo Gomes ,&nbsp;Roberto G. Pellerano ,&nbsp;Héctor C. Goicoechea","doi":"10.1016/j.chemolab.2024.105156","DOIUrl":"https://doi.org/10.1016/j.chemolab.2024.105156","url":null,"abstract":"<div><p>In this study, 110 tea samples from South American countries (Argentina, Brazil, and Paraguay) and Asian countries (India and China) were analyzed using near-infrared spectroscopy (NIRS) together with a two-step chemometric authentication strategy (class modeling techniques and discriminant analysis) to authenticate commercial teas from Argentina. In the first step, one-class models were built and validated to authenticate South American teas using preprocessed NIRS data. For this purpose, data-driven soft independent modeling of class analogy (DD-SIMCA) and one-class partial least squares (OC-PLS) were used. The DD-SIMCA model gave the best results, with a sensitivity of 93.10%, specificity of 100%, and efficiency of 95.00%. In the second step, a support vector machine (SVM) was used to build and validate a multiclass model to discriminate between tea samples from Argentina and neighboring countries of South America. The best model was the combination of nine variables selected by the fast correlation-based filter (FCBF) method, with an accuracy of 98.30%. Therefore, we conclude that the combination of NIRS and two-step chemometric tools can be used to authenticate the geographical origin of samples with high inter-class similarity.</p></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":null,"pages":null},"PeriodicalIF":3.9,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141314560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Chemometrics and Intelligent Laboratory Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1