首页 > 最新文献

Journal of Chemometrics最新文献

英文 中文
Some Views on Multi-criteria Methods for Data Analysis 关于数据分析多标准方法的一些观点
IF 2.3 4区 化学 Q1 SOCIAL WORK Pub Date : 2024-08-21 DOI: 10.1002/cem.3597
Henk A. L. Kiers, Marieke E. Timmerman

Many data analysis methods actually combine optimization of several criteria. In this paper, a framework is offered for categorizing such multi-criteria methods. In particular, it categorizes multiset and three-way analysis methods as well as penalized methods and combinations thereof. The framework aims to stimulate critical evaluation of methods and reflection on the purpose of methods and, by signaling gaps, to help the development of new data analysis methods.

许多数据分析方法实际上结合了多个标准的优化。本文为此类多标准方法的分类提供了一个框架。特别是,它对多集合和三向分析方法以及惩罚性方法及其组合进行了分类。该框架旨在激发对方法的批判性评估和对方法目的的思考,并通过指出差距,帮助开发新的数据分析方法。
{"title":"Some Views on Multi-criteria Methods for Data Analysis","authors":"Henk A. L. Kiers,&nbsp;Marieke E. Timmerman","doi":"10.1002/cem.3597","DOIUrl":"10.1002/cem.3597","url":null,"abstract":"<p>Many data analysis methods actually combine optimization of several criteria. In this paper, a framework is offered for categorizing such multi-criteria methods. In particular, it categorizes multiset and three-way analysis methods as well as penalized methods and combinations thereof. The framework aims to stimulate critical evaluation of methods and reflection on the purpose of methods and, by signaling gaps, to help the development of new data analysis methods.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 11","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.3597","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142209923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Novel Automated System for Early Diabetic Retinopathy Detection and Severity Classification 用于早期糖尿病视网膜病变检测和严重程度分类的新型自动系统
IF 2.3 4区 化学 Q1 SOCIAL WORK Pub Date : 2024-08-19 DOI: 10.1002/cem.3593
Santoshkumar S Ainapur, Virupakshappa Patil

Diabetes is a common and serious global disease that damages blood vessels in the eye, leading to vision loss. Early and accurate diagnosis of this issue is crucial to reduce the risk of visual impairment. The typical deep learning (DL) methods for diabetic retinopathy (DR) grading are often time-consuming, resulting in unsatisfactory detection performance due to inadequate representation of lesion features. To overcome these challenges, this research proposes a new automated mechanism for detecting and classifying DR, aiming to identify DR severities and different stages. To figure out and capture feature characteristics from DR samples, a conjugated attention mechanism and vision transformer are utilized within a collective net model, which automatically generates feature maps for diagnosing DR. These extracted feature maps are then fused through the feature fusion function in a fused attention net model, calculating attention weights to produce the most powerful feature map. Finally, the DR cases are identified and discriminated using the kernel extreme learning machine (KELM) model. For evaluating DR severity, our work utilizes four different benchmark datasets: APTOS 2019, MESSIDOR-2 dataset, DiaRetDB1 V2.1, and DIARETDB0 datasets. To illuminate data noise and unwanted variations, two preprocessing steps are carried out, which include contrast enhancement and illumination correction. The experimental results evaluated using well-known indicators demonstrate that the suggested method achieves a higher accuracy of 99.63% compared to other baseline methods. This research contributes to the development of powerful DR screening techniques that are less time-consuming and capable of automatically identifying DR severity levels at a premature level.

糖尿病是一种常见的全球性严重疾病,它会损害眼部血管,导致视力下降。对这一问题进行早期准确诊断对于降低视力损伤风险至关重要。用于糖尿病视网膜病变(DR)分级的典型深度学习(DL)方法往往耗时较长,而且由于病变特征的表征不充分,导致检测性能不尽如人意。为了克服这些挑战,本研究提出了一种新的自动检测和分级 DR 的机制,旨在识别 DR 的严重程度和不同阶段。为了找出并捕捉 DR 样本的特征,在一个集合网模型中利用了共轭注意力机制和视觉转换器,自动生成用于诊断 DR 的特征图。然后,通过融合注意力网络模型中的特征融合功能将这些提取的特征图进行融合,计算注意力权重以生成最强大的特征图。最后,使用核极端学习机(KELM)模型识别和区分 DR 病例。为了评估 DR 的严重程度,我们的工作使用了四个不同的基准数据集:APTOS 2019、MESSIDOR-2 数据集、DiaRetDB1 V2.1 和 DIARETDB0 数据集。为了消除数据噪声和不必要的变化,进行了两个预处理步骤,包括对比度增强和光照校正。使用知名指标评估的实验结果表明,与其他基线方法相比,建议的方法达到了 99.63% 的较高准确率。这项研究有助于开发功能强大的 DR 筛选技术,这种技术耗时少,能够自动识别过早出现的 DR 严重程度。
{"title":"A Novel Automated System for Early Diabetic Retinopathy Detection and Severity Classification","authors":"Santoshkumar S Ainapur,&nbsp;Virupakshappa Patil","doi":"10.1002/cem.3593","DOIUrl":"https://doi.org/10.1002/cem.3593","url":null,"abstract":"<div>\u0000 \u0000 <p>Diabetes is a common and serious global disease that damages blood vessels in the eye, leading to vision loss. Early and accurate diagnosis of this issue is crucial to reduce the risk of visual impairment. The typical deep learning (DL) methods for diabetic retinopathy (DR) grading are often time-consuming, resulting in unsatisfactory detection performance due to inadequate representation of lesion features. To overcome these challenges, this research proposes a new automated mechanism for detecting and classifying DR, aiming to identify DR severities and different stages. To figure out and capture feature characteristics from DR samples, a conjugated attention mechanism and vision transformer are utilized within a collective net model, which automatically generates feature maps for diagnosing DR. These extracted feature maps are then fused through the feature fusion function in a fused attention net model, calculating attention weights to produce the most powerful feature map. Finally, the DR cases are identified and discriminated using the kernel extreme learning machine (KELM) model. For evaluating DR severity, our work utilizes four different benchmark datasets: APTOS 2019, MESSIDOR-2 dataset, DiaRetDB1 V2.1, and DIARETDB0 datasets. To illuminate data noise and unwanted variations, two preprocessing steps are carried out, which include contrast enhancement and illumination correction. The experimental results evaluated using well-known indicators demonstrate that the suggested method achieves a higher accuracy of 99.63% compared to other baseline methods. This research contributes to the development of powerful DR screening techniques that are less time-consuming and capable of automatically identifying DR severity levels at a premature level.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 11","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142666139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Can Angle Measures Be Useful in MCR Analyses? 角度测量在 MCR 分析中有用吗?
IF 2.3 4区 化学 Q1 SOCIAL WORK Pub Date : 2024-08-14 DOI: 10.1002/cem.3582
Klaus Neymeyr, Martina Beese, Hamid Abdollahi, Mathias Sawall

In MCR analyses, the similarity of pairs of spectra or concentration profiles can be measured in terms of the acute angle that is enclosed by the representing vectors. Acute angles between vectors can be generalized to pairs of subspaces. So-called canonical angles, also called principal angles, measure the mutual orientation of a pair of subspaces. This work discusses how angles and canonical angles can support multivariate curve resolution analyses. A canonical angle analysis (CAA) can help to detect changes of the chemical composition during a chemical reaction in a way comparable, but different to the evolving factor analysis (EFA).

在 MCR 分析中,光谱或浓度曲线对的相似性可以用代表向量所围成的锐角来衡量。矢量之间的锐角可以推广到子空间对。所谓的典型角(也称为主角)可以测量一对子空间的相互方向。本研究将讨论角度和典型角度如何支持多元曲线解析分析。典型角分析 (CAA) 可以帮助检测化学反应过程中化学成分的变化,其方法与演化因子分析 (EFA) 类似,但又有所不同。
{"title":"Can Angle Measures Be Useful in MCR Analyses?","authors":"Klaus Neymeyr,&nbsp;Martina Beese,&nbsp;Hamid Abdollahi,&nbsp;Mathias Sawall","doi":"10.1002/cem.3582","DOIUrl":"10.1002/cem.3582","url":null,"abstract":"<p>In MCR analyses, the similarity of pairs of spectra or concentration profiles can be measured in terms of the acute angle that is enclosed by the representing vectors. Acute angles between vectors can be generalized to pairs of subspaces. So-called canonical angles, also called principal angles, measure the mutual orientation of a pair of subspaces. This work discusses how angles and canonical angles can support multivariate curve resolution analyses. A canonical angle analysis (CAA) can help to detect changes of the chemical composition during a chemical reaction in a way comparable, but different to the evolving factor analysis (EFA).</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 11","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.3582","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142209896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Flexible Trilinearity Alignment (FTA) and Shift Invariant Transformation (SIT) Constraints in Three-Way Multivariate Curve Resolution Data Analysis 三向多元曲线解析数据分析中的灵活三线性对齐(FTA)和位移不变变换(SIT)约束条件
IF 2.3 4区 化学 Q1 SOCIAL WORK Pub Date : 2024-08-08 DOI: 10.1002/cem.3581
Xin Zhang, Romà Tauler

In this work, two alternative ways of analyzing three-way data with multivariate curve resolution alternating least squares (MCR-ALS) using the trilinearity constraint are described and compared. Different synthetic datasets and experimental three-way datasets covering different scenarios are analyzed, and the results obtained are compared. The two new different ways of applying the trilinearity constraint are named flexible trilinearity alignment (FTA) and shift invariant transformation (SIT). The effects of noise in the application of both types of constraints are investigated in detail. Results show that both approaches are particularly adequate for those cases like in gas chromatography and especially in liquid chromatography where the elution profiles of the same chemical component in different chromatographic runs are not totally reproducible because they are time shifted, although they preserve their shape. When strong time shifts and co-elution occur, then the “standard” trilinear model does not work, and alternative approaches should be used, such as the MCR extended bilinear model to multiset (multirun) data, or the proposed relaxation of the trilinearity constraint in the FTA and SIT methods to capture the time drift changes produced in the elution profiles of the resolved components.

在这项工作中,描述并比较了使用三线性约束的多变量曲线分辨率交替最小二乘法(MCR-ALS)分析三向数据的两种替代方法。对涵盖不同场景的不同合成数据集和实验三向数据集进行了分析,并对所得结果进行了比较。应用三线性约束的两种新的不同方法被命名为灵活三线性配准(FTA)和移位不变变换(SIT)。在应用这两种约束时,对噪声的影响进行了详细研究。结果表明,这两种方法都特别适用于气相色谱法,尤其是液相色谱法中同一化学成分在不同色谱运行中的洗脱剖面图虽然形状保持不变,但由于时间偏移而无法完全重现的情况。当发生强烈的时间偏移和共洗脱时,"标准 "三线性模型就不起作用了,此时应采用其他方法,如针对多集(多运行)数据的 MCR 扩展双线性模型,或建议放宽 FTA 和 SIT 方法中的三线性约束,以捕捉已解析组分洗脱剖面中产生的时间漂移变化。
{"title":"Flexible Trilinearity Alignment (FTA) and Shift Invariant Transformation (SIT) Constraints in Three-Way Multivariate Curve Resolution Data Analysis","authors":"Xin Zhang,&nbsp;Romà Tauler","doi":"10.1002/cem.3581","DOIUrl":"10.1002/cem.3581","url":null,"abstract":"<p>In this work, two alternative ways of analyzing three-way data with multivariate curve resolution alternating least squares (MCR-ALS) using the trilinearity constraint are described and compared. Different synthetic datasets and experimental three-way datasets covering different scenarios are analyzed, and the results obtained are compared. The two new different ways of applying the trilinearity constraint are named flexible trilinearity alignment (FTA) and shift invariant transformation (SIT). The effects of noise in the application of both types of constraints are investigated in detail. Results show that both approaches are particularly adequate for those cases like in gas chromatography and especially in liquid chromatography where the elution profiles of the same chemical component in different chromatographic runs are not totally reproducible because they are time shifted, although they preserve their shape. When strong time shifts and co-elution occur, then the “standard” trilinear model does not work, and alternative approaches should be used, such as the MCR extended bilinear model to multiset (multirun) data, or the proposed relaxation of the trilinearity constraint in the FTA and SIT methods to capture the time drift changes produced in the elution profiles of the resolved components.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 11","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.3581","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141928166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Novel One-Class Convolutional Autoencoder Combined With Excitation–Emission Matrix Fluorescence Spectroscopy for Authenticity Identification of Food 新型单类卷积自动编码器与激发-发射矩阵荧光光谱技术相结合用于食品真伪鉴别
IF 2.3 4区 化学 Q1 SOCIAL WORK Pub Date : 2024-08-05 DOI: 10.1002/cem.3592
Xiaoqin Yan, Baoshuo Jia, Wanjun Long, Kun Huang, Tong Wang, Hailong Wu, Ruqin Yu

In this work, a novel one-class classification algorithm one-class convolutional autoencoder (OC-CAE) was proposed for the detection of abnormal samples in the excitation–emission matrix (EEM) fluorescence spectra dataset. The OC-CAE used Boxplot to analyze the reconstruction errors and used the LOF algorithm to handle features extracted by the hidden layer in the convolutional autoencoder (CAE). The fused information provides the basis for more accurate pattern recognition, ensures flexibility in model training, and can obtain higher model specificity, which is important in the field of food quality control. To demonstrate the reliability and advantages of OC-CAE, two EEM cases related to the authentication of food including the Zhenjiang aromatic vinegar (ZAV) case and the camellia oil (CAO) case were studied. The results showed that OC-CAE identified all abnormal samples in the two cases, reflecting excellent performance in the detection of abnormal samples, and that it, coupled with EEM, would be an effective tool for the authenticity identification of food.

本研究提出了一种新型的一类分类算法一类卷积自动编码器(OC-CAE),用于检测激发-发射矩阵(EEM)荧光光谱数据集中的异常样本。OC-CAE 使用 Boxplot 分析重构误差,并使用 LOF 算法处理卷积自动编码器 (CAE) 隐藏层提取的特征。融合后的信息为更精确的模式识别提供了基础,确保了模型训练的灵活性,并能获得更高的模型特异性,这在食品质量控制领域非常重要。为了证明 OC-CAE 的可靠性和优势,研究了两个与食品认证相关的 EEM 案例,包括镇江香醋(ZAV)案例和山茶油(CAO)案例。结果表明,OC-CAE 能识别这两个案例中的所有异常样品,在检测异常样品方面表现出色,与 EEM 相结合将成为食品真伪鉴定的有效工具。
{"title":"A Novel One-Class Convolutional Autoencoder Combined With Excitation–Emission Matrix Fluorescence Spectroscopy for Authenticity Identification of Food","authors":"Xiaoqin Yan,&nbsp;Baoshuo Jia,&nbsp;Wanjun Long,&nbsp;Kun Huang,&nbsp;Tong Wang,&nbsp;Hailong Wu,&nbsp;Ruqin Yu","doi":"10.1002/cem.3592","DOIUrl":"10.1002/cem.3592","url":null,"abstract":"<div>\u0000 \u0000 <p>In this work, a novel one-class classification algorithm one-class convolutional autoencoder (OC-CAE) was proposed for the detection of abnormal samples in the excitation–emission matrix (EEM) fluorescence spectra dataset. The OC-CAE used Boxplot to analyze the reconstruction errors and used the LOF algorithm to handle features extracted by the hidden layer in the convolutional autoencoder (CAE). The fused information provides the basis for more accurate pattern recognition, ensures flexibility in model training, and can obtain higher model specificity, which is important in the field of food quality control. To demonstrate the reliability and advantages of OC-CAE, two EEM cases related to the authentication of food including the Zhenjiang aromatic vinegar (ZAV) case and the camellia oil (CAO) case were studied. The results showed that OC-CAE identified all abnormal samples in the two cases, reflecting excellent performance in the detection of abnormal samples, and that it, coupled with EEM, would be an effective tool for the authenticity identification of food.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 11","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141933341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust Multiplicative Scatter Correction Using Quantile Regression 利用定量回归进行稳健的乘法散点校正
IF 2.3 4区 化学 Q1 SOCIAL WORK Pub Date : 2024-08-05 DOI: 10.1002/cem.3589
Bahram Hemmateenejad, Nabiollah Mobaraki, Knut Baumann

A robust method for multiplicative scatter correction (MSC) in infrared spectroscopy is presented. Using quantile regression, the outlier wavelengths (concentration-dependent wavelengths) that are irrelevant to the regression are identified and therefore excluded from the regression model. This new MCS method, which could be implemented in its simple or extended form, is much simpler than the recently proposed methods and has only one hyperparameter (the quantile value) to be adjusted. To achieve this, a scoring function based on residual analysis can automatically determine the correct quantile value. The method is first explained using simulation data sets and then its validation is explained by analysing some experimental data sets. It was found that our new method can perform well in the presence of strong outlying variables. On the other hand, when the data sets are not associated outlying wavelengths, this method behaves similarly to the conventional MSC method.

本文介绍了一种用于红外光谱乘法散射校正(MSC)的稳健方法。通过使用量子回归,可以识别出与回归无关的离群波长(与浓度相关的波长),从而将其排除在回归模型之外。这种新的 MCS 方法可以以简单或扩展的形式实现,比最近提出的方法简单得多,而且只需调整一个超参数(量值)。为此,基于残差分析的评分函数可以自动确定正确的量化值。首先使用模拟数据集对该方法进行了说明,然后通过分析一些实验数据集对其进行了验证。结果发现,我们的新方法在存在强离群变量的情况下表现良好。另一方面,当数据集与离群波长无关时,这种方法的表现与传统的 MSC 方法类似。
{"title":"Robust Multiplicative Scatter Correction Using Quantile Regression","authors":"Bahram Hemmateenejad,&nbsp;Nabiollah Mobaraki,&nbsp;Knut Baumann","doi":"10.1002/cem.3589","DOIUrl":"10.1002/cem.3589","url":null,"abstract":"<p>A robust method for multiplicative scatter correction (MSC) in infrared spectroscopy is presented. Using quantile regression, the outlier wavelengths (concentration-dependent wavelengths) that are irrelevant to the regression are identified and therefore excluded from the regression model. This new MCS method, which could be implemented in its simple or extended form, is much simpler than the recently proposed methods and has only one hyperparameter (the quantile value) to be adjusted. To achieve this, a scoring function based on residual analysis can automatically determine the correct quantile value. The method is first explained using simulation data sets and then its validation is explained by analysing some experimental data sets. It was found that our new method can perform well in the presence of strong outlying variables. On the other hand, when the data sets are not associated outlying wavelengths, this method behaves similarly to the conventional MSC method.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 11","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.3589","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141933433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adjusted Pareto Scaling for Multivariate Calibration Models 多变量校准模型的调整帕累托缩放法
IF 2.3 4区 化学 Q1 SOCIAL WORK Pub Date : 2024-08-03 DOI: 10.1002/cem.3588
Kurt Varmuza, Peter Filzmoser

The performance of multivariate calibration models ŷ = f(x) for the prediction of a numerical property y from a set of x-variables depends on the type of scaling of the x-variables. Common scaling methods are autoscaling (dividing the centered x by its standard deviation s) and Pareto scaling (dividing the centered x by sP with P = 0.5). The adjusted Pareto scaling presented here varies the exponent P between 0 (no scaling) and 1 (autoscaling) with the aim of obtaining an optimum prediction performance for ŷ. Related scaling methods based on the variable spread are range scaling and vast scaling; while level scaling is based on the location (central value) of the variable. These scaling methods and robust versions are compared for models created by partial least-squares (PLS) regression. The applied strategy repeated double cross validation (rdCV) evaluates the model performance for test set objects and considers its variability. Results with three data sets from chemistry show: (a) the efficacy of the different scaling methods depends on the data structure; (b) optimization of the Pareto exponent P is recommended; (c) range scaling or vast scaling may be better than adjusted Pareto scaling; (d) in general a heuristic search for the best scaling method is advisable. Overall, the consideration of different variants of scaling allow for a flexible adjustment of the variable contributions to the calibration model.

多元校准模型 ŷ = f(x)从一组 x 变量预测数值属性 y 的性能取决于 x 变量的缩放类型。常见的缩放方法有自动缩放(将中心 x 除以标准偏差 s)和帕累托缩放(将中心 x 除以 sP,p = 0.5)。本文介绍的调整帕累托缩放法在 0(无缩放)和 1(自动缩放)之间改变指数 P,目的是获得 ŷ 的最佳预测性能。基于变量分布的相关缩放方法有范围缩放和广度缩放;而水平缩放则基于变量的位置(中心值)。通过偏最小二乘(PLS)回归创建的模型,对这些缩放方法和稳健版本进行了比较。所采用的重复双重交叉验证(rdCV)策略可评估测试集对象的模型性能,并考虑其可变性。三个化学数据集的结果表明:(a) 不同缩放方法的效果取决于数据结构;(b) 建议优化帕累托指数 P;(c) 范围缩放或广域缩放可能比调整后的帕累托缩放更好;(d) 一般来说,最好采用启发式搜索最佳缩放方法。总之,考虑不同的缩放变量可以灵活调整校准模型的变量贡献。
{"title":"Adjusted Pareto Scaling for Multivariate Calibration Models","authors":"Kurt Varmuza,&nbsp;Peter Filzmoser","doi":"10.1002/cem.3588","DOIUrl":"10.1002/cem.3588","url":null,"abstract":"<p>The performance of multivariate calibration models <i>ŷ</i> = f(<b><i>x</i></b>) for the prediction of a numerical property <i>y</i> from a set of <i>x</i>-variables depends on the type of scaling of the <i>x</i>-variables. Common scaling methods are autoscaling (dividing the centered <i>x</i> by its standard deviation <i>s</i>) and Pareto scaling (dividing the centered <i>x</i> by <i>s</i><sup><i>P</i></sup> with <i>P</i> = 0.5). The adjusted Pareto scaling presented here varies the exponent <i>P</i> between 0 (no scaling) and 1 (autoscaling) with the aim of obtaining an optimum prediction performance for <i>ŷ</i>. Related scaling methods based on the variable spread are range scaling and vast scaling; while level scaling is based on the location (central value) of the variable. These scaling methods and robust versions are compared for models created by partial least-squares (PLS) regression. The applied strategy repeated double cross validation (rdCV) evaluates the model performance for test set objects and considers its variability. Results with three data sets from chemistry show: (a) the efficacy of the different scaling methods depends on the data structure; (b) optimization of the Pareto exponent <i>P</i> is recommended; (c) range scaling or vast scaling may be better than adjusted Pareto scaling; (d) in general a heuristic search for the best scaling method is advisable. Overall, the consideration of different variants of scaling allow for a flexible adjustment of the variable contributions to the calibration model.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 11","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.3588","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141933342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Investigation of the Physiological and Post-training Effects of Ecdysteroid Supplementation by Multivariate Analysis of the Human Serum Metabolome 通过人体血清代谢组的多元分析研究补充蜕皮激素对生理和训练后的影响
IF 2.3 4区 化学 Q1 SOCIAL WORK Pub Date : 2024-08-02 DOI: 10.1002/cem.3594
Patrizia Leogrande, Daniel Jardines, Dayamin Martinez Brito, Xavier de la Torre, Francesco Botrè, Andreas Luch, Patrick Diel, Maria Kristina Parr

This work aims to characterize the serum profile of athletes after the administration of ecdysteroids, natural steroid hormones recently reported to enhance athletic performance. The combination of mass spectrometry and chemometric tools may allow to differentiate physiological effects from post-training and intake-driven effects. Serum samples were collected from 46 healthy male volunteers and divided into four groups: control (two capsules/day of Peak Ecdysone without training), placebo (two capsules without ecdysteroids with training), Ec1 (two capsules/day Peak Ecdysone with training), and Ec2 (eight capsules/day Peak Ecdysone with training). Metabolic profiling was measured using a SCIEX Triple Quadrupole LC-MS/MS system coupled with the Biocrates AbsoluteIDQ p180 kit, which allows quantitation of a large panel of metabolites that were subjected to multivariate analysis. Unsupervised analysis of the data found no significant differences between the placebo and the ecdysteroid supplementation groups. By merging Ec1 and Ec2 into a single group, coded as treated, a clear discrimination between the control and placebo groups was observed. Phosphatidylcholines were among the most significant features of ecdysteroids administration, showing a dose-dependent effect in Ec1 and Ec2 groups. As specific metabolic phenotypes can result from years of training, the discrimination of physiological effects from those caused by the administration of banned substances can be a valuable analytical strategy for the interpretation of adverse analytical findings in the anti-doping field.

这项研究旨在分析运动员服用蜕皮激素后血清的特征,据报道,这种天然类固醇激素可提高运动员的运动成绩。质谱法和化学计量学工具的结合可以区分生理效应和训练后效应以及摄入驱动效应。研究人员收集了 46 名健康男性志愿者的血清样本,并将其分为四组:对照组(每天两粒峰值蜕皮激素,不进行训练)、安慰剂组(每天两粒不含蜕皮激素的胶囊,进行训练)、Ec1 组(每天两粒峰值蜕皮激素,进行训练)和 Ec2 组(每天八粒峰值蜕皮激素,进行训练)。代谢谱分析使用 SCIEX 三重四极杆 LC-MS/MS 系统和 Biocrates AbsoluteIDQ p180 试剂盒进行测量。对数据的无监督分析发现,安慰剂组和补充蜕皮激素组之间没有明显差异。将蜕皮激素 1 和蜕皮激素 2 合并为一个组,编码为 "治疗组",可以明显区分对照组和安慰剂组。磷脂酰胆碱是服用蜕皮激素的最显著特征之一,在 Ec1 和 Ec2 组中显示出剂量依赖性效应。由于多年的训练可能会产生特定的代谢表型,因此将生理效应与服用禁用物质所产生的效应区分开来,是解释反兴奋剂领域不良分析结果的一种有价值的分析策略。
{"title":"Investigation of the Physiological and Post-training Effects of Ecdysteroid Supplementation by Multivariate Analysis of the Human Serum Metabolome","authors":"Patrizia Leogrande,&nbsp;Daniel Jardines,&nbsp;Dayamin Martinez Brito,&nbsp;Xavier de la Torre,&nbsp;Francesco Botrè,&nbsp;Andreas Luch,&nbsp;Patrick Diel,&nbsp;Maria Kristina Parr","doi":"10.1002/cem.3594","DOIUrl":"10.1002/cem.3594","url":null,"abstract":"<div>\u0000 \u0000 <p>This work aims to characterize the serum profile of athletes after the administration of ecdysteroids, natural steroid hormones recently reported to enhance athletic performance. The combination of mass spectrometry and chemometric tools may allow to differentiate physiological effects from post-training and intake-driven effects. Serum samples were collected from 46 healthy male volunteers and divided into four groups: control (two capsules/day of Peak Ecdysone without training), placebo (two capsules without ecdysteroids with training), Ec1 (two capsules/day Peak Ecdysone with training), and Ec2 (eight capsules/day Peak Ecdysone with training). Metabolic profiling was measured using a SCIEX Triple Quadrupole LC-MS/MS system coupled with the Biocrates AbsoluteIDQ p180 kit, which allows quantitation of a large panel of metabolites that were subjected to multivariate analysis. Unsupervised analysis of the data found no significant differences between the placebo and the ecdysteroid supplementation groups. By merging Ec1 and Ec2 into a single group, coded as treated, a clear discrimination between the control and placebo groups was observed. Phosphatidylcholines were among the most significant features of ecdysteroids administration, showing a dose-dependent effect in Ec1 and Ec2 groups. As specific metabolic phenotypes can result from years of training, the discrimination of physiological effects from those caused by the administration of banned substances can be a valuable analytical strategy for the interpretation of adverse analytical findings in the anti-doping field.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 11","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141881287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The MCR-ALS Trilinearity Constraint for Data With Missing Values 缺失值数据的 MCR-ALS 三线性约束
IF 2.3 4区 化学 Q1 SOCIAL WORK Pub Date : 2024-08-02 DOI: 10.1002/cem.3584
Adrián Gómez-Sánchez, Raffaele Vitale, Pablo Loza-Alvarez, Cyril Ruckebusch, Anna de Juan

Trilinearity is a property of some chemical data that leads to unique decompositions when curve resolution or multiway decomposition methods are used. Curve resolution algorithms, such as Multivariate Curve Resolution–Alternating Least Squares (MCR-ALS), can provide trilinear models by implementing the trilinearity condition as a constraint. However, some trilinear analytical measurements, such as excitation–emission matrix (EEM) measurements, usually exhibit systematic patterns of missing data due to the nature of the technique, which imply a challenge to the classical implementation of the trilinearity constraint. In this instance, extrapolation or imputation methodologies may not provide optimal results. Recently, a novel algorithmic strategy to constrain trilinearity in MCR-ALS in the presence of missing data was developed. This strategy relies on the sequential imposition of a classical trilinearity restriction on different submatrices of the original investigated dataset, but, although effective, was found to be particularly slow and requires a proper submatrix selection criterion. In this paper, a much simpler implementation of the trilinearity constraint in MCR-ALS capable of handling systematic patterns of missing data and based on the principles of the Nonlinear Iterative Partial Least Squares (NIPALS) algorithm is proposed. This novel approach preserves the trilinearity of the retrieved component profiles without requiring data imputation or subset selection steps and, as with all other constraints designed for MCR-ALS, offers the flexibility to be applied component-wise or data block-wise, providing hybrid bilinear/trilinear models. Furthermore, it can be easily extended to cope with any trilinear or higher-order dataset with whatever pattern of missing values.

三线性是某些化学数据的一个特性,在使用曲线解析或多向分解方法时会产生独特的分解。曲线解析算法,如多元曲线解析-替代最小二乘法(MCR-ALS),可以通过将三线性条件作为约束条件来提供三线性模型。然而,一些三线性分析测量,如激发-发射矩阵(EEM)测量,由于其技术性质,通常会出现系统性的数据缺失模式,这对经典的三线性约束条件的实现提出了挑战。在这种情况下,外推法或估算法可能无法提供最佳结果。最近,我们开发了一种新的算法策略,用于在存在缺失数据的情况下对 MCR-ALS 中的三线性进行约束。这种策略依赖于对原始调查数据集的不同子矩阵依次施加经典的三线性限制,但尽管有效,却发现速度特别慢,而且需要适当的子矩阵选择标准。本文根据非线性迭代部分最小二乘法(NIPALS)算法的原理,提出了一种更简单的 MCR-ALS 中三线性约束的实现方法,该方法能够处理系统性缺失数据模式。这种新颖的方法无需数据估算或子集选择步骤,就能保留检索到的成分剖面的三线性,而且与为 MCR-ALS 设计的所有其他约束一样,可以灵活地按成分或数据块应用,提供混合双线性/三线性模型。此外,它还可以很容易地扩展到任何三线性或高阶数据集,以应对任何缺失值模式。
{"title":"The MCR-ALS Trilinearity Constraint for Data With Missing Values","authors":"Adrián Gómez-Sánchez,&nbsp;Raffaele Vitale,&nbsp;Pablo Loza-Alvarez,&nbsp;Cyril Ruckebusch,&nbsp;Anna de Juan","doi":"10.1002/cem.3584","DOIUrl":"10.1002/cem.3584","url":null,"abstract":"<p>Trilinearity is a property of some chemical data that leads to unique decompositions when curve resolution or multiway decomposition methods are used. Curve resolution algorithms, such as Multivariate Curve Resolution–Alternating Least Squares (MCR-ALS), can provide trilinear models by implementing the trilinearity condition as a constraint. However, some trilinear analytical measurements, such as excitation–emission matrix (EEM) measurements, usually exhibit systematic patterns of missing data due to the nature of the technique, which imply a challenge to the classical implementation of the trilinearity constraint. In this instance, extrapolation or imputation methodologies may not provide optimal results. Recently, a novel algorithmic strategy to constrain trilinearity in MCR-ALS in the presence of missing data was developed. This strategy relies on the sequential imposition of a classical trilinearity restriction on different submatrices of the original investigated dataset, but, although effective, was found to be particularly slow and requires a proper submatrix selection criterion. In this paper, a much simpler implementation of the trilinearity constraint in MCR-ALS capable of handling systematic patterns of missing data and based on the principles of the Nonlinear Iterative Partial Least Squares (NIPALS) algorithm is proposed. This novel approach preserves the trilinearity of the retrieved component profiles without requiring data imputation or subset selection steps and, as with all other constraints designed for MCR-ALS, offers the flexibility to be applied component-wise or data block-wise, providing hybrid bilinear/trilinear models. Furthermore, it can be easily extended to cope with any trilinear or higher-order dataset with whatever pattern of missing values.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 11","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.3584","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141881346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
X-Ray Computed Tomography Meets Robust Chemometric Latent Space Modeling for Lean Meat Percentage Prediction in Pig Carcasses X 射线计算机断层扫描与用于猪胴体瘦肉率预测的鲁棒化学计量潜空间建模相结合
IF 2.3 4区 化学 Q1 SOCIAL WORK Pub Date : 2024-07-31 DOI: 10.1002/cem.3591
Puneet Mishra, Maria Font-i-Furnols

This study presents a case of processing X-ray computed tomography (CT) data for pork scans using chemometric latent space modeling. The distribution of voxel intensities is shown to exemplify a multivariate, multi-collinear signal mixture. While this concept is not novel, it is revisited here from a chemometric perspective. To extract meaningful information from such multivariate signals, latent space modeling based on partial least squares (PLS) is an ideal solution. Furthermore, a robust PLS approach is even more effective for latent space modeling, as it can extract latent spaces unaffected by outliers, thereby enhancing predictive modeling. As an example, lean meat percentage is predicted using X-ray CT data and robust PLS regression. This method is applicable to X-ray CT quantification analysis, particularly in cases where unclear, erroneous, and outlying observations are suspected in the data.

本研究介绍了利用化学计量潜空间建模处理猪肉扫描的 X 射线计算机断层扫描(CT)数据的案例。研究表明,体素强度的分布是多变量、多共线性信号混合物的典范。虽然这一概念并不新颖,但本文从化学计量学的角度对其进行了重新审视。要从这种多变量信号中提取有意义的信息,基于偏最小二乘法(PLS)的潜在空间建模是一种理想的解决方案。此外,稳健的偏最小二乘法对潜在空间建模更为有效,因为它可以提取不受异常值影响的潜在空间,从而增强预测建模能力。例如,利用 X 射线 CT 数据和稳健 PLS 回归预测瘦肉率。这种方法适用于 X 射线 CT 定量分析,特别是在怀疑数据中存在不清晰、错误和离群观测值的情况下。
{"title":"X-Ray Computed Tomography Meets Robust Chemometric Latent Space Modeling for Lean Meat Percentage Prediction in Pig Carcasses","authors":"Puneet Mishra,&nbsp;Maria Font-i-Furnols","doi":"10.1002/cem.3591","DOIUrl":"10.1002/cem.3591","url":null,"abstract":"<p>This study presents a case of processing X-ray computed tomography (CT) data for pork scans using chemometric latent space modeling. The distribution of voxel intensities is shown to exemplify a multivariate, multi-collinear signal mixture. While this concept is not novel, it is revisited here from a chemometric perspective. To extract meaningful information from such multivariate signals, latent space modeling based on partial least squares (PLS) is an ideal solution. Furthermore, a robust PLS approach is even more effective for latent space modeling, as it can extract latent spaces unaffected by outliers, thereby enhancing predictive modeling. As an example, lean meat percentage is predicted using X-ray CT data and robust PLS regression. This method is applicable to X-ray CT quantification analysis, particularly in cases where unclear, erroneous, and outlying observations are suspected in the data.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 10","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.3591","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141863474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Chemometrics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1