Pub Date : 2024-07-16DOI: 10.1016/j.chemolab.2024.105178
Zhenghui Feng , Hanli Jiang , Ruiqi Lin , Wanying Mu
With the advancement of data science and technology, the complexity and diversity of data have increased. Challenges arise when dealing with a larger number of variables than the sample size or the presence of multicollinearity due to strong correlations among variables. In this paper, we propose a moving window sparse partial least squares method that combines the sliding interval technique with sparse partial least squares. By utilizing sliding interval partial least squares regression to identify the optimal interval and incorporating sparse partial least squares for variable selection, the proposed method offers innovations compared to traditional partial least squares (PLS) approaches. Monte Carlo simulations demonstrate its performance in variable selection and model prediction. We apply the method to seawater spectral data, predicting the concentration of chemical Oxygen demand. The results show that the method not only selects reasonable spectral wavelength intervals but also enhances predictive performance.
{"title":"Moving window sparse partial least squares method and its application in spectral data","authors":"Zhenghui Feng , Hanli Jiang , Ruiqi Lin , Wanying Mu","doi":"10.1016/j.chemolab.2024.105178","DOIUrl":"10.1016/j.chemolab.2024.105178","url":null,"abstract":"<div><p>With the advancement of data science and technology, the complexity and diversity of data have increased. Challenges arise when dealing with a larger number of variables than the sample size or the presence of multicollinearity due to strong correlations among variables. In this paper, we propose a moving window sparse partial least squares method that combines the sliding interval technique with sparse partial least squares. By utilizing sliding interval partial least squares regression to identify the optimal interval and incorporating sparse partial least squares for variable selection, the proposed method offers innovations compared to traditional partial least squares (PLS) approaches. Monte Carlo simulations demonstrate its performance in variable selection and model prediction. We apply the method to seawater spectral data, predicting the concentration of chemical Oxygen demand. The results show that the method not only selects reasonable spectral wavelength intervals but also enhances predictive performance.</p></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"252 ","pages":"Article 105178"},"PeriodicalIF":3.7,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141693158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-15DOI: 10.1016/j.chemolab.2024.105179
Aline Emmer Ferreira Furman , Alexandre de Fátima Cobre , Dile Pontarolo Stremel , Roberto Pontarolo
Diabetes and dyslipidemia are well-established risk factors for cardiovascular disease, which is the primary cause of death both in Brazil and globally. Fourier-transform mid-infrared spectroscopy (FTIR-MIR) generates spectral fingerprints of biomolecules, allowing for correlation with metabolic changes, while remaining a rapid, non-invasive, and non-destructive method. The study provided a proof of concept for the effectiveness of FTIR-MIR in screening diabetes, pre-diabetes, hypercholesterolemia, hypertriglyceridemia, and mixed dyslipidemia in blood serum. After acquiring mid-infrared spectra of 60 human serum samples, both unsupervised and supervised analysis models were developed. Principal component analysis (PCA) was used for pattern recognition and to determine how closely related the samples were based on their spectral profiles. The results obtained by the supervised models showed a clear discriminative ability to distinguish both diabetic and dyslipidemic samples from healthy subjects by multivariate analysis performed on FTIR-MIR spectra. High accuracy rates of more than 90 % were achieved for diabetes and dyslipidemia diagnosis with PLS-DA. Dyslipidemia type discrimination could be attributed mainly to the amide I region [1720-1600 cm−1, (ν(CO)] and altered lipid concentration in the 3000-2800 cm−1 region, whereas the discrimination of diabetes and prediabetes was primarily due to the altered conformational protein in the Amides I [1720-1600 cm−1, ν(CO)] and Amide II [1570-1480 cm−1, δ(NH) + ν(CH)] range.
{"title":"A new and fast method for diabetes and dyslipidemia diagnosis using FTIR-MIR, spectroscopy and multivariate data analysis: A proof of concept","authors":"Aline Emmer Ferreira Furman , Alexandre de Fátima Cobre , Dile Pontarolo Stremel , Roberto Pontarolo","doi":"10.1016/j.chemolab.2024.105179","DOIUrl":"10.1016/j.chemolab.2024.105179","url":null,"abstract":"<div><p>Diabetes and dyslipidemia are well-established risk factors for cardiovascular disease, which is the primary cause of death both in Brazil and globally. Fourier-transform mid-infrared spectroscopy (FTIR-MIR) generates spectral fingerprints of biomolecules, allowing for correlation with metabolic changes, while remaining a rapid, non-invasive, and non-destructive method. The study provided a proof of concept for the effectiveness of FTIR-MIR in screening diabetes, pre-diabetes, hypercholesterolemia, hypertriglyceridemia, and mixed dyslipidemia in blood serum. After acquiring mid-infrared spectra of 60 human serum samples, both unsupervised and supervised analysis models were developed. Principal component analysis (PCA) was used for pattern recognition and to determine how closely related the samples were based on their spectral profiles. The results obtained by the supervised models showed a clear discriminative ability to distinguish both diabetic and dyslipidemic samples from healthy subjects by multivariate analysis performed on FTIR-MIR spectra. High accuracy rates of more than 90 % were achieved for diabetes and dyslipidemia diagnosis with PLS-DA. Dyslipidemia type discrimination could be attributed mainly to the amide I region [1720-1600 cm<sup>−1</sup>, (ν(C<img>O)] and altered lipid concentration in the 3000-2800 cm<sup>−1</sup> region, whereas the discrimination of diabetes and prediabetes was primarily due to the altered conformational protein in the Amides I [1720-1600 cm<sup>−1</sup>, ν(C<img>O)] and Amide II [1570-1480 cm<sup>−1</sup>, δ(N<img>H) + ν(CH)] range.</p></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"252 ","pages":"Article 105179"},"PeriodicalIF":3.7,"publicationDate":"2024-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141701142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-14DOI: 10.1016/j.chemolab.2024.105175
Jun Tian , Ming Li , Zhiyi Tan , Meng Lei , Lin Ke , Liang Zou
The rapid and non-destructive measurement of coal moisture content is essential in the coal industry for production, transportation and utilization purposes. Existing measurement methods have still drawbacks, such as being time-consuming, producing destructive samples and yielding unstable outcomes. To address these issues, this paper explored the utilization of broadband microwave spectrum for intelligent coal moisture measurement. A multi-type outliers detection method based on the Monte-Carlo cross-validation (MCCV) strategy was used to prevent masking effect of microwave spectra. In order to effectively extract microwave spectral features and establish correlations with coal moisture, a novel neural network model, UC-PLSR, is proposed by combining U-Net, Convolutional Block Attention Module (CBAM) and Partial Least Squares Regression (PLSR) algorithm. Furthermore, a design scheme/case of a microwave measurement device for coal moisture was presented, offering guidance for the development of rapid coal moisture measurement instruments or on-site measurement systems. Experimental results demonstrated that the proposed model outperformed traditional chemometrics methods, achieving superior prediction accuracy and generalization capability with = 0.8756, MAE = 1.2523 and RMSE=1.6560.
{"title":"Intelligent non-destructive measurement of coal moisture via microwave spectroscopy and chemometrics","authors":"Jun Tian , Ming Li , Zhiyi Tan , Meng Lei , Lin Ke , Liang Zou","doi":"10.1016/j.chemolab.2024.105175","DOIUrl":"10.1016/j.chemolab.2024.105175","url":null,"abstract":"<div><p>The rapid and non-destructive measurement of coal moisture content is essential in the coal industry for production, transportation and utilization purposes. Existing measurement methods have still drawbacks, such as being time-consuming, producing destructive samples and yielding unstable outcomes. To address these issues, this paper explored the utilization of broadband microwave spectrum for intelligent coal moisture measurement. A multi-type outliers detection method based on the Monte-Carlo cross-validation (MCCV) strategy was used to prevent masking effect of microwave spectra. In order to effectively extract microwave spectral features and establish correlations with coal moisture, a novel neural network model, UC-PLSR, is proposed by combining U-Net, Convolutional Block Attention Module (CBAM) and Partial Least Squares Regression (PLSR) algorithm. Furthermore, a design scheme/case of a microwave measurement device for coal moisture was presented, offering guidance for the development of rapid coal moisture measurement instruments or on-site measurement systems. Experimental results demonstrated that the proposed model outperformed traditional chemometrics methods, achieving superior prediction accuracy and generalization capability with <span><math><msup><mrow><mi>R</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span> = 0.8756, MAE = 1.2523 and RMSE=1.6560.</p></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"252 ","pages":"Article 105175"},"PeriodicalIF":3.7,"publicationDate":"2024-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141638995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ion mobility spectrometry (IMS) is a promising analytical technique for mass spectrometry (MS)-based compound identification by providing collision cross-section (CCS) value as an additional dimension with structural information. Here, GraphCCS was proposed to accurately predict the CCS value and expand the coverage of CCS libraries. A new adduct encoding method was proposed to encode SMILES strings and adduct types of compounds into adduct graphs. GraphCCS extended its predictive capability to ten different adduct types. A very deep graph convolutional network with up to 40 GCN layers was built to predict CCS values from adduct graphs. A curated dataset with 12,775 experimental CCS values was used to train, validate, and test the GraphCCS model. The resulting CCS predictions achieved a median relative error (MedRE) of 0.94 % and a coefficient of determination (R2) of 0.994 on the test set. Results on external test sets showed that GraphCCS outperformed AllCCS2, CCSbase, SigmaCCS, and DeepCCS. Based on the developed GraphCCS method, a large-scale in-silico database was built, including 2,394,468 CCS values. Those CCS values can be used to filter false positives complementary to retention times and tandem mass spectra. Finally, the effectiveness of GraphCCS in assisting compound identification was tested on a mouse adrenal gland lipid dataset with 1,960 lipids. The results demonstrated that the in-silico CCS values combined with MS spectra and retention times can efficiently filter the false positive candidates.
{"title":"Large-scale prediction of collision cross-section with very deep graph convolutional network for small molecule identification","authors":"Ting Xie, Qiong Yang, Jinyu Sun, Hailiang Zhang, Yue Wang, Zhimin Zhang, Hongmei Lu","doi":"10.1016/j.chemolab.2024.105177","DOIUrl":"10.1016/j.chemolab.2024.105177","url":null,"abstract":"<div><p>Ion mobility spectrometry (IMS) is a promising analytical technique for mass spectrometry (MS)-based compound identification by providing collision cross-section (CCS) value as an additional dimension with structural information. Here, GraphCCS was proposed to accurately predict the CCS value and expand the coverage of CCS libraries. A new adduct encoding method was proposed to encode SMILES strings and adduct types of compounds into adduct graphs. GraphCCS extended its predictive capability to ten different adduct types. <strong>A very deep graph convolutional network with up to 40 GC</strong><strong>N layers</strong> was built to predict CCS values from adduct graphs. A curated dataset with 12,775 experimental CCS values was used to train, validate, and test the GraphCCS model. The resulting CCS predictions achieved a median relative error (MedRE) of 0.94 % and a coefficient of determination (R<sup>2</sup>) of 0.994 on the test set. Results on external test sets showed that GraphCCS outperformed AllCCS2, CCSbase, SigmaCCS, and DeepCCS. Based on the developed GraphCCS method, a large-scale <em>in-silico</em> database was built, including 2,394,468 CCS values. Those CCS values can be used to filter false positives complementary to retention times and tandem mass spectra. Finally, the effectiveness of GraphCCS in assisting compound identification was tested on a mouse adrenal gland lipid dataset with 1,960 lipids. The results demonstrated that the <em>in-silico</em> CCS values combined with MS spectra and retention times can efficiently filter the false positive candidates.</p></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"252 ","pages":"Article 105177"},"PeriodicalIF":3.7,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141622400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-09DOI: 10.1016/j.chemolab.2024.105176
Ali R. Jalalvand , Sara Chamandoost , Soheila Mohammadi , Cyrus Jalili , Sajad Fakhri
In this work, a novel biosensing platform was fabricated based on modification of a rotating glassy carbon electrode (GCE) with chitosan-ionic liquid (Ch-IL) composite film, electrochemical synthesis of gold palladium platinum trimetallic three metallic alloy nanoparticles (AuPtPd NPs) onto its surface, and electrosynthesis of dual templates molecularly imprinted polymers (MIPs) where morphine (MO) and codeine (COD) used as template molecules. The AuPtPd NPs were synthesized under different electrochemical conditions, and surfaces of electrodes were investigated by digital image processing, and the best electrode was chosen. Effects of experimental variables on response of the biosensor to MO and COD were optimized by a central composite design (CCD), and under optimized conditions (concentration of the phosphate buffered solution (PBS): 0.09 M, pH of the PBS: 3.21–3.2, time of immersion: 204.8–205 s, and rotation rate: 2993.51–3000 rpm) the biosensor responses to MO and COD were individually calibrated (1–20 pM for MO and 0.5–12 pM for COD), three-way calibrated by PARASIAS, PARAFAC2, and MCR-ALS, and validated in the presence of ascorbic acid and uric acid as uncalibrated interference. Finally, performance of the biosensor in simultaneous determination of MO and COD in the presence of ascorbic acid and uric acid as uncalibrated interference in human serum samples were verified and compared with the results of HPLC-UV as the reference method which guaranteed it as a reliable method.
{"title":"Engagement of computerized and electrochemical methods to develop a novel and intelligent electronic device for detection of heroin abuse","authors":"Ali R. Jalalvand , Sara Chamandoost , Soheila Mohammadi , Cyrus Jalili , Sajad Fakhri","doi":"10.1016/j.chemolab.2024.105176","DOIUrl":"10.1016/j.chemolab.2024.105176","url":null,"abstract":"<div><p>In this work, a novel biosensing platform was fabricated based on modification of a rotating glassy carbon electrode (GCE) with chitosan-ionic liquid (Ch-IL) composite film, electrochemical synthesis of gold palladium platinum trimetallic three metallic alloy nanoparticles (AuPtPd NPs) onto its surface, and electrosynthesis of dual templates molecularly imprinted polymers (MIPs) where morphine (MO) and codeine (COD) used as template molecules. The AuPtPd NPs were synthesized under different electrochemical conditions, and surfaces of electrodes were investigated by digital image processing, and the best electrode was chosen. Effects of experimental variables on response of the biosensor to MO and COD were optimized by a central composite design (CCD), and under optimized conditions (concentration of the phosphate buffered solution (PBS): 0.09 M, pH of the PBS: 3.21–3.2, time of immersion: 204.8–205 s, and rotation rate: 2993.51–3000 rpm) the biosensor responses to MO and COD were individually calibrated (1–20 pM for MO and 0.5–12 pM for COD), three-way calibrated by PARASIAS, PARAFAC2, and MCR-ALS, and validated in the presence of ascorbic acid and uric acid as uncalibrated interference. Finally, performance of the biosensor in simultaneous determination of MO and COD in the presence of ascorbic acid and uric acid as uncalibrated interference in human serum samples were verified and compared with the results of HPLC-UV as the reference method which guaranteed it as a reliable method.</p></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"251 ","pages":"Article 105176"},"PeriodicalIF":3.7,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141566833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Characterizing sample composition and visualizing the distribution of its chemical compounds is a prominent topic in various research and applied fields. Integrating spatial and spectral information, hyperspectral imaging (HSI) plays a pivotal role in this pursuit. While self-modelling curve resolution techniques, like multivariate curve resolution - alternating least squares (MCR-ALS), and clustering methods, such as K-means, are widely used for HSI data analysis, their effectiveness in complex scenarios, where the structure of the data deviates from the models’ assumptions, deserves further investigation. The choice of a data analysis method is most often driven by research question at hand and prior knowledge of the sample. However, overlooking the structure of the investigated data, i.e. linearity, geometry, homogeneity, might lead to erroneous or biased results. Here, we propose an exploratory data analysis approach, based on the geometry of the data points cloud, to investigate the structure of HSI datasets and extract their main characteristics, providing insight into the results obtained by the above-mentioned methods. We employ the principle of essential information to extract archetype (most linearly dissimilar) spectra and archetype single-wavelength images. These spectra and images are then discussed and contrasted with MCR-ALS and K-means clustering results. Two datasets with varying characteristics and complexities were investigated: a powder mixture analyzed with Raman spectroscopy and a mineral sample analyzed with Laser Induced Breakdown Spectroscopy (LIBS). We show that the proposed approach enables to summarize the main characteristics of hyperspectral imaging data and provides a more accurate understanding of the results obtained by traditional data modelling methods, driving the choice of the most suitable one.
{"title":"Exploratory analysis of hyperspectral imaging data","authors":"Alessandra Olarini , Marina Cocchi , Vincent Motto-Ros , Ludovic Duponchel , Cyril Ruckebusch","doi":"10.1016/j.chemolab.2024.105174","DOIUrl":"10.1016/j.chemolab.2024.105174","url":null,"abstract":"<div><p>Characterizing sample composition and visualizing the distribution of its chemical compounds is a prominent topic in various research and applied fields. Integrating spatial and spectral information, hyperspectral imaging (HSI) plays a pivotal role in this pursuit. While self-modelling curve resolution techniques, like multivariate curve resolution - alternating least squares (MCR-ALS), and clustering methods, such as K-means, are widely used for HSI data analysis, their effectiveness in complex scenarios, where the structure of the data deviates from the models’ assumptions, deserves further investigation. The choice of a data analysis method is most often driven by research question at hand and prior knowledge of the sample. However, overlooking the structure of the investigated data, i.e. linearity, geometry, homogeneity, might lead to erroneous or biased results. Here, we propose an exploratory data analysis approach, based on the geometry of the data points cloud, to investigate the structure of HSI datasets and extract their main characteristics, providing insight into the results obtained by the above-mentioned methods. We employ the principle of essential information to extract archetype (most linearly dissimilar) spectra and archetype single-wavelength images. These spectra and images are then discussed and contrasted with MCR-ALS and K-means clustering results. Two datasets with varying characteristics and complexities were investigated: a powder mixture analyzed with Raman spectroscopy and a mineral sample analyzed with Laser Induced Breakdown Spectroscopy (LIBS). We show that the proposed approach enables to summarize the main characteristics of hyperspectral imaging data and provides a more accurate understanding of the results obtained by traditional data modelling methods, driving the choice of the most suitable one.</p></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"252 ","pages":"Article 105174"},"PeriodicalIF":3.7,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S016974392400114X/pdfft?md5=fc1e3ebcd612aa27333c2ec8738aca2e&pid=1-s2.0-S016974392400114X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141638994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-03DOI: 10.1016/j.chemolab.2024.105170
Mia Hubert, Mehdi Hirari
Multiway data extend two-way matrices into higher-dimensional tensors, often explored through dimensional reduction techniques. In this paper, we study the Parallel Factor Analysis (PARAFAC) model for handling multiway data, representing it more compactly through a concise set of loading matrices and scores. We assume that the data may be incomplete and could contain both rowwise and cellwise outliers, signifying cases that deviate from the majority and outlying cells dispersed throughout the data array. To address these challenges, we present a novel algorithm designed to robustly estimate both loadings and scores. Additionally, we introduce an enhanced outlier map to distinguish various patterns of outlying behavior. Through simulations and the analysis of fluorescence Excitation-Emission Matrix (EEM) data, we demonstrate the robustness of our approach. Our results underscore the effectiveness of diagnostic tools in identifying and interpreting unusual patterns within the data.
{"title":"MacroPARAFAC for handling rowwise and cellwise outliers in incomplete multiway data","authors":"Mia Hubert, Mehdi Hirari","doi":"10.1016/j.chemolab.2024.105170","DOIUrl":"10.1016/j.chemolab.2024.105170","url":null,"abstract":"<div><p>Multiway data extend two-way matrices into higher-dimensional tensors, often explored through dimensional reduction techniques. In this paper, we study the Parallel Factor Analysis (PARAFAC) model for handling multiway data, representing it more compactly through a concise set of loading matrices and scores. We assume that the data may be incomplete and could contain both rowwise and cellwise outliers, signifying cases that deviate from the majority and outlying cells dispersed throughout the data array. To address these challenges, we present a novel algorithm designed to robustly estimate both loadings and scores. Additionally, we introduce an enhanced outlier map to distinguish various patterns of outlying behavior. Through simulations and the analysis of fluorescence Excitation-Emission Matrix (EEM) data, we demonstrate the robustness of our approach. Our results underscore the effectiveness of diagnostic tools in identifying and interpreting unusual patterns within the data.</p></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"251 ","pages":"Article 105170"},"PeriodicalIF":3.7,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141566715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-01DOI: 10.1016/j.chemolab.2024.105173
Xueping Yang , Fuyu Yang , Matthieu Lesnoff , Paolo Berzaghi , Alessandro Ferragina
This study aimed to assess the predictive accuracy of Near-Infrared Spectroscopy (NIRS) across a large multi-product library, employing novel local calibration methodologies. Three local strategies were examined: LOCAL Algorithm, Locally Weighted Regression predicted on k-nearest neighbor selection (kNN-LWPLSR), along with a newly proposed algorithm within this study called Hybrid Local. These strategies were applied to an extensive multi-product dataset. When compared with Global PLS models, the results exhibited significant reductions in RMSEP values for all local strategies. Particularly, the kNN-LWPLSR demonstrated proficient prediction for the constituents of ADF and DM. The newly proposed method [Hybrid Local] exhibits comparable performance to the LOCAL Algorithm; however, it notably reduces the prediction time by half compared to the latter, representing a significant advancement for the practical implementation of NIRS technology within industrial processing scenarios.
本研究旨在采用新颖的局部校准方法,评估近红外光谱(NIRS)在大型多产品库中的预测准确性。研究考察了三种局部策略:LOCAL 算法、基于 k 近邻选择的局部加权回归预测 (kNN-LWPLSR) 以及本研究中新提出的混合局部算法。这些策略被应用于一个广泛的多产品数据集。与全局 PLS 模型相比,所有本地策略的 RMSEP 值都有显著降低。特别是,kNN-LWPLSR 对 ADF 和 DM 的成分进行了出色的预测。新提出的[混合本地]方法与 LOCAL 算法的性能相当,但与后者相比,它明显缩短了一半的预测时间,这对于在工业加工场景中实际应用近红外光谱技术来说是一个重大进步。
{"title":"Diverse local calibration approaches for chemometric predictive analysis of large near-infrared spectroscopy (NIRS) multi-product datasets","authors":"Xueping Yang , Fuyu Yang , Matthieu Lesnoff , Paolo Berzaghi , Alessandro Ferragina","doi":"10.1016/j.chemolab.2024.105173","DOIUrl":"https://doi.org/10.1016/j.chemolab.2024.105173","url":null,"abstract":"<div><p>This study aimed to assess the predictive accuracy of Near-Infrared Spectroscopy (NIRS) across a large multi-product library, employing novel local calibration methodologies. Three local strategies were examined: LOCAL Algorithm, Locally Weighted Regression predicted on k-nearest neighbor selection (kNN-LWPLSR), along with a newly proposed algorithm within this study called Hybrid Local. These strategies were applied to an extensive multi-product dataset. When compared with Global PLS models, the results exhibited significant reductions in RMSEP values for all local strategies. Particularly, the kNN-LWPLSR demonstrated proficient prediction for the constituents of ADF and DM. The newly proposed method [Hybrid Local] exhibits comparable performance to the LOCAL Algorithm; however, it notably reduces the prediction time by half compared to the latter, representing a significant advancement for the practical implementation of NIRS technology within industrial processing scenarios.</p></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"251 ","pages":"Article 105173"},"PeriodicalIF":3.7,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0169743924001138/pdfft?md5=115b1d8cf3d3927fcd4a4da98b29f3e1&pid=1-s2.0-S0169743924001138-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141539174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-26DOI: 10.1016/j.chemolab.2024.105171
C. Ortiz-Abellán , E. Aguado-Sarrió , J.M. Prats-Montalbán , J. Camps-Herrero , A. Ferrer
Currently, magnetic resonance imaging is the most sensitive imaging technique for detecting cancerous processes in early stages. As for breast cancer, due to the tubular structure of the tissue, being formed by ducts, anisotropic diffusion should be considered instead of the general isotropic diffusion. Anisotropic diffusion is studied by applying a technique called Diffusion Tensor Imaging (DTI), where the diffusion gradient is applied by changing the magnetic field in several spatial directions.
To date, the application of Multivariate Curve Resolution (MCR) models in diffusion sequences has demonstrated its ability to develop cancer biomarkers of easy clinical interpretation in the case of isotropic tissues, such as the prostate. But so far, it has never been applied in the case of anisotropic tissues, as the breast.
Therefore, the main objective of this work is to obtain easy-to-interpret imaging biomarkers useful for early breast cancer diagnosis from diffusion magnetic resonance imaging based on the Diffusion Tensor using multivariate curve resolution (MCR) models. A classification model to identify healthy and tumor affected pixels is also proposed.
{"title":"New breast cancer biomarkers from diffusion magnetic resonance imaging based on the Diffusion Tensor using multivariate curve resolution (MCR) models","authors":"C. Ortiz-Abellán , E. Aguado-Sarrió , J.M. Prats-Montalbán , J. Camps-Herrero , A. Ferrer","doi":"10.1016/j.chemolab.2024.105171","DOIUrl":"https://doi.org/10.1016/j.chemolab.2024.105171","url":null,"abstract":"<div><p>Currently, magnetic resonance imaging is the most sensitive imaging technique for detecting cancerous processes in early stages. As for breast cancer, due to the tubular structure of the tissue, being formed by ducts, anisotropic diffusion should be considered instead of the general isotropic diffusion. Anisotropic diffusion is studied by applying a technique called Diffusion Tensor Imaging (DTI), where the diffusion gradient is applied by changing the magnetic field in several spatial directions.</p><p>To date, the application of Multivariate Curve Resolution (MCR) models in diffusion sequences has demonstrated its ability to develop cancer biomarkers of easy clinical interpretation in the case of isotropic tissues, such as the prostate. But so far, it has never been applied in the case of anisotropic tissues, as the breast.</p><p>Therefore, the main objective of this work is to obtain easy-to-interpret imaging biomarkers useful for early breast cancer diagnosis from diffusion magnetic resonance imaging based on the Diffusion Tensor using multivariate curve resolution (MCR) models. A classification model to identify healthy and tumor affected pixels is also proposed.</p></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"251 ","pages":"Article 105171"},"PeriodicalIF":3.7,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0169743924001114/pdfft?md5=bfa9e402dd60fbdcd42e8d99cb32d250&pid=1-s2.0-S0169743924001114-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141606715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-26DOI: 10.1016/j.chemolab.2024.105172
Miguel Mengual-Pujante , Antonio J. Perán , Antonio Ortiz , María Dolores Pérez-Cárceles
Blood in the form of stains is one of the most frequently encountered fluid in crime scene. Estimation of the time since deposition (TSD) is of great importance to guide the police investigation and the clarification of criminal offences. The time elapsed since deposition is usually estimated by modelling the physicochemical degradation of blood biomolecules over time. This work shows an ATR-FTIR spectroscopy and chemometrics study to estimate TSD of bloodstains on various surfaces and under different ambient conditions (indoor and outdoor). For a period from 0 to 212 days, a total of 960 stains were analyzed. Most of the eleven partial least squares regression (PLSR) models obtained showed a good prediction capacity, with a Residual Predictive Deviation (RPD) value higher than 3, and R2 higher than 0.90. Models for non-rigid supports showed better predictive capacity than those for rigid ones. A non-rigid surface model including the various non-rigid surfaces and ambient conditions was elaborated, which might be the most useful one from the criminalistic point of view. These results show that this technique can be a rapid, robust, and trustable tool for in situ determination of the TSD of bloodstains at crime scenes.
{"title":"Estimation of human bloodstains time since deposition using ATR-FTIR spectroscopy and chemometrics in simulated crime conditions","authors":"Miguel Mengual-Pujante , Antonio J. Perán , Antonio Ortiz , María Dolores Pérez-Cárceles","doi":"10.1016/j.chemolab.2024.105172","DOIUrl":"https://doi.org/10.1016/j.chemolab.2024.105172","url":null,"abstract":"<div><p>Blood in the form of stains is one of the most frequently encountered fluid in crime scene. Estimation of the time since deposition (TSD) is of great importance to guide the police investigation and the clarification of criminal offences. The time elapsed since deposition is usually estimated by modelling the physicochemical degradation of blood biomolecules over time. This work shows an ATR-FTIR spectroscopy and chemometrics study to estimate TSD of bloodstains on various surfaces and under different ambient conditions (indoor and outdoor). For a period from 0 to 212 days, a total of 960 stains were analyzed. Most of the eleven partial least squares regression (PLSR) models obtained showed a good prediction capacity, with a Residual Predictive Deviation (RPD) value higher than 3, and R<sup>2</sup> higher than 0.90. Models for non-rigid supports showed better predictive capacity than those for rigid ones. A non-rigid surface model including the various non-rigid surfaces and ambient conditions was elaborated, which might be the most useful one from the criminalistic point of view. These results show that this technique can be a rapid, robust, and trustable tool for <em>in situ</em> determination of the TSD of bloodstains at crime scenes.</p></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"251 ","pages":"Article 105172"},"PeriodicalIF":3.7,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0169743924001126/pdfft?md5=12868d33bb0a44826b6ab904bb81dcbd&pid=1-s2.0-S0169743924001126-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141487277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}