Pub Date : 2025-11-30DOI: 10.1016/j.chemolab.2025.105593
Shijie Zhu , Qi Zhang , Shuai Li , Yang Fu , Dongni Jia , Yigeng Wang
Causality mining plays a crucial role in monitoring complex industrial processes. However, incomplete extraction of quality related information may lead to a reduced monitoring accuracy rate for quality related faults, while uncertain causal relationships during root variable mining can further result in wrong fault diagnosis outcomes. To address these problems, we decompose the causal relationships between variables into synergistic and unique ones and further propose an integrated monitoring and diagnosis approach for industrial processes based on causality synergistic and unique decomposition. Firstly, we use Granger causality to preliminarily identify quality-related features and enhance the extraction of quality related features via the synergistic effect of causal relationships for addressing their complex interdependence. Secondly, due to the synergistic causality among variables between variable groups, it is necessary to capture and model their dynamic characteristics to ensure monitoring accuracy. We extend quality variable fault monitoring to process variables and further achieve integrated monitoring. Finally, we explore causal uniqueness to identify the fault root cause, which is key to achieving precise and rapid diagnosis in complex and uncertain industrial processes. The feasibility and effectiveness of the proposed method were validated in two scenarios: the benchmark Tennessee Eastman (TE) chemical process and an industrial case study of poor iron ore beneficiation.
{"title":"Integrated monitoring and diagnosis of industrial processes based on causality synergistic and unique decomposition","authors":"Shijie Zhu , Qi Zhang , Shuai Li , Yang Fu , Dongni Jia , Yigeng Wang","doi":"10.1016/j.chemolab.2025.105593","DOIUrl":"10.1016/j.chemolab.2025.105593","url":null,"abstract":"<div><div>Causality mining plays a crucial role in monitoring complex industrial processes. However, incomplete extraction of quality related information may lead to a reduced monitoring accuracy rate for quality related faults, while uncertain causal relationships during root variable mining can further result in wrong fault diagnosis outcomes. To address these problems, we decompose the causal relationships between variables into synergistic and unique ones and further propose an integrated monitoring and diagnosis approach for industrial processes based on causality synergistic and unique decomposition. Firstly, we use Granger causality to preliminarily identify quality-related features and enhance the extraction of quality related features via the synergistic effect of causal relationships for addressing their complex interdependence. Secondly, due to the synergistic causality among variables between variable groups, it is necessary to capture and model their dynamic characteristics to ensure monitoring accuracy. We extend quality variable fault monitoring to process variables and further achieve integrated monitoring. Finally, we explore causal uniqueness to identify the fault root cause, which is key to achieving precise and rapid diagnosis in complex and uncertain industrial processes. The feasibility and effectiveness of the proposed method were validated in two scenarios: the benchmark Tennessee Eastman (TE) chemical process and an industrial case study of poor iron ore beneficiation.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"269 ","pages":"Article 105593"},"PeriodicalIF":3.8,"publicationDate":"2025-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145682909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-30DOI: 10.1016/j.chemolab.2025.105611
Debanjana Ghosh , Debangana Das , Shreya Nag , Runu Banerjee Roy
—Black tea processing involves variation in phytochemical constituents through multiple stages, with the tea quality index varying according to these biomarkers. In this treatise, modified classifier models were used to monitor two key biomarkers, catechin and epigallocatechin gallate (EGCG), throughout the five distinct stages of black tea processing for tea quality estimation. Catechin and EGCG selective molecularly imprinted polymer (MIP) electrodes were prepared for differential pulse voltammetry (DPV) responses at five different processing stages of black tea. The DPV responses were analyzed for the discrimination of the processing stages based on the content of catechin and EGCG using a stacked model incorporating four classification algorithms—Random forest, K-nearest neighbors, Gaussian Naive Bayes (NB), and Gradient boosting and an Artificial Neural Network (ANN) classifier model. The proposed models exhibited satisfactory performance in classifying five different stages of fermentation for four different tea samples, with accuracies of 98 % for catechin and 95 % for EGCG. Principal Component Analysis (PCA) plots show the capability of the sensors to identify each stage of tea processing as a distinct cluster. The sensor response also exhibited a consistent pattern of change in catechin and EGCG contents across various stages of tea processing.
{"title":"Online monitoring of phytochemical dynamics in black tea processing using MIP-driven classifier models","authors":"Debanjana Ghosh , Debangana Das , Shreya Nag , Runu Banerjee Roy","doi":"10.1016/j.chemolab.2025.105611","DOIUrl":"10.1016/j.chemolab.2025.105611","url":null,"abstract":"<div><div>—Black tea processing involves variation in phytochemical constituents through multiple stages, with the tea quality index varying according to these biomarkers. In this treatise, modified classifier models were used to monitor two key biomarkers, catechin and epigallocatechin gallate (EGCG), throughout the five distinct stages of black tea processing for tea quality estimation. Catechin and EGCG selective molecularly imprinted polymer (MIP) electrodes were prepared for differential pulse voltammetry (DPV) responses at five different processing stages of black tea. The DPV responses were analyzed for the discrimination of the processing stages based on the content of catechin and EGCG using a stacked model incorporating four classification algorithms—Random forest, K-nearest neighbors, Gaussian Naive Bayes (NB), and Gradient boosting and an Artificial Neural Network (ANN) classifier model. The proposed models exhibited satisfactory performance in classifying five different stages of fermentation for four different tea samples, with accuracies of 98 % for catechin and 95 % for EGCG. Principal Component Analysis (PCA) plots show the capability of the sensors to identify each stage of tea processing as a distinct cluster. The sensor response also exhibited a consistent pattern of change in catechin and EGCG contents across various stages of tea processing.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"269 ","pages":"Article 105611"},"PeriodicalIF":3.8,"publicationDate":"2025-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145682775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-29DOI: 10.1016/j.chemolab.2025.105603
Jan P.M. Andries , Gerjen H. Tinnevelt , Yvan Vander Heyden
The well-known Uninformative-Variable Elimination for Partial Least Squares, denoted as UVE-PLS, is not reproducible regarding the selected variables. Additionally, in UVE, variables are selected in the first minimum of the graph of the root mean squared error of cross validation (RMSECV) against the number of retained variables. This results mostly in rather large numbers of selected variables. Therefore, there is a need for a new and reproducible UVE method with better selective and preferably also better predictive abilities. Consequently, the Global-Minimum Error Reproducible Uninformative-Variable Elimination method, denoted as GME-RUVE, is proposed and tested.
In the GME-RUVE method, main characteristics of two existing methods, i.e. Jack-knife-based Partial Least Squares Regression (JK-PLSR) and Global-Minimum Error Uninformative-Variable Elimination (GME-UVE), are combined. JK-PLSR can be considered as a reproducible version of the original UVE method.
In GME-RUVE, as in the JK-PLSR method, no artificial random variables are added to the X matrix, and firstly the significance of the PLS regression coefficients is determined from jack-knifing. Secondly, as in the GME-UVE method, either the global minimum or the critical RMSECV is used for the selection of the variables. The performance of the new GME-RUVE method is investigated using four datasets with multivariate profiles, i.e. either simulated profiles, NIR spectra or theoretical molecular descriptor profiles, resulting in 12 profile-response (X-y) combinations.
The predictive performance of GME-RUVE, using the global RMSECV minimum and both the selective and predictive performances of GME-RUVE, using the critical RMSECV, are significantly better than both those of the JK-PLSR method, using the first local RMSECV minimum, and of the existing UVE method. The selective and predictive performances of the new GME-RUVE method are also much better than those of the existing GME-UVE method. Moreover, variables selected by the above GME-RUVE method have a chemical meaning.
{"title":"Improved variable reduction in Partial Least Squares modelling by global-minimum error reproducible Uninformative-Variable Elimination","authors":"Jan P.M. Andries , Gerjen H. Tinnevelt , Yvan Vander Heyden","doi":"10.1016/j.chemolab.2025.105603","DOIUrl":"10.1016/j.chemolab.2025.105603","url":null,"abstract":"<div><div>The well-known Uninformative-Variable Elimination for Partial Least Squares, denoted as UVE-PLS, is not reproducible regarding the selected variables. Additionally, in UVE, variables are selected in the first minimum of the graph of the root mean squared error of cross validation (<em>RMSECV</em>) against the number of retained variables. This results mostly in rather large numbers of selected variables. Therefore, there is a need for a new and reproducible UVE method with better selective and preferably also better predictive abilities. Consequently, the Global-Minimum Error Reproducible Uninformative-Variable Elimination method, denoted as GME-RUVE, is proposed and tested.</div><div>In the GME-RUVE method, main characteristics of two existing methods, i.e. Jack-knife-based Partial Least Squares Regression (JK-PLSR) and Global-Minimum Error Uninformative-Variable Elimination (GME-UVE), are combined. JK-PLSR can be considered as a reproducible version of the original UVE method.</div><div>In GME-RUVE, as in the JK-PLSR method, no artificial random variables are added to the <strong><em>X</em></strong> matrix, and firstly the significance of the PLS regression coefficients is determined from jack-knifing. Secondly, as in the GME-UVE method, either the <em>global minimum</em> or the <em>critical RMSECV</em> is used for the selection of the variables. The performance of the new GME-RUVE method is investigated using four datasets with multivariate profiles, i.e. either simulated profiles, NIR spectra or theoretical molecular descriptor profiles, resulting in 12 profile-response (<strong><em>X</em></strong>-<strong><em>y</em></strong>) combinations.</div><div>The predictive performance of GME-RUVE, using the <em>global RMSECV minimum</em> and both the selective and predictive performances of GME-RUVE, using the <em>critical RMSECV</em>, are significantly better than both those of the JK-PLSR method, using the <em>first local RMSECV minimum</em>, and of the existing UVE method. The selective and predictive performances of the new GME-RUVE method are also much better than those of the existing GME-UVE method. Moreover, variables selected by the above GME-RUVE method have a chemical meaning.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"269 ","pages":"Article 105603"},"PeriodicalIF":3.8,"publicationDate":"2025-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145682819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-27DOI: 10.1016/j.chemolab.2025.105592
Yuqi Ren , He Wang , Chongbo Yin , Hong Men , Yan Shi , Jingjing Liu
Agricultural products of the same variety can differ in quality, appearance, and nutritional value due to variations in climate, soil, and other growth conditions. To support reliable and sustainable origin traceability, we propose a non-destructive framework using hyperspectral data. Spectral information for rice and peanut samples from multiple production regions was acquired using a GaiaSorter hyperspectral imaging system. This method can rapidly detect chemical bonds and functional groups, with differences in these features reflecting the overall microstructural quality of agricultural products from different origins. A novel Quadrangle Attention with Deformation (QAD) module was designed to enhance multi-scale feature learning. The module applies geometric transformations within local windows and incorporates relative positional encoding to capture multi-scale receptive-field information, thereby improving spectral-band relationships. By embedding the QAD module into a separable-convolution backbone, we developed the Quadrangle Attention with Deformation Network (QAD-Net) for precise origin identification. On two benchmark datasets, QAD-Net achieved state-of-the-art accuracy, reaching 99.66 ± 0.57 % on peanuts and 99.57 ± 0.65 % on rice, outperforming existing models. This work demonstrates the potential of QAD-Net as a fast, accurate, and non-destructive tool for hyperspectral origin traceability, with significant implications for on-site quality supervision, authenticity verification, and sustainable market regulation.
由于气候、土壤和其他生长条件的不同,同一品种的农产品在质量、外观和营养价值上可能有所不同。为了支持可靠和可持续的原产地追溯,我们提出了一个使用高光谱数据的非破坏性框架。利用GaiaSorter高光谱成像系统获取了多个产区水稻和花生样品的光谱信息。该方法可以快速检测到化学键和官能团,这些特征的差异反映了不同产地农产品的整体微观结构质量。为了增强多尺度特征学习能力,设计了一种新的变形四边形注意(QAD)模块。该模块在局部窗口内应用几何变换,并结合相对位置编码来捕获多尺度接收场信息,从而改善频谱带关系。通过将QAD模块嵌入到可分离卷积主干中,我们开发了具有变形网络的Quadrangle Attention with Deformation Network (QAD- net),用于精确的原点识别。在两个基准数据集上,QAD-Net达到了最先进的准确率,花生和大米的准确率分别达到99.66±0.57%和99.57±0.65%,优于现有模型。这项工作证明了QAD-Net作为一种快速、准确、无损的高光谱来源溯源工具的潜力,对现场质量监督、真实性验证和可持续市场监管具有重要意义。
{"title":"A multi-scale approach integrating hyperspectral system for tracing the origin of agricultural products","authors":"Yuqi Ren , He Wang , Chongbo Yin , Hong Men , Yan Shi , Jingjing Liu","doi":"10.1016/j.chemolab.2025.105592","DOIUrl":"10.1016/j.chemolab.2025.105592","url":null,"abstract":"<div><div>Agricultural products of the same variety can differ in quality, appearance, and nutritional value due to variations in climate, soil, and other growth conditions. To support reliable and sustainable origin traceability, we propose a non-destructive framework using hyperspectral data. Spectral information for rice and peanut samples from multiple production regions was acquired using a GaiaSorter hyperspectral imaging system. This method can rapidly detect chemical bonds and functional groups, with differences in these features reflecting the overall microstructural quality of agricultural products from different origins. A novel Quadrangle Attention with Deformation (QAD) module was designed to enhance multi-scale feature learning. The module applies geometric transformations within local windows and incorporates relative positional encoding to capture multi-scale receptive-field information, thereby improving spectral-band relationships. By embedding the QAD module into a separable-convolution backbone, we developed the Quadrangle Attention with Deformation Network (QAD-Net) for precise origin identification. On two benchmark datasets, QAD-Net achieved state-of-the-art accuracy, reaching 99.66 ± 0.57 % on peanuts and 99.57 ± 0.65 % on rice, outperforming existing models. This work demonstrates the potential of QAD-Net as a fast, accurate, and non-destructive tool for hyperspectral origin traceability, with significant implications for on-site quality supervision, authenticity verification, and sustainable market regulation.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"269 ","pages":"Article 105592"},"PeriodicalIF":3.8,"publicationDate":"2025-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145617036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-26DOI: 10.1016/j.chemolab.2025.105591
Juehong Dai , Liheng Dong , Jingjing Xu , Lingli Deng , Lei Guo , Jiyang Dong
Reliable evaluation of extracted ion chromatograms (EICs) remains a persistent challenge in LC–MS metabolomics, as inaccuracies in peak identification can profoundly impact subsequent data analysis and interpretation. While recent deep learning approaches show promise, their computational burden, limited generalizability, and lack of interpretability hinder broad adoption in routine analytical workflows. To address these limitations, we introduce EXACT-EIC (EXplainable Assessment of Chromatogram qualiTy for EICs), a lightweight, explainable machine learning framework. EXACT-EIC employs a thoughtfully designed 34 handcrafted features to perform two critical tasks: effective binary classification of EICs (peak vs. noise) and quantitative quality scoring. Benchmarking on curated in-house and public testing set demonstrated that EXACT-EIC achieved 95.2 % accuracy and 98.1 % recall for classification. For quantitative assessment, it attained a mean absolute error of 0.70 on a 1–10 expert-assigned quality scale. These results consistently outperformed state-of-the-art deep learning methods including PeakOnly and QuanFormer. Furthermore, Shapley Additive exPlanations (SHAP) analysis quantified the contribution of key chromatographic features (e.g., apex-boundary ratio, distribution entropy) to model predictions, offering transparent mechanistic insights absent in "black-box" architectures. By combining robustness, interpretability, and computational efficiency, EXACT-EIC facilitates reliable EIC evaluation across diverse platforms and experimental conditions. It provides a practical, deployable solution for automated quality control and confident metabolite annotation, addressing a critical need in untargeted LC–MS metabolomics workflows.
{"title":"Explainable machine learning enables robust evaluation of extracted ion chromatograms in LC–MS metabolomics","authors":"Juehong Dai , Liheng Dong , Jingjing Xu , Lingli Deng , Lei Guo , Jiyang Dong","doi":"10.1016/j.chemolab.2025.105591","DOIUrl":"10.1016/j.chemolab.2025.105591","url":null,"abstract":"<div><div>Reliable evaluation of extracted ion chromatograms (EICs) remains a persistent challenge in LC–MS metabolomics, as inaccuracies in peak identification can profoundly impact subsequent data analysis and interpretation. While recent deep learning approaches show promise, their computational burden, limited generalizability, and lack of interpretability hinder broad adoption in routine analytical workflows. To address these limitations, we introduce EXACT-EIC (EXplainable Assessment of Chromatogram qualiTy for EICs), a lightweight, explainable machine learning framework. EXACT-EIC employs a thoughtfully designed 34 handcrafted features to perform two critical tasks: effective binary classification of EICs (peak vs. noise) and quantitative quality scoring. Benchmarking on curated in-house and public testing set demonstrated that EXACT-EIC achieved 95.2 % accuracy and 98.1 % recall for classification. For quantitative assessment, it attained a mean absolute error of 0.70 on a 1–10 expert-assigned quality scale. These results consistently outperformed state-of-the-art deep learning methods including PeakOnly and QuanFormer. Furthermore, Shapley Additive exPlanations (SHAP) analysis quantified the contribution of key chromatographic features (e.g., apex-boundary ratio, distribution entropy) to model predictions, offering transparent mechanistic insights absent in \"black-box\" architectures. By combining robustness, interpretability, and computational efficiency, EXACT-EIC facilitates reliable EIC evaluation across diverse platforms and experimental conditions. It provides a practical, deployable solution for automated quality control and confident metabolite annotation, addressing a critical need in untargeted LC–MS metabolomics workflows.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"269 ","pages":"Article 105591"},"PeriodicalIF":3.8,"publicationDate":"2025-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145610607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-24DOI: 10.1016/j.chemolab.2025.105589
Chaoyuan Hou , Fei Xie , Guohua Wu , Wenting Yu , Houpu Yang , Liu Yang , Xuewen Long , Longfei Yin , Shu Wang
At present, Raman spectroscopy combined with deep learning has been widely used in the field of disease screening. Transformer is an important architecture for deep learning and has excelled in several areas with technologies such as its self-attention mechanism. However, as an architecture originally designed for the field of natural language processing, Transformer has disadvantages such as high computational complexity and easy overfitting in small data sets when processing spectral data. In this study, we propose a spectral classification model called Categorical Embedding Transformer (CET) and apply it to the screening of breast cancer and ductal carcinoma in situ combined with Raman spectroscopy. The core principle of CET model is to embed class labels to fixed dimensional vectors and update them as learnable parameters during training. The CET model also removes the positional encoding in transformer encoder and the initial linear layer used for dimensionality reduction or dimensionality enhancement, and retains the structure used for feature extraction and dimensionality reduction of spectral data. The ability of feature extraction and dimensionality reduction of spectral data is retained while the computational complexity is reduced. Finally, the dot product is used to calculate the similarity between the class vector and the spectrum after dimensionality reduction, and the cross entropy loss function is used to maximize the dot product similarity of the real class during training. The model we built achieved 100 % accuracy on the validation set and 98.2 % accuracy on the unknown test set, which is better than other compared models.
{"title":"A classification model for early detection of breast cancer by Raman spectroscopy based on categorical embedding transformer","authors":"Chaoyuan Hou , Fei Xie , Guohua Wu , Wenting Yu , Houpu Yang , Liu Yang , Xuewen Long , Longfei Yin , Shu Wang","doi":"10.1016/j.chemolab.2025.105589","DOIUrl":"10.1016/j.chemolab.2025.105589","url":null,"abstract":"<div><div>At present, Raman spectroscopy combined with deep learning has been widely used in the field of disease screening. Transformer is an important architecture for deep learning and has excelled in several areas with technologies such as its self-attention mechanism. However, as an architecture originally designed for the field of natural language processing, Transformer has disadvantages such as high computational complexity and easy overfitting in small data sets when processing spectral data. In this study, we propose a spectral classification model called Categorical Embedding Transformer (CET) and apply it to the screening of breast cancer and ductal carcinoma in situ combined with Raman spectroscopy. The core principle of CET model is to embed class labels to fixed dimensional vectors and update them as learnable parameters during training. The CET model also removes the positional encoding in transformer encoder and the initial linear layer used for dimensionality reduction or dimensionality enhancement, and retains the structure used for feature extraction and dimensionality reduction of spectral data. The ability of feature extraction and dimensionality reduction of spectral data is retained while the computational complexity is reduced. Finally, the dot product is used to calculate the similarity between the class vector and the spectrum after dimensionality reduction, and the cross entropy loss function is used to maximize the dot product similarity of the real class during training. The model we built achieved 100 % accuracy on the validation set and 98.2 % accuracy on the unknown test set, which is better than other compared models.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"268 ","pages":"Article 105589"},"PeriodicalIF":3.8,"publicationDate":"2025-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145614940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-24DOI: 10.1016/j.chemolab.2025.105590
Qasem M. Tawhari , Muhammad Naeem , Saba Maqbool , Syed Muhammad Kashif Raza , Adnan Aslam
This study computes M-polynomial indices for Doxorubicin and Mitoxantrone, two widely used anthracycline and anthracenedione anticancer drugs, respectively. Doxorubicin, a potent topoisomerase II inhibitor, is commonly employed in treating various cancers, including breast, ovarian, and leukemia. Mitoxantrone, with its unique DNA-intercalating properties, is effective against acute myeloid leukemia, breast cancer, and non-Hodgkin’s lymphoma. We produced M-polynomial indices by partitioning graph edges depending on degree and adjacency matrix. A Python algorithm is written using an adjacency matrix to efficiently compute the indices, reducing calculation time from days to minutes and eliminating human error. Simple linear regression models in SPSS software are used to create QSPR and predict the physical attributes of cancer medicines. Our findings show that M-polynomial indices accurately predict physical attributes, providing important insights into the structural requirements for maximum anticancer action. In addition, we proposed models for each physical attribute. This study aids in the development of new cancer therapies and the prediction of physical features for uncharacterized medications.
{"title":"Modeling and statistical analysis of cancer drugs using M-polynomial indices for their characteristics","authors":"Qasem M. Tawhari , Muhammad Naeem , Saba Maqbool , Syed Muhammad Kashif Raza , Adnan Aslam","doi":"10.1016/j.chemolab.2025.105590","DOIUrl":"10.1016/j.chemolab.2025.105590","url":null,"abstract":"<div><div>This study computes M-polynomial indices for Doxorubicin and Mitoxantrone, two widely used anthracycline and anthracenedione anticancer drugs, respectively. Doxorubicin, a potent topoisomerase II inhibitor, is commonly employed in treating various cancers, including breast, ovarian, and leukemia. Mitoxantrone, with its unique DNA-intercalating properties, is effective against acute myeloid leukemia, breast cancer, and non-Hodgkin’s lymphoma. We produced M-polynomial indices by partitioning graph edges depending on degree and adjacency matrix. A Python algorithm is written using an adjacency matrix to efficiently compute the indices, reducing calculation time from days to minutes and eliminating human error. Simple linear regression models in SPSS software are used to create QSPR and predict the physical attributes of cancer medicines. Our findings show that M-polynomial indices accurately predict physical attributes, providing important insights into the structural requirements for maximum anticancer action. In addition, we proposed models for each physical attribute. This study aids in the development of new cancer therapies and the prediction of physical features for uncharacterized medications.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"268 ","pages":"Article 105590"},"PeriodicalIF":3.8,"publicationDate":"2025-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145615443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-20DOI: 10.1016/j.chemolab.2025.105587
Atifa Rafique , Xue Yu , Kashif Iqbal
In recent years, evolutionary generative adversarial networks (EGANs) have been proposed as a emerging research area that merges the well-known original concept of generative adversarial networks (GAN) for generating realistic data and evolutionary computation (EC) techniques to optimize solutions by inspiration from nature. In this review paper, we delve into the synergetic relationship between EC and GAN with an emphasis on EGANs — an emerging direction that has the potential to spark a multitude of practical applications. To this end, we first introduce the key concepts of GANs and EC respectively in detail to illustrate their synergism for modeling novel data efficiently while keeping consistency with reality. Then we describe how EC techniques have been incorporated into these architectures to improve both performance and diversity. This paper presents a thorough analysis of the EGANs in various domains. In this perspective, EGANs have been proven to be very effective in various real-world problems like data scarcity as well as mode collapse and training instability. We also consider the limitations of EGANs and suggest methods for addressing them. For the future, we present new research directions for EGANs and suggest that it could potentially transform artificial intelligence (AI) as well as push forward cutting-edge applications in personalized content generation, virtual reality (VR) experiences, and medical diagnosis. In conclusion, it will provide a solid foundation for EGANs. It represents a promising trajectory for AI space due to a combination of two powerful paradigms, GAN and EC. It aims to handle the challenges which will result in enabling the new world in data synthesis and optimization.
{"title":"Recent developments in evolutionary computation for generative adversarial networks: A comprehensive survey","authors":"Atifa Rafique , Xue Yu , Kashif Iqbal","doi":"10.1016/j.chemolab.2025.105587","DOIUrl":"10.1016/j.chemolab.2025.105587","url":null,"abstract":"<div><div>In recent years, evolutionary generative adversarial networks (EGANs) have been proposed as a emerging research area that merges the well-known original concept of generative adversarial networks (GAN) for generating realistic data and evolutionary computation (EC) techniques to optimize solutions by inspiration from nature. In this review paper, we delve into the synergetic relationship between EC and GAN with an emphasis on EGANs — an emerging direction that has the potential to spark a multitude of practical applications. To this end, we first introduce the key concepts of GANs and EC respectively in detail to illustrate their synergism for modeling novel data efficiently while keeping consistency with reality. Then we describe how EC techniques have been incorporated into these architectures to improve both performance and diversity. This paper presents a thorough analysis of the EGANs in various domains. In this perspective, EGANs have been proven to be very effective in various real-world problems like data scarcity as well as mode collapse and training instability. We also consider the limitations of EGANs and suggest methods for addressing them. For the future, we present new research directions for EGANs and suggest that it could potentially transform artificial intelligence (AI) as well as push forward cutting-edge applications in personalized content generation, virtual reality (VR) experiences, and medical diagnosis. In conclusion, it will provide a solid foundation for EGANs. It represents a promising trajectory for AI space due to a combination of two powerful paradigms, GAN and EC. It aims to handle the challenges which will result in enabling the new world in data synthesis and optimization.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"268 ","pages":"Article 105587"},"PeriodicalIF":3.8,"publicationDate":"2025-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145615445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Capparis spinosa L. buds undergo salting and drying to enhance their shelf life and organoleptic properties. This study evaluates the impact of four drying methods: oven drying (OD), vacuum drying (VD), freeze-drying (FD), and microwave drying (MD) on the physicochemical, antioxidant, and microbiological properties of dried caper buds. Salting reduced the initial moisture content from 508.50 % to 168.59 % (db), while drying further decreased it to approximately 9 %. Drying time varied significantly, with MD achieving the shortest duration (0.19–0.75h) and OD requiring the longest (reaching 49.66h). FD exhibited the highest energy consumption (60.77 kWh/kg), followed by VD, while OD and MD were the least energy-intensive (0.54–3.10 kWh/kg and 1.34–2.18 kWh/kg, respectively). FD preserved the most chlorophyll (193.63 μg/g DW) and total phenolic content (28.98 mgGAE/g DW), whereas MD at 200 W resulted in the lowest TPC (9.88 mgGAE/g DW). FD samples also showed superior antioxidant activities in both ABTS and FRAP assays. In contrast, OD and MD increased browning and degraded quality attributes. Multivariate analyses (PCA and clustering) highlighted FD as optimal for preserving quality, while MD was the most detrimental. Microbiological analysis confirmed that dried capers met food safety standards. A predictive model using Decision Tree coupled with Least Squares Boosting (DT_LSBOOST) achieved exceptional accuracy (R = 0.9999, RMSE = 0.0564, ESP = 0.2028, MAE = 0.0305), providing a reliable tool for optimizing drying parameters. Overall, freeze-drying emerged as the best method to retain nutritional and bioactive properties of capers, and the developed predictive model offers an innovative approach to enhancing caper processing efficiency.
{"title":"Optimization of caper bud drying using the DT_LSBOOST model: A predictive approach to improve quality and efficiency","authors":"Chafika Lakhdari , Hocine Remini , Samia Djellal , Meriem Adouane , Hichem Tahraoui , Abdeltif Amrane , Farid Dahmoune , Merve Yavuz-Düzgün , Elif Feyza Aydar , Evren Demircan , Zehra Mertdinç , Beraat Ozçelik , Nabil Kadri","doi":"10.1016/j.chemolab.2025.105585","DOIUrl":"10.1016/j.chemolab.2025.105585","url":null,"abstract":"<div><div><em>Capparis spinosa</em> L. buds undergo salting and drying to enhance their shelf life and organoleptic properties. This study evaluates the impact of four drying methods: oven drying (OD), vacuum drying (VD), freeze-drying (FD), and microwave drying (MD) on the physicochemical, antioxidant, and microbiological properties of dried caper buds. Salting reduced the initial moisture content from 508.50 % to 168.59 % (db), while drying further decreased it to approximately 9 %. Drying time varied significantly, with MD achieving the shortest duration (0.19–0.75h) and OD requiring the longest (reaching 49.66h). FD exhibited the highest energy consumption (60.77 kWh/kg), followed by VD, while OD and MD were the least energy-intensive (0.54–3.10 kWh/kg and 1.34–2.18 kWh/kg, respectively). FD preserved the most chlorophyll (193.63 μg/g DW) and total phenolic content (28.98 mgGAE/g DW), whereas MD at 200 W resulted in the lowest TPC (9.88 mgGAE/g DW). FD samples also showed superior antioxidant activities in both ABTS and FRAP assays. In contrast, OD and MD increased browning and degraded quality attributes. Multivariate analyses (PCA and clustering) highlighted FD as optimal for preserving quality, while MD was the most detrimental. Microbiological analysis confirmed that dried capers met food safety standards. A predictive model using Decision Tree coupled with Least Squares Boosting (DT_LSBOOST) achieved exceptional accuracy (R = 0.9999, RMSE = 0.0564, ESP = 0.2028, MAE = 0.0305), providing a reliable tool for optimizing drying parameters. Overall, freeze-drying emerged as the best method to retain nutritional and bioactive properties of capers, and the developed predictive model offers an innovative approach to enhancing caper processing efficiency.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"268 ","pages":"Article 105585"},"PeriodicalIF":3.8,"publicationDate":"2025-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145615444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-20DOI: 10.1016/j.chemolab.2025.105580
Long Liu , Yifan Wang , Bin Wang , Xiaoxuan Xu , Jing Xu
Tea is a widely popular beverage across the globe. However, its medicinal content and value vary from one species to another. As a result, consumers need a quick and efficient method to distinguish between species. This paper introduces a method for species classification using two-dimensional correlation spectroscopy (2DCOS) images combined with deep learning (DL) models. Initially, 345 thin-section samples of five different black teas were prepared, and their near-infrared spectroscopy (NIRS) data were obtained. From this preprocessed one-dimensional NIRS data, 8280 2DCOS contour images and contour fill images were generated. MobileNet model with various bottleneck residual blocks was constructed, and trained using these 2DCOS images as samples, achieving a classification accuracy of 100 %. The model testing results indicated that the optimal NIRS data preprocessing method and 2DCOS image format are Standard Normal Variate transformation (SNV) and contour fill image. Furthermore, the classification results of one-dimensional NIRS data, 2DOCS matrix data, and 2DCOS image data were compared, showing that the 2DCOS images provide higher classification accuracy. Finally, comparative experiments were conducted between the MobileNet model and other deep learning models, demonstrating that the MobileNet model has the advantages of fewer parameters, lower computational load, high accuracy, and fast convergence speed. Therefore, combining 2DCOS images with the MobileNet model for black tea classification is effective. This paper offers a promising approach for the identification of black tea species, with extensive potential applications in species classification.
{"title":"High precision classification method for black tea: Deep learning combined with two-dimensional correlation spectroscopy","authors":"Long Liu , Yifan Wang , Bin Wang , Xiaoxuan Xu , Jing Xu","doi":"10.1016/j.chemolab.2025.105580","DOIUrl":"10.1016/j.chemolab.2025.105580","url":null,"abstract":"<div><div>Tea is a widely popular beverage across the globe. However, its medicinal content and value vary from one species to another. As a result, consumers need a quick and efficient method to distinguish between species. This paper introduces a method for species classification using two-dimensional correlation spectroscopy (2DCOS) images combined with deep learning (DL) models. Initially, 345 thin-section samples of five different black teas were prepared, and their near-infrared spectroscopy (NIRS) data were obtained. From this preprocessed one-dimensional NIRS data, 8280 2DCOS contour images and contour fill images were generated. MobileNet model with various bottleneck residual blocks was constructed, and trained using these 2DCOS images as samples, achieving a classification accuracy of 100 %. The model testing results indicated that the optimal NIRS data preprocessing method and 2DCOS image format are Standard Normal Variate transformation (SNV) and contour fill image. Furthermore, the classification results of one-dimensional NIRS data, 2DOCS matrix data, and 2DCOS image data were compared, showing that the 2DCOS images provide higher classification accuracy. Finally, comparative experiments were conducted between the MobileNet model and other deep learning models, demonstrating that the MobileNet model has the advantages of fewer parameters, lower computational load, high accuracy, and fast convergence speed. Therefore, combining 2DCOS images with the MobileNet model for black tea classification is effective. This paper offers a promising approach for the identification of black tea species, with extensive potential applications in species classification.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"268 ","pages":"Article 105580"},"PeriodicalIF":3.8,"publicationDate":"2025-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145569576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}