Pub Date : 2025-11-01DOI: 10.1016/j.chemolab.2025.105560
Zulqurnain Sabir , Bahaa Basbous , Basma Souayeh , Muhammad Umar , Soheil Salahshour
Purpose
The purpose of this work is to provide a reliable neural network process for the spreading virus in computers with kill signals. The mathematical model shows susceptible, exposed, infected individuals to form the virus inactive, and kill signals classes.
Method
A structure of deep neural network (DNN) is designed by using two different hidden layers having radial basis activation functions in both layers, optimization through the Bayesian regularization, twenty and thirty numbers of neurons in primary and secondary hidden layers for the spreading virus in computers with kill signals. The stochastic DNN framework is presented to solve the spreading virus in computers with kill signals by selecting the data for training as 70 %, and 15 %, 15 % for both validation and testing.
Results
The accuracy of the scheme is observed through the overlapping of the solutions along with negligible absolute error for solving the model. The consistency of the solver is observed through the process of error histogram, regression, and state transition.
Novelty
The proposed DNN structure having radial basis activation function has never been applied for the spreading virus in computers with kill signals.
{"title":"A reliable deep neural network using the radial basis for the spreading virus in computers with kill signals","authors":"Zulqurnain Sabir , Bahaa Basbous , Basma Souayeh , Muhammad Umar , Soheil Salahshour","doi":"10.1016/j.chemolab.2025.105560","DOIUrl":"10.1016/j.chemolab.2025.105560","url":null,"abstract":"<div><h3>Purpose</h3><div>The purpose of this work is to provide a reliable neural network process for the spreading virus in computers with kill signals. The mathematical model shows susceptible, exposed, infected individuals to form the virus inactive, and kill signals classes.</div></div><div><h3>Method</h3><div>A structure of deep neural network (DNN) is designed by using two different hidden layers having radial basis activation functions in both layers, optimization through the Bayesian regularization, twenty and thirty numbers of neurons in primary and secondary hidden layers for the spreading virus in computers with kill signals. The stochastic DNN framework is presented to solve the spreading virus in computers with kill signals by selecting the data for training as 70 %, and 15 %, 15 % for both validation and testing.</div></div><div><h3>Results</h3><div>The accuracy of the scheme is observed through the overlapping of the solutions along with negligible absolute error for solving the model. The consistency of the solver is observed through the process of error histogram, regression, and state transition.</div></div><div><h3>Novelty</h3><div>The proposed DNN structure having radial basis activation function has never been applied for the spreading virus in computers with kill signals.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"268 ","pages":"Article 105560"},"PeriodicalIF":3.8,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145464512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
An important step in the development of novel cosmetic ingredients is the setting up of sensory analyses to assess their tactile properties. A recent work allowed the obtention of 12 novel biobased emollients with interesting physico-chemical properties. Four of the most promising emollients were selected in the present study and their safety was tested to ensure they are suitable for use on human skin. Their tactile properties, along with ten commercial emollients, were assessed by 16 expert assessors: circular spreading behavior, thickness of residual film and slippery after feel. In addition to characterizing a wide range of emollients, the results made possible the establishment of three predictive models using Partial Least Squares regressions. These original models correspond to various sensory attributes of the emollients, both during and after their application on the skin. All predictive models were then validated by leave-one-out cross validations. Only three instrumental parameters (viscosity, friction, stickiness) were necessary to build the models and predict the tactile properties. This approach was then applied to the eight other biobased emollients that were not initially used to establish the predictions in order to validate the models. Results demonstrate the significant value of such models for developing new ingredients. Ultimately, these predictive models could override the time-consuming and costly process of safety testing and sensory analyses in the research in development of future newly produced emollients for dermocosmetic applications.
{"title":"Instrumental prediction of in vivo sensory properties of emollients to allow the development of new biobased ingredients","authors":"Floriane Rischard , Amandine Flourat , Ecaterina Gore , Géraldine Savary","doi":"10.1016/j.chemolab.2025.105559","DOIUrl":"10.1016/j.chemolab.2025.105559","url":null,"abstract":"<div><div>An important step in the development of novel cosmetic ingredients is the setting up of sensory analyses to assess their tactile properties. A recent work allowed the obtention of 12 novel biobased emollients with interesting physico-chemical properties. Four of the most promising emollients were selected in the present study and their safety was tested to ensure they are suitable for use on human skin. Their tactile properties, along with ten commercial emollients, were assessed by 16 expert assessors: circular spreading behavior, thickness of residual film and slippery after feel. In addition to characterizing a wide range of emollients, the results made possible the establishment of three predictive models using Partial Least Squares regressions. These original models correspond to various sensory attributes of the emollients, both during and after their application on the skin. All predictive models were then validated by leave-one-out cross validations. Only three instrumental parameters (viscosity, friction, stickiness) were necessary to build the models and predict the tactile properties. This approach was then applied to the eight other biobased emollients that were not initially used to establish the predictions in order to validate the models. Results demonstrate the significant value of such models for developing new ingredients. Ultimately, these predictive models could override the time-consuming and costly process of safety testing and sensory analyses in the research in development of future newly produced emollients for dermocosmetic applications.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"268 ","pages":"Article 105559"},"PeriodicalIF":3.8,"publicationDate":"2025-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145464520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-30DOI: 10.1016/j.chemolab.2025.105558
Veena Potdar , Mohan Govindasa Kabadi
Anomaly detection is essential for identifying deviations from normal patterns in data, enabling the detection of security breaches or system faults, particularly in Internet of Things (IoT) networks. However, traditional machine learning (ML) and deep learning (DL) methods often struggle with the dynamic and complex nature of IoT environments, where attack patterns are non-linear, continuously evolving, and context-dependent. These models typically require large labeled datasets and retraining to adapt to new threats, which limits their responsiveness and scalability. Additionally, their high computational demands make real-time deployment on resource-constrained IoT devices challenging. Furthermore, many ML/DL models exhibit poor generalization, performing well in controlled scenarios but failing to maintain accuracy across diverse, real-world IoT settings with varying devices, protocols, and data distributions. To address these issues, this work proposes the Dwarf Mongoose-Chaos Optimized Deep Belief (DCODB) Framework, which combines advanced preprocessing, feature selection (FS), and classification techniques. Initial preprocessing involves Min-Max Normalization and One-Hot Encoding to scale numerical features and transform categorical data for effective model input. FS is optimized by the novel Dwarf Mongoose-Chaos Fusion Optimization (DMCFO), which is a swarm intelligence algorithm that leverages chaotic maps to improve the effectiveness of the Dwarf Mongoose Optimization Algorithm (DMO), reducing dimensionality and improving classification accuracy. The refined features are then classified using a Deep Belief Network (DBN), which processes hierarchical feature representations to differentiate between normal and anomalous behaviors in the NSL-KDD dataset. The proposed framework has been thoroughly assessed using diverse metrics, demonstrating its effectiveness in anomaly detection by achieving above 99 % Balanced Accuracy, along with exceptional Precision, Recall, F1 Score, Specificity, and the AUC-ROC curve. These high-performance metrics affirm the model's capability to deliver reliable and scalable anomaly detection in IoT environments, strengthening overall security.
{"title":"Enhancing IoT anomaly detection with the Dwarf Mongoose-Chaos optimized deep belief framework","authors":"Veena Potdar , Mohan Govindasa Kabadi","doi":"10.1016/j.chemolab.2025.105558","DOIUrl":"10.1016/j.chemolab.2025.105558","url":null,"abstract":"<div><div>Anomaly detection is essential for identifying deviations from normal patterns in data, enabling the detection of security breaches or system faults, particularly in Internet of Things (IoT) networks. However, traditional machine learning (ML) and deep learning (DL) methods often struggle with the dynamic and complex nature of IoT environments, where attack patterns are non-linear, continuously evolving, and context-dependent. These models typically require large labeled datasets and retraining to adapt to new threats, which limits their responsiveness and scalability. Additionally, their high computational demands make real-time deployment on resource-constrained IoT devices challenging. Furthermore, many ML/DL models exhibit poor generalization, performing well in controlled scenarios but failing to maintain accuracy across diverse, real-world IoT settings with varying devices, protocols, and data distributions. To address these issues, this work proposes the Dwarf Mongoose-Chaos Optimized Deep Belief (DCODB) Framework, which combines advanced preprocessing, feature selection (FS), and classification techniques. Initial preprocessing involves Min-Max Normalization and One-Hot Encoding to scale numerical features and transform categorical data for effective model input. FS is optimized by the novel Dwarf Mongoose-Chaos Fusion Optimization (DMCFO), which is a swarm intelligence algorithm that leverages chaotic maps to improve the effectiveness of the Dwarf Mongoose Optimization Algorithm (DMO), reducing dimensionality and improving classification accuracy. The refined features are then classified using a Deep Belief Network (DBN), which processes hierarchical feature representations to differentiate between normal and anomalous behaviors in the NSL-KDD dataset. The proposed framework has been thoroughly assessed using diverse metrics, demonstrating its effectiveness in anomaly detection by achieving above 99 % Balanced Accuracy, along with exceptional Precision, Recall, F1 Score, Specificity, and the AUC-ROC curve. These high-performance metrics affirm the model's capability to deliver reliable and scalable anomaly detection in IoT environments, strengthening overall security.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"268 ","pages":"Article 105558"},"PeriodicalIF":3.8,"publicationDate":"2025-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145464518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-30DOI: 10.1016/j.chemolab.2025.105556
Francisco Arteaga , José Camacho , Alberto Ferrer
Researchers interested in developing new multivariate statistical methods often need to be able to generate multivariate datasets with specific characteristics to test the effectiveness of their data analysis algorithms under specific conditions.
In this paper, we present a family of methods for generating multivariate centred datasets by simultaneously controlling features of the cross-product matrices and . This provides an interesting trade-off to control for the variance structure in the data, important for the family of algorithms that operate on the data matrix, like, e.g., Principal Component Analysis, and control for the distances among objects, important for algorithms that operate on the distance matrix, like Multidimensional Scaling. The proposed methods form a general framework that can be understood as a jigsaw puzzle, joining pieces obtained from the spectral decomposition of a target covariance matrix and the singular value decomposition of a target data matrix. These methods have in common that they are derived from a two-sided orthogonal Procrustes problem.
{"title":"Solving the puzzle: Simulation of multivariate data with control over the structure of columns and rows using the two-sided orthogonal procrustes problem","authors":"Francisco Arteaga , José Camacho , Alberto Ferrer","doi":"10.1016/j.chemolab.2025.105556","DOIUrl":"10.1016/j.chemolab.2025.105556","url":null,"abstract":"<div><div>Researchers interested in developing new multivariate statistical methods often need to be able to generate multivariate datasets with specific characteristics to test the effectiveness of their data analysis algorithms under specific conditions.</div><div>In this paper, we present a family of methods for generating multivariate centred datasets by simultaneously controlling features of the cross-product matrices <span><math><mrow><msup><mi>X</mi><mo>⊤</mo></msup><mi>X</mi></mrow></math></span> and <span><math><mrow><mi>X</mi><msup><mi>X</mi><mo>⊤</mo></msup></mrow></math></span>. This provides an interesting trade-off to control for the variance structure in the data, important for the family of algorithms that operate on the data matrix, like, e.g., Principal Component Analysis, and control for the distances among objects, important for algorithms that operate on the distance matrix, like Multidimensional Scaling. The proposed methods form a general framework that can be understood as a jigsaw puzzle, joining pieces obtained from the spectral decomposition of a target covariance matrix and the singular value decomposition of a target data matrix. These methods have in common that they are derived from a two-sided orthogonal Procrustes problem.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"268 ","pages":"Article 105556"},"PeriodicalIF":3.8,"publicationDate":"2025-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145464513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-30DOI: 10.1016/j.chemolab.2025.105564
Peiliang Wu , Zhiwei Wang , Yuhan Zhao , Deming Kong
Oil spills hidden below the sea surface and in a suspended state are known as submerged oil. Determining the source of an oil spill and evaluating the amount of oil spilled can provide a basis for the effective development of oil spill emergency response strategies and policies. Because of this, this paper proposes an oil spill species identification and concentration quantification analytical method based on the combination of time-resolved fluorescence spectroscopy (TRFS) and improved parallel factor framework-clustering analysis (IPFFCA). The IPFFCA model first decomposes the oil TRFS data to extract the loading matrix and reconstructs the landscape maps corresponding to each component based on the loading matrix. Subsequently, the non-negative least squares algorithm was employed to fit the component landscape maps to the unfolded actual spectra, thereby estimating the score matrix of the samples. Building upon this, the score matrix was used as input to develop oil species identification and concentration quantification models via particle swarm optimization support vector machine (PSO-SVM) and extreme gradient boosting (XGBoost), respectively. To verify the effectiveness of the proposed analytical method, six typical submerged oil samples were experimentally prepared, and their TRFS data were collected and analyzed. The experimental results show that the analytical method proposed in this paper achieves 92 % accuracy in the oil species identification task, the average coefficient of determination of the concentration prediction in the validation set of the six types of samples reaches 0.95, and the root mean square error is 0.08, indicating strong predictive performance.
{"title":"Time-resolved fluorescence spectroscopy and improved parallel factor framework-clustering analysis for oil spill type identification and concentration quantification","authors":"Peiliang Wu , Zhiwei Wang , Yuhan Zhao , Deming Kong","doi":"10.1016/j.chemolab.2025.105564","DOIUrl":"10.1016/j.chemolab.2025.105564","url":null,"abstract":"<div><div>Oil spills hidden below the sea surface and in a suspended state are known as submerged oil. Determining the source of an oil spill and evaluating the amount of oil spilled can provide a basis for the effective development of oil spill emergency response strategies and policies. Because of this, this paper proposes an oil spill species identification and concentration quantification analytical method based on the combination of time-resolved fluorescence spectroscopy (TRFS) and improved parallel factor framework-clustering analysis (IPFFCA). The IPFFCA model first decomposes the oil TRFS data to extract the loading matrix and reconstructs the landscape maps corresponding to each component based on the loading matrix. Subsequently, the non-negative least squares algorithm was employed to fit the component landscape maps to the unfolded actual spectra, thereby estimating the score matrix of the samples. Building upon this, the score matrix was used as input to develop oil species identification and concentration quantification models via particle swarm optimization support vector machine (PSO-SVM) and extreme gradient boosting (XGBoost), respectively. To verify the effectiveness of the proposed analytical method, six typical submerged oil samples were experimentally prepared, and their TRFS data were collected and analyzed. The experimental results show that the analytical method proposed in this paper achieves 92 % accuracy in the oil species identification task, the average coefficient of determination of the concentration prediction in the validation set of the six types of samples reaches 0.95, and the root mean square error is 0.08, indicating strong predictive performance.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"268 ","pages":"Article 105564"},"PeriodicalIF":3.8,"publicationDate":"2025-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145464514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-29DOI: 10.1016/j.chemolab.2025.105561
Chunru Xiong , Jufang Hu , Ken Cai , Fangxiu Meng , Qinyong Lin , Huazhou Chen
This study aims to combine the deep learning algorithm and the visible and near-infrared (Vis-NIR) spectroscopy technology to build a soil nutrient information extraction model. A deep learning framework based on Long Short-Term Memory (LSTM) is proposed to establish optimal calibration model for the analysis of the full range of Vis-NIR spectral data. Moreover, an influence function is designed to select the informative wavelength variables, which is an important goal in engineering application of spectroscopy for reducing the model dimensionality and enhancing model robustness. Experiment was performed for the prediction of nitrogen (N), phosphorus (P) and potassium (K) contents of soil. The modeling results showed that the proposed model could improve the modeling efficiency of soil nutrient information extraction, and also obtained higher accuracy in the modeling and predictive procedures than the conventional model. This will provide effective response to the challenges in engineering applications, to promote the Vis-NIR spectroscopy technology be applied for fast detection, and to obtain robust models with high precisions in soil nutrient information extraction process.
{"title":"Extraction of soil nutrient information from visible and near-infrared signals using deep learning models","authors":"Chunru Xiong , Jufang Hu , Ken Cai , Fangxiu Meng , Qinyong Lin , Huazhou Chen","doi":"10.1016/j.chemolab.2025.105561","DOIUrl":"10.1016/j.chemolab.2025.105561","url":null,"abstract":"<div><div>This study aims to combine the deep learning algorithm and the visible and near-infrared (Vis-NIR) spectroscopy technology to build a soil nutrient information extraction model. A deep learning framework based on Long Short-Term Memory (LSTM) is proposed to establish optimal calibration model for the analysis of the full range of Vis-NIR spectral data. Moreover, an influence function is designed to select the informative wavelength variables, which is an important goal in engineering application of spectroscopy for reducing the model dimensionality and enhancing model robustness. Experiment was performed for the prediction of nitrogen (N), phosphorus (P) and potassium (K) contents of soil. The modeling results showed that the proposed model could improve the modeling efficiency of soil nutrient information extraction, and also obtained higher accuracy in the modeling and predictive procedures than the conventional model. This will provide effective response to the challenges in engineering applications, to promote the Vis-NIR spectroscopy technology be applied for fast detection, and to obtain robust models with high precisions in soil nutrient information extraction process.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"268 ","pages":"Article 105561"},"PeriodicalIF":3.8,"publicationDate":"2025-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145398088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-29DOI: 10.1016/j.chemolab.2025.105563
G. Jenifa , B.R. Tapas Bapu , Vivekanandan M. , J. Senthil Murugan
Brain tumor (BT) detection and segmentation are of vital importance for accurate diagnosis, but are still difficult because of intricate brain anatomy, non-spherical tumor shapes, and low contrast of MRI images. Conventional manual methods are time-consuming and invasive with observer variability, whereas traditional machine learning (ML) approaches based on handcrafted features tend to miss subtle patterns of tumor areas. Even the deep learning (DL) models like CNNs, despite their effectiveness, have limitations such as high computation expenses, poor generalization to heterogeneous data, and complexity in delineating tumor boundaries accurately, which are subtle. These drawbacks are sought to be overcome by this manuscript, suggesting an innovative technique for automatic BT detection in MRI samples. Initially, the normalized gamma corrected contrast-limited adaptive histogram equalization (NG-CCLAHE) is introduced for enhancing the MRI image quality. Then, the Faster 2D-Otsu Thresholding technique is introduced for segmenting the tumor regions from the MRI samples. Followed by this, the synchroextracting Transform (SET) technique is employed to extract features, which are then optimized with an Improved Ladybug Beetle Optimization Algorithm (ILBOA). The improved features are fed into the PWGAN, allowing for more accurate and effective tumor detection. Experimental assessment using the Br35H Brain Tumor Detection 2020 dataset reflects high-level performance with 98.6 % accuracy, 92 % DSC, 95 % PDR, 23 % classification error, 37.8s computation time, and an F1-score of 98.59 %. These aspects identify the proposed approach's efficiency and competency in brain tumor patterns from MRI images.
{"title":"Decoding brain tumor patterns in MRI images: Unleashing optimized insights with Progressive Wasserstein generative adversarial network","authors":"G. Jenifa , B.R. Tapas Bapu , Vivekanandan M. , J. Senthil Murugan","doi":"10.1016/j.chemolab.2025.105563","DOIUrl":"10.1016/j.chemolab.2025.105563","url":null,"abstract":"<div><div>Brain tumor (BT) detection and segmentation are of vital importance for accurate diagnosis, but are still difficult because of intricate brain anatomy, non-spherical tumor shapes, and low contrast of MRI images. Conventional manual methods are time-consuming and invasive with observer variability, whereas traditional machine learning (ML) approaches based on handcrafted features tend to miss subtle patterns of tumor areas. Even the deep learning (DL) models like CNNs, despite their effectiveness, have limitations such as high computation expenses, poor generalization to heterogeneous data, and complexity in delineating tumor boundaries accurately, which are subtle. These drawbacks are sought to be overcome by this manuscript, suggesting an innovative technique for automatic BT detection in MRI samples. Initially, the normalized gamma corrected contrast-limited adaptive histogram equalization (NG-CCLAHE) is introduced for enhancing the MRI image quality. Then, the Faster 2D-Otsu Thresholding technique is introduced for segmenting the tumor regions from the MRI samples. Followed by this, the synchroextracting Transform (SET) technique is employed to extract features, which are then optimized with an Improved Ladybug Beetle Optimization Algorithm (ILBOA). The improved features are fed into the PWGAN, allowing for more accurate and effective tumor detection. Experimental assessment using the Br35H Brain Tumor Detection 2020 dataset reflects high-level performance with 98.6 % accuracy, 92 % DSC, 95 % PDR, 23 % classification error, 37.8s computation time, and an F1-score of 98.59 %. These aspects identify the proposed approach's efficiency and competency in brain tumor patterns from MRI images.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"268 ","pages":"Article 105563"},"PeriodicalIF":3.8,"publicationDate":"2025-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145569577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-27DOI: 10.1016/j.chemolab.2025.105555
Junhua Zheng , Zeyu Yang , Zhiqiang Ge
While deep learning has made great progresses in various application domains, the nature of computational expensive and reliance on large-scale data makes it inefficient or even impossible for small data modeling, particularly under the just-in-time learning framework. Effective combination of deep learning and just-in-time learning may explore great potentials for both two learning paradigms, thus should be attractive and beneficial to the research community. In this paper, an improved form of the lightweight deep partial least squares (PLS) model is developed under the framework of Just-in-time learning. Without complicated backpropagation and time-consuming parameter tuning algorithms, deep PLS provides a transparent model structure which also works well for small training data. As a result, fusion of those two learning strategies makes the new proposed method as a very promising predictive modeling tool in industrial soft sensor applications, the performance of which is evaluated and confirmed through a real industrial example.
{"title":"When just-in-time learning meets deep learning: An industrial quality prediction practice on deep partial least squares model","authors":"Junhua Zheng , Zeyu Yang , Zhiqiang Ge","doi":"10.1016/j.chemolab.2025.105555","DOIUrl":"10.1016/j.chemolab.2025.105555","url":null,"abstract":"<div><div>While deep learning has made great progresses in various application domains, the nature of computational expensive and reliance on large-scale data makes it inefficient or even impossible for small data modeling, particularly under the just-in-time learning framework. Effective combination of deep learning and just-in-time learning may explore great potentials for both two learning paradigms, thus should be attractive and beneficial to the research community. In this paper, an improved form of the lightweight deep partial least squares (PLS) model is developed under the framework of Just-in-time learning. Without complicated backpropagation and time-consuming parameter tuning algorithms, deep PLS provides a transparent model structure which also works well for small training data. As a result, fusion of those two learning strategies makes the new proposed method as a very promising predictive modeling tool in industrial soft sensor applications, the performance of which is evaluated and confirmed through a real industrial example.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"267 ","pages":"Article 105555"},"PeriodicalIF":3.8,"publicationDate":"2025-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145412394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-22DOI: 10.1016/j.chemolab.2025.105553
Yellam Naidu Kottavalasa, Lauro Snidaro
The chemical industry is the backbone of global manufacturing, driving innovation across multiple sectors. Since chemical processes are complex and dynamic in nature, it is still difficult to maintain efficiency, consistency in product, and optimize process parameters. Traditional approaches often fall short in handling these complexities, prompting manufacturers to adopt data-driven methodologies, including statistical models, machine learning techniques, and deep learning architectures. This survey discusses how these models help in fault detection, process optimization, and quality control. We examine the role of statistical models in capturing process variation, machine learning models in detecting patterns and anomalies, and neural networks in predictive maintenance and real-time monitoring. Additionally, we explore fusion-based architectures, including hybrid statistical, machine learning, and deep learning methods, that facilitate better fault detection and parameter estimation. The survey also highlights how data-driven approaches support sustainable chemical manufacturing by enabling real-time decisions, adaptive control, and effective process monitoring.
{"title":"Advancing chemical manufacturing processes through data-driven approaches: A survey","authors":"Yellam Naidu Kottavalasa, Lauro Snidaro","doi":"10.1016/j.chemolab.2025.105553","DOIUrl":"10.1016/j.chemolab.2025.105553","url":null,"abstract":"<div><div>The chemical industry is the backbone of global manufacturing, driving innovation across multiple sectors. Since chemical processes are complex and dynamic in nature, it is still difficult to maintain efficiency, consistency in product, and optimize process parameters. Traditional approaches often fall short in handling these complexities, prompting manufacturers to adopt data-driven methodologies, including statistical models, machine learning techniques, and deep learning architectures. This survey discusses how these models help in fault detection, process optimization, and quality control. We examine the role of statistical models in capturing process variation, machine learning models in detecting patterns and anomalies, and neural networks in predictive maintenance and real-time monitoring. Additionally, we explore fusion-based architectures, including hybrid statistical, machine learning, and deep learning methods, that facilitate better fault detection and parameter estimation. The survey also highlights how data-driven approaches support sustainable chemical manufacturing by enabling real-time decisions, adaptive control, and effective process monitoring.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"267 ","pages":"Article 105553"},"PeriodicalIF":3.8,"publicationDate":"2025-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145412356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-22DOI: 10.1016/j.chemolab.2025.105554
Yong Pan , Chuandong Li , Jiang Xiong , Ziye Hou , Youbin Yao
With advancements in modern science and technology, electronic noses (ENs) have gained significant attention for their applications in environmental monitoring, food quality inspection, and medical equipment. ENs mimic biological olfactory systems to classify gases using arrays of sensors and pattern recognition models. However, gas sensor drift poses a major challenge, leading to performance degradation in EN systems. To address this, Domain Adaptation (DA) methods align source domain data with target domain drift data. While traditional DA methods assume identical class compositions in both domains, this is often unrealistic in practice, leading to suboptimal results. Open Set Domain Adaptation (OSDA) methods address unknown classes in the target domain, but they often focus too much on distinguishing unknown classes, neglecting accurate recognition of known classes. To overcome these limitations, we propose the Adversarial Domain Adaptation Guided by Farthest Distance (ADA-FDG), comprising two complementary modules: Farthest Distance Guide (FDG) and Confidence Normalized Adaptive Factor (CNAF). FDG adaptively builds a guide set that lies farthest from the source distribution in feature space, ensuring adversarial alignment learns to the edge region distribution. CNAF assigns a weight to each batch proportional to its classification confidence, preventing unknown-class samples from contaminating the ADA process. By integrating FDG and CNAF in an adversarial training framework, ADA-FDG achieves more precise alignment of source and target distributions while preserving clear separation between known and unknown classes. Extensive experiments on two benchmark datasets demonstrate that ADA-FDG consistently outperforms state-of-the-art closed and open set DA methods, delivering significant improvements in overall, known-class, and unknown-class accuracy.
随着现代科学技术的进步,电子鼻在环境监测、食品质量检测、医疗设备等方面的应用越来越受到人们的重视。ENs模拟生物嗅觉系统,利用传感器阵列和模式识别模型对气体进行分类。然而,气体传感器漂移带来了重大挑战,导致EN系统的性能下降。为了解决这个问题,域适应(DA)方法将源域数据与目标域漂移数据对齐。虽然传统的数据处理方法在两个领域中假设相同的类组成,但这在实践中往往是不现实的,从而导致次优结果。开放集域自适应(Open Set Domain Adaptation, OSDA)方法主要针对目标域中的未知类,但往往过于关注识别未知类,而忽略了对已知类的准确识别。为了克服这些限制,我们提出了由最远距离引导的对抗域自适应(ADA-FDG),它由最远距离引导(FDG)和置信归一化自适应因子(CNAF)两个互补模块组成。FDG自适应地构建距离源分布在特征空间中最远的引导集,保证对抗性对齐学习到边缘区域分布。CNAF为每个批次分配与其分类置信度成比例的权重,防止未知类别的样品污染ADA过程。通过在对抗性训练框架中集成FDG和CNAF, ADA-FDG实现了更精确的源和目标分布对齐,同时保留了已知和未知类别之间的明确分离。在两个基准数据集上进行的大量实验表明,ADA-FDG始终优于最先进的封闭集和开放集数据分析方法,在总体、已知类和未知类精度方面都有显著提高。
{"title":"Adversarial Domain Adaptation Guided by Farthest Distance for open set electronic nose drift compensation","authors":"Yong Pan , Chuandong Li , Jiang Xiong , Ziye Hou , Youbin Yao","doi":"10.1016/j.chemolab.2025.105554","DOIUrl":"10.1016/j.chemolab.2025.105554","url":null,"abstract":"<div><div>With advancements in modern science and technology, electronic noses (ENs) have gained significant attention for their applications in environmental monitoring, food quality inspection, and medical equipment. ENs mimic biological olfactory systems to classify gases using arrays of sensors and pattern recognition models. However, gas sensor drift poses a major challenge, leading to performance degradation in EN systems. To address this, Domain Adaptation (DA) methods align source domain data with target domain drift data. While traditional DA methods assume identical class compositions in both domains, this is often unrealistic in practice, leading to suboptimal results. Open Set Domain Adaptation (OSDA) methods address unknown classes in the target domain, but they often focus too much on distinguishing unknown classes, neglecting accurate recognition of known classes. To overcome these limitations, we propose the Adversarial Domain Adaptation Guided by Farthest Distance (ADA-FDG), comprising two complementary modules: Farthest Distance Guide (FDG) and Confidence Normalized Adaptive Factor (CNAF). FDG adaptively builds a guide set that lies farthest from the source distribution in feature space, ensuring adversarial alignment learns to the edge region distribution. CNAF assigns a weight to each batch proportional to its classification confidence, preventing unknown-class samples from contaminating the ADA process. By integrating FDG and CNAF in an adversarial training framework, ADA-FDG achieves more precise alignment of source and target distributions while preserving clear separation between known and unknown classes. Extensive experiments on two benchmark datasets demonstrate that ADA-FDG consistently outperforms state-of-the-art closed and open set DA methods, delivering significant improvements in overall, known-class, and unknown-class accuracy.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"267 ","pages":"Article 105554"},"PeriodicalIF":3.8,"publicationDate":"2025-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145358703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}