Li Bin, Yang Jin-li, Sun Zhao-xiang, Yang Shi-min, Ouyang Aiguo, Liu Yan-de
The cultivation processes of watermelon seed are often affected by issues such as empty shells and defects, resulting in significant losses. To obtain high-quality seeds, the terahertz imaging technology combined with image smoothing and enhancement algorithm was proposed to reduce the noise and non-obvious features caused by the influence in the imaging process and realize the non-destructive, efficient, and accurate detection of the internal quality of watermelon seeds. Initially, a terahertz imaging system with a spatial resolution of 0.4 mm was used to acquire images of watermelon seeds with varying levels of fullness. Subsequently, denoising techniques, including Gaussian filtering, median filtering, bilateral filtering, discrete wavelet transformation denoising, wavelet denoising, and principal component analysis denoising, were used to handle the terahertz spectral images of watermelon seeds in the frequency range of 1–1.5 THz, respectively. Image enhancement operations, involving segmented linear gray-level transformation and fractional-order differentiation, were performed on the terahertz images of watermelon seeds after denoising. The optimal image processing approach was determined based on defect assessment through threshold segmentation. Finally, the validation was conducted at a spatial resolution of 0.2 mm. The images at a spatial resolution of 0.4 mm were subjected to wavelet denoising and window slicing in segmented linear gray-level transformation (WS-SLT) enhancement; the results exhibited the following improvements in defect accuracy compared with untreated THz images. A 7.74% increase in accuracy was observed for empty seeds, along with a 6.29% increase in the defect ratio for defective seeds 1. The defect ratio for intact seeds was 0, and there was no significant difference in defect ratio accuracy for defective seeds 2. At a spatial resolution of 0.2 mm, the average defect ratio error of THz imaging handled by wavelet denoising and WS-SLT was approximately 5.04%. In conclusion, the terahertz imaging technology coupled with wavelet denoising and WS-SLT methods can be used to enhance the accuracy of internal defect detection in watermelon seeds, and it provides a technical foundation and reference for assessing watermelon seed fullness.
{"title":"Detection the internal quality of watermelon seeds based on terahertz imaging technology combined with image smoothing and enhancement algorithm","authors":"Li Bin, Yang Jin-li, Sun Zhao-xiang, Yang Shi-min, Ouyang Aiguo, Liu Yan-de","doi":"10.1002/cem.3557","DOIUrl":"10.1002/cem.3557","url":null,"abstract":"<p>The cultivation processes of watermelon seed are often affected by issues such as empty shells and defects, resulting in significant losses. To obtain high-quality seeds, the terahertz imaging technology combined with image smoothing and enhancement algorithm was proposed to reduce the noise and non-obvious features caused by the influence in the imaging process and realize the non-destructive, efficient, and accurate detection of the internal quality of watermelon seeds. Initially, a terahertz imaging system with a spatial resolution of 0.4 mm was used to acquire images of watermelon seeds with varying levels of fullness. Subsequently, denoising techniques, including Gaussian filtering, median filtering, bilateral filtering, discrete wavelet transformation denoising, wavelet denoising, and principal component analysis denoising, were used to handle the terahertz spectral images of watermelon seeds in the frequency range of 1–1.5 THz, respectively. Image enhancement operations, involving segmented linear gray-level transformation and fractional-order differentiation, were performed on the terahertz images of watermelon seeds after denoising. The optimal image processing approach was determined based on defect assessment through threshold segmentation. Finally, the validation was conducted at a spatial resolution of 0.2 mm. The images at a spatial resolution of 0.4 mm were subjected to wavelet denoising and window slicing in segmented linear gray-level transformation (WS-SLT) enhancement; the results exhibited the following improvements in defect accuracy compared with untreated THz images. A 7.74% increase in accuracy was observed for empty seeds, along with a 6.29% increase in the defect ratio for defective seeds 1. The defect ratio for intact seeds was 0, and there was no significant difference in defect ratio accuracy for defective seeds 2. At a spatial resolution of 0.2 mm, the average defect ratio error of THz imaging handled by wavelet denoising and WS-SLT was approximately 5.04%. In conclusion, the terahertz imaging technology coupled with wavelet denoising and WS-SLT methods can be used to enhance the accuracy of internal defect detection in watermelon seeds, and it provides a technical foundation and reference for assessing watermelon seed fullness.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 9","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141196176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chi Yao, Cheng-tao Su, Ji-ping Zou, Shang-tao Ou-yang, Jian Wu, Nan Chen, Yan de Liu, Bin Li
To reduce the number of bruised mangoes at source, it is important to determine the different storage times of mangoes after mild bruise. In order to address this issue, a hyperspectral imaging combined with deep learning model was proposed. First, the average spectrum of the sample bruised area was extracted as spectral features, and then, the six eigenvalues of the most representative PC1 image were calculated as texture features based on the gray level co-occurrence matrix. In order to find the optimal discriminative model, random forest (RF), partial least squares discriminant analysis (PLS-DA), extreme gradient boosting (XGBoost), and convolutional neural network (CNN) models were built based on spectral features, texture features, and spectral features combined with texture features (Feature Fusion 1), respectively. The results showed that the best model discriminating model was based on CNN under Feature Fusion 1, with an overall accuracy of 90.22%. To reduce the redundant information and noise introduced by the full spectrum, uninformative variable elimination (UVE) and competitive adaptive reweighted sampling (CARS) algorithms were used to filter the spectral features. The screened spectral features were fused with texture features (Feature Fusion 2) and modeled again with RF, PLS-DA, XGBoost, and CNN. The results showed that the optimal model for discriminating different storage times of mangoes after bruise was the CNN model based on feature fusion 2 (CARS), with an overall accuracy of 93.48%. In summary, this study shows that the spectral features combined with texture features can be used to effectively improve the model's discriminative results for different storage times of mango after mild bruise. Compared to other machine learning models, the CNN model in this paper achieves better results. It provides a theoretical basis for hyperspectral imaging combined with deep learning in discriminating different storage times of mangoes after mild bruise.
{"title":"Detection storage time of mangoes after mild bruise based on hyperspectral imaging combined with deep learning","authors":"Chi Yao, Cheng-tao Su, Ji-ping Zou, Shang-tao Ou-yang, Jian Wu, Nan Chen, Yan de Liu, Bin Li","doi":"10.1002/cem.3559","DOIUrl":"10.1002/cem.3559","url":null,"abstract":"<p>To reduce the number of bruised mangoes at source, it is important to determine the different storage times of mangoes after mild bruise. In order to address this issue, a hyperspectral imaging combined with deep learning model was proposed. First, the average spectrum of the sample bruised area was extracted as spectral features, and then, the six eigenvalues of the most representative PC1 image were calculated as texture features based on the gray level co-occurrence matrix. In order to find the optimal discriminative model, random forest (RF), partial least squares discriminant analysis (PLS-DA), extreme gradient boosting (XGBoost), and convolutional neural network (CNN) models were built based on spectral features, texture features, and spectral features combined with texture features (Feature Fusion 1), respectively. The results showed that the best model discriminating model was based on CNN under Feature Fusion 1, with an overall accuracy of 90.22%. To reduce the redundant information and noise introduced by the full spectrum, uninformative variable elimination (UVE) and competitive adaptive reweighted sampling (CARS) algorithms were used to filter the spectral features. The screened spectral features were fused with texture features (Feature Fusion 2) and modeled again with RF, PLS-DA, XGBoost, and CNN. The results showed that the optimal model for discriminating different storage times of mangoes after bruise was the CNN model based on feature fusion 2 (CARS), with an overall accuracy of 93.48%. In summary, this study shows that the spectral features combined with texture features can be used to effectively improve the model's discriminative results for different storage times of mango after mild bruise. Compared to other machine learning models, the CNN model in this paper achieves better results. It provides a theoretical basis for hyperspectral imaging combined with deep learning in discriminating different storage times of mangoes after mild bruise.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 9","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141113807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Extended similarity indices (i.e., generalization of pairwise similarity) have recently gained importance because of their simplicity, fast computation, and superiority in tasks like diversity picking. However, they operate with several meta parameters that should be optimized. Earlier, we extended the binary similarity indices to “discrete non-binary” and “continuous” data; now we continue with introducing and comparing multiple weighting functions. As a case study, the similarity of CYP enzyme inhibitors (4016 molecules after curation) was characterized by their extended similarities, based on 2D descriptors, MACCS and Morgan fingerprints. A statistical workflow based on sum of ranking differences (SRD) and analysis of variance (ANOVA) was used for finding the optimal weight function(s). Overall, the best weighting function is the fraction (“frac”), which corresponds to the principle of parsimony. Optimal extended similarity indices were also found, and their differences are revealed across different data sets. We intend this work to be a guideline for users of extended similarity indices regarding the various weighting options available. Source code for the calculations is available at https://github.com/mqcomplab/MultipleComparisons.
{"title":"Alternative weighting schemes for fine-tuned extended similarity indices","authors":"Kenneth López Pérez, Anita Rácz, Dávid Bajusz, Camila Gonzalez, Károly Héberger, Ramón Alain Miranda-Quintana","doi":"10.1002/cem.3558","DOIUrl":"10.1002/cem.3558","url":null,"abstract":"<p>Extended similarity indices (i.e., generalization of pairwise similarity) have recently gained importance because of their simplicity, fast computation, and superiority in tasks like diversity picking. However, they operate with several meta parameters that should be optimized. Earlier, we extended the binary similarity indices to “discrete non-binary” and “continuous” data; now we continue with introducing and comparing multiple weighting functions. As a case study, the similarity of CYP enzyme inhibitors (4016 molecules after curation) was characterized by their extended similarities, based on 2D descriptors, MACCS and Morgan fingerprints. A statistical workflow based on sum of ranking differences (SRD) and analysis of variance (ANOVA) was used for finding the optimal weight function(s). Overall, the best weighting function is the fraction (“frac”), which corresponds to the principle of parsimony. Optimal extended similarity indices were also found, and their differences are revealed across different data sets. We intend this work to be a guideline for users of extended similarity indices regarding the various weighting options available. Source code for the calculations is available at https://github.com/mqcomplab/MultipleComparisons.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 9","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.3558","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140930670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The aim of this paper is twofold. First, it serves as a comprehensive tutorial on Data-Driven Soft Independent Modelling of Class Analogy (SIMCA) (DD-SIMCA) method for one-class classification. It covers all practical aspects of developing, validation, and application of DD-SIMCA models, using a set of simple examples. Second, it introduces web application that implements the main DD-SIMCA functionality. This application is freely available for everyone and does not require registration or installation. All calculations run locally in a browser without sending any information on a server, hence removing any obstacles to the dissemination of the data and models.
{"title":"A comprehensive tutorial on Data-Driven SIMCA: Theory and implementation in web","authors":"Sergey Kucheryavskiy, Oxana Rodionova, Alexey Pomerantsev","doi":"10.1002/cem.3556","DOIUrl":"10.1002/cem.3556","url":null,"abstract":"<p>The aim of this paper is twofold. First, it serves as a comprehensive tutorial on Data-Driven Soft Independent Modelling of Class Analogy (SIMCA) (DD-SIMCA) method for one-class classification. It covers all practical aspects of developing, validation, and application of DD-SIMCA models, using a set of simple examples. Second, it introduces web application that implements the main DD-SIMCA functionality. This application is freely available for everyone and does not require registration or installation. All calculations run locally in a browser without sending any information on a server, hence removing any obstacles to the dissemination of the data and models.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 7","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.3556","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140930665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The just-in-time learning-based partial least squares (JIT-PLS) has been extensively applied to adaptive soft sensor modeling of complex nonlinear processes. However, it still has the problems of unreasonable relevant samples selection and unsatisfactory local modeling. Aiming at these problems, this paper proposes an improved just-in-time learning-based random mapping partial least squares (IJIT-RMPLS), including an improved relevant samples selection strategy and a random mapping PLS (RMPLS) model. On the one hand, considering the different correlation degrees between input variables and output variable, this method applies mutual information to evaluate the importance of each input variable and designs a variable-weighted Euclidean distance to select relevant samples for local modeling. On the other hand, in order to prompt the prediction precision of local soft sensor models, this method combines the idea of nonlinear random mapping in extreme learning machines with PLS and builds a RMPLS with multiple activation functions. Applications on a numerical example and a real chemical process show that the proposed IJIT-RMPLS has smaller prediction error compared with traditional JIT-PLS.
{"title":"Adaptive soft sensor modeling of chemical processes based on an improved just-in-time learning and random mapping partial least squares","authors":"Ke Zhang, Xiangrui Zhang","doi":"10.1002/cem.3554","DOIUrl":"10.1002/cem.3554","url":null,"abstract":"<p>The just-in-time learning-based partial least squares (JIT-PLS) has been extensively applied to adaptive soft sensor modeling of complex nonlinear processes. However, it still has the problems of unreasonable relevant samples selection and unsatisfactory local modeling. Aiming at these problems, this paper proposes an improved just-in-time learning-based random mapping partial least squares (IJIT-RMPLS), including an improved relevant samples selection strategy and a random mapping PLS (RMPLS) model. On the one hand, considering the different correlation degrees between input variables and output variable, this method applies mutual information to evaluate the importance of each input variable and designs a variable-weighted Euclidean distance to select relevant samples for local modeling. On the other hand, in order to prompt the prediction precision of local soft sensor models, this method combines the idea of nonlinear random mapping in extreme learning machines with PLS and builds a RMPLS with multiple activation functions. Applications on a numerical example and a real chemical process show that the proposed IJIT-RMPLS has smaller prediction error compared with traditional JIT-PLS.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 9","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140831636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kai Liu, Xiaoqiang Zhao, Yongyong Hui, Hongmei Jiang
Fault prediction ensures safe and stable production, and cuts maintenance costs. Due to the changing operating conditions that lead to the changes in the characteristics of industrial processes, there is a need to monitor the fault state of batch processes in real-time and to accurately predict fault trends. An adaptive slow feature analysis-neighborhood preserving embedding-improved stochastic configuration network (SFA-NPE-ISCN) algorithm for batch process fault prediction is proposed. Firstly, SFA is used to extract the time-varying features of process data and establish the update index of the NPE model. Then, to extract local nearest-neighbor features and reconstruct them by the NPE model with adaptive update capability, square prediction error (SPE) statistics are constructed as fault state features based on the reconstructed error. Further, the hunter-prey optimization (HPO) algorithm optimizes the weights and biases in the stochastic configuration network, and the singular value decomposition (SVD) and QR decomposition of column rotation are introduced to solve the ill-posed problem of SCN and obtain the prediction model of ISCN. Finally, the obtained statistics SPE is formed into a time series, and the ISCN model is used to predict the process state trend. The effectiveness of the proposed algorithm is verified by case studies of industrial-scale penicillin fermentation processes and the Hot strip mill process.
{"title":"An adaptive strategy for time-varying batch process fault prediction based on stochastic configuration network","authors":"Kai Liu, Xiaoqiang Zhao, Yongyong Hui, Hongmei Jiang","doi":"10.1002/cem.3555","DOIUrl":"10.1002/cem.3555","url":null,"abstract":"<p>Fault prediction ensures safe and stable production, and cuts maintenance costs. Due to the changing operating conditions that lead to the changes in the characteristics of industrial processes, there is a need to monitor the fault state of batch processes in real-time and to accurately predict fault trends. An adaptive slow feature analysis-neighborhood preserving embedding-improved stochastic configuration network (SFA-NPE-ISCN) algorithm for batch process fault prediction is proposed. Firstly, SFA is used to extract the time-varying features of process data and establish the update index of the NPE model. Then, to extract local nearest-neighbor features and reconstruct them by the NPE model with adaptive update capability, square prediction error (SPE) statistics are constructed as fault state features based on the reconstructed error. Further, the hunter-prey optimization (HPO) algorithm optimizes the weights and biases in the stochastic configuration network, and the singular value decomposition (SVD) and QR decomposition of column rotation are introduced to solve the ill-posed problem of SCN and obtain the prediction model of ISCN. Finally, the obtained statistics SPE is formed into a time series, and the ISCN model is used to predict the process state trend. The effectiveness of the proposed algorithm is verified by case studies of industrial-scale penicillin fermentation processes and the Hot strip mill process.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 9","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140831422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fan Zhang, Chaoyang Liu, Binjie Wang, Yiru He, Xinhong Zhang
Most of the current nonclassical proteins prediction methods involve manual feature selection, such as constructing features of samples based on the physicochemical properties of proteins and position-specific scoring matrix (PSSM). However, these tasks require researchers to perform some tedious search work to obtain the physicochemical properties of proteins. This paper proposes an end-to-end nonclassical secreted protein prediction model based on deep learning, named DeepNCSPP, which employs the protein sequence information and sequence statistics information as input to predict whether it is a nonclassical secreted protein. The protein sequence information and sequence statistics information are extracted using bidirectional long- and short-term memory and convolutional neural networks, respectively. Among the experiments conducted on the independent test dataset, DeepNCSPP achieved excellent results with an accuracy of 88.24%, Matthews coefficient (MCC) of 77.01%, and F1-score of 87.50%. Independent test dataset testing and 10-fold cross-validation show that DeepNCSPP achieves competitive performance with state-of-the-art methods and can be used as a reliable nonclassical secreted protein prediction model. A web server has been constructed for the convenience of researchers. The web link is https://www.deepncspp.top/. The source code of DeepNCSPP has been hosted on GitHub and is available online (https://github.com/xiaoliu166370/DEEPNCSPP).
{"title":"A prediction model of nonclassical secreted protein based on deep learning","authors":"Fan Zhang, Chaoyang Liu, Binjie Wang, Yiru He, Xinhong Zhang","doi":"10.1002/cem.3553","DOIUrl":"10.1002/cem.3553","url":null,"abstract":"<p>Most of the current nonclassical proteins prediction methods involve manual feature selection, such as constructing features of samples based on the physicochemical properties of proteins and position-specific scoring matrix (PSSM). However, these tasks require researchers to perform some tedious search work to obtain the physicochemical properties of proteins. This paper proposes an end-to-end nonclassical secreted protein prediction model based on deep learning, named DeepNCSPP, which employs the protein sequence information and sequence statistics information as input to predict whether it is a nonclassical secreted protein. The protein sequence information and sequence statistics information are extracted using bidirectional long- and short-term memory and convolutional neural networks, respectively. Among the experiments conducted on the independent test dataset, DeepNCSPP achieved excellent results with an accuracy of 88.24<i>%</i>, Matthews coefficient (MCC) of 77.01<i>%</i>, and F1-score of 87.50<i>%</i>. Independent test dataset testing and 10-fold cross-validation show that DeepNCSPP achieves competitive performance with state-of-the-art methods and can be used as a reliable nonclassical secreted protein prediction model. A web server has been constructed for the convenience of researchers. The web link is https://www.deepncspp.top/. The source code of DeepNCSPP has been hosted on GitHub and is available online (https://github.com/xiaoliu166370/DEEPNCSPP).</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 8","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140803214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper describes some statistical tests for comparing the predictive performance of two or more prediction rules. It covers the cases of both quantitative and qualitative predictions, that is, both regression and classification problems. Worked examples are included for both cases.
{"title":"Testing differences in predictive ability: A tutorial","authors":"Tom Fearn","doi":"10.1002/cem.3549","DOIUrl":"10.1002/cem.3549","url":null,"abstract":"<p>This paper describes some statistical tests for comparing the predictive performance of two or more prediction rules. It covers the cases of both quantitative and qualitative predictions, that is, both regression and classification problems. Worked examples are included for both cases.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 8","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.3549","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140803276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Linearly dependent concentration profiles of a chemical reaction can result in a spectral data matrix with a chemical rank smaller than the number of absorbing chemical species. Such a rank deficiency is problematic for a factor analysis as some information on the pure component spectra cannot be recovered from the mixture data. Matrix augmentation can break rank deficiencies and enable successful pure component recovery. In contrast to this, an artificial breakdown of a rank deficiency can be caused by a numerical finite precision simulation of the underlying kinetic model and can fake a successful MCR analysis. This work discusses the problem and points out some remedies.
{"title":"A note on rank deficiency and numerical modeling","authors":"Klaus Neymeyr, Mathias Sawall, Tomass Andersons","doi":"10.1002/cem.3550","DOIUrl":"10.1002/cem.3550","url":null,"abstract":"<p>Linearly dependent concentration profiles of a chemical reaction can result in a spectral data matrix with a chemical rank smaller than the number of absorbing chemical species. Such a rank deficiency is problematic for a factor analysis as some information on the pure component spectra cannot be recovered from the mixture data. Matrix augmentation can break rank deficiencies and enable successful pure component recovery. In contrast to this, an artificial breakdown of a rank deficiency can be caused by a numerical finite precision simulation of the underlying kinetic model and can fake a successful MCR analysis. This work discusses the problem and points out some remedies.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 8","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.3550","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140669905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Liwei Feng, Shaofeng Guo, Yifei Wu, Yu Xing, Yuan Li
To solve the problem that the multi-stage process with dynamicity and nonlinear is hard to monitor effectively, the time-space neighborhood standardization (TSNS) method is proposed, which is further applied to partial least squares (PLS) to propose TSNS and PLS (TSNS-PLS) method for process fault detection. TSNS can transform multi-stage data into single-stage data that approximately obeys a standard normal distribution, remove temporal correlation between samples at previous and subsequent moments in the process data, and separate online fault samples. TSNS makes the transformed process data satisfy the requirements of the PLS method for process data and can significantly improve the fault detection rate of the PLS method. Finally, the performance of TSNS-PLS was examined by a numerical simulation process and the penicillin fermentation process design fault detection experiment.
{"title":"Application of time-space neighborhood standardization technology to complex multi-stage process fault detection","authors":"Liwei Feng, Shaofeng Guo, Yifei Wu, Yu Xing, Yuan Li","doi":"10.1002/cem.3546","DOIUrl":"10.1002/cem.3546","url":null,"abstract":"<p>To solve the problem that the multi-stage process with dynamicity and nonlinear is hard to monitor effectively, the time-space neighborhood standardization (TSNS) method is proposed, which is further applied to partial least squares (PLS) to propose TSNS and PLS (TSNS-PLS) method for process fault detection. TSNS can transform multi-stage data into single-stage data that approximately obeys a standard normal distribution, remove temporal correlation between samples at previous and subsequent moments in the process data, and separate online fault samples. TSNS makes the transformed process data satisfy the requirements of the PLS method for process data and can significantly improve the fault detection rate of the PLS method. Finally, the performance of TSNS-PLS was examined by a numerical simulation process and the penicillin fermentation process design fault detection experiment.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 8","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140635546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}