首页 > 最新文献

Journal of Chemometrics最新文献

英文 中文
Detection the internal quality of watermelon seeds based on terahertz imaging technology combined with image smoothing and enhancement algorithm 基于太赫兹成像技术结合图像平滑和增强算法的西瓜籽内部质量检测
IF 2.3 4区 化学 Q1 SOCIAL WORK Pub Date : 2024-05-28 DOI: 10.1002/cem.3557
Li Bin, Yang Jin-li, Sun Zhao-xiang, Yang Shi-min, Ouyang Aiguo, Liu Yan-de

The cultivation processes of watermelon seed are often affected by issues such as empty shells and defects, resulting in significant losses. To obtain high-quality seeds, the terahertz imaging technology combined with image smoothing and enhancement algorithm was proposed to reduce the noise and non-obvious features caused by the influence in the imaging process and realize the non-destructive, efficient, and accurate detection of the internal quality of watermelon seeds. Initially, a terahertz imaging system with a spatial resolution of 0.4 mm was used to acquire images of watermelon seeds with varying levels of fullness. Subsequently, denoising techniques, including Gaussian filtering, median filtering, bilateral filtering, discrete wavelet transformation denoising, wavelet denoising, and principal component analysis denoising, were used to handle the terahertz spectral images of watermelon seeds in the frequency range of 1–1.5 THz, respectively. Image enhancement operations, involving segmented linear gray-level transformation and fractional-order differentiation, were performed on the terahertz images of watermelon seeds after denoising. The optimal image processing approach was determined based on defect assessment through threshold segmentation. Finally, the validation was conducted at a spatial resolution of 0.2 mm. The images at a spatial resolution of 0.4 mm were subjected to wavelet denoising and window slicing in segmented linear gray-level transformation (WS-SLT) enhancement; the results exhibited the following improvements in defect accuracy compared with untreated THz images. A 7.74% increase in accuracy was observed for empty seeds, along with a 6.29% increase in the defect ratio for defective seeds 1. The defect ratio for intact seeds was 0, and there was no significant difference in defect ratio accuracy for defective seeds 2. At a spatial resolution of 0.2 mm, the average defect ratio error of THz imaging handled by wavelet denoising and WS-SLT was approximately 5.04%. In conclusion, the terahertz imaging technology coupled with wavelet denoising and WS-SLT methods can be used to enhance the accuracy of internal defect detection in watermelon seeds, and it provides a technical foundation and reference for assessing watermelon seed fullness.

在西瓜种子的培育过程中,经常会受到空壳、瑕疵等问题的影响,造成重大损失。为了获得高质量的种子,提出了太赫兹成像技术结合图像平滑和增强算法,以降低成像过程中受影响而产生的噪声和非明显特征,实现对西瓜种子内部质量的无损、高效、准确检测。首先,使用空间分辨率为 0.4 毫米的太赫兹成像系统获取不同饱满度的西瓜籽图像。随后,使用去噪技术,包括高斯滤波、中值滤波、双边滤波、离散小波变换去噪、小波去噪和主成分分析去噪,分别处理频率范围为 1-1.5 THz 的西瓜籽太赫兹光谱图像。对去噪后的西瓜籽太赫兹图像进行了图像增强操作,包括分段线性灰度级变换和分数阶微分。根据通过阈值分割进行的缺陷评估,确定了最佳图像处理方法。最后,在 0.2 毫米的空间分辨率下进行了验证。对空间分辨率为 0.4 毫米的图像进行了小波去噪和分段线性灰度级变换(WS-SLT)增强中的窗口切片处理;结果显示,与未经处理的 THz 图像相比,缺陷准确率有了以下提高。空种子的准确度提高了 7.74%,缺陷种子 1 的缺陷率提高了 6.29%。完整种子的缺陷率为 0,缺陷种子 2 的缺陷率准确度没有显著差异。在 0.2 毫米的空间分辨率下,小波去噪和 WS-SLT 处理的太赫兹成像平均缺陷率误差约为 5.04%。综上所述,太赫兹成像技术结合小波去噪和 WS-SLT 方法可用于提高西瓜种子内部缺陷检测的准确性,为西瓜种子饱满度评估提供了技术基础和参考。
{"title":"Detection the internal quality of watermelon seeds based on terahertz imaging technology combined with image smoothing and enhancement algorithm","authors":"Li Bin,&nbsp;Yang Jin-li,&nbsp;Sun Zhao-xiang,&nbsp;Yang Shi-min,&nbsp;Ouyang Aiguo,&nbsp;Liu Yan-de","doi":"10.1002/cem.3557","DOIUrl":"10.1002/cem.3557","url":null,"abstract":"<p>The cultivation processes of watermelon seed are often affected by issues such as empty shells and defects, resulting in significant losses. To obtain high-quality seeds, the terahertz imaging technology combined with image smoothing and enhancement algorithm was proposed to reduce the noise and non-obvious features caused by the influence in the imaging process and realize the non-destructive, efficient, and accurate detection of the internal quality of watermelon seeds. Initially, a terahertz imaging system with a spatial resolution of 0.4 mm was used to acquire images of watermelon seeds with varying levels of fullness. Subsequently, denoising techniques, including Gaussian filtering, median filtering, bilateral filtering, discrete wavelet transformation denoising, wavelet denoising, and principal component analysis denoising, were used to handle the terahertz spectral images of watermelon seeds in the frequency range of 1–1.5 THz, respectively. Image enhancement operations, involving segmented linear gray-level transformation and fractional-order differentiation, were performed on the terahertz images of watermelon seeds after denoising. The optimal image processing approach was determined based on defect assessment through threshold segmentation. Finally, the validation was conducted at a spatial resolution of 0.2 mm. The images at a spatial resolution of 0.4 mm were subjected to wavelet denoising and window slicing in segmented linear gray-level transformation (WS-SLT) enhancement; the results exhibited the following improvements in defect accuracy compared with untreated THz images. A 7.74% increase in accuracy was observed for empty seeds, along with a 6.29% increase in the defect ratio for defective seeds 1. The defect ratio for intact seeds was 0, and there was no significant difference in defect ratio accuracy for defective seeds 2. At a spatial resolution of 0.2 mm, the average defect ratio error of THz imaging handled by wavelet denoising and WS-SLT was approximately 5.04%. In conclusion, the terahertz imaging technology coupled with wavelet denoising and WS-SLT methods can be used to enhance the accuracy of internal defect detection in watermelon seeds, and it provides a technical foundation and reference for assessing watermelon seed fullness.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 9","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141196176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detection storage time of mangoes after mild bruise based on hyperspectral imaging combined with deep learning 基于高光谱成像与深度学习的芒果轻度瘀伤后贮藏时间检测
IF 2.3 4区 化学 Q1 SOCIAL WORK Pub Date : 2024-05-21 DOI: 10.1002/cem.3559
Chi Yao, Cheng-tao Su, Ji-ping Zou, Shang-tao Ou-yang, Jian Wu, Nan Chen, Yan de Liu, Bin Li

To reduce the number of bruised mangoes at source, it is important to determine the different storage times of mangoes after mild bruise. In order to address this issue, a hyperspectral imaging combined with deep learning model was proposed. First, the average spectrum of the sample bruised area was extracted as spectral features, and then, the six eigenvalues of the most representative PC1 image were calculated as texture features based on the gray level co-occurrence matrix. In order to find the optimal discriminative model, random forest (RF), partial least squares discriminant analysis (PLS-DA), extreme gradient boosting (XGBoost), and convolutional neural network (CNN) models were built based on spectral features, texture features, and spectral features combined with texture features (Feature Fusion 1), respectively. The results showed that the best model discriminating model was based on CNN under Feature Fusion 1, with an overall accuracy of 90.22%. To reduce the redundant information and noise introduced by the full spectrum, uninformative variable elimination (UVE) and competitive adaptive reweighted sampling (CARS) algorithms were used to filter the spectral features. The screened spectral features were fused with texture features (Feature Fusion 2) and modeled again with RF, PLS-DA, XGBoost, and CNN. The results showed that the optimal model for discriminating different storage times of mangoes after bruise was the CNN model based on feature fusion 2 (CARS), with an overall accuracy of 93.48%. In summary, this study shows that the spectral features combined with texture features can be used to effectively improve the model's discriminative results for different storage times of mango after mild bruise. Compared to other machine learning models, the CNN model in this paper achieves better results. It provides a theoretical basis for hyperspectral imaging combined with deep learning in discriminating different storage times of mangoes after mild bruise.

为了从源头上减少淤伤芒果的数量,必须确定芒果轻度淤伤后的不同储存时间。针对这一问题,提出了一种高光谱成像与深度学习相结合的模型。首先,提取样品碰伤区域的平均光谱作为光谱特征,然后,根据灰度共现矩阵计算最具代表性的 PC1 图像的六个特征值作为纹理特征。为了找到最佳判别模型,研究人员分别根据光谱特征、纹理特征以及光谱特征与纹理特征相结合(特征融合 1)建立了随机森林(RF)、偏最小二乘判别分析(PLS-DA)、极梯度提升(XGBoost)和卷积神经网络(CNN)模型。结果表明,基于特征融合 1 的 CNN 模型判别效果最好,总体准确率为 90.22%。为了减少全光谱带来的冗余信息和噪声,采用了无信息变量消除(UVE)和竞争性自适应加权采样(CARS)算法来筛选光谱特征。筛选出的光谱特征与纹理特征进行融合(特征融合 2),并再次使用 RF、PLS-DA、XGBoost 和 CNN 进行建模。结果表明,基于特征融合 2 的 CNN 模型(CARS)是判别芒果瘀伤后不同储存时间的最佳模型,总体准确率为 93.48%。综上所述,本研究表明,光谱特征与纹理特征相结合可有效提高模型对轻度碰伤后不同贮藏时间芒果的判别结果。与其他机器学习模型相比,本文的 CNN 模型取得了更好的效果。它为高光谱成像结合深度学习判别芒果轻度碰伤后的不同储存时间提供了理论依据。
{"title":"Detection storage time of mangoes after mild bruise based on hyperspectral imaging combined with deep learning","authors":"Chi Yao,&nbsp;Cheng-tao Su,&nbsp;Ji-ping Zou,&nbsp;Shang-tao Ou-yang,&nbsp;Jian Wu,&nbsp;Nan Chen,&nbsp;Yan de Liu,&nbsp;Bin Li","doi":"10.1002/cem.3559","DOIUrl":"10.1002/cem.3559","url":null,"abstract":"<p>To reduce the number of bruised mangoes at source, it is important to determine the different storage times of mangoes after mild bruise. In order to address this issue, a hyperspectral imaging combined with deep learning model was proposed. First, the average spectrum of the sample bruised area was extracted as spectral features, and then, the six eigenvalues of the most representative PC1 image were calculated as texture features based on the gray level co-occurrence matrix. In order to find the optimal discriminative model, random forest (RF), partial least squares discriminant analysis (PLS-DA), extreme gradient boosting (XGBoost), and convolutional neural network (CNN) models were built based on spectral features, texture features, and spectral features combined with texture features (Feature Fusion 1), respectively. The results showed that the best model discriminating model was based on CNN under Feature Fusion 1, with an overall accuracy of 90.22%. To reduce the redundant information and noise introduced by the full spectrum, uninformative variable elimination (UVE) and competitive adaptive reweighted sampling (CARS) algorithms were used to filter the spectral features. The screened spectral features were fused with texture features (Feature Fusion 2) and modeled again with RF, PLS-DA, XGBoost, and CNN. The results showed that the optimal model for discriminating different storage times of mangoes after bruise was the CNN model based on feature fusion 2 (CARS), with an overall accuracy of 93.48%. In summary, this study shows that the spectral features combined with texture features can be used to effectively improve the model's discriminative results for different storage times of mango after mild bruise. Compared to other machine learning models, the CNN model in this paper achieves better results. It provides a theoretical basis for hyperspectral imaging combined with deep learning in discriminating different storage times of mangoes after mild bruise.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 9","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141113807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Alternative weighting schemes for fine-tuned extended similarity indices 微调扩展相似性指数的替代加权方案
IF 2.3 4区 化学 Q1 SOCIAL WORK Pub Date : 2024-05-11 DOI: 10.1002/cem.3558
Kenneth López Pérez, Anita Rácz, Dávid Bajusz, Camila Gonzalez, Károly Héberger, Ramón Alain Miranda-Quintana

Extended similarity indices (i.e., generalization of pairwise similarity) have recently gained importance because of their simplicity, fast computation, and superiority in tasks like diversity picking. However, they operate with several meta parameters that should be optimized. Earlier, we extended the binary similarity indices to “discrete non-binary” and “continuous” data; now we continue with introducing and comparing multiple weighting functions. As a case study, the similarity of CYP enzyme inhibitors (4016 molecules after curation) was characterized by their extended similarities, based on 2D descriptors, MACCS and Morgan fingerprints. A statistical workflow based on sum of ranking differences (SRD) and analysis of variance (ANOVA) was used for finding the optimal weight function(s). Overall, the best weighting function is the fraction (“frac”), which corresponds to the principle of parsimony. Optimal extended similarity indices were also found, and their differences are revealed across different data sets. We intend this work to be a guideline for users of extended similarity indices regarding the various weighting options available. Source code for the calculations is available at https://github.com/mqcomplab/MultipleComparisons.

扩展的相似性指数(即成对相似性的广义化)因其简单、计算速度快以及在多样性挑选等任务中的优越性,近来越来越受到重视。然而,它们在运行时需要优化几个元参数。之前,我们将二元相似性指数扩展到了 "离散非二元 "和 "连续 "数据;现在,我们继续引入并比较多重加权函数。作为一项案例研究,我们通过基于二维描述符、MACCS 和摩根指纹的扩展相似性对 CYP 酶抑制剂(经整理后有 4016 个分子)的相似性进行了表征。为找到最佳加权函数,采用了基于排序差异总和(SRD)和方差分析(ANOVA)的统计工作流程。总体而言,最佳加权函数是分数("frac"),它符合简约原则。我们还找到了最佳扩展相似性指数,并揭示了它们在不同数据集上的差异。我们希望这项工作能为扩展相似性指数的用户提供有关各种权重选项的指导。计算的源代码见 https://github.com/mqcomplab/MultipleComparisons。
{"title":"Alternative weighting schemes for fine-tuned extended similarity indices","authors":"Kenneth López Pérez,&nbsp;Anita Rácz,&nbsp;Dávid Bajusz,&nbsp;Camila Gonzalez,&nbsp;Károly Héberger,&nbsp;Ramón Alain Miranda-Quintana","doi":"10.1002/cem.3558","DOIUrl":"10.1002/cem.3558","url":null,"abstract":"<p>Extended similarity indices (i.e., generalization of pairwise similarity) have recently gained importance because of their simplicity, fast computation, and superiority in tasks like diversity picking. However, they operate with several meta parameters that should be optimized. Earlier, we extended the binary similarity indices to “discrete non-binary” and “continuous” data; now we continue with introducing and comparing multiple weighting functions. As a case study, the similarity of CYP enzyme inhibitors (4016 molecules after curation) was characterized by their extended similarities, based on 2D descriptors, MACCS and Morgan fingerprints. A statistical workflow based on sum of ranking differences (SRD) and analysis of variance (ANOVA) was used for finding the optimal weight function(s). Overall, the best weighting function is the fraction (“frac”), which corresponds to the principle of parsimony. Optimal extended similarity indices were also found, and their differences are revealed across different data sets. We intend this work to be a guideline for users of extended similarity indices regarding the various weighting options available. Source code for the calculations is available at https://github.com/mqcomplab/MultipleComparisons.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 9","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.3558","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140930670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A comprehensive tutorial on Data-Driven SIMCA: Theory and implementation in web 数据驱动 SIMCA 综合教程:网络理论与实施
IF 2.3 4区 化学 Q1 SOCIAL WORK Pub Date : 2024-05-10 DOI: 10.1002/cem.3556
Sergey Kucheryavskiy, Oxana Rodionova, Alexey Pomerantsev

The aim of this paper is twofold. First, it serves as a comprehensive tutorial on Data-Driven Soft Independent Modelling of Class Analogy (SIMCA) (DD-SIMCA) method for one-class classification. It covers all practical aspects of developing, validation, and application of DD-SIMCA models, using a set of simple examples. Second, it introduces web application that implements the main DD-SIMCA functionality. This application is freely available for everyone and does not require registration or installation. All calculations run locally in a browser without sending any information on a server, hence removing any obstacles to the dissemination of the data and models.

本文有两个目的。首先,它是数据驱动的类类比软独立建模(SIMCA)(DD-SIMCA)方法用于单类分类的综合教程。它使用一组简单的示例,涵盖了开发、验证和应用 DD-SIMCA 模型的所有实际方面。其次,它介绍了实现 DD-SIMCA 主要功能的网络应用程序。该应用程序对所有人免费开放,无需注册或安装。所有计算均在本地浏览器中运行,无需向服务器发送任何信息,从而消除了数据和模型传播的任何障碍。
{"title":"A comprehensive tutorial on Data-Driven SIMCA: Theory and implementation in web","authors":"Sergey Kucheryavskiy,&nbsp;Oxana Rodionova,&nbsp;Alexey Pomerantsev","doi":"10.1002/cem.3556","DOIUrl":"10.1002/cem.3556","url":null,"abstract":"<p>The aim of this paper is twofold. First, it serves as a comprehensive tutorial on Data-Driven Soft Independent Modelling of Class Analogy (SIMCA) (DD-SIMCA) method for one-class classification. It covers all practical aspects of developing, validation, and application of DD-SIMCA models, using a set of simple examples. Second, it introduces web application that implements the main DD-SIMCA functionality. This application is freely available for everyone and does not require registration or installation. All calculations run locally in a browser without sending any information on a server, hence removing any obstacles to the dissemination of the data and models.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 7","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.3556","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140930665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive soft sensor modeling of chemical processes based on an improved just-in-time learning and random mapping partial least squares 基于改进的及时学习和随机映射偏最小二乘法的化学过程自适应软传感器建模
IF 2.3 4区 化学 Q1 SOCIAL WORK Pub Date : 2024-05-01 DOI: 10.1002/cem.3554
Ke Zhang, Xiangrui Zhang

The just-in-time learning-based partial least squares (JIT-PLS) has been extensively applied to adaptive soft sensor modeling of complex nonlinear processes. However, it still has the problems of unreasonable relevant samples selection and unsatisfactory local modeling. Aiming at these problems, this paper proposes an improved just-in-time learning-based random mapping partial least squares (IJIT-RMPLS), including an improved relevant samples selection strategy and a random mapping PLS (RMPLS) model. On the one hand, considering the different correlation degrees between input variables and output variable, this method applies mutual information to evaluate the importance of each input variable and designs a variable-weighted Euclidean distance to select relevant samples for local modeling. On the other hand, in order to prompt the prediction precision of local soft sensor models, this method combines the idea of nonlinear random mapping in extreme learning machines with PLS and builds a RMPLS with multiple activation functions. Applications on a numerical example and a real chemical process show that the proposed IJIT-RMPLS has smaller prediction error compared with traditional JIT-PLS.

基于及时学习的偏最小二乘法(JIT-PLS)已被广泛应用于复杂非线性过程的自适应软传感器建模。然而,它仍然存在相关样本选择不合理、局部建模效果不理想等问题。针对这些问题,本文提出了一种改进的基于及时学习的随机映射偏最小二乘法(IJIT-RMPLS),包括改进的相关样本选择策略和随机映射偏最小二乘法(RMPLS)模型。一方面,考虑到输入变量和输出变量之间的相关度不同,该方法采用互信息来评估每个输入变量的重要性,并设计了一个变量加权欧氏距离来选择相关样本进行局部建模。另一方面,为了提高局部软传感器模型的预测精度,该方法将极限学习机中的非线性随机映射思想与 PLS 相结合,建立了具有多个激活函数的 RMPLS。在一个数值实例和一个实际化学过程中的应用表明,与传统的 JIT-PLS 相比,所提出的 IJIT-RMPLS 具有更小的预测误差。
{"title":"Adaptive soft sensor modeling of chemical processes based on an improved just-in-time learning and random mapping partial least squares","authors":"Ke Zhang,&nbsp;Xiangrui Zhang","doi":"10.1002/cem.3554","DOIUrl":"10.1002/cem.3554","url":null,"abstract":"<p>The just-in-time learning-based partial least squares (JIT-PLS) has been extensively applied to adaptive soft sensor modeling of complex nonlinear processes. However, it still has the problems of unreasonable relevant samples selection and unsatisfactory local modeling. Aiming at these problems, this paper proposes an improved just-in-time learning-based random mapping partial least squares (IJIT-RMPLS), including an improved relevant samples selection strategy and a random mapping PLS (RMPLS) model. On the one hand, considering the different correlation degrees between input variables and output variable, this method applies mutual information to evaluate the importance of each input variable and designs a variable-weighted Euclidean distance to select relevant samples for local modeling. On the other hand, in order to prompt the prediction precision of local soft sensor models, this method combines the idea of nonlinear random mapping in extreme learning machines with PLS and builds a RMPLS with multiple activation functions. Applications on a numerical example and a real chemical process show that the proposed IJIT-RMPLS has smaller prediction error compared with traditional JIT-PLS.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 9","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140831636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An adaptive strategy for time-varying batch process fault prediction based on stochastic configuration network 基于随机配置网络的时变批量工艺故障预测自适应策略
IF 2.3 4区 化学 Q1 SOCIAL WORK Pub Date : 2024-04-28 DOI: 10.1002/cem.3555
Kai Liu, Xiaoqiang Zhao, Yongyong Hui, Hongmei Jiang

Fault prediction ensures safe and stable production, and cuts maintenance costs. Due to the changing operating conditions that lead to the changes in the characteristics of industrial processes, there is a need to monitor the fault state of batch processes in real-time and to accurately predict fault trends. An adaptive slow feature analysis-neighborhood preserving embedding-improved stochastic configuration network (SFA-NPE-ISCN) algorithm for batch process fault prediction is proposed. Firstly, SFA is used to extract the time-varying features of process data and establish the update index of the NPE model. Then, to extract local nearest-neighbor features and reconstruct them by the NPE model with adaptive update capability, square prediction error (SPE) statistics are constructed as fault state features based on the reconstructed error. Further, the hunter-prey optimization (HPO) algorithm optimizes the weights and biases in the stochastic configuration network, and the singular value decomposition (SVD) and QR decomposition of column rotation are introduced to solve the ill-posed problem of SCN and obtain the prediction model of ISCN. Finally, the obtained statistics SPE is formed into a time series, and the ISCN model is used to predict the process state trend. The effectiveness of the proposed algorithm is verified by case studies of industrial-scale penicillin fermentation processes and the Hot strip mill process.

故障预测可确保安全稳定的生产,并降低维护成本。由于运行条件不断变化,导致工业流程的特性也随之变化,因此需要实时监控批量流程的故障状态,并准确预测故障趋势。本文提出了一种用于批量工艺故障预测的自适应慢特征分析-邻域保留嵌入-改进随机配置网络(SFA-NPE-ISCN)算法。首先,利用 SFA 提取过程数据的时变特征,建立 NPE 模型的更新指标。然后,通过具有自适应更新能力的 NPE 模型提取局部近邻特征并对其进行重构,根据重构后的误差构建平方预测误差(SPE)统计量作为故障状态特征。然后,利用猎人-猎物优化(HPO)算法优化随机配置网络中的权重和偏置,并引入奇异值分解(SVD)和列旋转 QR 分解来解决 SCN 的问题,从而得到 ISCN 的预测模型。最后,将得到的统计 SPE 形成时间序列,利用 ISCN 模型预测过程状态趋势。工业规模的青霉素发酵过程和热轧带钢过程的案例研究验证了所提算法的有效性。
{"title":"An adaptive strategy for time-varying batch process fault prediction based on stochastic configuration network","authors":"Kai Liu,&nbsp;Xiaoqiang Zhao,&nbsp;Yongyong Hui,&nbsp;Hongmei Jiang","doi":"10.1002/cem.3555","DOIUrl":"10.1002/cem.3555","url":null,"abstract":"<p>Fault prediction ensures safe and stable production, and cuts maintenance costs. Due to the changing operating conditions that lead to the changes in the characteristics of industrial processes, there is a need to monitor the fault state of batch processes in real-time and to accurately predict fault trends. An adaptive slow feature analysis-neighborhood preserving embedding-improved stochastic configuration network (SFA-NPE-ISCN) algorithm for batch process fault prediction is proposed. Firstly, SFA is used to extract the time-varying features of process data and establish the update index of the NPE model. Then, to extract local nearest-neighbor features and reconstruct them by the NPE model with adaptive update capability, square prediction error (SPE) statistics are constructed as fault state features based on the reconstructed error. Further, the hunter-prey optimization (HPO) algorithm optimizes the weights and biases in the stochastic configuration network, and the singular value decomposition (SVD) and QR decomposition of column rotation are introduced to solve the ill-posed problem of SCN and obtain the prediction model of ISCN. Finally, the obtained statistics SPE is formed into a time series, and the ISCN model is used to predict the process state trend. The effectiveness of the proposed algorithm is verified by case studies of industrial-scale penicillin fermentation processes and the Hot strip mill process.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 9","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140831422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A prediction model of nonclassical secreted protein based on deep learning 基于深度学习的非经典分泌蛋白预测模型
IF 2.3 4区 化学 Q1 SOCIAL WORK Pub Date : 2024-04-25 DOI: 10.1002/cem.3553
Fan Zhang, Chaoyang Liu, Binjie Wang, Yiru He, Xinhong Zhang

Most of the current nonclassical proteins prediction methods involve manual feature selection, such as constructing features of samples based on the physicochemical properties of proteins and position-specific scoring matrix (PSSM). However, these tasks require researchers to perform some tedious search work to obtain the physicochemical properties of proteins. This paper proposes an end-to-end nonclassical secreted protein prediction model based on deep learning, named DeepNCSPP, which employs the protein sequence information and sequence statistics information as input to predict whether it is a nonclassical secreted protein. The protein sequence information and sequence statistics information are extracted using bidirectional long- and short-term memory and convolutional neural networks, respectively. Among the experiments conducted on the independent test dataset, DeepNCSPP achieved excellent results with an accuracy of 88.24%, Matthews coefficient (MCC) of 77.01%, and F1-score of 87.50%. Independent test dataset testing and 10-fold cross-validation show that DeepNCSPP achieves competitive performance with state-of-the-art methods and can be used as a reliable nonclassical secreted protein prediction model. A web server has been constructed for the convenience of researchers. The web link is https://www.deepncspp.top/. The source code of DeepNCSPP has been hosted on GitHub and is available online (https://github.com/xiaoliu166370/DEEPNCSPP).

目前大多数非经典蛋白质预测方法都涉及人工特征选择,如根据蛋白质的理化性质和特定位置评分矩阵(PSSM)构建样本特征。然而,这些任务需要研究人员进行一些繁琐的搜索工作来获取蛋白质的理化性质。本文提出了一种基于深度学习的端到端非经典分泌蛋白预测模型,命名为DeepNCSPP,利用蛋白质序列信息和序列统计信息作为输入,预测其是否为非经典分泌蛋白。蛋白质序列信息和序列统计信息分别通过双向长短期记忆和卷积神经网络提取。在独立测试数据集的实验中,DeepNCSPP 取得了优异的成绩,准确率为 88.24%,马修系数(MCC)为 77.01%,F1 分数为 87.50%。独立测试数据集测试和10倍交叉验证表明,DeepNCSPP的性能与最先进的方法不相上下,可用作可靠的非经典分泌蛋白预测模型。为方便研究人员,我们还建立了一个网络服务器。网站链接为 https://www.deepncspp.top/。DeepNCSPP 的源代码托管在 GitHub 上,可在线获取(https://github.com/xiaoliu166370/DEEPNCSPP)。
{"title":"A prediction model of nonclassical secreted protein based on deep learning","authors":"Fan Zhang,&nbsp;Chaoyang Liu,&nbsp;Binjie Wang,&nbsp;Yiru He,&nbsp;Xinhong Zhang","doi":"10.1002/cem.3553","DOIUrl":"10.1002/cem.3553","url":null,"abstract":"<p>Most of the current nonclassical proteins prediction methods involve manual feature selection, such as constructing features of samples based on the physicochemical properties of proteins and position-specific scoring matrix (PSSM). However, these tasks require researchers to perform some tedious search work to obtain the physicochemical properties of proteins. This paper proposes an end-to-end nonclassical secreted protein prediction model based on deep learning, named DeepNCSPP, which employs the protein sequence information and sequence statistics information as input to predict whether it is a nonclassical secreted protein. The protein sequence information and sequence statistics information are extracted using bidirectional long- and short-term memory and convolutional neural networks, respectively. Among the experiments conducted on the independent test dataset, DeepNCSPP achieved excellent results with an accuracy of 88.24<i>%</i>, Matthews coefficient (MCC) of 77.01<i>%</i>, and F1-score of 87.50<i>%</i>. Independent test dataset testing and 10-fold cross-validation show that DeepNCSPP achieves competitive performance with state-of-the-art methods and can be used as a reliable nonclassical secreted protein prediction model. A web server has been constructed for the convenience of researchers. The web link is https://www.deepncspp.top/. The source code of DeepNCSPP has been hosted on GitHub and is available online (https://github.com/xiaoliu166370/DEEPNCSPP).</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 8","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140803214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Testing differences in predictive ability: A tutorial 测试预测能力的差异:教程
IF 2.3 4区 化学 Q1 SOCIAL WORK Pub Date : 2024-04-25 DOI: 10.1002/cem.3549
Tom Fearn

This paper describes some statistical tests for comparing the predictive performance of two or more prediction rules. It covers the cases of both quantitative and qualitative predictions, that is, both regression and classification problems. Worked examples are included for both cases.

本文介绍了一些用于比较两个或多个预测规则的预测性能的统计检验方法。它涵盖了定量和定性预测的情况,即回归和分类问题。两种情况都包含工作示例。
{"title":"Testing differences in predictive ability: A tutorial","authors":"Tom Fearn","doi":"10.1002/cem.3549","DOIUrl":"10.1002/cem.3549","url":null,"abstract":"<p>This paper describes some statistical tests for comparing the predictive performance of two or more prediction rules. It covers the cases of both quantitative and qualitative predictions, that is, both regression and classification problems. Worked examples are included for both cases.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 8","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.3549","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140803276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A note on rank deficiency and numerical modeling 关于等级缺陷和数值建模的说明
IF 2.3 4区 化学 Q1 SOCIAL WORK Pub Date : 2024-04-23 DOI: 10.1002/cem.3550
Klaus Neymeyr, Mathias Sawall, Tomass Andersons

Linearly dependent concentration profiles of a chemical reaction can result in a spectral data matrix with a chemical rank smaller than the number of absorbing chemical species. Such a rank deficiency is problematic for a factor analysis as some information on the pure component spectra cannot be recovered from the mixture data. Matrix augmentation can break rank deficiencies and enable successful pure component recovery. In contrast to this, an artificial breakdown of a rank deficiency can be caused by a numerical finite precision simulation of the underlying kinetic model and can fake a successful MCR analysis. This work discusses the problem and points out some remedies.

化学反应的线性浓度分布会导致光谱数据矩阵的化学秩小于吸收化学物种的数量。这种秩缺陷对于因子分析来说是个问题,因为纯成分光谱的某些信息无法从混合物数据中恢复。矩阵增强可以打破秩缺陷,并成功地恢复纯成分。与此相反,对基本动力学模型进行有限精度数值模拟可能会人为地打破秩缺陷,从而导致 MCR 分析失败。这项工作讨论了这一问题,并指出了一些补救措施。
{"title":"A note on rank deficiency and numerical modeling","authors":"Klaus Neymeyr,&nbsp;Mathias Sawall,&nbsp;Tomass Andersons","doi":"10.1002/cem.3550","DOIUrl":"10.1002/cem.3550","url":null,"abstract":"<p>Linearly dependent concentration profiles of a chemical reaction can result in a spectral data matrix with a chemical rank smaller than the number of absorbing chemical species. Such a rank deficiency is problematic for a factor analysis as some information on the pure component spectra cannot be recovered from the mixture data. Matrix augmentation can break rank deficiencies and enable successful pure component recovery. In contrast to this, an artificial breakdown of a rank deficiency can be caused by a numerical finite precision simulation of the underlying kinetic model and can fake a successful MCR analysis. This work discusses the problem and points out some remedies.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 8","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.3550","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140669905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Application of time-space neighborhood standardization technology to complex multi-stage process fault detection 时空邻域标准化技术在复杂多级工艺故障检测中的应用
IF 2.3 4区 化学 Q1 SOCIAL WORK Pub Date : 2024-04-21 DOI: 10.1002/cem.3546
Liwei Feng, Shaofeng Guo, Yifei Wu, Yu Xing, Yuan Li

To solve the problem that the multi-stage process with dynamicity and nonlinear is hard to monitor effectively, the time-space neighborhood standardization (TSNS) method is proposed, which is further applied to partial least squares (PLS) to propose TSNS and PLS (TSNS-PLS) method for process fault detection. TSNS can transform multi-stage data into single-stage data that approximately obeys a standard normal distribution, remove temporal correlation between samples at previous and subsequent moments in the process data, and separate online fault samples. TSNS makes the transformed process data satisfy the requirements of the PLS method for process data and can significantly improve the fault detection rate of the PLS method. Finally, the performance of TSNS-PLS was examined by a numerical simulation process and the penicillin fermentation process design fault detection experiment.

为解决具有动态性和非线性的多阶段过程难以有效监测的问题,提出了时空邻域标准化(TSNS)方法,并将其进一步应用于偏最小二乘法(PLS),提出了 TSNS 和 PLS(TSNS-PLS)过程故障检测方法。TSNS 可以将多阶段数据转化为近似服从标准正态分布的单阶段数据,消除过程数据中前一时刻和后一时刻样本之间的时间相关性,并分离在线故障样本。TSNS 使转换后的过程数据满足 PLS 方法对过程数据的要求,并能显著提高 PLS 方法的故障检测率。最后,通过数值模拟过程和青霉素发酵过程设计故障检测实验检验了 TSNS-PLS 的性能。
{"title":"Application of time-space neighborhood standardization technology to complex multi-stage process fault detection","authors":"Liwei Feng,&nbsp;Shaofeng Guo,&nbsp;Yifei Wu,&nbsp;Yu Xing,&nbsp;Yuan Li","doi":"10.1002/cem.3546","DOIUrl":"10.1002/cem.3546","url":null,"abstract":"<p>To solve the problem that the multi-stage process with dynamicity and nonlinear is hard to monitor effectively, the time-space neighborhood standardization (TSNS) method is proposed, which is further applied to partial least squares (PLS) to propose TSNS and PLS (TSNS-PLS) method for process fault detection. TSNS can transform multi-stage data into single-stage data that approximately obeys a standard normal distribution, remove temporal correlation between samples at previous and subsequent moments in the process data, and separate online fault samples. TSNS makes the transformed process data satisfy the requirements of the PLS method for process data and can significantly improve the fault detection rate of the PLS method. Finally, the performance of TSNS-PLS was examined by a numerical simulation process and the penicillin fermentation process design fault detection experiment.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"38 8","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140635546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Chemometrics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1