Chemometrics and Intelligent Laboratory Systems最新文献_第10页

Advanced hyperparameter optimization for lung cancer detection using DenseBeetle network 基于DenseBeetle网络的肺癌检测高级超参数优化

IF 3.8 2区化学 Q2 AUTOMATION & CONTROL SYSTEMS

Chemometrics and Intelligent Laboratory Systems

Pub Date : 2025-11-18 DOI: 10.1016/j.chemolab.2025.105584

Jyoti Kumari , Sapna Sinha , Laxman Singh

Lung cancer remains a leading cause of cancer-related mortality, underscoring the urgent need for accurate and early detection to improve patient outcomes. However, current detection systems often struggle with issues like elevated false-positive rates and insufficient feature extraction. These challenges largely stem from the visual resemblance between nodules and nearby tissues, as well as the inability of conventional models to effectively capture the complex features of pulmonary nodules. This research presents a deep learning-based approach for identifying lung nodules in CT images. The framework incorporates advanced preprocessing steps such as Gaussian filtering and Contrast Limited Adaptive Histogram Equalization to enhance image sharpness and overall visual quality. A Residual Pyramid Attention-Enhanced DenseNet201, integrated with SE and CBAM modules, is used for effective feature extraction, while a sigmoid function supports binary classification. Hyperparameter tuning is performed using a novel optimizer based on Latin Hypercube Sampling and Mean Differential Variation. Evaluated on LUNA16 dataset with 888 CT scans, the model reached 98.7 % accuracy, 99.2 % sensitivity, and a 95.38 % F1-score on the test set. The framework significantly reduces false positives and demonstrates strong generalization for clinical lung cancer identification.

肺癌仍然是癌症相关死亡的主要原因，强调迫切需要准确和早期发现以改善患者的预后。然而，目前的检测系统经常面临假阳性率升高和特征提取不足等问题。这些挑战主要源于结节和附近组织之间的视觉相似性，以及传统模型无法有效捕获肺结节的复杂特征。本研究提出了一种基于深度学习的方法来识别CT图像中的肺结节。该框架结合了先进的预处理步骤，如高斯滤波和对比度有限的自适应直方图均衡化，以增强图像清晰度和整体视觉质量。残差金字塔注意力增强的DenseNet201集成了SE和CBAM模块，用于有效的特征提取，而sigmoid函数支持二元分类。采用基于拉丁超立方采样和均值微分变异的优化器进行超参数调优。在LUNA16数据集上对888次CT扫描进行评估，该模型在测试集上达到98.7%的准确率，99.2%的灵敏度和95.38%的f1得分。该框架显著减少假阳性，对临床肺癌鉴定具有很强的通用性。

{"title":"Advanced hyperparameter optimization for lung cancer detection using DenseBeetle network","authors":"Jyoti Kumari , Sapna Sinha , Laxman Singh","doi":"10.1016/j.chemolab.2025.105584","DOIUrl":"10.1016/j.chemolab.2025.105584","url":null,"abstract":"<div><div>Lung cancer remains a leading cause of cancer-related mortality, underscoring the urgent need for accurate and early detection to improve patient outcomes. However, current detection systems often struggle with issues like elevated false-positive rates and insufficient feature extraction. These challenges largely stem from the visual resemblance between nodules and nearby tissues, as well as the inability of conventional models to effectively capture the complex features of pulmonary nodules. This research presents a deep learning-based approach for identifying lung nodules in CT images. The framework incorporates advanced preprocessing steps such as Gaussian filtering and Contrast Limited Adaptive Histogram Equalization to enhance image sharpness and overall visual quality. A Residual Pyramid Attention-Enhanced DenseNet201, integrated with SE and CBAM modules, is used for effective feature extraction, while a sigmoid function supports binary classification. Hyperparameter tuning is performed using a novel optimizer based on Latin Hypercube Sampling and Mean Differential Variation. Evaluated on LUNA16 dataset with 888 CT scans, the model reached 98.7 % accuracy, 99.2 % sensitivity, and a 95.38 % F1-score on the test set. The framework significantly reduces false positives and demonstrates strong generalization for clinical lung cancer identification.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"268 ","pages":"Article 105584"},"PeriodicalIF":3.8,"publicationDate":"2025-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145569575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Feature extraction using differential amplification singular value decomposition in Vis–NIR spectroscopy: Application to cigarette brand identification 基于差分放大奇异值分解的近红外光谱特征提取：在香烟品牌识别中的应用

IF 3.8 2区化学 Q2 AUTOMATION & CONTROL SYSTEMS

Chemometrics and Intelligent Laboratory Systems

Pub Date : 2025-11-18 DOI: 10.1016/j.chemolab.2025.105579

Biao Tang , Chengbo Yang , Jianchun Li , Jingjun Wu

Rapid and accurate identification of cigarette brands is crucial for combating counterfeiting and protecting tax revenue. Vis–NIR spectroscopy combined with machine learning is a promising identification method. Nevertheless, redundant information abounds in high-dimensional spectral data, which affects classification accuracy. To address this challenge, this study proposes a novel feature extraction method ― Differential amplification singular value decomposition (DA-SVD). This method optimizes the feature projection direction by amplifying both the individual differences among samples and the overall differences between classes, thereby achieving effective dimensionality reduction of spectral data. By applying DA-SVD, the classification accuracy of KNN, SVM, and RF models on the test set significantly increased from 36 %, 34 %, and 30 % (based on the original data) to 98 % for all models, with precision, sensitivity, and F1 score reaching 97.86 %, 98.14 %, and 97.86 %, respectively, and all outperforming conventional feature extraction methods such as LDA, SVD, and PCA. The experimental results further demonstrated that DA-SVD could achieve satisfactory classification performance without additional preprocessing steps (outlier detection and spectral denoising). In addition, the 10-fold cross-validation results confirmed the stability of the DA-SVD method, and validation on public datasets further demonstrated its generalization ability and applicability. Overall, DA-SVD provides an efficient and robust feature extraction strategy that, when combined with machine learning, enables reliable cigarette brand identification and has broad potential for other spectroscopic applications.

快速准确地识别卷烟品牌对于打击假冒和保护税收至关重要。近红外光谱与机器学习相结合是一种很有前途的识别方法。然而，高维光谱数据中存在大量冗余信息，影响了分类精度。为了解决这一挑战，本研究提出了一种新的特征提取方法-差分放大奇异值分解（DA-SVD）。该方法通过放大样本之间的个体差异和类之间的整体差异来优化特征投影方向，从而实现光谱数据的有效降维。通过应用DA-SVD， KNN、SVM和RF模型在测试集上的分类准确率从36%、34%和30%（基于原始数据）显著提高到98%，精度、灵敏度和F1得分分别达到97.86%、98.14%和97.86%，均优于LDA、SVD和PCA等传统特征提取方法。实验结果进一步表明，DA-SVD无需额外的预处理步骤（离群点检测和光谱去噪）就能获得满意的分类性能。此外，10倍交叉验证结果证实了DA-SVD方法的稳定性，在公共数据集上的验证进一步证明了其泛化能力和适用性。总体而言，DA-SVD提供了一种高效且强大的特征提取策略，当与机器学习相结合时，可以实现可靠的香烟品牌识别，并在其他光谱应用中具有广泛的潜力。

{"title":"Feature extraction using differential amplification singular value decomposition in Vis–NIR spectroscopy: Application to cigarette brand identification","authors":"Biao Tang , Chengbo Yang , Jianchun Li , Jingjun Wu","doi":"10.1016/j.chemolab.2025.105579","DOIUrl":"10.1016/j.chemolab.2025.105579","url":null,"abstract":"<div><div>Rapid and accurate identification of cigarette brands is crucial for combating counterfeiting and protecting tax revenue. Vis–NIR spectroscopy combined with machine learning is a promising identification method. Nevertheless, redundant information abounds in high-dimensional spectral data, which affects classification accuracy. To address this challenge, this study proposes a novel feature extraction method ― Differential amplification singular value decomposition (DA-SVD). This method optimizes the feature projection direction by amplifying both the individual differences among samples and the overall differences between classes, thereby achieving effective dimensionality reduction of spectral data. By applying DA-SVD, the classification accuracy of KNN, SVM, and RF models on the test set significantly increased from 36 %, 34 %, and 30 % (based on the original data) to 98 % for all models, with precision, sensitivity, and F1 score reaching 97.86 %, 98.14 %, and 97.86 %, respectively, and all outperforming conventional feature extraction methods such as LDA, SVD, and PCA. The experimental results further demonstrated that DA-SVD could achieve satisfactory classification performance without additional preprocessing steps (outlier detection and spectral denoising). In addition, the 10-fold cross-validation results confirmed the stability of the DA-SVD method, and validation on public datasets further demonstrated its generalization ability and applicability. Overall, DA-SVD provides an efficient and robust feature extraction strategy that, when combined with machine learning, enables reliable cigarette brand identification and has broad potential for other spectroscopic applications.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"268 ","pages":"Article 105579"},"PeriodicalIF":3.8,"publicationDate":"2025-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145569653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Environment aging analysis of animal bloodstains with ATR-FTIR and CNN 动物血迹的ATR-FTIR和CNN环境老化分析

IF 3.8 2区化学 Q2 AUTOMATION & CONTROL SYSTEMS

Chemometrics and Intelligent Laboratory Systems

Pub Date : 2025-11-14 DOI: 10.1016/j.chemolab.2025.105576

Chun-Ta Wei , Zexin Shen , Wenbin Luo , Jingyi Zhao , Tingting Yin , Kaining Cheng , Miao Zhang

Bloodstain analysis is a critical component of forensic science, particularly for determining the time of deposition and understanding the effects of environmental conditions on evidence. This study presents an innovative bloodstains environment aging model, which integrates attenuated total reflection fourier transform infrared spectroscopy (ATR-FTIR) with a convolutional neural network (CNN) optimized using the black-winged kite algorithm. Bloodstains from common sources (pig, cow, and chicken) were analyzed under varying environmental conditions, including temperature fluctuations (0 °C, 40 °C, 100 °C) and simulated sunlight exposure, across multiple aging periods (1, 2, 4, 8 days). Spectral data obtained through ATR-FTIR scanning served as the input for the optimized CNN, enabling precise differentiation and classification of bloodstains based on aging and environmental factors. The model achieved high predictive accuracy, with 97.86 % for pig blood, 95.47 % for cow blood, and 97.15 % for chicken blood under 0 °C conditions, demonstrating its robustness and reliability in forensic applications. Additionally, this research highlights the potential for integrating spectroscopic data with advanced deep learning techniques to enhance forensic methodologies. By improving accuracy, accessibility, and cost-effectiveness, this work represents a significant advancement in bloodstain analysis and forensic science.

血迹分析是法医科学的一个重要组成部分，特别是在确定沉积时间和了解环境条件对证据的影响方面。本研究提出了一种创新的血迹环境老化模型，该模型将衰减全反射傅立叶变换红外光谱（ATR-FTIR）与使用黑翼风筝算法优化的卷积神经网络（CNN）相结合。研究人员在不同的环境条件下分析了常见来源（猪、牛和鸡）的血迹，包括温度波动（0°C、40°C、100°C）和模拟阳光照射，并在多个老化期（1、2、4、8天）进行了分析。通过ATR-FTIR扫描获得的光谱数据作为优化后的CNN的输入，可以根据年龄和环境因素对血迹进行精确的区分和分类。在0°C条件下，该模型对猪血、牛血和鸡血的预测准确率分别为97.86%、95.47%和97.15%，显示了其在法医应用中的鲁棒性和可靠性。此外，本研究强调了将光谱数据与先进的深度学习技术相结合以增强法医方法的潜力。通过提高准确性、可及性和成本效益，这项工作代表了血迹分析和法医科学的重大进步。

{"title":"Environment aging analysis of animal bloodstains with ATR-FTIR and CNN","authors":"Chun-Ta Wei , Zexin Shen , Wenbin Luo , Jingyi Zhao , Tingting Yin , Kaining Cheng , Miao Zhang","doi":"10.1016/j.chemolab.2025.105576","DOIUrl":"10.1016/j.chemolab.2025.105576","url":null,"abstract":"<div><div>Bloodstain analysis is a critical component of forensic science, particularly for determining the time of deposition and understanding the effects of environmental conditions on evidence. This study presents an innovative bloodstains environment aging model, which integrates attenuated total reflection fourier transform infrared spectroscopy (ATR-FTIR) with a convolutional neural network (CNN) optimized using the black-winged kite algorithm. Bloodstains from common sources (pig, cow, and chicken) were analyzed under varying environmental conditions, including temperature fluctuations (0 °C, 40 °C, 100 °C) and simulated sunlight exposure, across multiple aging periods (1, 2, 4, 8 days). Spectral data obtained through ATR-FTIR scanning served as the input for the optimized CNN, enabling precise differentiation and classification of bloodstains based on aging and environmental factors. The model achieved high predictive accuracy, with 97.86 % for pig blood, 95.47 % for cow blood, and 97.15 % for chicken blood under 0 °C conditions, demonstrating its robustness and reliability in forensic applications. Additionally, this research highlights the potential for integrating spectroscopic data with advanced deep learning techniques to enhance forensic methodologies. By improving accuracy, accessibility, and cost-effectiveness, this work represents a significant advancement in bloodstain analysis and forensic science.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"268 ","pages":"Article 105576"},"PeriodicalIF":3.8,"publicationDate":"2025-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145569656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Machine learning and evolutionary computation on e-nose datasets: A preliminary approach to ergot alkaloid detection in wheat 基于电子鼻数据集的机器学习和进化计算：小麦麦角生物碱检测的初步方法

IF 3.8 2区化学 Q2 AUTOMATION & CONTROL SYSTEMS

Chemometrics and Intelligent Laboratory Systems

Pub Date : 2025-11-13 DOI: 10.1016/j.chemolab.2025.105574

Chiara Giliberti , Giulia Magnani , Monica Mattarozzi , Marco Giannetto , Federica Bianchi , Maria Careri , Stefano Cagnoni

To the best of the authors' knowledge, this is the first time that an approach based on the use of machine learning (ML) algorithms combined with genetic programming (GP) was used to process small-sample-size e-nose data. The approach was proposed to classify the volatile compound information of wheat samples based on the contamination of ergot alkaloids, a class of emerging mycotoxins which pose a severe threat to food safety and consumer health. Unlike previous studies that applied convolutional neural networks to full e-nose response profiles, our approach focused on a small set of features extracted from the steady-state region of each response curve. Despite the low dimensionality, using GP to generate optimal features significantly improved the classification performance of several ML models. Different classifiers, including Decision Tree, Linear Discriminant Analysis, the Mahalanobis Distance Classifier, an artificial neural network-based method and ensemble methods were assessed and applied to a dataset of 21 wheat samples. These samples were classified according to their compliance with the EU maximum limit of 150 μg/kg for ergot alkaloids in wheat. The combined application of GP-based feature transformations, specifically using M3GP, and ML classifiers resulted in significant improvements in accuracy, F1 score, precision and recall compared to models trained on untransformed features. These findings highlight the unexplored potential of GP as a powerful tool for feature construction in sensor-based classification tasks for food safety signal processing.

据作者所知，这是第一次使用基于机器学习（ML）算法结合遗传编程（GP）的方法来处理小样本电子鼻数据。麦角生物碱是一类严重威胁食品安全和消费者健康的新型真菌毒素，提出了基于麦角生物碱污染对小麦样品挥发性化合物信息进行分类的方法。与之前将卷积神经网络应用于完整电子鼻响应剖面的研究不同，我们的方法侧重于从每个响应曲线的稳态区域提取的一小部分特征。尽管维数较低，但使用GP生成最优特征显著提高了几种ML模型的分类性能。采用决策树、线性判别分析、马氏距离分类器、基于人工神经网络的方法和集成方法对21个小麦样本数据集进行了评估和应用。这些样品符合欧盟对小麦中麦角生物碱的最高限量150 μg/kg进行分类。与未转换特征训练的模型相比，基于gp的特征转换（特别是使用M3GP）和ML分类器的组合应用在准确性、F1分数、精度和召回率方面都有显著提高。这些发现突出了GP作为基于传感器的食品安全信号处理分类任务中特征构建的强大工具的潜力。

{"title":"Machine learning and evolutionary computation on e-nose datasets: A preliminary approach to ergot alkaloid detection in wheat","authors":"Chiara Giliberti , Giulia Magnani , Monica Mattarozzi , Marco Giannetto , Federica Bianchi , Maria Careri , Stefano Cagnoni","doi":"10.1016/j.chemolab.2025.105574","DOIUrl":"10.1016/j.chemolab.2025.105574","url":null,"abstract":"<div><div>To the best of the authors' knowledge, this is the first time that an approach based on the use of machine learning (ML) algorithms combined with genetic programming (GP) was used to process small-sample-size e-nose data. The approach was proposed to classify the volatile compound information of wheat samples based on the contamination of ergot alkaloids, a class of emerging mycotoxins which pose a severe threat to food safety and consumer health. Unlike previous studies that applied convolutional neural networks to full e-nose response profiles, our approach focused on a small set of features extracted from the steady-state region of each response curve. Despite the low dimensionality, using GP to generate optimal features significantly improved the classification performance of several ML models. Different classifiers, including Decision Tree, Linear Discriminant Analysis, the Mahalanobis Distance Classifier, an artificial neural network-based method and ensemble methods were assessed and applied to a dataset of 21 wheat samples. These samples were classified according to their compliance with the EU maximum limit of 150 μg/kg for ergot alkaloids in wheat. The combined application of GP-based feature transformations, specifically using M3GP, and ML classifiers resulted in significant improvements in accuracy, F1 score, precision and recall compared to models trained on untransformed features. These findings highlight the unexplored potential of GP as a powerful tool for feature construction in sensor-based classification tasks for food safety signal processing.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"268 ","pages":"Article 105574"},"PeriodicalIF":3.8,"publicationDate":"2025-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145569578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Nondestructive detection of total flavonoids content in daylily using Vis-NIR and NIR hyperspectral imaging: data fusion combined with SHAP for model interpretability 利用近红外和近红外高光谱成像无损检测黄花菜中总黄酮含量：数据融合结合SHAP提高模型可解释性

IF 3.8 2区化学 Q2 AUTOMATION & CONTROL SYSTEMS

Chemometrics and Intelligent Laboratory Systems

Pub Date : 2025-11-13 DOI: 10.1016/j.chemolab.2025.105575

Xuexia Ma, Na Li, Ruifeng Wang, Jiaxue Ma, Ninghua Zhu, Tingting Li, Zhongxiong Zhang, Haifeng Li, Songlei Wang, Haihong Zhang

Flavonoids, vital bioactive compounds in daylily (a nutritionally and medicinally valuable food), have antioxidant, anti-inflammatory, antibacterial, and antidepressant properties, which boost its nutritional value, health benefits, and quality. Hyperspectral Imaging (HSI) for detecting trace flavonoids usually uses single systems, failing to leverage multispectral complementarity and restricting detection performance. This study integrates a data fusion strategy with two HSI techniques (visible–near-infrared (Vis-NIR) and near-infrared (NIR)) for the non-destructive detection of total flavonoids content (TFC) in daylily. The investigation employed partial least squares regression (PLSR) and least squares support vector machine (LS-SVM) for spectral data. Additionally, data-level and feature-level fusion strategies are implemented for data fusion modeling, while the SHapley Additive exPlanations (SHAP) methodology is used to comprehensively evaluate spectral feature contribution rates. The findings demonstrate that modeling based on the fusion strategy of LS-SVM yields substantially superior results compared to single-system approaches. Notably, the Mid-level fusion model incorporating competitive adaptive reweighted sampling (CARS) and LS-SVM demonstrates optimal performance. The determination coefficient (R²_P), root mean square prediction error (RMSEP) and residual prediction deviation (RPD) of the prediction set were 0.9332, 0.0186 and 3.3560, respectively. This study confirms the feasibility of HSI technology in non-destructively detecting flavonoids in daylily. Furthermore, the collaborative optimization of multi-spectral HSI systems through a data fusion strategy effectively enhances the accuracy of non-destructive flavonoids detection. This study presents innovative technical approaches for non-destructive trace substance detection and agricultural product quality and safety monitoring, thereby providing essential technical support for developing intelligent agricultural product quality and safety monitoring systems.

黄花菜（一种有营养和药用价值的食物）中的黄酮类化合物是重要的生物活性化合物，具有抗氧化、抗炎、抗菌和抗抑郁的特性，这提高了黄花菜的营养价值、健康益处和质量。用于检测痕量黄酮类化合物的高光谱成像（HSI）通常使用单一系统，不能充分利用多光谱的互补性，限制了检测性能。本研究将两种HSI技术（可见-近红外（Vis-NIR）和近红外（NIR））的数据融合策略用于黄花菜中总黄酮含量（TFC）的无损检测。采用偏最小二乘回归（PLSR）和最小二乘支持向量机（LS-SVM）对光谱数据进行分析。此外，采用数据级和特征级融合策略进行数据融合建模，采用SHapley加性解释（SHAP）方法综合评估光谱特征贡献率。研究结果表明，与单系统方法相比，基于LS-SVM融合策略的建模结果明显优于单系统方法。值得注意的是，结合竞争自适应重加权采样（CARS）和LS-SVM的中级融合模型表现出最优的性能。预测集的决定系数（R2P）、均方根预测误差（RMSEP）和残差预测偏差（RPD）分别为0.9332、0.0186和3.3560。本研究证实了HSI技术无损检测黄花菜中黄酮类化合物的可行性。此外，通过数据融合策略对多光谱HSI系统进行协同优化，有效提高了黄酮类化合物无损检测的准确性。本研究提出了微量物质无损检测和农产品质量安全监测的创新技术途径，为发展农产品质量安全智能监测系统提供必要的技术支持。

{"title":"Nondestructive detection of total flavonoids content in daylily using Vis-NIR and NIR hyperspectral imaging: data fusion combined with SHAP for model interpretability","authors":"Xuexia Ma, Na Li, Ruifeng Wang, Jiaxue Ma, Ninghua Zhu, Tingting Li, Zhongxiong Zhang, Haifeng Li, Songlei Wang, Haihong Zhang","doi":"10.1016/j.chemolab.2025.105575","DOIUrl":"10.1016/j.chemolab.2025.105575","url":null,"abstract":"<div><div>Flavonoids, vital bioactive compounds in daylily (a nutritionally and medicinally valuable food), have antioxidant, anti-inflammatory, antibacterial, and antidepressant properties, which boost its nutritional value, health benefits, and quality. Hyperspectral Imaging (HSI) for detecting trace flavonoids usually uses single systems, failing to leverage multispectral complementarity and restricting detection performance. This study integrates a data fusion strategy with two HSI techniques (visible–near-infrared (Vis-NIR) and near-infrared (NIR)) for the non-destructive detection of total flavonoids content (TFC) in daylily. The investigation employed partial least squares regression (PLSR) and least squares support vector machine (LS-SVM) for spectral data. Additionally, data-level and feature-level fusion strategies are implemented for data fusion modeling, while the SHapley Additive exPlanations (SHAP) methodology is used to comprehensively evaluate spectral feature contribution rates. The findings demonstrate that modeling based on the fusion strategy of LS-SVM yields substantially superior results compared to single-system approaches. Notably, the Mid-level fusion model incorporating competitive adaptive reweighted sampling (CARS) and LS-SVM demonstrates optimal performance. The determination coefficient (R<sup>2</sup><sub>P</sub>), root mean square prediction error (RMSEP) and residual prediction deviation (RPD) of the prediction set were 0.9332, 0.0186 and 3.3560, respectively. This study confirms the feasibility of HSI technology in non-destructively detecting flavonoids in daylily. Furthermore, the collaborative optimization of multi-spectral HSI systems through a data fusion strategy effectively enhances the accuracy of non-destructive flavonoids detection. This study presents innovative technical approaches for non-destructive trace substance detection and agricultural product quality and safety monitoring, thereby providing essential technical support for developing intelligent agricultural product quality and safety monitoring systems.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"268 ","pages":"Article 105575"},"PeriodicalIF":3.8,"publicationDate":"2025-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145569654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A combination of gas detection system and adaptive deep learning network (GFC-Net) to identify different production batches of beer 结合气体检测系统和自适应深度学习网络（GFC-Net）来识别不同生产批次的啤酒

IF 3.8 2区化学 Q2 AUTOMATION & CONTROL SYSTEMS

Chemometrics and Intelligent Laboratory Systems

Pub Date : 2025-11-04 DOI: 10.1016/j.chemolab.2025.105557

Junliang Han , Feifei Tong , Chuansheng Tang , Titi Liu

Even for products of the same brand, the quality of beer may vary across different production batches. Strict quality testing is essential to ensure product consistency, safety, and consumer satisfaction. In this work, an e-nose system, combined with the proposed deep learning algorithm, achieves the qualitative identification of beers from different production batches. First, the e-nose system is applied to acquire the gas information of beers from different production batches. Then, to comprehensively extract features characterizing the gas information, a fusion computational module that integrates local and global features from convolution and self-attention mechanism is proposed, called the Gas Features Calculation Module (GFCM). Finally, a Gas Features Classification Network (GFC-Net) is designed to enable the adaptive identification of beers from different production batches. Through structural optimization, ablation experiments, and comparison with state-of-the-art gas classification methods, GFC-Net achieves an accuracy of 98.50 %, a precision of 98.70 %, and a recall of 98.58 %. The integration of gas information that characterizes the overall chemical quality, along with GFC-Net, enables the qualitative identification of beers from different batches, providing an effective approach for quality monitoring.

即使是同一品牌的产品，不同批次的啤酒质量也会有所不同。严格的质量检测对于确保产品的一致性、安全性和消费者满意度至关重要。在这项工作中，电子鼻系统结合所提出的深度学习算法，实现了不同生产批次啤酒的定性识别。首先，利用电子鼻系统采集不同生产批次啤酒的气体信息。然后，为了全面提取表征气体信息的特征，提出了一种融合卷积和自关注机制的局部特征和全局特征的融合计算模块，称为气体特征计算模块（GFCM）。最后，设计了气体特征分类网络（GFC-Net），实现了不同生产批次啤酒的自适应识别。通过结构优化、烧蚀实验以及与现有气体分类方法的比较，GFC-Net的准确率为98.50%，精密度为98.70%，召回率为98.58%。整合表征整体化学质量的气体信息，以及GFC-Net，可以对不同批次的啤酒进行定性鉴定，为质量监测提供了有效的方法。

{"title":"A combination of gas detection system and adaptive deep learning network (GFC-Net) to identify different production batches of beer","authors":"Junliang Han , Feifei Tong , Chuansheng Tang , Titi Liu","doi":"10.1016/j.chemolab.2025.105557","DOIUrl":"10.1016/j.chemolab.2025.105557","url":null,"abstract":"<div><div>Even for products of the same brand, the quality of beer may vary across different production batches. Strict quality testing is essential to ensure product consistency, safety, and consumer satisfaction. In this work, an e-nose system, combined with the proposed deep learning algorithm, achieves the qualitative identification of beers from different production batches. First, the e-nose system is applied to acquire the gas information of beers from different production batches. Then, to comprehensively extract features characterizing the gas information, a fusion computational module that integrates local and global features from convolution and self-attention mechanism is proposed, called the Gas Features Calculation Module (GFCM). Finally, a Gas Features Classification Network (GFC-Net) is designed to enable the adaptive identification of beers from different production batches. Through structural optimization, ablation experiments, and comparison with state-of-the-art gas classification methods, GFC-Net achieves an accuracy of 98.50 %, a precision of 98.70 %, and a recall of 98.58 %. The integration of gas information that characterizes the overall chemical quality, along with GFC-Net, enables the qualitative identification of beers from different batches, providing an effective approach for quality monitoring.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"268 ","pages":"Article 105557"},"PeriodicalIF":3.8,"publicationDate":"2025-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145464515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Sampling-based computation of the sets of feasible solutions and feasible bands for noisy data 基于采样的噪声数据可行解集和可行带的计算

IF 3.8 2区化学 Q2 AUTOMATION & CONTROL SYSTEMS

Chemometrics and Intelligent Laboratory Systems

Pub Date : 2025-11-04 DOI: 10.1016/j.chemolab.2025.105565

Mathias Sawall , Tomass Andersons , Chunhong Wei , Christoph Kubis , Klaus Neymeyr

Multivariate curve resolution often suffers from solution ambiguity, with many nonnegative factorizations fitting the data equally well. Building on the algorithm of Laursen and Hobolth (2022), we present an efficient sampling algorithm that can handle noisy data even containing negative entries. The algorithm iteratively updates factor columns via affine combinations within a nested loop structure, effectively approximating the sets of feasible solutions, the feasible bands, as well as the dual profiles. We apply the algorithm to two in situ FTIR spectroscopic data sets tracking the decomposition and activation of rhodium carbonyl complexes for the hydroformylation process. A comparison against established algorithms for these data sets indicates the robustness and computational efficiency of the algorithm.

多元曲线分辨率经常受到解模糊的影响，许多非负因子分解同样可以很好地拟合数据。基于Laursen和Hobolth（2022）的算法，我们提出了一种有效的采样算法，即使包含负项也可以处理噪声数据。该算法通过嵌套循环结构内的仿射组合迭代更新因子列，有效地逼近可行解集、可行带集以及双剖面集。我们将该算法应用于两个原位FTIR光谱数据集，跟踪氢甲酰化过程中铑羰基配合物的分解和活化。对这些数据集与已有算法的比较表明了该算法的鲁棒性和计算效率。

引用次数: 0

Investigation on different strategies of significance testing in ANOVA-simultaneous component analysis (ASCA) anova -同步成分分析（ASCA）中不同显著性检验策略的探讨

IF 3.8 2区化学 Q2 AUTOMATION & CONTROL SYSTEMS

Chemometrics and Intelligent Laboratory Systems

Pub Date : 2025-11-04 DOI: 10.1016/j.chemolab.2025.105573

Faezeh Maddahi , Mahsa Akbari Lakeh , Jamile Mohammad Jafari , Farnoosh Koleini , Siewert Hugelier , Paul J. Gemperline , Hamid Abdollahi

ANOVA Simultaneous Component Analysis (ASCA) integrates analysis of variance with multivariate modelling to quantify how experimental factors and their interactions affect complex multivariate measurements. Statistical significance in ASCA is typically assessed by permutation testing; however, different permutation strategies imply distinct null hypothesis and exchangeability assumptions. In this study, we systematically compare three widely used approaches embedded in popular chemometric software packages where the permutation strategy is often predefined and not always transparent to the user. The restricted permutation method shuffles observations only within experimental strata, preserving the structure of the null hypothesis. The reduced‐model permutation contrasts the full ASCA model with a simplified version in which selected effects are removed. Permutation of marginal design matrices isolates interaction effects by permuting marginal matrices derived from the design matrix. We evaluate these methods on simulated datasets with varying patterns of main effects and interactions, as well as on an experimental study of feral cabbage (Brassica oleracea) under treatment and time factors. Our results show that the restricted permutation method reliably detects main effects, reduced‐model permutation excels at identifying interactions, and permutation of marginal design matrices consistently captures both. By examining the assumptions and performance of each method, we provide practical guidance for selecting the optimal permutation strategy in ASCA-based chemometric analysis, particularly for balanced experimental designs. As a baseline, we additionally assessed unrestricted permutation of the raw data using two test statistics: the sum of squares and the F-ratio. The results demonstrated that when employing the F-ratio, this approach was also capable of accurately detecting statistical significance.

ANOVA同时成分分析（ASCA）将方差分析与多变量建模相结合，量化实验因素及其相互作用如何影响复杂的多变量测量。ASCA的统计显著性通常通过排列测试来评估；然而，不同的排列策略意味着不同的零假设和互换性假设。在这项研究中，我们系统地比较了三种广泛使用的方法，这些方法嵌入在流行的化学计量软件包中，其中排列策略通常是预定义的，并不总是对用户透明。限制排列法只在实验层内打乱观察结果，保留原假设的结构。简化模型排列对比了完整的ASCA模型与简化版本，其中选择的影响被删除。边际设计矩阵的置换通过置换由设计矩阵导出的边际矩阵来隔离交互效应。我们在具有不同主效应和相互作用模式的模拟数据集上评估了这些方法，并在处理和时间因素下对野生卷心菜（芸苔甘蓝）进行了实验研究。我们的研究结果表明，限制排列方法可靠地检测主效应，简化模型排列在识别相互作用方面表现出色，而边缘设计矩阵的排列一致地捕获了两者。通过检查每种方法的假设和性能，我们为基于asca的化学计量分析中选择最佳排列策略提供了实用指导，特别是对于平衡实验设计。作为基线，我们使用两个检验统计量（平方和和f比）额外评估了原始数据的无限制排列。结果表明，当采用f比时，该方法也能够准确地检测统计显著性。

{"title":"Investigation on different strategies of significance testing in ANOVA-simultaneous component analysis (ASCA)","authors":"Faezeh Maddahi , Mahsa Akbari Lakeh , Jamile Mohammad Jafari , Farnoosh Koleini , Siewert Hugelier , Paul J. Gemperline , Hamid Abdollahi","doi":"10.1016/j.chemolab.2025.105573","DOIUrl":"10.1016/j.chemolab.2025.105573","url":null,"abstract":"<div><div>ANOVA Simultaneous Component Analysis (ASCA) integrates analysis of variance with multivariate modelling to quantify how experimental factors and their interactions affect complex multivariate measurements. Statistical significance in ASCA is typically assessed by permutation testing; however, different permutation strategies imply distinct null hypothesis and exchangeability assumptions. In this study, we systematically compare three widely used approaches embedded in popular chemometric software packages where the permutation strategy is often predefined and not always transparent to the user. The restricted permutation method shuffles observations only within experimental strata, preserving the structure of the null hypothesis. The reduced‐model permutation contrasts the full ASCA model with a simplified version in which selected effects are removed. Permutation of marginal design matrices isolates interaction effects by permuting marginal matrices derived from the design matrix. We evaluate these methods on simulated datasets with varying patterns of main effects and interactions, as well as on an experimental study of feral cabbage (Brassica oleracea) under treatment and time factors. Our results show that the restricted permutation method reliably detects main effects, reduced‐model permutation excels at identifying interactions, and permutation of marginal design matrices consistently captures both. By examining the assumptions and performance of each method, we provide practical guidance for selecting the optimal permutation strategy in ASCA-based chemometric analysis, particularly for balanced experimental designs. As a baseline, we additionally assessed unrestricted permutation of the raw data using two test statistics: the sum of squares and the F-ratio. The results demonstrated that when employing the F-ratio, this approach was also capable of accurately detecting statistical significance.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"268 ","pages":"Article 105573"},"PeriodicalIF":3.8,"publicationDate":"2025-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145464517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Impact of converting graphs into spanning trees on node and graph classification in Graph Neural Network 图神经网络中生成树对节点和图分类的影响

IF 3.8 2区化学 Q2 AUTOMATION & CONTROL SYSTEMS

Chemometrics and Intelligent Laboratory Systems

Pub Date : 2025-11-03 DOI: 10.1016/j.chemolab.2025.105562

Mohammadmahdi Taheri , Mahdi Eftekhari , Gholamreza Aghamollaei

This paper investigates the impact of graph reduction to their spanning trees on Graph Neural Network (GNN) performance in node and graph classification across six GNN architectures. The proposed approach leverages spanning trees for graph sparsification while preserving critical structural information, achieving performance comparable to or better than full-graph models. A theoretical connection is established between edge sampling via determinantal point processes (DPPs) and the transfer-current matrix, showing minimum spanning trees effectively approximate DPP-selected subgraphs. Four edge-weighting schemes are analyzed such that their sparsification trade-offs are revealed. The method consistently reduces memory usage and computation while maintaining accuracy. Findings indicate that spanning tree pruning offers a scalable, theoretically grounded strategy for efficient GNN training without compromising classification accuracy. Experiments on node classification benchmarks (Cora, Citeseer, PubMed, PPI) and graph classification biological and chemical datasets (AIDS, MUTAG, PROTEINS, NCI1, IMDB-BINARY) demonstrate excellent graph classification results, notably 98.27% accuracy on AIDS, with reduced computational overhead.

本文研究了生成树的图约简对图神经网络（GNN）节点和图分类性能的影响。所提出的方法利用生成树进行图稀疏化，同时保留关键的结构信息，实现与全图模型相当或更好的性能。通过确定性点过程（DPPs）和传输电流矩阵建立了边缘采样的理论联系，表明最小生成树有效地近似dpp选择的子图。分析了四种边加权方案，从而揭示了它们的稀疏性权衡。该方法在保持准确性的同时持续减少内存使用和计算。研究结果表明，生成树修剪为有效的GNN训练提供了一种可扩展的，理论基础的策略，而不会影响分类精度。在节点分类基准（Cora, Citeseer, PubMed， PPI）和图分类生物和化学数据集（AIDS， MUTAG， PROTEINS, NCI1, IMDB-BINARY）上的实验显示了出色的图分类结果，在AIDS上准确率达到98.27%，计算开销降低。

{"title":"Impact of converting graphs into spanning trees on node and graph classification in Graph Neural Network","authors":"Mohammadmahdi Taheri , Mahdi Eftekhari , Gholamreza Aghamollaei","doi":"10.1016/j.chemolab.2025.105562","DOIUrl":"10.1016/j.chemolab.2025.105562","url":null,"abstract":"<div><div>This paper investigates the impact of graph reduction to their spanning trees on Graph Neural Network (GNN) performance in node and graph classification across six GNN architectures. The proposed approach leverages spanning trees for graph sparsification while preserving critical structural information, achieving performance comparable to or better than full-graph models. A theoretical connection is established between edge sampling via determinantal point processes (DPPs) and the transfer-current matrix, showing minimum spanning trees effectively approximate DPP-selected subgraphs. Four edge-weighting schemes are analyzed such that their sparsification trade-offs are revealed. The method consistently reduces memory usage and computation while maintaining accuracy. Findings indicate that spanning tree pruning offers a scalable, theoretically grounded strategy for efficient GNN training without compromising classification accuracy. Experiments on node classification benchmarks (Cora, Citeseer, PubMed, PPI) and graph classification biological and chemical datasets (AIDS, MUTAG, PROTEINS, NCI1, IMDB-BINARY) demonstrate excellent graph classification results, notably 98.27% accuracy on AIDS, with reduced computational overhead.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"268 ","pages":"Article 105562"},"PeriodicalIF":3.8,"publicationDate":"2025-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145464516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A reliable deep neural network using the radial basis for the spreading virus in computers with kill signals 一种基于径向基的可靠深度神经网络，用于具有杀伤信号的计算机中病毒的传播

IF 3.8 2区化学 Q2 AUTOMATION & CONTROL SYSTEMS

Chemometrics and Intelligent Laboratory Systems

Pub Date : 2025-11-01 DOI: 10.1016/j.chemolab.2025.105560

Zulqurnain Sabir , Bahaa Basbous , Basma Souayeh , Muhammad Umar , Soheil Salahshour

Purpose

The purpose of this work is to provide a reliable neural network process for the spreading virus in computers with kill signals. The mathematical model shows susceptible, exposed, infected individuals to form the virus inactive, and kill signals classes.

Method

A structure of deep neural network (DNN) is designed by using two different hidden layers having radial basis activation functions in both layers, optimization through the Bayesian regularization, twenty and thirty numbers of neurons in primary and secondary hidden layers for the spreading virus in computers with kill signals. The stochastic DNN framework is presented to solve the spreading virus in computers with kill signals by selecting the data for training as 70 %, and 15 %, 15 % for both validation and testing.

Results

The accuracy of the scheme is observed through the overlapping of the solutions along with negligible absolute error for solving the model. The consistency of the solver is observed through the process of error histogram, regression, and state transition.

Novelty

The proposed DNN structure having radial basis activation function has never been applied for the spreading virus in computers with kill signals.

目的为具有杀伤信号的计算机中病毒的传播提供一种可靠的神经网络过程。该数学模型显示了易感、暴露、感染个体形成的病毒灭活和杀伤信号等级。方法采用两层具有径向基激活函数的不同隐层设计深度神经网络（DNN）结构，通过贝叶斯正则化优化，在主隐层和次隐层分别设置20和30个神经元，用于在具有杀伤信号的计算机中传播病毒。提出了随机深度神经网络框架，通过选择训练数据为70%，验证数据为15%，测试数据为15%，来解决具有杀死信号的计算机中病毒的传播问题。结果通过解的重叠观察到该方案的精度，求解模型的绝对误差可以忽略不计。通过误差直方图、回归和状态转移的过程来观察求解器的一致性。新颖提出的具有径向基激活函数的深度神经网络结构尚未应用于具有杀伤信号的计算机中病毒的传播。

{"title":"A reliable deep neural network using the radial basis for the spreading virus in computers with kill signals","authors":"Zulqurnain Sabir , Bahaa Basbous , Basma Souayeh , Muhammad Umar , Soheil Salahshour","doi":"10.1016/j.chemolab.2025.105560","DOIUrl":"10.1016/j.chemolab.2025.105560","url":null,"abstract":"<div><h3>Purpose</h3><div>The purpose of this work is to provide a reliable neural network process for the spreading virus in computers with kill signals. The mathematical model shows susceptible, exposed, infected individuals to form the virus inactive, and kill signals classes.</div></div><div><h3>Method</h3><div>A structure of deep neural network (DNN) is designed by using two different hidden layers having radial basis activation functions in both layers, optimization through the Bayesian regularization, twenty and thirty numbers of neurons in primary and secondary hidden layers for the spreading virus in computers with kill signals. The stochastic DNN framework is presented to solve the spreading virus in computers with kill signals by selecting the data for training as 70 %, and 15 %, 15 % for both validation and testing.</div></div><div><h3>Results</h3><div>The accuracy of the scheme is observed through the overlapping of the solutions along with negligible absolute error for solving the model. The consistency of the solver is observed through the process of error histogram, regression, and state transition.</div></div><div><h3>Novelty</h3><div>The proposed DNN structure having radial basis activation function has never been applied for the spreading virus in computers with kill signals.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"268 ","pages":"Article 105560"},"PeriodicalIF":3.8,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145464512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0