首页 > 最新文献

Chemometrics and Intelligent Laboratory Systems最新文献

英文 中文
Evaluation of a mathematical approach to detect fraudulent substitution of Darjeeling tea with other types of tea using the elemental profiles obtained by Energy Dispersive X-ray Fluorescence 利用能量色散x射线荧光所获得的元素谱,评估一种检测大吉岭茶与其他类型茶的欺诈性替代的数学方法
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-12-08 DOI: 10.1016/j.chemolab.2025.105606
Sergej Papoci, Manuel Jiménez, Michele Ghidotti, María Beatriz de la Calle Guntiñas
The willingness of consumers to pay higher prices for high quality specialties, such as Darjeeling tea, goes hand in hand with an increase of fraudulent practices in which Darjeeling tea is substituted totally or partially by cheaper teas. Currently, to evaluate the percentage of substitution that a method can detect, Darjeeling tea is mixed in different proportions with non-Darjeeling teas, and after homogenisation the mixture is analysed. This time-consuming approach implies the use of valuable amounts of sample and, therefore an alternative approach is needed. Here a method is described to calculate the minimum detectable substitution percentage of Darjeeling tea by other teas without needing to prepare real mixtures. The approach is based on the use of virtual mixtures made with the results obtained for commercially available Darjeeling and non-Darjeeling teas. The method used for authentication purposes, made use of the elemental profiles of tea obtained by Energy Dispersive X-ray Fluorescence, combined with chemometrics and modelling by Partial Least Square-Discriminant Analysis. The false positives percentage at different substitution levels, was evaluated and compared with the results obtained with real mixtures of Darjeeling and non-Darjeeling teas. Comparable results were obtained with both approaches. Twenty percent was the lowest substitution level that could be detected with an acceptable sensitivity (94 %) and specificity (86 %). A fast, easy to implement approach has been developed and validated, to calculate the minimum substitution percentage that can be detected by an authentication analytical method, without the need to carry out additional laboratory experiments.
消费者愿意支付更高的价格来购买高质量的特产,如大吉岭茶,与此同时,欺诈行为也在增加,大吉岭茶全部或部分被更便宜的茶所取代。目前,为了评估一种方法可以检测到的替代百分比,将大吉岭茶与非大吉岭茶以不同比例混合,并在均质后对混合物进行分析。这种耗时的方法意味着要使用大量有价值的样本,因此需要另一种方法。这里描述了一种方法来计算大吉岭茶被其他茶的最小可检测替代百分比,而无需制备真正的混合物。该方法基于对市售的大吉岭茶和非大吉岭茶所获得的结果进行的虚拟混合物的使用。该方法利用能量色散x射线荧光获得的茶叶元素谱,结合化学计量学和偏最小二乘判别分析建模,用于鉴定目的。对不同替代水平下的假阳性率进行了评价,并与实际混合的大吉岭茶和非大吉岭茶进行了比较。两种方法获得的结果具有可比性。20%是可接受的灵敏度(94%)和特异性(86%)检测到的最低替代水平。已经开发并验证了一种快速,易于实施的方法,以计算可以通过认证分析方法检测到的最小替代百分比,而无需进行额外的实验室实验。
{"title":"Evaluation of a mathematical approach to detect fraudulent substitution of Darjeeling tea with other types of tea using the elemental profiles obtained by Energy Dispersive X-ray Fluorescence","authors":"Sergej Papoci,&nbsp;Manuel Jiménez,&nbsp;Michele Ghidotti,&nbsp;María Beatriz de la Calle Guntiñas","doi":"10.1016/j.chemolab.2025.105606","DOIUrl":"10.1016/j.chemolab.2025.105606","url":null,"abstract":"<div><div>The willingness of consumers to pay higher prices for high quality specialties, such as Darjeeling tea, goes hand in hand with an increase of fraudulent practices in which Darjeeling tea is substituted totally or partially by cheaper teas. Currently, to evaluate the percentage of substitution that a method can detect, Darjeeling tea is mixed in different proportions with non-Darjeeling teas, and after homogenisation the mixture is analysed. This time-consuming approach implies the use of valuable amounts of sample and, therefore an alternative approach is needed. Here a method is described to calculate the minimum detectable substitution percentage of Darjeeling tea by other teas without needing to prepare real mixtures. The approach is based on the use of virtual mixtures made with the results obtained for commercially available Darjeeling and non-Darjeeling teas. The method used for authentication purposes, made use of the elemental profiles of tea obtained by Energy Dispersive X-ray Fluorescence, combined with chemometrics and modelling by Partial Least Square-Discriminant Analysis. The false positives percentage at different substitution levels, was evaluated and compared with the results obtained with real mixtures of Darjeeling and non-Darjeeling teas. Comparable results were obtained with both approaches. Twenty percent was the lowest substitution level that could be detected with an acceptable sensitivity (94 %) and specificity (86 %). A fast, easy to implement approach has been developed and validated, to calculate the minimum substitution percentage that can be detected by an authentication analytical method, without the need to carry out additional laboratory experiments.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"269 ","pages":"Article 105606"},"PeriodicalIF":3.8,"publicationDate":"2025-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145733576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Equivalent and complementary variables screening based on global search mechanism for wavelength optimization in spectral multivariate calibration 基于全局搜索机制的光谱多变量校准波长优化等效互补变量筛选
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-12-06 DOI: 10.1016/j.chemolab.2025.105613
Honghong Wang, Yan Zhang, Anqi Jia, Ting Wu, Yiping Du
Variable selection is a very effective method to improve performance of a multivariate calibration model when using high-dimensional spectral dataset. The newly proposed screening strategy of equivalent variables (EVs) and complementary variables (CVs) is worthy of attention. In the proposed method a local search mechanism was used to select the EVs, and the selection range was limited to the adjacent area of the basic variables (BVs) selected by a variable selection method, while the variables far from the BVs were not effectively screened. Aiming at overcoming the limitation of this strategy, this study proposed a global search mechanism based on full-spectrum scanning to screen EVs and investigate CVs based on EVs. The CVs selected from the EVs screened by the global search can provide richer and more accurate feature information to improve the performance of the model. Three variable selection algorithms, stability competitive adaptive reweighted sampling (SCARS), competitive adaptive reweighted sampling (CARS) and Monte Carlo and uninformative variable elimination (MC-UVE), were used to screen EVs and CVs. This strategy is applied to three datasets (corn and tablet NIR dataset, UV–visible dataset). In corn dataset, compared with the model established by the combination of CVs and BVs that used the local search mechanism to screen SCARS from the EVs of CARS and MC-UVE, the performance of the model constructed by 30 CVs combined with BVs based on the global search mechanism was significantly improved, RMSEC and RMSEP decreased from 0.0365 and 0.0590 to 0.0305 and 0.0496, respectively. Similarly, the RMSEP of the model prediction results constructed by the CVs of CARS and MC-UVE combined with BVs obtained by the global search decreased from 0.0625 and 0.0505 to 0.0555 and 0.0403, respectively. Similar results were obtained for other datasets.
在使用高维光谱数据集时,变量选择是提高多变量校准模型性能的一种非常有效的方法。新提出的等效变量(ev)和互补变量(cv)的筛选策略值得关注。该方法采用局部搜索机制对电动汽车进行选取,选取范围局限于变量选取法选取的基本变量(bv)的邻近区域,而对远离基本变量的变量没有有效筛选。针对该策略的局限性,本研究提出了一种基于全谱扫描的全局搜索机制来筛选电动汽车,并对基于电动汽车的cv进行研究。从全局搜索筛选的电动汽车中选择的cv可以提供更丰富、更准确的特征信息,从而提高模型的性能。采用稳定性竞争自适应重加权抽样(scar)、竞争自适应重加权抽样(CARS)和蒙特卡罗和无信息变量消除(MC-UVE)三种变量选择算法筛选电动汽车和cv。该策略应用于三个数据集(玉米和片剂近红外数据集,紫外可见数据集)。在玉米数据集中,与使用局部搜索机制从CARS和MC-UVE的ev中筛选scar的cv和bv组合模型相比,基于全局搜索机制构建的30个cv和bv组合模型的性能显著提高,RMSEC和RMSEP分别从0.0365和0.0590降低到0.0305和0.0496。同样,CARS和MC-UVE的cv结合全局搜索得到的bv构建的模型预测结果的RMSEP分别从0.0625和0.0505下降到0.0555和0.0403。其他数据集也得到了类似的结果。
{"title":"Equivalent and complementary variables screening based on global search mechanism for wavelength optimization in spectral multivariate calibration","authors":"Honghong Wang,&nbsp;Yan Zhang,&nbsp;Anqi Jia,&nbsp;Ting Wu,&nbsp;Yiping Du","doi":"10.1016/j.chemolab.2025.105613","DOIUrl":"10.1016/j.chemolab.2025.105613","url":null,"abstract":"<div><div>Variable selection is a very effective method to improve performance of a multivariate calibration model when using high-dimensional spectral dataset. The newly proposed screening strategy of equivalent variables (EVs) and complementary variables (CVs) is worthy of attention. In the proposed method a local search mechanism was used to select the EVs, and the selection range was limited to the adjacent area of the basic variables (BVs) selected by a variable selection method, while the variables far from the BVs were not effectively screened. Aiming at overcoming the limitation of this strategy, this study proposed a global search mechanism based on full-spectrum scanning to screen EVs and investigate CVs based on EVs. The CVs selected from the EVs screened by the global search can provide richer and more accurate feature information to improve the performance of the model. Three variable selection algorithms, stability competitive adaptive reweighted sampling (SCARS), competitive adaptive reweighted sampling (CARS) and Monte Carlo and uninformative variable elimination (MC-UVE), were used to screen EVs and CVs. This strategy is applied to three datasets (corn and tablet NIR dataset, UV–visible dataset). In corn dataset, compared with the model established by the combination of CVs and BVs that used the local search mechanism to screen SCARS from the EVs of CARS and MC-UVE, the performance of the model constructed by 30 CVs combined with BVs based on the global search mechanism was significantly improved, RMSEC and RMSEP decreased from 0.0365 and 0.0590 to 0.0305 and 0.0496, respectively. Similarly, the RMSEP of the model prediction results constructed by the CVs of CARS and MC-UVE combined with BVs obtained by the global search decreased from 0.0625 and 0.0505 to 0.0555 and 0.0403, respectively. Similar results were obtained for other datasets.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"269 ","pages":"Article 105613"},"PeriodicalIF":3.8,"publicationDate":"2025-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145733700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhanced Raman hyperspectral imaging using RS-NMF: a novel Regularized Sparse Non-negative Matrix Factorization for spectral unmixing 基于RS-NMF的增强拉曼高光谱成像:一种用于光谱分解的正则化稀疏非负矩阵分解新方法
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-12-04 DOI: 10.1016/j.chemolab.2025.105602
Marc Offroy , Amir Ayadi , Léon Govohetchan , Janette Ayoub , Thomas M. Hancewicz , Ludovic Duponchel , Mario Marchetti
Molecular spectroscopy is a powerful, non-destructive technique for chemical analysis, as the sample remains unaltered during the measurement. Although it is essential for getting meaningful information, it often suffers from spectral overlap, making it challenging to identify individual components within a sample. Therefore, for over fifty years, a plethora of mathematical approaches have been developed to unmix complex signals and push the detection limits of spectroscopic instruments, such as Blind Source Separation (BSS) or Multivariate Curve Resolution (MCR), to name but a few. However, despite these numerous advances, and even as the amount of data increases – potentially providing more information – they continue to face inherent limitations (i.e., selectivity problems), particularly when dealing with contemporary samples, making their thorough characterization an increasingly intricate challenge, especially with diminishing prior knowledge. This article presents a novel signal unmixing method applied to hyperspectral Raman imaging designed to overcome these limitations. Our approach, based on a Non-Negative Matrix Factorization (NMF), addresses critical challenges such as rotational ambiguity and noise sensitivity, which often prevent accurate pure component spectral unmixing. First, we introduce our methodology and explain how it differs from existing mathematical methods. We then evaluate its performance on a well-known real-world dataset in the chemometrics community called “emulsion” from hyperspectral Raman imaging. To further challenge our method, we apply it to a complex simulated molecular signal dataset. Finally, we compare our results with those obtained using the standard MCR-ALS approach. Our initial results demonstrate that this RS-NMF approach improves the unmixing of complex signals.
分子光谱学是一种强大的、非破坏性的化学分析技术,因为在测量过程中样品保持不变。虽然它对于获得有意义的信息是必不可少的,但它经常受到光谱重叠的影响,这使得识别样本中的单个成分变得具有挑战性。因此,五十多年来,已经开发了大量的数学方法来分解复杂信号并推动光谱仪器的检测极限,例如盲源分离(BSS)或多元曲线分辨率(MCR),仅举几例。然而,尽管有这些众多的进步,甚至随着数据量的增加-潜在地提供更多的信息-他们继续面临固有的局限性(即,选择性问题),特别是在处理当代样品时,使他们的彻底表征成为一个日益复杂的挑战,特别是随着先验知识的减少。本文提出了一种新的用于高光谱拉曼成像的信号解混方法,旨在克服这些限制。我们的方法基于非负矩阵分解(NMF),解决了旋转模糊和噪声敏感性等关键挑战,这些问题通常会妨碍准确的纯成分光谱分解。首先,我们介绍了我们的方法,并解释了它与现有数学方法的区别。然后,我们评估了它在化学计量学社区中一个著名的真实数据集上的性能,该数据集被称为“乳液”,来自高光谱拉曼成像。为了进一步挑战我们的方法,我们将其应用于复杂的模拟分子信号数据集。最后,我们将我们的结果与使用标准MCR-ALS方法获得的结果进行比较。我们的初步结果表明,这种RS-NMF方法改善了复杂信号的解混。
{"title":"Enhanced Raman hyperspectral imaging using RS-NMF: a novel Regularized Sparse Non-negative Matrix Factorization for spectral unmixing","authors":"Marc Offroy ,&nbsp;Amir Ayadi ,&nbsp;Léon Govohetchan ,&nbsp;Janette Ayoub ,&nbsp;Thomas M. Hancewicz ,&nbsp;Ludovic Duponchel ,&nbsp;Mario Marchetti","doi":"10.1016/j.chemolab.2025.105602","DOIUrl":"10.1016/j.chemolab.2025.105602","url":null,"abstract":"<div><div>Molecular spectroscopy is a powerful, non-destructive technique for chemical analysis, as the sample remains unaltered during the measurement. Although it is essential for getting meaningful information, it often suffers from spectral overlap, making it challenging to identify individual components within a sample. Therefore, for over fifty years, a plethora of mathematical approaches have been developed to unmix complex signals and push the detection limits of spectroscopic instruments, such as Blind Source Separation (BSS) or Multivariate Curve Resolution (MCR), to name but a few. However, despite these numerous advances, and even as the amount of data increases – potentially providing more information – they continue to face inherent limitations (i.e., selectivity problems), particularly when dealing with contemporary samples, making their thorough characterization an increasingly intricate challenge, especially with diminishing prior knowledge. This article presents a novel signal unmixing method applied to hyperspectral Raman imaging designed to overcome these limitations. Our approach, based on a Non-Negative Matrix Factorization (NMF), addresses critical challenges such as rotational ambiguity and noise sensitivity, which often prevent accurate pure component spectral unmixing. First, we introduce our methodology and explain how it differs from existing mathematical methods. We then evaluate its performance on a well-known real-world dataset in the chemometrics community called “emulsion” from hyperspectral Raman imaging. To further challenge our method, we apply it to a complex simulated molecular signal dataset. Finally, we compare our results with those obtained using the standard MCR-ALS approach. Our initial results demonstrate that this RS-NMF approach improves the unmixing of complex signals.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"269 ","pages":"Article 105602"},"PeriodicalIF":3.8,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145733702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Chemometric modeling of physicochemical properties using Lanzhou and Ad-Hoc Lanzhou indices: A multi-scale approach for drug design and material informatics 基于兰州指数和Ad-Hoc兰州指数的理化性质的化学计量学建模:药物设计和材料信息学的多尺度方法
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-12-03 DOI: 10.1016/j.chemolab.2025.105607
Song Tingting , Sadia Noureen , Saliha Kamran , Sobhy M. Ibrahim , Adnan Aslam
Chemical graph theory serves as a foundational framework in chemical informatics, offering molecular descriptors that enable the prediction of critical physicochemical properties. This study investigates the utility of two recently proposed topological indices — the Lanzhou index and its derivative, the Ad-hoc Lanzhou index — by computing them for four structurally diverse systems: Bismuth(III) Iodide (a layered inorganic compound), Nanostar Dendrimer (a hyperbranched polymer), and the two-dimensional Triangular Oxide and Triangular Silicate Networks. To assess the indices predictive power, we established linear regression models correlating these indices with five experimentally relevant properties of 21 phenethylamine derivatives: molar refractivity (MR), octanol-water partition coefficient (LOG P), calculated Log P (CLog P), critical volume (CV), and boiling point. Statistical robustness was evaluated using the coefficient of determination (R2), F-statistic, and significance level (P-value). The models for boiling point, CV, and MR exhibited strong significance (R2>0,P=0), while LOG P and CLog P also showed statistically valid correlations (P=0), though with slightly lower R2 values. Notably, the Lanzhou index demonstrated marginally superior performance in predicting partition coefficients, suggesting its sensitivity to hydrophobic interactions. These results underscore the efficacy of Lanzhou-based indices as reliable tools for quantifying structure–property relationships, particularly in drug design applications where rapid estimation of solubility, volatility, and bioavailability is critical. Our findings advocate for the broader integration of these indices into cheminformatics pipelines to augment molecular screening and optimization processes
化学图论作为化学信息学的基础框架,提供分子描述符,使关键的物理化学性质的预测成为可能。本研究研究了最近提出的两种拓扑指数的效用——兰州指数及其衍生物,Ad-hoc兰州指数——通过计算四种结构不同的体系:碘化铋(一种层状无机化合物)、纳米树状大分子(一种超支化聚合物)和二维三角形氧化物和三角形硅酸盐网络。为了评估这些指标的预测能力,我们建立了线性回归模型,将这些指标与21种苯乙胺衍生物的五种实验相关性质相关联:摩尔折射率(MR)、辛醇-水分配系数(LOG P)、计算LOG P (CLog P)、临界体积(CV)和沸点。采用决定系数(R2)、f统计量和显著性水平(p值)评估统计稳健性。沸点、CV和MR的模型显示出很强的显著性(R2>0,P=0),而LOG P和CLog P也显示出统计学上有效的相关性(P=0),尽管R2值略低。值得注意的是,兰州指数在预测分配系数方面表现出略微优越的性能,表明其对疏水相互作用的敏感性。这些结果强调了兰州指数作为定量结构-性质关系的可靠工具的有效性,特别是在药物设计应用中,快速估计溶解度、挥发性和生物利用度至关重要。我们的研究结果提倡将这些指标更广泛地整合到化学信息学管道中,以增强分子筛选和优化过程
{"title":"Chemometric modeling of physicochemical properties using Lanzhou and Ad-Hoc Lanzhou indices: A multi-scale approach for drug design and material informatics","authors":"Song Tingting ,&nbsp;Sadia Noureen ,&nbsp;Saliha Kamran ,&nbsp;Sobhy M. Ibrahim ,&nbsp;Adnan Aslam","doi":"10.1016/j.chemolab.2025.105607","DOIUrl":"10.1016/j.chemolab.2025.105607","url":null,"abstract":"<div><div>Chemical graph theory serves as a foundational framework in chemical informatics, offering molecular descriptors that enable the prediction of critical physicochemical properties. This study investigates the utility of two recently proposed topological indices — the Lanzhou index and its derivative, the Ad-hoc Lanzhou index — by computing them for four structurally diverse systems: Bismuth(III) Iodide (a layered inorganic compound), Nanostar Dendrimer (a hyperbranched polymer), and the two-dimensional Triangular Oxide and Triangular Silicate Networks. To assess the indices predictive power, we established linear regression models correlating these indices with five experimentally relevant properties of 21 phenethylamine derivatives: molar refractivity (MR), octanol-water partition coefficient (LOG P), calculated Log P (CLog P), critical volume (CV), and boiling point. Statistical robustness was evaluated using the coefficient of determination (<span><math><msup><mrow><mi>R</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span>), F-statistic, and significance level (<span><math><mi>P</mi></math></span>-value). The models for boiling point, CV, and MR exhibited strong significance (<span><math><mrow><msup><mrow><mi>R</mi></mrow><mrow><mn>2</mn></mrow></msup><mo>&gt;</mo><mn>0</mn><mo>,</mo><mi>P</mi><mo>=</mo><mn>0</mn></mrow></math></span>), while LOG P and CLog P also showed statistically valid correlations (<span><math><mrow><mi>P</mi><mo>=</mo><mn>0</mn></mrow></math></span>), though with slightly lower <span><math><msup><mrow><mi>R</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span> values. Notably, the Lanzhou index demonstrated marginally superior performance in predicting partition coefficients, suggesting its sensitivity to hydrophobic interactions. These results underscore the efficacy of Lanzhou-based indices as reliable tools for quantifying structure–property relationships, particularly in drug design applications where rapid estimation of solubility, volatility, and bioavailability is critical. Our findings advocate for the broader integration of these indices into cheminformatics pipelines to augment molecular screening and optimization processes</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"269 ","pages":"Article 105607"},"PeriodicalIF":3.8,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145682774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparative evaluation of lightweight convolutional neural network and vision transformer models for multi-class brain tumor classification using merged large MRI datasets 轻量级卷积神经网络与视觉转换模型在融合大MRI数据集的多类脑肿瘤分类中的比较评价
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-12-03 DOI: 10.1016/j.chemolab.2025.105609
Omneya Attallah , Ishak Pacal
The accurate classification of brain tumors from MRI scans is important for the timely diagnosis and treatment planning process; however, previous state-of-the-art automatic image classification methods frequently struggle to balance performance with computational cost for clinical applications. In this study, we evaluated twenty lightweight Convolutional Neural Networks (CNN) models and eighteen Vision Transformers (ViT) models for multi-class brain tumor classification using a merged dataset of 17,933 MRI images from 4 categories (glioma, meningioma, pituitary tumors, and healthy brains). The study demonstrated that both groups of architectures can achieve state-of-the-art performance with EfficientNet-b0 (98.36 % accuracy, 4.01 M params) and Tiny-ViT-5M (98.41 % accuracy, 5.07 M params), ranking as the top-performing models for each category. The systematic comparison determined that the proposed lighter models have equivalent or greater performance than established lightweight frameworks, while offering computational advantages, such as MobileViT-xxSmall, which achieved outstanding performance (98.16 % accuracy) with fewer than 1 M parameters. Through benchmarking against fourteen other prior existing frameworks for brain tumor classification, we demonstrated that the top-performing lightweight models of this study maintain stable performances across all evaluation metrics (including precision, recall, and F1 score) and aim to mitigate key weaknesses of prior work, including dataset diversity and model complexity. The findings show very competitive performance across brain tumor classification, highlighting the promise of lightweight architectures to generate accurate and efficient diagnostic support for potential clinical deployment, particularly in low-resource healthcare environments where such efficiencies are vital. Moreover, this work provides useful knowledge that may assist in developing deployable artificial intelligence solutions in neuro-oncology settings.
MRI扫描对脑肿瘤的准确分类对于及时诊断和制定治疗计划至关重要;然而,以前最先进的自动图像分类方法经常在临床应用的性能和计算成本之间取得平衡。在这项研究中,我们使用来自4类(胶质瘤、脑膜瘤、垂体瘤和健康脑)的17,933张MRI图像的合并数据集,评估了20种轻量级卷积神经网络(CNN)模型和18种视觉变形器(ViT)模型的多类别脑肿瘤分类。研究表明,这两组架构都可以在效率网-b0(98.36%的准确率,4.01 M参数)和微型vit - 5m(98.41%的准确率,5.07 M参数)上达到最先进的性能,在每个类别中都是表现最好的模型。系统的比较确定了提出的更轻的模型与现有的轻量化框架具有同等或更高的性能,同时提供计算优势,例如MobileViT-xxSmall,它在少于1 M参数的情况下取得了出色的性能(98.16%的准确率)。通过对其他14个现有的脑肿瘤分类框架进行基准测试,我们证明了本研究中表现最好的轻量级模型在所有评估指标(包括精度、召回率和F1分数)上保持稳定的性能,并旨在缓解先前工作的关键弱点,包括数据集多样性和模型复杂性。研究结果显示,该系统在脑肿瘤分类方面的表现非常有竞争力,突出了轻量级架构为潜在的临床部署提供准确、高效诊断支持的前景,特别是在资源匮乏的医疗环境中,这种效率至关重要。此外,这项工作提供了有用的知识,可能有助于在神经肿瘤学环境中开发可部署的人工智能解决方案。
{"title":"Comparative evaluation of lightweight convolutional neural network and vision transformer models for multi-class brain tumor classification using merged large MRI datasets","authors":"Omneya Attallah ,&nbsp;Ishak Pacal","doi":"10.1016/j.chemolab.2025.105609","DOIUrl":"10.1016/j.chemolab.2025.105609","url":null,"abstract":"<div><div>The accurate classification of brain tumors from MRI scans is important for the timely diagnosis and treatment planning process; however, previous state-of-the-art automatic image classification methods frequently struggle to balance performance with computational cost for clinical applications. In this study, we evaluated twenty lightweight Convolutional Neural Networks (CNN) models and eighteen Vision Transformers (ViT) models for multi-class brain tumor classification using a merged dataset of 17,933 MRI images from 4 categories (glioma, meningioma, pituitary tumors, and healthy brains). The study demonstrated that both groups of architectures can achieve state-of-the-art performance with EfficientNet-b0 (98.36 % accuracy, 4.01 M params) and Tiny-ViT-5M (98.41 % accuracy, 5.07 M params), ranking as the top-performing models for each category. The systematic comparison determined that the proposed lighter models have equivalent or greater performance than established lightweight frameworks, while offering computational advantages, such as MobileViT-xxSmall, which achieved outstanding performance (98.16 % accuracy) with fewer than 1 M parameters. Through benchmarking against fourteen other prior existing frameworks for brain tumor classification, we demonstrated that the top-performing lightweight models of this study maintain stable performances across all evaluation metrics (including precision, recall, and F1 score) and aim to mitigate key weaknesses of prior work, including dataset diversity and model complexity. The findings show very competitive performance across brain tumor classification, highlighting the promise of lightweight architectures to generate accurate and efficient diagnostic support for potential clinical deployment, particularly in low-resource healthcare environments where such efficiencies are vital. Moreover, this work provides useful knowledge that may assist in developing deployable artificial intelligence solutions in neuro-oncology settings.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"269 ","pages":"Article 105609"},"PeriodicalIF":3.8,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145733701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Scientific trends in spectroscopy and regression chemometric modelling for the estimation of whole fruit quality: A systematic review 全果品质评价的光谱和回归化学计量学模型的科学发展趋势:系统综述
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-12-03 DOI: 10.1016/j.chemolab.2025.105612
Vicente Amirpasha Tirado-Kulieva , Fidel A. Torres-Guevara , Jhony Alberto Gonzales-Malca , Wilson Castro , Lucía Seguí
Spectroscopic techniques, supported by chemometrics, provide rapid, non-destructive, and sustainable solutions for assessing the quality of whole fruits. This study systematically reviews and analyzes research on spectroscopy and regression chemometric modelling for the estimation of whole fruits from 1997 to 2025. A total of 389 English-language articles were retrieved from Scopus using a hybrid strategy that combined an initial search and snowballing. Geographical analysis identified China as the leading country in scientific output, followed by Spain, Italy, and the United States, while Africa and Oceania showed limited participation. Apples, grapes, pears, and mangoes were the most frequently studied fruits, and commonly modeled attributes included SSC, physicochemical properties, and bioactive compounds. NIR and Vis-NIR were the predominant techniques, complemented by emerging methods such as HSI and Raman. Among chemometric approaches, preprocessing relied mainly on hybrid strategies followed by SNV and derivatives. For dimensionality reduction, CARS, SPA, and hybrid methods were the most relevant. PLSR remained dominant for modeling, although there was an increasing use of advanced algorithms such as SVMR and deep neural networks. The review also examined current trends and future directions, highlighting progress in robust modeling algorithms, portable and online detection systems, and multimodal spectroscopy with data fusion. Key priorities include methodological harmonization, open data practices, and large-scale field validation. Overall, the findings highlight a transition toward more precise and adaptable systems while underscoring persistent challenges in standardization, real-world validation, and equitable access. This review provides a strategic foundation for advancing non-destructive technologies across the fruit value chain.
在化学计量学的支持下,光谱技术为评估整个水果的质量提供了快速、无损和可持续的解决方案。本研究系统地回顾和分析了1997 - 2025年全果估算的光谱和回归化学计量模型研究。使用初始搜索和滚雪球式搜索相结合的混合策略,从Scopus检索了389篇英文文章。地理分析表明,中国是科学产出最多的国家,其次是西班牙、意大利和美国,而非洲和大洋洲的参与程度有限。苹果、葡萄、梨和芒果是最常被研究的水果,通常建模的属性包括SSC、物理化学性质和生物活性化合物。近红外光谱和可见光近红外光谱是主要的技术,辅以新兴的方法,如HSI和拉曼光谱。在化学计量学方法中,预处理主要依靠混合策略,其次是SNV和衍生物。对于降维,CARS, SPA和混合方法是最相关的。PLSR仍然是建模的主导,尽管越来越多地使用先进的算法,如SVMR和深度神经网络。该综述还审查了当前趋势和未来方向,强调了鲁棒建模算法、便携式和在线检测系统以及具有数据融合的多模态光谱的进展。关键优先事项包括方法协调、开放数据实践和大规模现场验证。总体而言,研究结果强调了向更精确和适应性更强的系统的过渡,同时强调了标准化、现实验证和公平获取方面的持续挑战。这一综述为在整个水果价值链中推进无损技术提供了战略基础。
{"title":"Scientific trends in spectroscopy and regression chemometric modelling for the estimation of whole fruit quality: A systematic review","authors":"Vicente Amirpasha Tirado-Kulieva ,&nbsp;Fidel A. Torres-Guevara ,&nbsp;Jhony Alberto Gonzales-Malca ,&nbsp;Wilson Castro ,&nbsp;Lucía Seguí","doi":"10.1016/j.chemolab.2025.105612","DOIUrl":"10.1016/j.chemolab.2025.105612","url":null,"abstract":"<div><div>Spectroscopic techniques, supported by chemometrics, provide rapid, non-destructive, and sustainable solutions for assessing the quality of whole fruits. This study systematically reviews and analyzes research on spectroscopy and regression chemometric modelling for the estimation of whole fruits from 1997 to 2025. A total of 389 English-language articles were retrieved from Scopus using a hybrid strategy that combined an initial search and snowballing. Geographical analysis identified China as the leading country in scientific output, followed by Spain, Italy, and the United States, while Africa and Oceania showed limited participation. Apples, grapes, pears, and mangoes were the most frequently studied fruits, and commonly modeled attributes included SSC, physicochemical properties, and bioactive compounds. NIR and Vis-NIR were the predominant techniques, complemented by emerging methods such as HSI and Raman. Among chemometric approaches, preprocessing relied mainly on hybrid strategies followed by SNV and derivatives. For dimensionality reduction, CARS, SPA, and hybrid methods were the most relevant. PLSR remained dominant for modeling, although there was an increasing use of advanced algorithms such as SVMR and deep neural networks. The review also examined current trends and future directions, highlighting progress in robust modeling algorithms, portable and online detection systems, and multimodal spectroscopy with data fusion. Key priorities include methodological harmonization, open data practices, and large-scale field validation. Overall, the findings highlight a transition toward more precise and adaptable systems while underscoring persistent challenges in standardization, real-world validation, and equitable access. This review provides a strategic foundation for advancing non-destructive technologies across the fruit value chain.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"269 ","pages":"Article 105612"},"PeriodicalIF":3.8,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145920784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bridging lab-to-clinic: microbiological screening via Swin-Ultra Transformer with transfer learning 连接实验室到诊所:通过swing - ultra Transformer和迁移学习进行微生物筛选
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-12-02 DOI: 10.1016/j.chemolab.2025.105605
Yunxin Wang , Wenjing Zhang , Hongguo Wei , Yuetian Ren , Haosong Du , Wenbin Xu , Ailing Tan , Shuo Chen
Bacterial infections are a critical global health issue, requiring rapid and precise pathogen identification for effective infection control. Traditional methods, such as culture and nucleic acid amplification, are often slow and lack sensitivity. Raman spectroscopy combing with deep learning has been a powerful technique for microbial identification. However, limitations such as bacterial physiological states, genetic variation, interference from biological materials, and differences in laboratory conditions make its practical application still challenging. This study introduces a feature-enhanced dual-attention pathway Shifted Window-Ultra (Swin-Ultra) Transformer architecture, integrated with deep transfer learning, to address challenges like bacterial physiological states, genetic variation, and laboratory condition discrepancies. A Bacterial Pre-trained Transformer (BPT) was developed using the Bacteria-ID database, achieving excellent classification performance, i.e., 98.26 % accuracy. Fine-tuning with clinical datasets yielded accuracies of 99.80 % for bacterial pathogens and 98.53 % for Cryptococcus genotypes. This approach, bridges laboratory models and clinical applications, enhancing unknown pathogen identification, infection control, and public health surveillance, with significant potential to improve patient outcomes.
细菌感染是一个重要的全球卫生问题,需要快速和精确的病原体识别才能有效控制感染。传统的方法,如培养和核酸扩增,往往是缓慢和缺乏灵敏度。拉曼光谱与深度学习相结合已成为微生物鉴定的有力技术。然而,细菌的生理状态、遗传变异、生物材料的干扰以及实验室条件的差异等限制使其实际应用仍然具有挑战性。本研究引入了一种功能增强的双注意路径转移窗口-超(swan - ultra)转换器架构,结合深度迁移学习,以解决细菌生理状态、遗传变异和实验室条件差异等挑战。利用细菌id数据库开发了细菌预训练转换器(BPT),实现了优异的分类性能,准确率达到98.26%。对临床数据集进行微调,对细菌病原体和隐球菌基因型的准确率分别为99.80%和98.53%。这种方法连接了实验室模型和临床应用,加强了未知病原体的识别、感染控制和公共卫生监测,具有改善患者预后的巨大潜力。
{"title":"Bridging lab-to-clinic: microbiological screening via Swin-Ultra Transformer with transfer learning","authors":"Yunxin Wang ,&nbsp;Wenjing Zhang ,&nbsp;Hongguo Wei ,&nbsp;Yuetian Ren ,&nbsp;Haosong Du ,&nbsp;Wenbin Xu ,&nbsp;Ailing Tan ,&nbsp;Shuo Chen","doi":"10.1016/j.chemolab.2025.105605","DOIUrl":"10.1016/j.chemolab.2025.105605","url":null,"abstract":"<div><div>Bacterial infections are a critical global health issue, requiring rapid and precise pathogen identification for effective infection control. Traditional methods, such as culture and nucleic acid amplification, are often slow and lack sensitivity. Raman spectroscopy combing with deep learning has been a powerful technique for microbial identification. However, limitations such as bacterial physiological states, genetic variation, interference from biological materials, and differences in laboratory conditions make its practical application still challenging. This study introduces a feature-enhanced dual-attention pathway Shifted Window-Ultra (Swin-Ultra) Transformer architecture, integrated with deep transfer learning, to address challenges like bacterial physiological states, genetic variation, and laboratory condition discrepancies. A Bacterial Pre-trained Transformer (BPT) was developed using the Bacteria-ID database, achieving excellent classification performance, i.e., 98.26 % accuracy. Fine-tuning with clinical datasets yielded accuracies of 99.80 % for bacterial pathogens and 98.53 % for Cryptococcus genotypes. This approach, bridges laboratory models and clinical applications, enhancing unknown pathogen identification, infection control, and public health surveillance, with significant potential to improve patient outcomes.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"269 ","pages":"Article 105605"},"PeriodicalIF":3.8,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145682908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Research on quality evaluation system and grade classification of Angelica dahurica based on artificial intelligence and multispectral technology 基于人工智能和多光谱技术的白芷质量评价体系及等级分类研究
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-12-02 DOI: 10.1016/j.chemolab.2025.105610
Wei Nie , Xulong Huang , Jin Pei , Chaoxiang Ren , Tao Zhou , Jinyu Du , Huajuan Jiang , HanYi Zhang , Xin Li , Juan Li , Yuhang Li , Yueying Hu , Zhiyu Hao
Angelica dahurica (AD) is both a widely used spice and a precious traditional Chinese medicine. Currently, its quality evaluation predominantly depends on traditional identification methods and physicochemical assessments, which are often subjective or time-consuming, thus limiting their suitability for rapid, non-destructive, and accurate quality evaluation. Therefore, this study constructed a quality evaluation system based on key criteria: shape, color, odour and texture. Experienced traditional medicine experts scored 611 samples according to this system, categorizing them into three quality grades. Imperatorin and isoimperatorin in 30 randomly selected batches were quantified by HPLC, revealing a positive correlation with quality grade and confirming the system's accuracy and reliability. Moreover, quality grading models were established by integrating multispectral imaging technology with artificial intelligence technologies such as CNN and Transformer. The Transformer model achieved the highest accuracy of 88.71 %. Overall, this study improves the objectivity and reproducibility of traditional identification methods. It also demonstrates that integrating artificial intelligence with multispectral imaging enables non-destructive, rapid, and precise classification of AD, offering a novel approach for quality control of medicinal materials.
白芷(Angelica dahurica)是一种用途广泛的香料,也是一种珍贵的中药。目前,其质量评价主要依靠传统的鉴定方法和物理化学评价方法,这些方法往往主观或耗时,从而限制了其对快速、无损、准确的质量评价的适用性。因此,本研究构建了一个以形状、颜色、气味和质地为主要标准的质量评价体系。经验丰富的传统医学专家根据该系统对611份样本进行评分,并将其分为三个质量等级。采用高效液相色谱法对随机选取的30个批次的欧前胡素和异欧前胡素进行定量分析,结果表明,欧前胡素与质量等级呈正相关,验证了体系的准确性和可靠性。将多光谱成像技术与CNN、Transformer等人工智能技术相结合,建立了质量分级模型。Transformer模型的准确率最高,达到了88.71%。总的来说,本研究提高了传统鉴定方法的客观性和可重复性。该研究还表明,将人工智能与多光谱成像相结合,可以实现AD的无损、快速、精确分类,为药材质量控制提供了一种新的方法。
{"title":"Research on quality evaluation system and grade classification of Angelica dahurica based on artificial intelligence and multispectral technology","authors":"Wei Nie ,&nbsp;Xulong Huang ,&nbsp;Jin Pei ,&nbsp;Chaoxiang Ren ,&nbsp;Tao Zhou ,&nbsp;Jinyu Du ,&nbsp;Huajuan Jiang ,&nbsp;HanYi Zhang ,&nbsp;Xin Li ,&nbsp;Juan Li ,&nbsp;Yuhang Li ,&nbsp;Yueying Hu ,&nbsp;Zhiyu Hao","doi":"10.1016/j.chemolab.2025.105610","DOIUrl":"10.1016/j.chemolab.2025.105610","url":null,"abstract":"<div><div><em>Angelica dahurica</em> (AD) is both a widely used spice and a precious traditional Chinese medicine. Currently, its quality evaluation predominantly depends on traditional identification methods and physicochemical assessments, which are often subjective or time-consuming, thus limiting their suitability for rapid, non-destructive, and accurate quality evaluation. Therefore, this study constructed a quality evaluation system based on key criteria: shape, color, odour and texture. Experienced traditional medicine experts scored 611 samples according to this system, categorizing them into three quality grades. Imperatorin and isoimperatorin in 30 randomly selected batches were quantified by HPLC, revealing a positive correlation with quality grade and confirming the system's accuracy and reliability. Moreover, quality grading models were established by integrating multispectral imaging technology with artificial intelligence technologies such as CNN and Transformer. The Transformer model achieved the highest accuracy of 88.71 %. Overall, this study improves the objectivity and reproducibility of traditional identification methods. It also demonstrates that integrating artificial intelligence with multispectral imaging enables non-destructive, rapid, and precise classification of AD, offering a novel approach for quality control of medicinal materials.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"269 ","pages":"Article 105610"},"PeriodicalIF":3.8,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145682912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multimodal class-aware molecule language model for drug response prediction 用于药物反应预测的多模态类感知分子语言模型
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-12-02 DOI: 10.1016/j.chemolab.2025.105604
Yunfei Xia, Hui Yu, Xiaobo Zhou, Lichuan Gu, Qingyong Wang
Drug response prediction driven by artificial intelligence (AI) offers an efficient solution for accelerating the development of precision medicine and personalized treatment. However, existing AI methods are typically limited by high noise levels, heterogeneity, and limited modal data. These limitations decrease model performance and hinder the identification of critical biomarkers. Therefore, we propose a Multimodal Class-Aware Molecular Language (MCML) model for accurate drug response prediction. Specifically, MCML systematically integrates multimodal features of drugs and cell lines and establishes a crossmodal modeling mechanism to achieve deep fusion of multimodal information. Meanwhile, the model dynamically adjusts the contribution weights based on the class features and importance of the samples, effectively alleviating the noise interference inherent in multimodal data. Furthermore, MCML employs self-supervised learning for pre-training to capture potential molecular interaction patterns, enhancing its ability to adapt to data heterogeneity. Experiments performed on cross-scale multiomics datasets and single-cell transcriptomic data indicate that MCML significantly outperforms existing state-of-the-art models in RMSE and MAE scores. These case studies further demonstrate that the MCML model can effectively identify tumor microenvironment characteristics associated with drug resistance, demonstrating its ability to discover relevant biomarkers. Additionally, we performed an interpretability analysis of the model to investigate the impact of key features on the prediction results. This research establishes a new methodological paradigm for multimodal tumor data-driven drug response predictions and offers reliable computational tools for personalized cancer treatment decision making.
人工智能驱动的药物反应预测为加快精准医疗和个性化治疗的发展提供了有效的解决方案。然而,现有的人工智能方法通常受到高噪声水平、异质性和有限模态数据的限制。这些限制降低了模型的性能,阻碍了关键生物标志物的识别。因此,我们提出了一个多模态类感知分子语言(MCML)模型来准确预测药物反应。具体而言,MCML系统整合药物和细胞系的多模态特征,建立跨模态建模机制,实现多模态信息的深度融合。同时,该模型根据样本的类特征和重要性动态调整贡献权重,有效缓解了多模态数据固有的噪声干扰。此外,MCML采用自监督学习进行预训练,以捕获潜在的分子相互作用模式,增强其适应数据异质性的能力。在跨尺度多组学数据集和单细胞转录组学数据上进行的实验表明,MCML在RMSE和MAE评分方面明显优于现有的最先进模型。这些案例研究进一步表明,MCML模型可以有效识别与耐药相关的肿瘤微环境特征,显示其发现相关生物标志物的能力。此外,我们对模型进行了可解释性分析,以调查关键特征对预测结果的影响。本研究为多模式肿瘤数据驱动的药物反应预测建立了一种新的方法学范式,并为个性化癌症治疗决策提供了可靠的计算工具。
{"title":"Multimodal class-aware molecule language model for drug response prediction","authors":"Yunfei Xia,&nbsp;Hui Yu,&nbsp;Xiaobo Zhou,&nbsp;Lichuan Gu,&nbsp;Qingyong Wang","doi":"10.1016/j.chemolab.2025.105604","DOIUrl":"10.1016/j.chemolab.2025.105604","url":null,"abstract":"<div><div>Drug response prediction driven by artificial intelligence (AI) offers an efficient solution for accelerating the development of precision medicine and personalized treatment. However, existing AI methods are typically limited by high noise levels, heterogeneity, and limited modal data. These limitations decrease model performance and hinder the identification of critical biomarkers. Therefore, we propose a Multimodal Class-Aware Molecular Language (MCML) model for accurate drug response prediction. Specifically, MCML systematically integrates multimodal features of drugs and cell lines and establishes a crossmodal modeling mechanism to achieve deep fusion of multimodal information. Meanwhile, the model dynamically adjusts the contribution weights based on the class features and importance of the samples, effectively alleviating the noise interference inherent in multimodal data. Furthermore, MCML employs self-supervised learning for pre-training to capture potential molecular interaction patterns, enhancing its ability to adapt to data heterogeneity. Experiments performed on cross-scale multiomics datasets and single-cell transcriptomic data indicate that MCML significantly outperforms existing state-of-the-art models in RMSE and MAE scores. These case studies further demonstrate that the MCML model can effectively identify tumor microenvironment characteristics associated with drug resistance, demonstrating its ability to discover relevant biomarkers. Additionally, we performed an interpretability analysis of the model to investigate the impact of key features on the prediction results. This research establishes a new methodological paradigm for multimodal tumor data-driven drug response predictions and offers reliable computational tools for personalized cancer treatment decision making.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"269 ","pages":"Article 105604"},"PeriodicalIF":3.8,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145682907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LRADA: An adaptive global-local fault diagnosis via low-rank subspace representation with prior-constrained discriminative framework LRADA:基于低秩子空间表示和先验约束判别框架的自适应全局局部故障诊断
IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-12-01 DOI: 10.1016/j.chemolab.2025.105608
Yi Luo , Jian Cheng , Si-Yu Chen , Lin-Hao Nie , Qian Cheng , Cheng-Shu Ye , Yuan Xu , Yang Zhao
Fault diagnosis of industrial processes is of great significance for reducing the risk of sensor damage and improving the safety and smooth operation of the plants. The data-driven fault diagnosis methods show promise in eliminating redundant signals captured by sensors, enabling superior anomaly detection performance. However, current data-driven approaches face many challenges in industrial plants signal processing and feature extraction, such as insufficient robustness, poor interpretability of the projection space, and inadequate representation of nonlinear local structures. To address these challenges, we propose a novel diagnostic framework, the low-rank approximation fusion discriminant analysis (LRADA) model, to enhance fault diagnosis for industrial processes. In the LRADA method, the prior information of the low-rank subspace was used to learn the global low-rank attributes of the original space. Then these attributes are embedded into the improved local linear discriminant analysis framework to enhance the discrimination between different classes. In addition, specific norm constraints are imposed on the projection space to facilitate sparse feature extraction and suppress noise interference. Finally, the effectiveness of the proposed diagnosis method is verified by experimental analysis on three benchmark data sets of industrial processes. The experimental results show the superiority of the proposed method in solving the above challenges and improving the fault diagnosis accuracy in industrial environments. LRADA is available at https://github.com/gitcodelist/LRADA.
工业过程故障诊断对于降低传感器损坏的风险,提高工厂的安全、平稳运行具有重要意义。数据驱动的故障诊断方法有望消除传感器捕获的冗余信号,从而实现卓越的异常检测性能。然而,目前的数据驱动方法在工业植物信号处理和特征提取中面临许多挑战,如鲁棒性不足,投影空间的可解释性差,以及非线性局部结构的不充分表示。为了解决这些问题,我们提出了一种新的诊断框架,即低秩近似融合判别分析(LRADA)模型,以增强工业过程的故障诊断能力。在LRADA方法中,利用低秩子空间的先验信息学习原始空间的全局低秩属性。然后将这些属性嵌入到改进的局部线性判别分析框架中,以增强不同类别之间的区分。此外,对投影空间施加了特定的范数约束,便于稀疏特征提取和抑制噪声干扰。最后,通过对三个工业过程基准数据集的实验分析,验证了所提诊断方法的有效性。实验结果表明,该方法在解决上述问题和提高工业环境下的故障诊断精度方面具有优越性。LRADA可在https://github.com/gitcodelist/LRADA上获得。
{"title":"LRADA: An adaptive global-local fault diagnosis via low-rank subspace representation with prior-constrained discriminative framework","authors":"Yi Luo ,&nbsp;Jian Cheng ,&nbsp;Si-Yu Chen ,&nbsp;Lin-Hao Nie ,&nbsp;Qian Cheng ,&nbsp;Cheng-Shu Ye ,&nbsp;Yuan Xu ,&nbsp;Yang Zhao","doi":"10.1016/j.chemolab.2025.105608","DOIUrl":"10.1016/j.chemolab.2025.105608","url":null,"abstract":"<div><div>Fault diagnosis of industrial processes is of great significance for reducing the risk of sensor damage and improving the safety and smooth operation of the plants. The data-driven fault diagnosis methods show promise in eliminating redundant signals captured by sensors, enabling superior anomaly detection performance. However, current data-driven approaches face many challenges in industrial plants signal processing and feature extraction, such as insufficient robustness, poor interpretability of the projection space, and inadequate representation of nonlinear local structures. To address these challenges, we propose a novel diagnostic framework, the low-rank approximation fusion discriminant analysis (LRADA) model, to enhance fault diagnosis for industrial processes. In the LRADA method, the prior information of the low-rank subspace was used to learn the global low-rank attributes of the original space. Then these attributes are embedded into the improved local linear discriminant analysis framework to enhance the discrimination between different classes. In addition, specific norm constraints are imposed on the projection space to facilitate sparse feature extraction and suppress noise interference. Finally, the effectiveness of the proposed diagnosis method is verified by experimental analysis on three benchmark data sets of industrial processes. The experimental results show the superiority of the proposed method in solving the above challenges and improving the fault diagnosis accuracy in industrial environments. LRADA is available at <span><span>https://github.com/gitcodelist/LRADA</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"269 ","pages":"Article 105608"},"PeriodicalIF":3.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145682821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Chemometrics and Intelligent Laboratory Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1