纸质比色传感器阵列的数据预处理

IF 3.7 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS Chemometrics and Intelligent Laboratory Systems Pub Date : 2024-09-21 DOI:10.1016/j.chemolab.2024.105237
{"title":"纸质比色传感器阵列的数据预处理","authors":"","doi":"10.1016/j.chemolab.2024.105237","DOIUrl":null,"url":null,"abstract":"<div><div>The responses of the paper-based colorimetric sensor arrays are typically recorded by an imaging device. The color values of the images are subjected to chemometrics data analysis, with a view to extract the relevant information. As is the case with data extracted from other analytical instruments, these data must undergo pre-processing prior to undergoing further analysis. This study represents the first comprehensive and systematic investigation into the impact of data pre-processing techniques on the quality of subsequent data analysis methods applied to imaging data collected from paper-based colorimetric sensor arrays. The use of color difference data (calculated by subtracting the images of the sensors before exposure from those after exposure) revealed that pre-treatment of the data was not a critical factor, although it could reduce the complexity of the model. For example, the number of principal components in the principal component-linear discriminant analysis model was reduced from eight (for data that had not been pre-processed) to three (for pre-processed data) to achieve the same level of accuracy (92 %). Nevertheless, the pivotal role of data pre-processing was elucidated through the examination of data sets collected immediately following exposure to the samples’ vapor. It was demonstrated that the use of an appropriate pre-processing method allows for the elimination or significant reduction of between-sensor variations, obviating the necessity for the inclusion of data from images taken prior to exposure. With regard to the objective of classification, the object pre-processing methods that demonstrated particular promise were mean (or median) centering, Pareto scaling and standard normal variate. To illustrate, in the analysis of volatile organic compounds by an array of metallic nanoparticles, the cross-validation classification accuracy of the unprocessed data, which was 70 %, increased to 95 % when unit variance scaling and range scaling were applied to objects and variables, respectively. In the calibration phase, the majority of pre-processing methods enhanced the quality of the regression models. Using suitable pre-processing methods for both objects and variables, eliminated the need for using the before exposing image of the CSAs.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":null,"pages":null},"PeriodicalIF":3.7000,"publicationDate":"2024-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Data pre-processing for paper-based colorimetric sensor arrays\",\"authors\":\"\",\"doi\":\"10.1016/j.chemolab.2024.105237\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The responses of the paper-based colorimetric sensor arrays are typically recorded by an imaging device. The color values of the images are subjected to chemometrics data analysis, with a view to extract the relevant information. As is the case with data extracted from other analytical instruments, these data must undergo pre-processing prior to undergoing further analysis. This study represents the first comprehensive and systematic investigation into the impact of data pre-processing techniques on the quality of subsequent data analysis methods applied to imaging data collected from paper-based colorimetric sensor arrays. The use of color difference data (calculated by subtracting the images of the sensors before exposure from those after exposure) revealed that pre-treatment of the data was not a critical factor, although it could reduce the complexity of the model. For example, the number of principal components in the principal component-linear discriminant analysis model was reduced from eight (for data that had not been pre-processed) to three (for pre-processed data) to achieve the same level of accuracy (92 %). Nevertheless, the pivotal role of data pre-processing was elucidated through the examination of data sets collected immediately following exposure to the samples’ vapor. It was demonstrated that the use of an appropriate pre-processing method allows for the elimination or significant reduction of between-sensor variations, obviating the necessity for the inclusion of data from images taken prior to exposure. With regard to the objective of classification, the object pre-processing methods that demonstrated particular promise were mean (or median) centering, Pareto scaling and standard normal variate. To illustrate, in the analysis of volatile organic compounds by an array of metallic nanoparticles, the cross-validation classification accuracy of the unprocessed data, which was 70 %, increased to 95 % when unit variance scaling and range scaling were applied to objects and variables, respectively. In the calibration phase, the majority of pre-processing methods enhanced the quality of the regression models. Using suitable pre-processing methods for both objects and variables, eliminated the need for using the before exposing image of the CSAs.</div></div>\",\"PeriodicalId\":9774,\"journal\":{\"name\":\"Chemometrics and Intelligent Laboratory Systems\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2024-09-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Chemometrics and Intelligent Laboratory Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0169743924001771\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chemometrics and Intelligent Laboratory Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169743924001771","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

纸质比色传感器阵列的响应通常由成像设备记录。对图像的颜色值进行化学计量学数据分析,以提取相关信息。与其他分析仪器提取的数据一样,这些数据在进行进一步分析之前必须经过预处理。本研究首次全面系统地探讨了数据预处理技术对后续数据分析方法质量的影响,这些方法适用于从纸质比色传感器阵列采集的成像数据。通过使用色差数据(将曝光前的传感器图像与曝光后的图像相减计算得出)发现,虽然数据预处理可以降低模型的复杂性,但并不是关键因素。例如,主成分-线性判别分析模型中的主成分数量从 8 个(未经过预处理的数据)减少到 3 个(经过预处理的数据),才能达到相同的准确率水平(92%)。尽管如此,通过对暴露于样品蒸汽后立即收集的数据集进行检验,还是阐明了数据预处理的关键作用。结果表明,使用适当的预处理方法可以消除或显著减少传感器之间的差异,从而无需纳入暴露前拍摄的图像数据。在分类目标方面,平均值(或中位数)居中、帕累托缩放和标准正态变量等物体预处理方法显示出了特别的前景。例如,在分析金属纳米粒子阵列的挥发性有机化合物时,如果对对象和变量分别采用单位方差缩放和范围缩放,未经处理数据的交叉验证分类准确率从 70% 提高到 95%。在校准阶段,大多数预处理方法都提高了回归模型的质量。对对象和变量采用适当的预处理方法,就无需使用 CSA 曝光前的图像。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Data pre-processing for paper-based colorimetric sensor arrays
The responses of the paper-based colorimetric sensor arrays are typically recorded by an imaging device. The color values of the images are subjected to chemometrics data analysis, with a view to extract the relevant information. As is the case with data extracted from other analytical instruments, these data must undergo pre-processing prior to undergoing further analysis. This study represents the first comprehensive and systematic investigation into the impact of data pre-processing techniques on the quality of subsequent data analysis methods applied to imaging data collected from paper-based colorimetric sensor arrays. The use of color difference data (calculated by subtracting the images of the sensors before exposure from those after exposure) revealed that pre-treatment of the data was not a critical factor, although it could reduce the complexity of the model. For example, the number of principal components in the principal component-linear discriminant analysis model was reduced from eight (for data that had not been pre-processed) to three (for pre-processed data) to achieve the same level of accuracy (92 %). Nevertheless, the pivotal role of data pre-processing was elucidated through the examination of data sets collected immediately following exposure to the samples’ vapor. It was demonstrated that the use of an appropriate pre-processing method allows for the elimination or significant reduction of between-sensor variations, obviating the necessity for the inclusion of data from images taken prior to exposure. With regard to the objective of classification, the object pre-processing methods that demonstrated particular promise were mean (or median) centering, Pareto scaling and standard normal variate. To illustrate, in the analysis of volatile organic compounds by an array of metallic nanoparticles, the cross-validation classification accuracy of the unprocessed data, which was 70 %, increased to 95 % when unit variance scaling and range scaling were applied to objects and variables, respectively. In the calibration phase, the majority of pre-processing methods enhanced the quality of the regression models. Using suitable pre-processing methods for both objects and variables, eliminated the need for using the before exposing image of the CSAs.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
7.50
自引率
7.70%
发文量
169
审稿时长
3.4 months
期刊介绍: Chemometrics and Intelligent Laboratory Systems publishes original research papers, short communications, reviews, tutorials and Original Software Publications reporting on development of novel statistical, mathematical, or computer techniques in Chemistry and related disciplines. Chemometrics is the chemical discipline that uses mathematical and statistical methods to design or select optimal procedures and experiments, and to provide maximum chemical information by analysing chemical data. The journal deals with the following topics: 1) Development of new statistical, mathematical and chemometrical methods for Chemistry and related fields (Environmental Chemistry, Biochemistry, Toxicology, System Biology, -Omics, etc.) 2) Novel applications of chemometrics to all branches of Chemistry and related fields (typical domains of interest are: process data analysis, experimental design, data mining, signal processing, supervised modelling, decision making, robust statistics, mixture analysis, multivariate calibration etc.) Routine applications of established chemometrical techniques will not be considered. 3) Development of new software that provides novel tools or truly advances the use of chemometrical methods. 4) Well characterized data sets to test performance for the new methods and software. The journal complies with International Committee of Medical Journal Editors'' Uniform requirements for manuscripts.
期刊最新文献
LTFM: Long-tail few-shot module with loose coupling strategy for mineral spectral identification Recent applications of analytical quality-by-design methodology for chromatographic analysis: A review Layer-wise-residual-driven approach for soft sensing in composite dynamic system based on slow and fast time-varying latent variables Applicability domain of a calibration model based on neural networks and infrared spectroscopy Machine learning based modeling for estimation of drug solubility in supercritical fluid by adjusting important parameters
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1