Essential Number of Principal Components and Nearly Training-Free Model for Spectral Analysis.

Yifeng Bie, Shuai You, Xinrui Li, Xuekui Zhang, Tao Lu
{"title":"Essential Number of Principal Components and Nearly Training-Free Model for Spectral Analysis.","authors":"Yifeng Bie, Shuai You, Xinrui Li, Xuekui Zhang, Tao Lu","doi":"10.1109/TPAMI.2024.3436860","DOIUrl":null,"url":null,"abstract":"<p><p>Learning-enabled spectroscopic analysis, promising for automated real-time analysis of chemicals, is facing several challenges. Firstly, a typical machine learning model requires a large number of training samples that physical systems can not provide. Secondly, it requires the testing samples to be in range with the training samples, which often is not the case in the real world. Further, a spectroscopy device is limited by its memory size, computing power, and battery capacity. That requires highly efficient learning models for on-site analysis. In this paper, by analyzing multi-gas mixtures and multi-molecule suspensions, we first show that orders of magnitude reduction of data dimension can be achieved as the number of principal components that need to be retained is the same as the independent constituents in the mixture. From this principle, we designed highly compact models in which the essential principal components can be directly extracted from the interrelations between the individual chemical properties and principal components; and only a few training samples are required. Our model can predict the constituent concentrations that have not been seen in the training dataset and provide estimations of measurement noises. This approach can be extended as an effectively standardized method for principle component extraction.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TPAMI.2024.3436860","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Learning-enabled spectroscopic analysis, promising for automated real-time analysis of chemicals, is facing several challenges. Firstly, a typical machine learning model requires a large number of training samples that physical systems can not provide. Secondly, it requires the testing samples to be in range with the training samples, which often is not the case in the real world. Further, a spectroscopy device is limited by its memory size, computing power, and battery capacity. That requires highly efficient learning models for on-site analysis. In this paper, by analyzing multi-gas mixtures and multi-molecule suspensions, we first show that orders of magnitude reduction of data dimension can be achieved as the number of principal components that need to be retained is the same as the independent constituents in the mixture. From this principle, we designed highly compact models in which the essential principal components can be directly extracted from the interrelations between the individual chemical properties and principal components; and only a few training samples are required. Our model can predict the constituent concentrations that have not been seen in the training dataset and provide estimations of measurement noises. This approach can be extended as an effectively standardized method for principle component extraction.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于频谱分析的基本主成分数和近乎无训练模型
具有学习功能的光谱分析有望实现化学物质的自动实时分析,但目前面临着一些挑战。首先,典型的机器学习模型需要大量的训练样本,而物理系统无法提供。其次,它要求测试样本与训练样本在一定范围内,而现实世界中往往不存在这种情况。此外,光谱设备还受到内存大小、计算能力和电池容量的限制。这就需要高效的学习模型来进行现场分析。在本文中,通过分析多气体混合物和多分子悬浮液,我们首先表明,由于需要保留的主成分数量与混合物中的独立成分数量相同,因此可以实现数据维度的数量级缩减。根据这一原理,我们设计了高度紧凑的模型,可以直接从单个化学特性和主成分之间的相互关系中提取基本主成分,而且只需要少量训练样本。我们的模型可以预测训练数据集中未出现的成分浓度,并提供测量噪声估计。这种方法可以扩展为一种有效的标准化原理成分提取方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Diversifying Policies with Non-Markov Dispersion to Expand the Solution Space. Integrating Neural Radiance Fields End-to-End for Cognitive Visuomotor Navigation. Variational Label Enhancement for Instance-Dependent Partial Label Learning. TagCLIP: Improving Discrimination Ability of Zero-Shot Semantic Segmentation. Efficient Neural Collaborative Search for Pickup and Delivery Problems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1