基于自动相关性确定的高斯过程嵌入式特征选择方法

IF 3.9 2区 工程技术 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Computers & Chemical Engineering Pub Date : 2024-08-23 DOI:10.1016/j.compchemeng.2024.108852
Yushi Deng, Mario Eden, Selen Cremaschi
{"title":"基于自动相关性确定的高斯过程嵌入式特征选择方法","authors":"Yushi Deng,&nbsp;Mario Eden,&nbsp;Selen Cremaschi","doi":"10.1016/j.compchemeng.2024.108852","DOIUrl":null,"url":null,"abstract":"<div><p>In Gaussian Process, feature importance is inversely proportional to the corresponding length scale when applying the Automatic Relevance Determination (ARD) structured kernel function. Features can be selected by ranking them according to their importance. Among the ARD-based feature selection methods, no uniform score exists for quantifying the output variation explained by feature subsets. This study proposes two feature selection approaches using two cumulative feature importance scores, one titled derivative decomposition ratio and the other normalized sensitivity, to determine the optimal feature subset. The performance of the approaches is assessed to test if irrelevant features are accurately identified and if the feature rankings are correct. The approaches are applied to identify relevant dimensionless inputs for a hybrid model estimating liquid entrainment fraction in two-phase flow. The results reveal that the proposed methods can identify the optimal feature subset for the hybrid model without significantly worsening its Root Mean Squared Error.</p></div>","PeriodicalId":286,"journal":{"name":"Computers & Chemical Engineering","volume":"191 ","pages":"Article 108852"},"PeriodicalIF":3.9000,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Gaussian process embedded feature selection method based on automatic relevance determination\",\"authors\":\"Yushi Deng,&nbsp;Mario Eden,&nbsp;Selen Cremaschi\",\"doi\":\"10.1016/j.compchemeng.2024.108852\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>In Gaussian Process, feature importance is inversely proportional to the corresponding length scale when applying the Automatic Relevance Determination (ARD) structured kernel function. Features can be selected by ranking them according to their importance. Among the ARD-based feature selection methods, no uniform score exists for quantifying the output variation explained by feature subsets. This study proposes two feature selection approaches using two cumulative feature importance scores, one titled derivative decomposition ratio and the other normalized sensitivity, to determine the optimal feature subset. The performance of the approaches is assessed to test if irrelevant features are accurately identified and if the feature rankings are correct. The approaches are applied to identify relevant dimensionless inputs for a hybrid model estimating liquid entrainment fraction in two-phase flow. The results reveal that the proposed methods can identify the optimal feature subset for the hybrid model without significantly worsening its Root Mean Squared Error.</p></div>\",\"PeriodicalId\":286,\"journal\":{\"name\":\"Computers & Chemical Engineering\",\"volume\":\"191 \",\"pages\":\"Article 108852\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2024-08-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Chemical Engineering\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0098135424002709\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Chemical Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0098135424002709","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

摘要

在高斯过程中,当应用自动相关性判定(ARD)结构核函数时,特征的重要性与相应的长度标度成反比。可以根据重要程度对特征进行排序来选择特征。在基于 ARD 的特征选择方法中,没有一种统一的分数可以量化特征子集所解释的输出变化。本研究提出了两种特征选择方法,使用两个累积特征重要性分数(一个是标题导数分解率,另一个是归一化灵敏度)来确定最佳特征子集。对这两种方法的性能进行了评估,以检验是否能准确识别无关特征以及特征排序是否正确。这些方法被应用于识别一个估算两相流中液体夹带分数的混合模型的相关无量纲输入。结果表明,所提出的方法可以为混合模型识别出最佳特征子集,而不会显著恶化其均方根误差。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A Gaussian process embedded feature selection method based on automatic relevance determination

In Gaussian Process, feature importance is inversely proportional to the corresponding length scale when applying the Automatic Relevance Determination (ARD) structured kernel function. Features can be selected by ranking them according to their importance. Among the ARD-based feature selection methods, no uniform score exists for quantifying the output variation explained by feature subsets. This study proposes two feature selection approaches using two cumulative feature importance scores, one titled derivative decomposition ratio and the other normalized sensitivity, to determine the optimal feature subset. The performance of the approaches is assessed to test if irrelevant features are accurately identified and if the feature rankings are correct. The approaches are applied to identify relevant dimensionless inputs for a hybrid model estimating liquid entrainment fraction in two-phase flow. The results reveal that the proposed methods can identify the optimal feature subset for the hybrid model without significantly worsening its Root Mean Squared Error.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computers & Chemical Engineering
Computers & Chemical Engineering 工程技术-工程:化工
CiteScore
8.70
自引率
14.00%
发文量
374
审稿时长
70 days
期刊介绍: Computers & Chemical Engineering is primarily a journal of record for new developments in the application of computing and systems technology to chemical engineering problems.
期刊最新文献
The bullwhip effect, market competition and standard deviation ratio in two parallel supply chains CADET-Julia: Efficient and versatile, open-source simulator for batch chromatography in Julia Computer aided formulation design based on molecular dynamics simulation: Detergents with fragrance Model-based real-time optimization in continuous pharmaceutical manufacturing Risk-averse supply chain management via robust reinforcement learning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1