Implications of confounding from unmodeled interactions between explanatory variables when using latent variable regression models to make inferences

IF 2.3 4区 化学 Q1 SOCIAL WORK Journal of Chemometrics Pub Date : 2024-01-12 DOI:10.1002/cem.3531
Olav M. Kvalheim, Warren S. Vidar, Tim U. H. Baumeister, Roger G. Linington, Nadja B. Cech
{"title":"Implications of confounding from unmodeled interactions between explanatory variables when using latent variable regression models to make inferences","authors":"Olav M. Kvalheim,&nbsp;Warren S. Vidar,&nbsp;Tim U. H. Baumeister,&nbsp;Roger G. Linington,&nbsp;Nadja B. Cech","doi":"10.1002/cem.3531","DOIUrl":null,"url":null,"abstract":"<p>With linear dependency between the explanatory variables, partial least squares (PLS) regression is commonly used for regression analysis. If the response variable correlates to a high degree with the explanatory variables, a model with excellent predictive ability can usually be obtained. Ranking of variable importance is commonly used to interpret the model and sometimes this interpretation guides further experimentation. For instance, when analyzing natural product extracts for bioactivity, an underlying assumption is that the highest ranked compounds represent the best candidates for isolation and further testing. A problem with this approach is that in most cases, the number of compounds is larger than the number of samples (and usually much larger) and that the concentrations of the compounds correlate. Furthermore, compounds may interact as synergists or as antagonists. If the modeling process does not account for this possibility, the interpretation can be thoroughly wrong because unmodeled variables that strongly influence the response will give rise to confounding of a first-order PLS model and send the experimenter on a wrong track. We show the consequences of this by a practical example from natural product research. Furthermore, we show that by including the possibility of interactions between explanatory variables, visualization using a selectivity ratio plot may provide model interpretation that can be used to make inferences.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":null,"pages":null},"PeriodicalIF":2.3000,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.3531","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemometrics","FirstCategoryId":"92","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cem.3531","RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"SOCIAL WORK","Score":null,"Total":0}
引用次数: 0

Abstract

With linear dependency between the explanatory variables, partial least squares (PLS) regression is commonly used for regression analysis. If the response variable correlates to a high degree with the explanatory variables, a model with excellent predictive ability can usually be obtained. Ranking of variable importance is commonly used to interpret the model and sometimes this interpretation guides further experimentation. For instance, when analyzing natural product extracts for bioactivity, an underlying assumption is that the highest ranked compounds represent the best candidates for isolation and further testing. A problem with this approach is that in most cases, the number of compounds is larger than the number of samples (and usually much larger) and that the concentrations of the compounds correlate. Furthermore, compounds may interact as synergists or as antagonists. If the modeling process does not account for this possibility, the interpretation can be thoroughly wrong because unmodeled variables that strongly influence the response will give rise to confounding of a first-order PLS model and send the experimenter on a wrong track. We show the consequences of this by a practical example from natural product research. Furthermore, we show that by including the possibility of interactions between explanatory variables, visualization using a selectivity ratio plot may provide model interpretation that can be used to make inferences.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用潜变量回归模型进行推理时,解释变量之间未建模的相互作用所产生的混淆影响
在解释变量之间存在线性依赖关系的情况下,偏最小二乘法(PLS)回归通常用于回归分析。如果响应变量与解释变量高度相关,通常可以得到一个预测能力极强的模型。变量重要性的排序通常用于解释模型,有时这种解释会指导进一步的实验。例如,在分析天然产品提取物的生物活性时,一个基本假设是排名最靠前的化合物是分离和进一步测试的最佳候选化合物。这种方法存在的一个问题是,在大多数情况下,化合物的数量比样本的数量要多(通常要多得多),而且化合物的浓度是相关的。此外,化合物之间可能存在协同作用或拮抗作用。如果建模过程没有考虑到这种可能性,解释就会完全错误,因为对反应有强烈影响的未建模变量会对一阶 PLS 模型造成混淆,从而使实验者走上错误的道路。我们通过一个天然产品研究的实际例子来说明这种情况的后果。此外,我们还表明,通过纳入解释变量之间相互作用的可能性,使用选择性比值图进行可视化可以提供模型解释,并可用于推论。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Chemometrics
Journal of Chemometrics 化学-分析化学
CiteScore
5.20
自引率
8.30%
发文量
78
审稿时长
2 months
期刊介绍: The Journal of Chemometrics is devoted to the rapid publication of original scientific papers, reviews and short communications on fundamental and applied aspects of chemometrics. It also provides a forum for the exchange of information on meetings and other news relevant to the growing community of scientists who are interested in chemometrics and its applications. Short, critical review papers are a particularly important feature of the journal, in view of the multidisciplinary readership at which it is aimed.
期刊最新文献
Issue Information Issue Information Resampling as a Robust Measure of Model Complexity in PARAFAC Models Population Power Curves in ASCA With Permutation Testing A Non‐Linear Model for Multiple Alcohol Intakes and Optimal Designs Strategies
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1