降维方法识别心脏健康饮食模式的比较

Natalie C. Gasca, R. McClelland
{"title":"降维方法识别心脏健康饮食模式的比较","authors":"Natalie C. Gasca, R. McClelland","doi":"10.1353/obs.2023.0020","DOIUrl":null,"url":null,"abstract":"Abstract:Most nutritional epidemiology studies investigating diet-disease trends use unsupervised dimension reduction methods, like principal component regression (PCR) and sparse PCR (SPCR), to create dietary patterns. Supervised methods, such as partial least squares (PLS), sparse PLS (SPLS), and Lasso, offer the possibility of more concisely summarizing the foods most related to a disease. In this study we evaluate these five methods for interpretable reduction of food frequency questionnaire (FFQ) data when analyzing a univariate continuous cardiac-related outcome via a simulation study and data application. We also demonstrate that to control for covariates, various scientific premises require different adjustment approaches when using PLS. To emulate food groups, we generated blocks of normally distributed predictors with varying intra-block covariances; only nine of 24 predictors contributed to the normal response. When block covariances were informed by FFQ data, the only methods that performed variable selection were Lasso and SPLS, which selected two and four irrelevant variables, respectively. SPLS had the lowest prediction error, and both PLS-based methods constructed four patterns, while PCR and SPCR created 24 patterns. These methods were applied to 120 FFQ variables and baseline body mass index (BMI) from the Multi-Ethnic Study of Atherosclerosis, which includes 6814 participants aged 45-84, and we adjusted for age, gender, race/ethnicity, exercise, and total energy intake. From 120 variables, PCR created 17 BMI-related patterns and PLS selected one pattern; SPLS only used five variables to create two patterns. All methods exhibited similar predictive performance. Specifically, SPLS’s first pattern highlighted hamburger and diet soda intake (positive associations with BMI), reflecting a fast food diet. By selecting fewer patterns and foods, SPLS can create interpretable dietary patterns while maintaining predictive ability.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Comparison of dimension reduction methods for the identification of heart-healthy dietary patterns\",\"authors\":\"Natalie C. Gasca, R. McClelland\",\"doi\":\"10.1353/obs.2023.0020\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract:Most nutritional epidemiology studies investigating diet-disease trends use unsupervised dimension reduction methods, like principal component regression (PCR) and sparse PCR (SPCR), to create dietary patterns. Supervised methods, such as partial least squares (PLS), sparse PLS (SPLS), and Lasso, offer the possibility of more concisely summarizing the foods most related to a disease. In this study we evaluate these five methods for interpretable reduction of food frequency questionnaire (FFQ) data when analyzing a univariate continuous cardiac-related outcome via a simulation study and data application. We also demonstrate that to control for covariates, various scientific premises require different adjustment approaches when using PLS. To emulate food groups, we generated blocks of normally distributed predictors with varying intra-block covariances; only nine of 24 predictors contributed to the normal response. When block covariances were informed by FFQ data, the only methods that performed variable selection were Lasso and SPLS, which selected two and four irrelevant variables, respectively. SPLS had the lowest prediction error, and both PLS-based methods constructed four patterns, while PCR and SPCR created 24 patterns. These methods were applied to 120 FFQ variables and baseline body mass index (BMI) from the Multi-Ethnic Study of Atherosclerosis, which includes 6814 participants aged 45-84, and we adjusted for age, gender, race/ethnicity, exercise, and total energy intake. From 120 variables, PCR created 17 BMI-related patterns and PLS selected one pattern; SPLS only used five variables to create two patterns. All methods exhibited similar predictive performance. Specifically, SPLS’s first pattern highlighted hamburger and diet soda intake (positive associations with BMI), reflecting a fast food diet. By selecting fewer patterns and foods, SPLS can create interpretable dietary patterns while maintaining predictive ability.\",\"PeriodicalId\":74335,\"journal\":{\"name\":\"Observational studies\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Observational studies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1353/obs.2023.0020\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Observational studies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1353/obs.2023.0020","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

摘要:大多数调查饮食疾病趋势的营养流行病学研究都使用无监督降维方法,如主成分回归(PCR)和稀疏PCR(SPCR),来创建饮食模式。监督方法,如偏最小二乘(PLS)、稀疏PLS(SPLS)和Lasso,提供了更简洁地总结与疾病最相关的食物的可能性。在本研究中,我们通过模拟研究和数据应用分析单变量连续心脏相关结果时,评估了这五种可解释的减少食物频率问卷(FFQ)数据的方法。我们还证明,为了控制协变量,在使用PLS时,各种科学前提需要不同的调整方法。为了模拟食物组,我们生成了具有不同块内协变量的正态分布预测因子块;24个预测因子中只有9个对正常反应有贡献。当块协变量由FFQ数据告知时,唯一进行变量选择的方法是Lasso和SPLS,它们分别选择了两个和四个不相关的变量。SPLS的预测误差最低,两种基于PLS的方法都构建了四种模式,而PCR和SPCR则构建了24种模式。这些方法应用于动脉粥样硬化多民族研究的120个FFQ变量和基线体重指数(BMI),该研究包括6814名年龄在45-84岁的参与者,我们对年龄、性别、种族/民族、运动和总能量摄入进行了调整。从120个变量中,PCR创建了17个BMI相关模式,PLS选择了一个模式;SPLS只使用了五个变量来创建两个模式。所有方法都表现出相似的预测性能。具体来说,SPLS的第一个模式强调了汉堡和无糖苏打水的摄入(与BMI呈正相关),反映了快餐饮食。通过选择更少的模式和食物,SPLS可以在保持预测能力的同时创造可解释的饮食模式。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Comparison of dimension reduction methods for the identification of heart-healthy dietary patterns
Abstract:Most nutritional epidemiology studies investigating diet-disease trends use unsupervised dimension reduction methods, like principal component regression (PCR) and sparse PCR (SPCR), to create dietary patterns. Supervised methods, such as partial least squares (PLS), sparse PLS (SPLS), and Lasso, offer the possibility of more concisely summarizing the foods most related to a disease. In this study we evaluate these five methods for interpretable reduction of food frequency questionnaire (FFQ) data when analyzing a univariate continuous cardiac-related outcome via a simulation study and data application. We also demonstrate that to control for covariates, various scientific premises require different adjustment approaches when using PLS. To emulate food groups, we generated blocks of normally distributed predictors with varying intra-block covariances; only nine of 24 predictors contributed to the normal response. When block covariances were informed by FFQ data, the only methods that performed variable selection were Lasso and SPLS, which selected two and four irrelevant variables, respectively. SPLS had the lowest prediction error, and both PLS-based methods constructed four patterns, while PCR and SPCR created 24 patterns. These methods were applied to 120 FFQ variables and baseline body mass index (BMI) from the Multi-Ethnic Study of Atherosclerosis, which includes 6814 participants aged 45-84, and we adjusted for age, gender, race/ethnicity, exercise, and total energy intake. From 120 variables, PCR created 17 BMI-related patterns and PLS selected one pattern; SPLS only used five variables to create two patterns. All methods exhibited similar predictive performance. Specifically, SPLS’s first pattern highlighted hamburger and diet soda intake (positive associations with BMI), reflecting a fast food diet. By selecting fewer patterns and foods, SPLS can create interpretable dietary patterns while maintaining predictive ability.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
0.80
自引率
0.00%
发文量
0
期刊最新文献
Does matching introduce confounding or selection bias into the matched case-control design? Size-biased sensitivity analysis for matched pairs design to assess the impact of healthcare-associated infections A Software Tutorial for Matching in Clustered Observational Studies Using a difference-in-difference control trial to test an intervention aimed at increasing the take-up of a welfare payment in New Zealand Estimating Treatment Effect with Propensity Score Weighted Regression and Double Machine Learning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1