Quantitative Models for Causal Analysis in the Era of Genome Wide Association Studies.

Steven S Coughlin
{"title":"Quantitative Models for Causal Analysis in the Era of Genome Wide Association Studies.","authors":"Steven S Coughlin","doi":"10.2174/1874924001003010118","DOIUrl":null,"url":null,"abstract":"Causal inference in health research is a complex endeavor partly because the biomedical enterprise involves researchers from many disciplines including clinical medicine, epidemiology, genetics, basic sciences such as pathology and cell biology, and the behavioral sciences. A multidisciplinary approach is often needed to study health concerns and interpret findings, drawing upon expertise from epidemiologists, statisticians, physicians, nurses, geneticists, psychologists, and other practicing clinicians and researchers. In addition to the diversity of scientific disciplines and professions that are represented in many study groups, the range of health topics that can be studied is large and can include physical injuries such as traumatic brain injury; pain syndromes and other neurological conditions; chronic health conditions such as obesity, cancer, respiratory illnesses, and cardiovascular disease, gastrointestinal illnesses such as irritable bowel syndrome, infectious diseases such as H1N1 influenza and hepatitis C, psychiatric conditions such as post traumatic stress syndrome, depression, and suicide, adverse reproductive outcomes, and other health problems and concerns. Another feature of health research is that a range of study designs are employed by researchers including surveillance systems, observational studies with a case-control or cohort design, cross-sectional surveys, and randomized controlled trials. In recent years, observational studies include the large platforms of cases and controls that are identified for genome-wide association studies [1, 2]. In addition to statistical geneticists, the researchers who analyze data from genome-wide association studies and proteomics research often include persons with expertise in bioinformatics or machine learning techniques. \n \nThese three features of health research (diversity of scientific disciplines, wide variety of health topics of interest, and alternative study designs) create both challenges and opportunities for researchers attempting to identify causal associations with possible etiologic agents and new therapeutic targets, so that research findings can be translated into targeted clinical interventions and evidence-based therapies. For example, in studies with an observational design, where assignment of exposures is not under control of the investigators, assessments of causality can be more challenging than in randomized trials [3, 4]. \n \nInvestigations into the distribution and determinants of health conditions attempt to gain new knowledge through observation and inductive logic. Causal criteria commonly cited in epidemiology include temporal order of exposure and disease, biologic gradient or dose-response curve, biologic plausibility, biologic coherence, and consistency of findings, although some authors have recommended subsets of the criteria or refined definitions [5-7]. The strength of the observed association is also important in some assessments of causality. Criteria for causal criteria are widely used as a heuristic aid for assessing whether associations observed in epidemiologic research are causal although criteria-based methods provide only general guidelines for assessing the causality of associations rather than a strict checklist for identifying a causal relationship [3, 4]. The model of sufficient component causes [8] is widely used in epidemiology as a framework for teaching and understanding multicausality. A sufficient component cause is made up of a number of components, no one of which is sufficient for the disease or adverse health condition on its own [4, 8]. Diseases and adverse health conditions can be caused by more than one causal mechanism and each causal mechanism involves the combined action of several component causes. For example, both genetic factors and environmental exposures may have a role in the development of neurologic conditions such as amyotrophic lateral sclerosis. Other examples of diseases caused by interactions between genes and environment include complex, common diseases such as cancer, coronary heart disease, and diabetes [1]. \n \nA large and growing literature has dealt with statistical modeling approaches for estimating causal parameters or identifying causal associations using data from observational studies [9-13]. However, much of this important literature has not dealt directly with the special challenges that arise in causal assessments of data from genome-wide association studies including information about environmental exposures. Recent advances in genetics have challenged traditional frameworks for causal inference in observational research [14]. \n \nThe goal of this article is to consider challenges that arise in causal assessments of data from genome-wide association studies, which utilize high throughput genotyping technologies to analyze biological specimens collected from large numbers of cases and controls for up to one million single nucleotide polymorphisms (SNPs) [1, 2]. Before considering those challenges, I briefly discuss key developments in quantitative models for causal analysis: counterfactual analysis and graphical causal models and structural equations modeling. I then provide a summary of quantitative techniques for analyzing data from genome-wide association studies and related gene expression and proteomic data, and offer some recommendations for causal assessments of results from such studies.","PeriodicalId":88329,"journal":{"name":"The open health services and policy journal","volume":"4 ","pages":"118-122"},"PeriodicalIF":0.0000,"publicationDate":"2011-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3150533/pdf/nihms-275621.pdf","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The open health services and policy journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2174/1874924001003010118","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Causal inference in health research is a complex endeavor partly because the biomedical enterprise involves researchers from many disciplines including clinical medicine, epidemiology, genetics, basic sciences such as pathology and cell biology, and the behavioral sciences. A multidisciplinary approach is often needed to study health concerns and interpret findings, drawing upon expertise from epidemiologists, statisticians, physicians, nurses, geneticists, psychologists, and other practicing clinicians and researchers. In addition to the diversity of scientific disciplines and professions that are represented in many study groups, the range of health topics that can be studied is large and can include physical injuries such as traumatic brain injury; pain syndromes and other neurological conditions; chronic health conditions such as obesity, cancer, respiratory illnesses, and cardiovascular disease, gastrointestinal illnesses such as irritable bowel syndrome, infectious diseases such as H1N1 influenza and hepatitis C, psychiatric conditions such as post traumatic stress syndrome, depression, and suicide, adverse reproductive outcomes, and other health problems and concerns. Another feature of health research is that a range of study designs are employed by researchers including surveillance systems, observational studies with a case-control or cohort design, cross-sectional surveys, and randomized controlled trials. In recent years, observational studies include the large platforms of cases and controls that are identified for genome-wide association studies [1, 2]. In addition to statistical geneticists, the researchers who analyze data from genome-wide association studies and proteomics research often include persons with expertise in bioinformatics or machine learning techniques. These three features of health research (diversity of scientific disciplines, wide variety of health topics of interest, and alternative study designs) create both challenges and opportunities for researchers attempting to identify causal associations with possible etiologic agents and new therapeutic targets, so that research findings can be translated into targeted clinical interventions and evidence-based therapies. For example, in studies with an observational design, where assignment of exposures is not under control of the investigators, assessments of causality can be more challenging than in randomized trials [3, 4]. Investigations into the distribution and determinants of health conditions attempt to gain new knowledge through observation and inductive logic. Causal criteria commonly cited in epidemiology include temporal order of exposure and disease, biologic gradient or dose-response curve, biologic plausibility, biologic coherence, and consistency of findings, although some authors have recommended subsets of the criteria or refined definitions [5-7]. The strength of the observed association is also important in some assessments of causality. Criteria for causal criteria are widely used as a heuristic aid for assessing whether associations observed in epidemiologic research are causal although criteria-based methods provide only general guidelines for assessing the causality of associations rather than a strict checklist for identifying a causal relationship [3, 4]. The model of sufficient component causes [8] is widely used in epidemiology as a framework for teaching and understanding multicausality. A sufficient component cause is made up of a number of components, no one of which is sufficient for the disease or adverse health condition on its own [4, 8]. Diseases and adverse health conditions can be caused by more than one causal mechanism and each causal mechanism involves the combined action of several component causes. For example, both genetic factors and environmental exposures may have a role in the development of neurologic conditions such as amyotrophic lateral sclerosis. Other examples of diseases caused by interactions between genes and environment include complex, common diseases such as cancer, coronary heart disease, and diabetes [1]. A large and growing literature has dealt with statistical modeling approaches for estimating causal parameters or identifying causal associations using data from observational studies [9-13]. However, much of this important literature has not dealt directly with the special challenges that arise in causal assessments of data from genome-wide association studies including information about environmental exposures. Recent advances in genetics have challenged traditional frameworks for causal inference in observational research [14]. The goal of this article is to consider challenges that arise in causal assessments of data from genome-wide association studies, which utilize high throughput genotyping technologies to analyze biological specimens collected from large numbers of cases and controls for up to one million single nucleotide polymorphisms (SNPs) [1, 2]. Before considering those challenges, I briefly discuss key developments in quantitative models for causal analysis: counterfactual analysis and graphical causal models and structural equations modeling. I then provide a summary of quantitative techniques for analyzing data from genome-wide association studies and related gene expression and proteomic data, and offer some recommendations for causal assessments of results from such studies.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
全基因组关联研究时代因果分析的定量模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Post-traumatic stress disorder and the care of persons living with HIV/AIDS How do Vested Interests Maintain Outdated Policy? The Case of Food Marketing to New Zealand Children How Healthcare Studies Use Claims Data Patient Centered Care - A Conceptual Model and Review of the State of the Art Erosion in the Healthcare Safety Net: Impacts on Different Population Groups.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1