A Generalized Integration Approach to Association Analysis with Multi-category Outcome: An Application to a Tumor Sequencing Study of Colorectal Cancer and Smoking.

IF 3 1区 数学 Q1 STATISTICS & PROBABILITY Journal of the American Statistical Association Pub Date : 2023-01-01 Epub Date: 2022-09-20 DOI:10.1080/01621459.2022.2105703
Jiayin Zheng, Xinyuan Dong, Christina C Newton, Li Hsu
{"title":"A Generalized Integration Approach to Association Analysis with Multi-category Outcome: An Application to a Tumor Sequencing Study of Colorectal Cancer and Smoking.","authors":"Jiayin Zheng, Xinyuan Dong, Christina C Newton, Li Hsu","doi":"10.1080/01621459.2022.2105703","DOIUrl":null,"url":null,"abstract":"<p><p>Cancer is a heterogeneous disease, and rapid progress in sequencing and -omics technologies has enabled researchers to characterize tumors comprehensively. This has stimulated an intensive interest in studying how risk factors are associated with various tumor heterogeneous features. The Cancer Prevention Study-II (CPS-II) cohort is one of the largest prospective studies, particularly valuable for elucidating associations between cancer and risk factors. In this paper, we investigate the association of smoking with novel colorectal tumor markers obtained from targeted sequencing. However, due to cost and logistic difficulties, only a limited number of tumors can be assayed, which limits our capability for studying these associations. Meanwhile, there are extensive studies for assessing the association of smoking with overall cancer risk and established colorectal tumor markers. Importantly, such summary information is readily available from the literature. By linking this summary information to parameters of interest with proper constraints, we develop a generalized integration approach for polytomous logistic regression model with outcome characterized by tumor features. The proposed approach gains the efficiency through maximizing the joint likelihood of individual-level tumor data and external summary information under the constraints that narrow the parameter searching space. We apply the proposed method to the CPS-II data and identify the association of smoking with colorectal cancer risk differing by the mutational status of APC and RNF43 genes, neither of which is identified by the conventional analysis of CPS-II individual data only. These results help better understand the role of smoking in the etiology of colorectal cancer.</p>","PeriodicalId":17227,"journal":{"name":"Journal of the American Statistical Association","volume":"118 541","pages":"29-42"},"PeriodicalIF":3.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10168026/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Statistical Association","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1080/01621459.2022.2105703","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2022/9/20 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0

Abstract

Cancer is a heterogeneous disease, and rapid progress in sequencing and -omics technologies has enabled researchers to characterize tumors comprehensively. This has stimulated an intensive interest in studying how risk factors are associated with various tumor heterogeneous features. The Cancer Prevention Study-II (CPS-II) cohort is one of the largest prospective studies, particularly valuable for elucidating associations between cancer and risk factors. In this paper, we investigate the association of smoking with novel colorectal tumor markers obtained from targeted sequencing. However, due to cost and logistic difficulties, only a limited number of tumors can be assayed, which limits our capability for studying these associations. Meanwhile, there are extensive studies for assessing the association of smoking with overall cancer risk and established colorectal tumor markers. Importantly, such summary information is readily available from the literature. By linking this summary information to parameters of interest with proper constraints, we develop a generalized integration approach for polytomous logistic regression model with outcome characterized by tumor features. The proposed approach gains the efficiency through maximizing the joint likelihood of individual-level tumor data and external summary information under the constraints that narrow the parameter searching space. We apply the proposed method to the CPS-II data and identify the association of smoking with colorectal cancer risk differing by the mutational status of APC and RNF43 genes, neither of which is identified by the conventional analysis of CPS-II individual data only. These results help better understand the role of smoking in the etiology of colorectal cancer.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
多类别结果关联分析的广义整合方法:在结直肠癌癌症和吸烟肿瘤测序研究中的应用。
癌症是一种异质性疾病,测序和组学技术的快速进步使研究人员能够全面地描述肿瘤。这激发了人们对研究风险因素如何与各种肿瘤异质性特征相关的浓厚兴趣。癌症预防研究II(CPS-II)队列是最大的前瞻性研究之一,对阐明癌症与危险因素之间的关系特别有价值。在这篇论文中,我们研究了吸烟与从靶向测序中获得的新的结直肠肿瘤标志物的关系。然而,由于成本和后勤方面的困难,只能对有限数量的肿瘤进行检测,这限制了我们研究这些相关性的能力。同时,有广泛的研究用于评估吸烟与癌症总体风险的关系,并建立了结直肠癌肿瘤标志物。重要的是,这些摘要信息很容易从文献中获得。通过将这些汇总信息与具有适当约束的感兴趣参数联系起来,我们为具有以肿瘤特征为特征的结果的多元逻辑回归模型开发了一种广义积分方法。所提出的方法通过在缩小参数搜索空间的约束下最大化个体级肿瘤数据和外部摘要信息的联合似然性来提高效率。我们将所提出的方法应用于CPS-II数据,并根据APC和RNF43基因的突变状态来识别吸烟与结直肠癌癌症风险的关联,这两种情况都不能仅通过CPS-II个体数据的常规分析来识别。这些结果有助于更好地理解吸烟在结直肠癌癌症病因中的作用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
7.50
自引率
8.10%
发文量
168
审稿时长
12 months
期刊介绍: Established in 1888 and published quarterly in March, June, September, and December, the Journal of the American Statistical Association ( JASA ) has long been considered the premier journal of statistical science. Articles focus on statistical applications, theory, and methods in economic, social, physical, engineering, and health sciences. Important books contributing to statistical advancement are reviewed in JASA . JASA is indexed in Current Index to Statistics and MathSci Online and reviewed in Mathematical Reviews. JASA is abstracted by Access Company and is indexed and abstracted in the SRM Database of Social Research Methodology.
期刊最新文献
Identifiability and Consistent Estimation for Gaussian Chain Graph Models Data Science and Predictive Analytics: Biomedical and Health Applications using R, 2nd ed. Extremal Random Forests Quantitative Methods for Precision Medicine: Pharmacogenomics in Action. Graphical Principal Component Analysis of Multivariate Functional Time Series
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1