利用多重对应分析进行无监督聚类,揭示多种胃肠道癌症中与临床相关的人口统计学变量

Ryan J. Kramer , Kristen E. Rhodin , Aaron Therien , Vignesh Raman , Austin Eckhoff , Camryn Thompson , Betty C. Tong , Dan G. Blazer III , Michael E. Lidsky , Thomas D’Amico , Daniel P. Nussbaum
{"title":"利用多重对应分析进行无监督聚类,揭示多种胃肠道癌症中与临床相关的人口统计学变量","authors":"Ryan J. Kramer ,&nbsp;Kristen E. Rhodin ,&nbsp;Aaron Therien ,&nbsp;Vignesh Raman ,&nbsp;Austin Eckhoff ,&nbsp;Camryn Thompson ,&nbsp;Betty C. Tong ,&nbsp;Dan G. Blazer III ,&nbsp;Michael E. Lidsky ,&nbsp;Thomas D’Amico ,&nbsp;Daniel P. Nussbaum","doi":"10.1016/j.soi.2024.100009","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><p>Patients with gastrointestinal malignancies represent a heterogenous population, even among those with similar stage and treatment pathways. Here, we used dimensionality reduction in the National Cancer Database (NCDB) to inform unsupervised clustering of patients with three gastrointestinal malignancies and examined outcomes among these computationally-derived groups.</p></div><div><h3>Methods</h3><p>The NCDB was queried for three cohorts of patients receiving multimodal therapy: stage II/III esophageal cancer, stage II/III gastric cancer, and stage III colon cancer. Multiple correspondence analysis (MCA), a dimensionality reduction technique well-suited for categorical variables such as demographic data in the NCDB, was performed on this cohort with variables including demographic and tumor characteristics. Principal components were analyzed to derive clusters. Outcomes for each cluster were compared using Kaplan-Meier survival methods.</p></div><div><h3>Results</h3><p>For esophageal (n = 11,399), gastric (n = 2033), and colon (n = 72,057) cancer, the same four variables were identified as highly representative. The principal variables were income quartile, education quartile, age quartile, and insurance type. Survival analysis demonstrated significant differences in overall survival between clusters in esophageal (p &lt; 0.0001) and colon (p &lt; 0.0001) cancer, but not gastric cancer (p = 0.56). Clusters defined by high income, high education, younger age, and private insurance fared better.</p></div><div><h3>Conclusions</h3><p>Using MCA, we identified combinations of 4 demographic variables in the NCDB with stage II/III esophageal cancer, stage II/III gastric cancer, and stage III colon cancer. These groupings had significantly different survival outcomes in colon and esophageal cancer. This work serves as proof-of-concept for the utility of unsupervised clustering for outcomes research in surgical malignancies and identifies at-risk populations.</p></div>","PeriodicalId":101191,"journal":{"name":"Surgical Oncology Insight","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2950247024000057/pdfft?md5=3c00e0283b85506b14944aa9afd3a079&pid=1-s2.0-S2950247024000057-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Unsupervised clustering using multiple correspondence analysis reveals clinically-relevant demographic variables across multiple gastrointestinal cancers\",\"authors\":\"Ryan J. Kramer ,&nbsp;Kristen E. Rhodin ,&nbsp;Aaron Therien ,&nbsp;Vignesh Raman ,&nbsp;Austin Eckhoff ,&nbsp;Camryn Thompson ,&nbsp;Betty C. Tong ,&nbsp;Dan G. Blazer III ,&nbsp;Michael E. Lidsky ,&nbsp;Thomas D’Amico ,&nbsp;Daniel P. Nussbaum\",\"doi\":\"10.1016/j.soi.2024.100009\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Objective</h3><p>Patients with gastrointestinal malignancies represent a heterogenous population, even among those with similar stage and treatment pathways. Here, we used dimensionality reduction in the National Cancer Database (NCDB) to inform unsupervised clustering of patients with three gastrointestinal malignancies and examined outcomes among these computationally-derived groups.</p></div><div><h3>Methods</h3><p>The NCDB was queried for three cohorts of patients receiving multimodal therapy: stage II/III esophageal cancer, stage II/III gastric cancer, and stage III colon cancer. Multiple correspondence analysis (MCA), a dimensionality reduction technique well-suited for categorical variables such as demographic data in the NCDB, was performed on this cohort with variables including demographic and tumor characteristics. Principal components were analyzed to derive clusters. Outcomes for each cluster were compared using Kaplan-Meier survival methods.</p></div><div><h3>Results</h3><p>For esophageal (n = 11,399), gastric (n = 2033), and colon (n = 72,057) cancer, the same four variables were identified as highly representative. The principal variables were income quartile, education quartile, age quartile, and insurance type. Survival analysis demonstrated significant differences in overall survival between clusters in esophageal (p &lt; 0.0001) and colon (p &lt; 0.0001) cancer, but not gastric cancer (p = 0.56). Clusters defined by high income, high education, younger age, and private insurance fared better.</p></div><div><h3>Conclusions</h3><p>Using MCA, we identified combinations of 4 demographic variables in the NCDB with stage II/III esophageal cancer, stage II/III gastric cancer, and stage III colon cancer. These groupings had significantly different survival outcomes in colon and esophageal cancer. This work serves as proof-of-concept for the utility of unsupervised clustering for outcomes research in surgical malignancies and identifies at-risk populations.</p></div>\",\"PeriodicalId\":101191,\"journal\":{\"name\":\"Surgical Oncology Insight\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-02-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2950247024000057/pdfft?md5=3c00e0283b85506b14944aa9afd3a079&pid=1-s2.0-S2950247024000057-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Surgical Oncology Insight\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2950247024000057\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Surgical Oncology Insight","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2950247024000057","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

目的胃肠道恶性肿瘤患者是一个异质性人群,即使在分期和治疗途径相似的患者中也是如此。在此,我们利用国家癌症数据库(NCDB)中的降维技术对三种胃肠道恶性肿瘤患者进行了无监督聚类,并研究了这些通过计算得出的组别之间的治疗效果。多重对应分析 (MCA) 是一种降维技术,非常适合 NCDB 中的人口统计学数据等分类变量。对主成分进行分析后得出聚类。结果对于食管癌(n = 11,399)、胃癌(n = 2033)和结肠癌(n = 72,057),同样的四个变量被确定为具有高度代表性。主要变量包括收入四分位数、教育四分位数、年龄四分位数和保险类型。生存分析表明,食管癌(p <0.0001)和结肠癌(p <0.0001)不同群组之间的总生存率存在显著差异,但胃癌(p = 0.56)不存在显著差异。结论利用 MCA,我们在国家疾病分类数据库中确定了 II/III 期食管癌、II/III 期胃癌和 III 期结肠癌的 4 个人口统计学变量组合。这些分组在结肠癌和食道癌的生存结果上有明显差异。这项工作证明了无监督聚类在外科恶性肿瘤结果研究中的实用性,并确定了高危人群。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Unsupervised clustering using multiple correspondence analysis reveals clinically-relevant demographic variables across multiple gastrointestinal cancers

Objective

Patients with gastrointestinal malignancies represent a heterogenous population, even among those with similar stage and treatment pathways. Here, we used dimensionality reduction in the National Cancer Database (NCDB) to inform unsupervised clustering of patients with three gastrointestinal malignancies and examined outcomes among these computationally-derived groups.

Methods

The NCDB was queried for three cohorts of patients receiving multimodal therapy: stage II/III esophageal cancer, stage II/III gastric cancer, and stage III colon cancer. Multiple correspondence analysis (MCA), a dimensionality reduction technique well-suited for categorical variables such as demographic data in the NCDB, was performed on this cohort with variables including demographic and tumor characteristics. Principal components were analyzed to derive clusters. Outcomes for each cluster were compared using Kaplan-Meier survival methods.

Results

For esophageal (n = 11,399), gastric (n = 2033), and colon (n = 72,057) cancer, the same four variables were identified as highly representative. The principal variables were income quartile, education quartile, age quartile, and insurance type. Survival analysis demonstrated significant differences in overall survival between clusters in esophageal (p < 0.0001) and colon (p < 0.0001) cancer, but not gastric cancer (p = 0.56). Clusters defined by high income, high education, younger age, and private insurance fared better.

Conclusions

Using MCA, we identified combinations of 4 demographic variables in the NCDB with stage II/III esophageal cancer, stage II/III gastric cancer, and stage III colon cancer. These groupings had significantly different survival outcomes in colon and esophageal cancer. This work serves as proof-of-concept for the utility of unsupervised clustering for outcomes research in surgical malignancies and identifies at-risk populations.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Combining liver-directed and immunotherapy in advanced hepatocellular carcinoma: A review and future directions Timing of breast biopsy and axillary ultrasound does not affect the false positive rate of the axillary ultrasound Small bowel cancers: A population-based analysis of epidemiology, treatment and outcomes in Ontario, Canada from 2005-2020 Pelvic Floor Physical Therapy Prehabilitation (PrePFPT) for the prevention of low anterior resection syndrome Pepsinogen and Helicobacter pylori: Serum biomarkers for gastric cancer risk in a diverse United States population
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1