Efficient risk-based collection of biospecimens in cohort studies: designing a prospective study of diagnostic performance for multicancer detection tests.
Mark Louie F Ramos, Anil K Chaturvedi, Barry I Graubard, Hormuzd A Katki
{"title":"Efficient risk-based collection of biospecimens in cohort studies: designing a prospective study of diagnostic performance for multicancer detection tests.","authors":"Mark Louie F Ramos, Anil K Chaturvedi, Barry I Graubard, Hormuzd A Katki","doi":"10.1093/aje/kwae139","DOIUrl":null,"url":null,"abstract":"<p><p>In cohort studies, it can be infeasible to collect specimens on an entire cohort. For example, to estimate sensitivity of multiple multi-cancer detection (MCD) assays, we desire an extra 80 mL of cell-free DNA (cfDNA) blood, but this much extra blood is too expensive for us to collect on everyone. We propose a novel epidemiologic study design that efficiently oversamples those at highest baseline disease risk from whom to collect specimens, to increase the number of future cases with cfDNA blood collection. The variance reduction ratio from our risk-based subsample versus a simple random (sub)sample (SRS) depends primarily on the ratio of risk model sensitivity to the fraction of the cohort selected for specimen collection subject to constraining the risk model specificity. In a simulation where we chose 34% of the Prostate, Lung, Colorectal, and Ovarian Screening Trial cohort at highest risk of lung cancer for cfDNA blood collection, we could enrich the number of lung cancers 2.42-fold. The standard deviation of lung-cancer MCD sensitivity was 31%-33% reduced versus SRS. Risk-based collection of specimens on a subsample of the cohort could be a feasible and efficient approach to collecting extra specimens for molecular epidemiology.</p>","PeriodicalId":7472,"journal":{"name":"American journal of epidemiology","volume":" ","pages":"243-253"},"PeriodicalIF":5.0000,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"American journal of epidemiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/aje/kwae139","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0
Abstract
In cohort studies, it can be infeasible to collect specimens on an entire cohort. For example, to estimate sensitivity of multiple multi-cancer detection (MCD) assays, we desire an extra 80 mL of cell-free DNA (cfDNA) blood, but this much extra blood is too expensive for us to collect on everyone. We propose a novel epidemiologic study design that efficiently oversamples those at highest baseline disease risk from whom to collect specimens, to increase the number of future cases with cfDNA blood collection. The variance reduction ratio from our risk-based subsample versus a simple random (sub)sample (SRS) depends primarily on the ratio of risk model sensitivity to the fraction of the cohort selected for specimen collection subject to constraining the risk model specificity. In a simulation where we chose 34% of the Prostate, Lung, Colorectal, and Ovarian Screening Trial cohort at highest risk of lung cancer for cfDNA blood collection, we could enrich the number of lung cancers 2.42-fold. The standard deviation of lung-cancer MCD sensitivity was 31%-33% reduced versus SRS. Risk-based collection of specimens on a subsample of the cohort could be a feasible and efficient approach to collecting extra specimens for molecular epidemiology.
期刊介绍:
The American Journal of Epidemiology is the oldest and one of the premier epidemiologic journals devoted to the publication of empirical research findings, opinion pieces, and methodological developments in the field of epidemiologic research.
It is a peer-reviewed journal aimed at both fellow epidemiologists and those who use epidemiologic data, including public health workers and clinicians.