{"title":"贝叶斯项目反应理论在以患者报告结果为终点的临床试验中估计功效。","authors":"Xiaohang Mei, Joseph C Cappelleri, Jinxiang Hu","doi":"10.1007/s11136-024-03874-y","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Patient-Reported Outcomes (PROs) are widely used in clinical trials, epidemiological research, quality of life (QOL) studies, routine clinical care, and medical surveillance. The Patient Reported Outcomes Measurement Information System (PROMIS) is a system of reliable and standardized measures of PROs developed with Item Response Theory (IRT) using latent scores. Power estimation is critical to clinical trials and research designs. However, in clinical trials with PROs as endpoints, observed scores are often used to calculate power rather than latent scores.</p><p><strong>Methods: </strong>In this paper, we conducted a series of simulations to compare the power obtained with IRT latent scores, including Bayesian IRT, Frequentist IRT, and observed scores, focusing on small sample size common in pilot studies and Phase I/II trials. Taking the PROMIS depression measures as an example, we simulated data and estimated power for two-armed clinical trials manipulating the following factors: sample size, effect size, and number of items. We also examined how misspecification of effect size affected power estimation.</p><p><strong>Results: </strong>Our results showed that the Bayesian IRT, which incorporated prior information into latent score estimation, yielded the highest power, especially when sample size was small. The effect of misspecification diminished as sample size increased.</p><p><strong>Conclusion: </strong>For power estimation in two-armed clinical trials with standardized PRO endpoints, if a medium effect size or larger is expected, we recommend BIRT simulation with well-grounded informative priors and a total sample size of at least 40.</p>","PeriodicalId":20748,"journal":{"name":"Quality of Life Research","volume":" ","pages":""},"PeriodicalIF":3.3000,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Bayesian item response theory to estimate power in clinical trials with patient-reported outcomes as endpoints.\",\"authors\":\"Xiaohang Mei, Joseph C Cappelleri, Jinxiang Hu\",\"doi\":\"10.1007/s11136-024-03874-y\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>Patient-Reported Outcomes (PROs) are widely used in clinical trials, epidemiological research, quality of life (QOL) studies, routine clinical care, and medical surveillance. The Patient Reported Outcomes Measurement Information System (PROMIS) is a system of reliable and standardized measures of PROs developed with Item Response Theory (IRT) using latent scores. Power estimation is critical to clinical trials and research designs. However, in clinical trials with PROs as endpoints, observed scores are often used to calculate power rather than latent scores.</p><p><strong>Methods: </strong>In this paper, we conducted a series of simulations to compare the power obtained with IRT latent scores, including Bayesian IRT, Frequentist IRT, and observed scores, focusing on small sample size common in pilot studies and Phase I/II trials. Taking the PROMIS depression measures as an example, we simulated data and estimated power for two-armed clinical trials manipulating the following factors: sample size, effect size, and number of items. We also examined how misspecification of effect size affected power estimation.</p><p><strong>Results: </strong>Our results showed that the Bayesian IRT, which incorporated prior information into latent score estimation, yielded the highest power, especially when sample size was small. The effect of misspecification diminished as sample size increased.</p><p><strong>Conclusion: </strong>For power estimation in two-armed clinical trials with standardized PRO endpoints, if a medium effect size or larger is expected, we recommend BIRT simulation with well-grounded informative priors and a total sample size of at least 40.</p>\",\"PeriodicalId\":20748,\"journal\":{\"name\":\"Quality of Life Research\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2025-01-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Quality of Life Research\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1007/s11136-024-03874-y\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Quality of Life Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s11136-024-03874-y","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
Bayesian item response theory to estimate power in clinical trials with patient-reported outcomes as endpoints.
Purpose: Patient-Reported Outcomes (PROs) are widely used in clinical trials, epidemiological research, quality of life (QOL) studies, routine clinical care, and medical surveillance. The Patient Reported Outcomes Measurement Information System (PROMIS) is a system of reliable and standardized measures of PROs developed with Item Response Theory (IRT) using latent scores. Power estimation is critical to clinical trials and research designs. However, in clinical trials with PROs as endpoints, observed scores are often used to calculate power rather than latent scores.
Methods: In this paper, we conducted a series of simulations to compare the power obtained with IRT latent scores, including Bayesian IRT, Frequentist IRT, and observed scores, focusing on small sample size common in pilot studies and Phase I/II trials. Taking the PROMIS depression measures as an example, we simulated data and estimated power for two-armed clinical trials manipulating the following factors: sample size, effect size, and number of items. We also examined how misspecification of effect size affected power estimation.
Results: Our results showed that the Bayesian IRT, which incorporated prior information into latent score estimation, yielded the highest power, especially when sample size was small. The effect of misspecification diminished as sample size increased.
Conclusion: For power estimation in two-armed clinical trials with standardized PRO endpoints, if a medium effect size or larger is expected, we recommend BIRT simulation with well-grounded informative priors and a total sample size of at least 40.
期刊介绍:
Quality of Life Research is an international, multidisciplinary journal devoted to the rapid communication of original research, theoretical articles and methodological reports related to the field of quality of life, in all the health sciences. The journal also offers editorials, literature, book and software reviews, correspondence and abstracts of conferences.
Quality of life has become a prominent issue in biometry, philosophy, social science, clinical medicine, health services and outcomes research. The journal''s scope reflects the wide application of quality of life assessment and research in the biological and social sciences. All original work is subject to peer review for originality, scientific quality and relevance to a broad readership.
This is an official journal of the International Society of Quality of Life Research.