Vera Ehrenstein, Maja Hellfritzsch, Johnny Kahlert, Sinéad M Langan, Hisashi Urushihara, Danica Marinac-Dabic, Jennifer L Lund, Henrik Toft Sørensen, Eric I Benchimol
{"title":"Validation of algorithms in studies based on routinely collected health data: general principles.","authors":"Vera Ehrenstein, Maja Hellfritzsch, Johnny Kahlert, Sinéad M Langan, Hisashi Urushihara, Danica Marinac-Dabic, Jennifer L Lund, Henrik Toft Sørensen, Eric I Benchimol","doi":"10.1093/aje/kwae071","DOIUrl":null,"url":null,"abstract":"<p><p>Clinicians, researchers, regulators, and other decision-makers increasingly rely on evidence from real-world data (RWD), including data routinely accumulating in health and administrative databases. RWD studies often rely on algorithms to operationalize variable definitions. An algorithm is a combination of codes or concepts used to identify persons with a specific health condition or characteristic. Establishing the validity of algorithms is a prerequisite for generating valid study findings that can ultimately inform evidence-based health care. In this paper, we aim to systematize terminology, methods, and practical considerations relevant to the conduct of validation studies of RWD-based algorithms. We discuss measures of algorithm accuracy, gold/reference standards, study size, prioritization of accuracy measures, algorithm portability, and implications for interpretation. Information bias is common in epidemiologic studies, underscoring the importance of transparency in decisions regarding choice and prioritizing measures of algorithm validity. The validity of an algorithm should be judged in the context of a data source, and one size does not fit all. Prioritizing validity measures within a given data source depends on the role of a given variable in the analysis (eligibility criterion, exposure, outcome, or covariate). Validation work should be part of routine maintenance of RWD sources. This article is part of a Special Collection on Pharmacoepidemiology.</p>","PeriodicalId":7472,"journal":{"name":"American journal of epidemiology","volume":" ","pages":"1612-1624"},"PeriodicalIF":5.0000,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"American journal of epidemiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/aje/kwae071","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0
Abstract
Clinicians, researchers, regulators, and other decision-makers increasingly rely on evidence from real-world data (RWD), including data routinely accumulating in health and administrative databases. RWD studies often rely on algorithms to operationalize variable definitions. An algorithm is a combination of codes or concepts used to identify persons with a specific health condition or characteristic. Establishing the validity of algorithms is a prerequisite for generating valid study findings that can ultimately inform evidence-based health care. In this paper, we aim to systematize terminology, methods, and practical considerations relevant to the conduct of validation studies of RWD-based algorithms. We discuss measures of algorithm accuracy, gold/reference standards, study size, prioritization of accuracy measures, algorithm portability, and implications for interpretation. Information bias is common in epidemiologic studies, underscoring the importance of transparency in decisions regarding choice and prioritizing measures of algorithm validity. The validity of an algorithm should be judged in the context of a data source, and one size does not fit all. Prioritizing validity measures within a given data source depends on the role of a given variable in the analysis (eligibility criterion, exposure, outcome, or covariate). Validation work should be part of routine maintenance of RWD sources. This article is part of a Special Collection on Pharmacoepidemiology.
期刊介绍:
The American Journal of Epidemiology is the oldest and one of the premier epidemiologic journals devoted to the publication of empirical research findings, opinion pieces, and methodological developments in the field of epidemiologic research.
It is a peer-reviewed journal aimed at both fellow epidemiologists and those who use epidemiologic data, including public health workers and clinicians.