<p>We commend Bezdicek and colleagues for a rigorous, consensus-based evaluation of neuropsychological tests for memory, language, visuospatial function, and premorbid intelligence in Parkinson's disease (PD).<span><sup>1</sup></span> Their tiered recommendations provide practical guidance for level-II assessment across Parkinson's Disease Mild Cognitive Impairment (PD-MCI) and PD dementia (PDD) and for outcome selection in trials.</p><p>The panel's core messages are clear. Word-list and story-recall measures show acceptable validity, reliability, and sensitivity to change in PD. The Boston Naming Test is a practical reference for language, matrix reasoning is the current visuospatial choice, and National adult reading test (NART) or NAART improves adjudication through premorbid-ability estimates. These points, including the stated limitations for early-stage sensitivity and motor confounds, provide a strong foundation for standardized workflows.</p><p>We suggest three additions that support the authors' framework.</p><p>First, reliability should also be treated as a design constraint of subsequent analyses. The maximum observable association between two variables is limited by their reliabilities. When studies compare or pool correlations across outcomes with unequal reliability, attenuation biases estimate toward zero and can be mistaken for weak construct associations.<span><sup>2</sup></span> Explicit reporting of reliability for each selected measure, and correction for attenuation in secondary analyses where justified, would reduce interpretive variability and improve cross-cohort comparability.</p><p>Second, sensitivity to change should use complementary thresholds. Minimal detectable change quantifies the smallest difference beyond measurement error, and minimal clinically important difference reflects patient-perceived benefit.<span><sup>3</sup></span> The latter is, therefore, often used when assessing outcomes of clinical trials.<span><sup>4</sup></span> If responder classification is applied, minimally detectable change (MDC) and minimal clinically important difference (MCID) should shape that split. When MCID is smaller than MDC, individual change scores that exceed MCID but fall below MDC remain indeterminate and should not be interpreted as true change. This differentiation is especially relevant given the substantial day-to-day variability of PD symptoms, which may surpass MCID values.<span><sup>5</sup></span></p><p>Third, measurement invariance should be established before comparing groups, subtypes, therapy states, or time points.<span><sup>6</sup></span> Without item-level invariance, observed score differences may be driven by a subset of items that function differentially, which decouples scores from the intended construct. Given the paper's critique about uneven domain sensitivity and executive contamination of visuospatial measures, routine checks for differential item functioning and factorial stability would strengthen in
{"title":"Comment on “Neuropsychological Tests of Memory, Visuospatial, and Language Function in Parkinson's Disease: Review, Critique, and Recommendations”","authors":"Joshua P. Woller MSc, Alireza Gharabaghi MD","doi":"10.1002/mds.70148","DOIUrl":"10.1002/mds.70148","url":null,"abstract":"<p>We commend Bezdicek and colleagues for a rigorous, consensus-based evaluation of neuropsychological tests for memory, language, visuospatial function, and premorbid intelligence in Parkinson's disease (PD).<span><sup>1</sup></span> Their tiered recommendations provide practical guidance for level-II assessment across Parkinson's Disease Mild Cognitive Impairment (PD-MCI) and PD dementia (PDD) and for outcome selection in trials.</p><p>The panel's core messages are clear. Word-list and story-recall measures show acceptable validity, reliability, and sensitivity to change in PD. The Boston Naming Test is a practical reference for language, matrix reasoning is the current visuospatial choice, and National adult reading test (NART) or NAART improves adjudication through premorbid-ability estimates. These points, including the stated limitations for early-stage sensitivity and motor confounds, provide a strong foundation for standardized workflows.</p><p>We suggest three additions that support the authors' framework.</p><p>First, reliability should also be treated as a design constraint of subsequent analyses. The maximum observable association between two variables is limited by their reliabilities. When studies compare or pool correlations across outcomes with unequal reliability, attenuation biases estimate toward zero and can be mistaken for weak construct associations.<span><sup>2</sup></span> Explicit reporting of reliability for each selected measure, and correction for attenuation in secondary analyses where justified, would reduce interpretive variability and improve cross-cohort comparability.</p><p>Second, sensitivity to change should use complementary thresholds. Minimal detectable change quantifies the smallest difference beyond measurement error, and minimal clinically important difference reflects patient-perceived benefit.<span><sup>3</sup></span> The latter is, therefore, often used when assessing outcomes of clinical trials.<span><sup>4</sup></span> If responder classification is applied, minimally detectable change (MDC) and minimal clinically important difference (MCID) should shape that split. When MCID is smaller than MDC, individual change scores that exceed MCID but fall below MDC remain indeterminate and should not be interpreted as true change. This differentiation is especially relevant given the substantial day-to-day variability of PD symptoms, which may surpass MCID values.<span><sup>5</sup></span></p><p>Third, measurement invariance should be established before comparing groups, subtypes, therapy states, or time points.<span><sup>6</sup></span> Without item-level invariance, observed score differences may be driven by a subset of items that function differentially, which decouples scores from the intended construct. Given the paper's critique about uneven domain sensitivity and executive contamination of visuospatial measures, routine checks for differential item functioning and factorial stability would strengthen in","PeriodicalId":213,"journal":{"name":"Movement Disorders","volume":"41 1","pages":"281-282"},"PeriodicalIF":7.6,"publicationDate":"2025-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://movementdisorders.onlinelibrary.wiley.com/doi/epdf/10.1002/mds.70148","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145759727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anthony E. Lang, Robert A. Hauser, Lorraine V. Kalia, Bonnie Hersh, Zdenek Berger, Roy Llorens Arenas, Coro Paisan‐Ruiz, Kyle Fraser, Danna Jennings, Jillian H. Kluss, Sarah Huntwork‐Rodriguez, Anastasia G. Henry, J. Timothy Greenamyre
Sheida Zolfaghari, Trycia Kouchache, Aline Delva, Sarah Bouhadoun, Mirja Kuhlencord, Amélie Pelletier, Alastair J. Noyce, Sheena Waters, Daniel Belete, Tim Wilkinson, Kathryn Bush, Filip Morys, Andrew Vo, Kristiina Rannikmae, Alain Dagher, Ronald B. Postuma
Jamal Al Ali, J. Lucas McKay, Joe R. Nocera, Faical Isbaine, Julie T. Tran, Paola Testini, Shirley D. Triche, Christine D. Esper, Pratibha Aia, Laura M. Scorr, Lenora Higginbotham, Richa Tripathi, Nicholas Au Yong, Svjetlana Miocinovic, Cathrin M. Buetefisch