A comparison of scoring algorithms for the NIH Toolbox executive function tasks in a U.S. norming sample.

IF 3.3 2区 心理学 Q1 PSYCHOLOGY, CLINICAL Psychological Assessment Pub Date : 2024-12-01 DOI:10.1037/pas0001350
Yusuke Shono, Berivan Ece, Emily H Ho, Aaron J Kaat, Erica M LaForte, Ezgi Ayturk, Richard Gershon
{"title":"A comparison of scoring algorithms for the NIH Toolbox executive function tasks in a U.S. norming sample.","authors":"Yusuke Shono, Berivan Ece, Emily H Ho, Aaron J Kaat, Erica M LaForte, Ezgi Ayturk, Richard Gershon","doi":"10.1037/pas0001350","DOIUrl":null,"url":null,"abstract":"<p><p>Executive function (EF) has been extensively linked to various behavioral, clinical, and educational outcomes. There have been, however, few systematic investigations into how best to score EF tasks using speed and accuracy performance, particularly how to generate a summary and norm-referenced score. Using data from an updated norming study for the NIH Toolbox Version 3 (NIHTB V3) with the general U.S. population aged between 3 and 85 (N = 3,794; 52.3% female; Mage = 25.06, SDage = 22.92), we empirically evaluated and compared several scoring algorithms for two EF tests: The Dimensional Change Card Sort (a test of cognitive flexibility) and Flanker (a test of inhibitory control) Tests. Results showed that joint scoring algorithms integrating speed and accuracy into single scores (namely, rate-correct score, linear integrated speed-accuracy score, and speed-accuracy additive score) provided more robust psychometric evidence for the EF tests than single-index scores of accuracy and speed. These integrated speed-accuracy scores were consistent and stable within and across tasks and time; similar to that of another well-validated EF measure, but as predicted, not related to a crystallized intelligence measure score; and increased rapidly from early childhood through late adolescence/early adulthood and then declined toward late adulthood. The rate-correct score was particularly free from ceiling effects and sensitive to age-related changes and variability in EF performance. Among various scoring algorithms, we recommend rate-correct score, which served as the basis for generating new NIHTB V3 norm-referenced scores, with good test-retest reliability (Dimensional Change Card Sort = .77, Flanker = .81) and acceptable convergent and discriminant validity. (PsycInfo Database Record (c) 2024 APA, all rights reserved).</p>","PeriodicalId":20770,"journal":{"name":"Psychological Assessment","volume":"36 12","pages":"760-771"},"PeriodicalIF":3.3000,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Psychological Assessment","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1037/pas0001350","RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, CLINICAL","Score":null,"Total":0}
引用次数: 0

Abstract

Executive function (EF) has been extensively linked to various behavioral, clinical, and educational outcomes. There have been, however, few systematic investigations into how best to score EF tasks using speed and accuracy performance, particularly how to generate a summary and norm-referenced score. Using data from an updated norming study for the NIH Toolbox Version 3 (NIHTB V3) with the general U.S. population aged between 3 and 85 (N = 3,794; 52.3% female; Mage = 25.06, SDage = 22.92), we empirically evaluated and compared several scoring algorithms for two EF tests: The Dimensional Change Card Sort (a test of cognitive flexibility) and Flanker (a test of inhibitory control) Tests. Results showed that joint scoring algorithms integrating speed and accuracy into single scores (namely, rate-correct score, linear integrated speed-accuracy score, and speed-accuracy additive score) provided more robust psychometric evidence for the EF tests than single-index scores of accuracy and speed. These integrated speed-accuracy scores were consistent and stable within and across tasks and time; similar to that of another well-validated EF measure, but as predicted, not related to a crystallized intelligence measure score; and increased rapidly from early childhood through late adolescence/early adulthood and then declined toward late adulthood. The rate-correct score was particularly free from ceiling effects and sensitive to age-related changes and variability in EF performance. Among various scoring algorithms, we recommend rate-correct score, which served as the basis for generating new NIHTB V3 norm-referenced scores, with good test-retest reliability (Dimensional Change Card Sort = .77, Flanker = .81) and acceptable convergent and discriminant validity. (PsycInfo Database Record (c) 2024 APA, all rights reserved).

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
美国标准样本中 NIH 工具箱执行功能任务评分算法的比较。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Psychological Assessment
Psychological Assessment PSYCHOLOGY, CLINICAL-
CiteScore
5.70
自引率
5.60%
发文量
167
期刊介绍: Psychological Assessment is concerned mainly with empirical research on measurement and evaluation relevant to the broad field of clinical psychology. Submissions are welcome in the areas of assessment processes and methods. Included are - clinical judgment and the application of decision-making models - paradigms derived from basic psychological research in cognition, personality–social psychology, and biological psychology - development, validation, and application of assessment instruments, observational methods, and interviews
期刊最新文献
Validating the Proposed Specifiers for Conduct Disorder (PSCD) in Iranian justice-involved youths: A multi-informant study of parent and youth self-report versions. Examining the factor structure of the nine-item Avoidant/Restrictive Food Intake Disorder Screen in a national U.S. military veteran sample. Validation and cross-sample consistency of Chinese Five-Factor Narcissism Inventory (FFNI) in community and offender samples. Measuring eating behavior and motivations in the United Arab Emirates and the United States: Evaluating measurement and predictive invariance of the Eating Disorder Examination Questionnaire-Short Form and the Eating Motivation Survey. Development and validation of the Parental Affection/Warmth Scale (PAWS) in a sample of parents of 2- to 8-year-olds.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1