Hannah M Heister,Casper J Albers,Marie Wiberg,Marieke E Timmerman
{"title":"Item response theory-based continuous test norming.","authors":"Hannah M Heister,Casper J Albers,Marie Wiberg,Marieke E Timmerman","doi":"10.1037/met0000686","DOIUrl":null,"url":null,"abstract":"In norm-referenced psychological testing, an individual's performance is expressed in relation to a reference population using a standardized score, like an intelligence quotient score. The reference population can depend on a continuous variable, like age. Current continuous norming methods transform the raw score into an age-dependent standardized score. Such methods have the shortcoming to solely rely on the raw test scores, ignoring valuable information from individual item responses. Instead of modeling the raw test scores, we propose modeling the item scores with a Bayesian two-parameter logistic (2PL) item response theory model with age-dependent mean and variance of the latent trait distribution, 2PL-norm for short. Norms are then derived using the estimated latent trait score and the age-dependent distribution parameters. Simulations show that 2PL-norms are overall more accurate than those from the most popular raw score-based norming methods cNORM and generalized additive models for location, scale, and shape (GAMLSS). Furthermore, the credible intervals of 2PL-norm exhibit clearly superior coverage over the confidence intervals of the raw score-based methods. The only issue of 2PL-norm is its slightly lower performance at the tails of the norms. Among the raw score-based norming methods, GAMLSS outperforms cNORM. For empirical practice this suggests the use of 2PL-norm, if the model assumptions hold. If not, or the interest is solely in the point estimates of the extreme trait positions, GAMLSS-based norming is a better alternative. The use of the 2PL-norm is illustrated and compared with GAMLSS and cNORM using empirical data, and code is provided, so that users can readily apply 2PL-norm to their normative data. (PsycInfo Database Record (c) 2024 APA, all rights reserved).","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":"10 1","pages":""},"PeriodicalIF":7.6000,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Psychological methods","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1037/met0000686","RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
In norm-referenced psychological testing, an individual's performance is expressed in relation to a reference population using a standardized score, like an intelligence quotient score. The reference population can depend on a continuous variable, like age. Current continuous norming methods transform the raw score into an age-dependent standardized score. Such methods have the shortcoming to solely rely on the raw test scores, ignoring valuable information from individual item responses. Instead of modeling the raw test scores, we propose modeling the item scores with a Bayesian two-parameter logistic (2PL) item response theory model with age-dependent mean and variance of the latent trait distribution, 2PL-norm for short. Norms are then derived using the estimated latent trait score and the age-dependent distribution parameters. Simulations show that 2PL-norms are overall more accurate than those from the most popular raw score-based norming methods cNORM and generalized additive models for location, scale, and shape (GAMLSS). Furthermore, the credible intervals of 2PL-norm exhibit clearly superior coverage over the confidence intervals of the raw score-based methods. The only issue of 2PL-norm is its slightly lower performance at the tails of the norms. Among the raw score-based norming methods, GAMLSS outperforms cNORM. For empirical practice this suggests the use of 2PL-norm, if the model assumptions hold. If not, or the interest is solely in the point estimates of the extreme trait positions, GAMLSS-based norming is a better alternative. The use of the 2PL-norm is illustrated and compared with GAMLSS and cNORM using empirical data, and code is provided, so that users can readily apply 2PL-norm to their normative data. (PsycInfo Database Record (c) 2024 APA, all rights reserved).
期刊介绍:
Psychological Methods is devoted to the development and dissemination of methods for collecting, analyzing, understanding, and interpreting psychological data. Its purpose is the dissemination of innovations in research design, measurement, methodology, and quantitative and qualitative analysis to the psychological community; its further purpose is to promote effective communication about related substantive and methodological issues. The audience is expected to be diverse and to include those who develop new procedures, those who are responsible for undergraduate and graduate training in design, measurement, and statistics, as well as those who employ those procedures in research.