Ordinal responses commonly occur in psychology, e.g., through school grades or rating scales. Where traditionally parametric statistical models like the proportional odds model have been used, machine learning (ML) methods such as random forest (RF) are increasingly employed for ordinal prediction. With new developments in assessment and new data sources yielding increasing quantities of data in the psychological sciences, such ML approaches promise high predictive performance. As RF does not inherently account for ordinality, several extensions have been proposed. A promising approach lies in assigning optimized numeric scores to the ordinal response categories and using regression RF. However, these optimization procedures are computationally expensive and have been shown to yield only situational benefit. In this work, I propose Frequency-Adjusted Borders Ordinal Forest (fabOF), a novel tree ensemble method for ordinal prediction forgoing extensive optimization while offering improved predictive performance in simulation and an illustrative example of student performance. To aid interpretation, I additionally introduce a permutation variable importance measure for fabOF tailored towards ordinal prediction. When applied to the illustrative example, an interest in higher education, mother's education, and study time are identified as important predictors of student performance. The presented methodology is made available through an accompanying R package.
{"title":"Frequency-adjusted borders ordinal forest: A novel tree ensemble method for ordinal prediction","authors":"Philip Buczak","doi":"10.1111/bmsp.12375","DOIUrl":"10.1111/bmsp.12375","url":null,"abstract":"<p>Ordinal responses commonly occur in psychology, e.g., through school grades or rating scales. Where traditionally parametric statistical models like the proportional odds model have been used, machine learning (ML) methods such as random forest (RF) are increasingly employed for ordinal prediction. With new developments in assessment and new data sources yielding increasing quantities of data in the psychological sciences, such ML approaches promise high predictive performance. As RF does not inherently account for ordinality, several extensions have been proposed. A promising approach lies in assigning optimized numeric scores to the ordinal response categories and using regression RF. However, these optimization procedures are computationally expensive and have been shown to yield only situational benefit. In this work, I propose Frequency-Adjusted Borders Ordinal Forest (fabOF), a novel tree ensemble method for ordinal prediction forgoing extensive optimization while offering improved predictive performance in simulation and an illustrative example of student performance. To aid interpretation, I additionally introduce a permutation variable importance measure for fabOF tailored towards ordinal prediction. When applied to the illustrative example, an interest in higher education, mother's education, and study time are identified as important predictors of student performance. The presented methodology is made available through an accompanying <span>R</span> package.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":"78 2","pages":"594-616"},"PeriodicalIF":1.8,"publicationDate":"2024-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/bmsp.12375","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142796610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Leonie V D E Vogelsmeier, Irina Uglanova, Manuel T Rein, Esther Ulitzsch
In ecological momentary assessment (EMA), respondents answer brief questionnaires about their current behaviours or experiences several times per day across multiple days. The frequent measurement enables a thorough grasp of the dynamics inherent in psychological constructs, but it also increases respondent burden. To lower this burden, respondents may engage in careless and insufficient effort responding (C/IER), leaving data contaminated with responses that do not reflect what researchers want to measure. We introduce a novel approach to investigating C/IER in EMA data. Our approach combines a confirmatory mixture item response theory model separating C/IER from attentive behaviour with latent Markov factor analysis. This enables gauging the occurrence of C/IER and studying transitions among states of different response behaviours including their contextual correlates. The approach can be implemented using R packages. An empirical application showcases the approach's efficacy in pinpointing C/IER instances and gaining insights into their underlying causes. We showcase that the approach identifies various C/IER response patterns but requires heterogeneous and negatively worded items to detect straightlining. In a simulation investigating robustness against unaccounted for changes in measurement models underlying attentive responses, the approach proved robust against heterogeneity in loading patterns but not against heterogeneity in factor structures. Extensions to accommodate the latter are discussed.
{"title":"Investigating dynamics in attentive and inattentive responding together with their contextual correlates using a novel mixture IRT model for intensive longitudinal data.","authors":"Leonie V D E Vogelsmeier, Irina Uglanova, Manuel T Rein, Esther Ulitzsch","doi":"10.1111/bmsp.12373","DOIUrl":"https://doi.org/10.1111/bmsp.12373","url":null,"abstract":"<p><p>In ecological momentary assessment (EMA), respondents answer brief questionnaires about their current behaviours or experiences several times per day across multiple days. The frequent measurement enables a thorough grasp of the dynamics inherent in psychological constructs, but it also increases respondent burden. To lower this burden, respondents may engage in careless and insufficient effort responding (C/IER), leaving data contaminated with responses that do not reflect what researchers want to measure. We introduce a novel approach to investigating C/IER in EMA data. Our approach combines a confirmatory mixture item response theory model separating C/IER from attentive behaviour with latent Markov factor analysis. This enables gauging the occurrence of C/IER and studying transitions among states of different response behaviours including their contextual correlates. The approach can be implemented using R packages. An empirical application showcases the approach's efficacy in pinpointing C/IER instances and gaining insights into their underlying causes. We showcase that the approach identifies various C/IER response patterns but requires heterogeneous and negatively worded items to detect straightlining. In a simulation investigating robustness against unaccounted for changes in measurement models underlying attentive responses, the approach proved robust against heterogeneity in loading patterns but not against heterogeneity in factor structures. Extensions to accommodate the latter are discussed.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":" ","pages":""},"PeriodicalIF":1.5,"publicationDate":"2024-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142792957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The statistical foundations of person parameter estimation for the multivariate Thurstonian item response theory (TIRT) model of pairwise comparison and forced-choice (FC) ranking data are elaborated, and several misconceptions in IRT and TIRT are addressed. It is shown that directional information (i.e. multivariate information as defined by Reckase & Kinley, 1991; Applied Psychological Measurement, 15, 361) is not suited to quantify the precision of the estimates unless the Fisher information matrix is diagonal. The asymptotic covariance can be quantified by the inverse Fisher information matrix if the genuine likelihood is used and by the inverse Godambe information for independence likelihood estimation that results from ignoring within-block dependencies of pairwise comparisons. Analytical expressions are provided for the genuine likelihood and the Fisher information matrix for a generalized TIRT model that comprises binary pairwise comparison and ranking designs, which enables maximum likelihood estimation (MLE) and Bayesian estimation (maximum a posteriori probability with normal and Jeffreys prior) of person parameters. The bias of the MLE is quantified, and methods of bias prevention and bias correction are introduced. The correct marginal likelihood of graded pairwise comparisons is provided that might be used for person parameter estimation based on the independence likelihood.
{"title":"Statistical foundations of person parameter estimation in the Thurstonian IRT model for forced-choice and pairwise comparison designs","authors":"Safir Yousfi","doi":"10.1111/bmsp.12364","DOIUrl":"10.1111/bmsp.12364","url":null,"abstract":"<p>The statistical foundations of person parameter estimation for the multivariate Thurstonian item response theory (TIRT) model of pairwise comparison and forced-choice (FC) ranking data are elaborated, and several misconceptions in IRT and TIRT are addressed. It is shown that directional information (i.e. multivariate information as defined by Reckase & Kinley, 1991; <i>Applied Psychological Measurement</i>, 15, 361) is not suited to quantify the precision of the estimates unless the Fisher information matrix is diagonal. The asymptotic covariance can be quantified by the inverse Fisher information matrix if the genuine likelihood is used and by the inverse Godambe information for independence likelihood estimation that results from ignoring within-block dependencies of pairwise comparisons. Analytical expressions are provided for the genuine likelihood and the Fisher information matrix for a generalized TIRT model that comprises binary pairwise comparison and ranking designs, which enables maximum likelihood estimation (MLE) and Bayesian estimation (maximum a posteriori probability with normal and Jeffreys prior) of person parameters. The bias of the MLE is quantified, and methods of bias prevention and bias correction are introduced. The correct marginal likelihood of graded pairwise comparisons is provided that might be used for person parameter estimation based on the independence likelihood.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":"78 2","pages":"555-593"},"PeriodicalIF":1.8,"publicationDate":"2024-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142734861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Q-matrix is a crucial component of cognitive diagnostic theory and an important basis for the research and practical application of cognitive diagnosis. In practice, the Q-matrix is typically developed by domain experts and may contain some misspecifications, so it needs to be refined using Q-matrix validation methods. Based on signal detection theory, this paper puts forward a new Q-matrix validation method (i.e.,