Interaction analysis using linear regression is widely employed in psychology and related fields, yet it often induces confusion among applied researchers and students. This paper aims to address this confusion by developing intuitive visual explanations based on causal graphs. By leveraging causal graphs with distinct interaction nodes, we provide clear insights into interpreting main effects in the presence of interaction, the rationale behind centering to reduce multicollinearity, and other pertinent topics. The proposed graphical approach could serve as a useful complement to existing algebraic explanations, fostering a more comprehensive understanding of the mechanics of linear interaction analysis.
{"title":"Understanding linear interaction analysis with causal graphs","authors":"Yongnam Kim, Geryong Jung","doi":"10.1111/bmsp.12369","DOIUrl":"10.1111/bmsp.12369","url":null,"abstract":"<p>Interaction analysis using linear regression is widely employed in psychology and related fields, yet it often induces confusion among applied researchers and students. This paper aims to address this confusion by developing intuitive visual explanations based on causal graphs. By leveraging causal graphs with distinct interaction nodes, we provide clear insights into interpreting main effects in the presence of interaction, the rationale behind centering to reduce multicollinearity, and other pertinent topics. The proposed graphical approach could serve as a useful complement to existing algebraic explanations, fostering a more comprehensive understanding of the mechanics of linear interaction analysis.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":"78 2","pages":"486-499"},"PeriodicalIF":1.8,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142633300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Q-matrices are crucial components of cognitive diagnosis models (CDMs), which are used to provide diagnostic information and classify examinees according to their attribute profiles. The absence of an appropriate Q-matrix that correctly reflects item-attribute relationships often limits the widespread use of CDMs. Rather than relying on expert judgment for specification and post-hoc methods for validation, there has been a notable shift towards Q-matrix estimation by adopting Bayesian methods. Nevertheless, their dependency on Markov chain Monte Carlo (MCMC) estimation requires substantial computational burdens and their exploratory tendency is unscalable to large-scale settings. As a scalable and efficient alternative, this study introduces the partially confirmatory framework within a saturated CDM, where the Q-matrix can be partially defined by experts and partially inferred from data. To address the dual needs of accuracy and efficiency, the proposed framework accommodates two estimation algorithms—an MCMC algorithm and a Variational Bayesian Expectation Maximization (VBEM) algorithm. This dual-channel approach extends the model's applicability across a variety of settings. Based on simulated and real data, the proposed framework demonstrated its robustness in Q-matrix inference.
{"title":"Regularized Bayesian algorithms for Q-matrix inference based on saturated cognitive diagnosis modelling","authors":"Yi Jin, Jinsong Chen","doi":"10.1111/bmsp.12368","DOIUrl":"10.1111/bmsp.12368","url":null,"abstract":"<p><i>Q</i>-matrices are crucial components of cognitive diagnosis models (CDMs), which are used to provide diagnostic information and classify examinees according to their attribute profiles. The absence of an appropriate <i>Q</i>-matrix that correctly reflects item-attribute relationships often limits the widespread use of CDMs. Rather than relying on expert judgment for specification and post-hoc methods for validation, there has been a notable shift towards <i>Q</i>-matrix estimation by adopting Bayesian methods. Nevertheless, their dependency on Markov chain Monte Carlo (MCMC) estimation requires substantial computational burdens and their exploratory tendency is unscalable to large-scale settings. As a scalable and efficient alternative, this study introduces the partially confirmatory framework within a saturated CDM, where the <i>Q</i>-matrix can be partially defined by experts and partially inferred from data. To address the dual needs of accuracy and efficiency, the proposed framework accommodates two estimation algorithms—an MCMC algorithm and a Variational Bayesian Expectation Maximization (VBEM) algorithm. This dual-channel approach extends the model's applicability across a variety of settings. Based on simulated and real data, the proposed framework demonstrated its robustness in <i>Q</i>-matrix inference.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":"78 2","pages":"459-485"},"PeriodicalIF":1.8,"publicationDate":"2024-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142633298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In addition to the usual slope and location parameters included in a regular two-parameter logistic model (2PL), the logistic positive exponent (LPE) model incorporates an item parameter that leads to asymmetric item characteristic curves, which have recently been shown to be useful in some contexts. Although this model has been used in some empirical studies, an identifiability analysis (i.e., checking the (un)identified status of a model and searching for identifiablity restrictions to make an unidentified model identified) has not yet been established. In this paper, we formalize the unidentified status of a large class of fixed-effects item response theory models that includes the LPE model and related versions of it. In addition, we conduct an identifiability analysis of a particular version of the LPE model that is based on the fixed-effects one-parameter logistic model (1PL), which we call the 1PL-LPE model. The main result indicates that the 1PL-LPE model is not identifiable. Ways to make the 1PL-LPE useful in practice and how different strategies for identifiability analyses may affect other versions of the model are also discussed.
{"title":"Identifiability analysis of the fixed-effects one-parameter logistic positive exponent model","authors":"Jorge González, Jorge Bazán, Mariana Curi","doi":"10.1111/bmsp.12366","DOIUrl":"10.1111/bmsp.12366","url":null,"abstract":"<p>In addition to the usual slope and location parameters included in a regular two-parameter logistic model (2PL), the logistic positive exponent (LPE) model incorporates an item parameter that leads to asymmetric item characteristic curves, which have recently been shown to be useful in some contexts. Although this model has been used in some empirical studies, an identifiability analysis (i.e., checking the (un)identified status of a model and searching for identifiablity restrictions to make an unidentified model identified) has not yet been established. In this paper, we formalize the unidentified status of a large class of fixed-effects item response theory models that includes the LPE model and related versions of it. In addition, we conduct an identifiability analysis of a particular version of the LPE model that is based on the fixed-effects one-parameter logistic model (1PL), which we call the 1PL-LPE model. The main result indicates that the 1PL-LPE model is not identifiable. Ways to make the 1PL-LPE useful in practice and how different strategies for identifiability analyses may affect other versions of the model are also discussed.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":"78 2","pages":"440-458"},"PeriodicalIF":1.8,"publicationDate":"2024-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142633297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Item response tree (IRTree) models form a family of psychometric models that allow researchers to control for multiple response processes, such as different sorts of response styles, in the measurement of latent traits. While IRTree models can capture quantitative individual differences in both the latent traits of interest and the use of response categories, they maintain the basic assumption that the nature and weighting of latent response processes are homogeneous across the entire population of respondents. In the present research, we therefore propose a novel approach for detecting heterogeneity in the parameters of IRTree models across subgroups that engage in different response behavior. The approach uses score-based tests to reveal violations of parameter heterogeneity along extraneous person covariates, and it can be employed as a model-based partitioning algorithm to identify sources of differences in the strength of trait-based responding or other response processes. Simulation studies demonstrate generally accurate Type I error rates and sufficient power for metric, ordinal, and categorical person covariates and for different types of test statistics, with the potential to differentiate between different types of parameter heterogeneity. An empirical application illustrates the use of score-based partitioning in the analysis of latent response processes with real data.
{"title":"Investigating heterogeneity in IRTree models for multiple response processes with score-based partitioning","authors":"Rudolf Debelak, Thorsten Meiser, Alicia Gernand","doi":"10.1111/bmsp.12367","DOIUrl":"10.1111/bmsp.12367","url":null,"abstract":"<p>Item response tree (IRTree) models form a family of psychometric models that allow researchers to control for multiple response processes, such as different sorts of response styles, in the measurement of latent traits. While IRTree models can capture quantitative individual differences in both the latent traits of interest and the use of response categories, they maintain the basic assumption that the nature and weighting of latent response processes are homogeneous across the entire population of respondents. In the present research, we therefore propose a novel approach for detecting heterogeneity in the parameters of IRTree models across subgroups that engage in different response behavior. The approach uses score-based tests to reveal violations of parameter heterogeneity along extraneous person covariates, and it can be employed as a model-based partitioning algorithm to identify sources of differences in the strength of trait-based responding or other response processes. Simulation studies demonstrate generally accurate Type I error rates and sufficient power for metric, ordinal, and categorical person covariates and for different types of test statistics, with the potential to differentiate between different types of parameter heterogeneity. An empirical application illustrates the use of score-based partitioning in the analysis of latent response processes with real data.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":"78 2","pages":"420-439"},"PeriodicalIF":1.8,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/bmsp.12367","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142569999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
An alternative closed-form expression for the marginal joint probability distribution of item scores under the random effects generalized partial credit model is presented. The closed-form expression involves a cumulant generating function and is therefore subjected to convexity constraints. As a consequence, complicated moment inequalities are taken into account in maximum likelihood estimation of the parameters of the model, so that the estimation solution is always proper. Another important favorable consequence is that the likelihood function has a single local extreme point, the global maximum. Furthermore, attention is paid to expected a posteriori person parameter estimation, generalizations of the model, and testing the goodness-of-fit of the model. Procedures proposed are demonstrated in an illustrative example.
{"title":"A convexity-constrained parameterization of the random effects generalized partial credit model","authors":"David J. Hessen","doi":"10.1111/bmsp.12365","DOIUrl":"10.1111/bmsp.12365","url":null,"abstract":"<p>An alternative closed-form expression for the marginal joint probability distribution of item scores under the random effects generalized partial credit model is presented. The closed-form expression involves a cumulant generating function and is therefore subjected to convexity constraints. As a consequence, complicated moment inequalities are taken into account in maximum likelihood estimation of the parameters of the model, so that the estimation solution is always proper. Another important favorable consequence is that the likelihood function has a single local extreme point, the global maximum. Furthermore, attention is paid to expected a posteriori person parameter estimation, generalizations of the model, and testing the goodness-of-fit of the model. Procedures proposed are demonstrated in an illustrative example.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":"78 2","pages":"401-419"},"PeriodicalIF":1.8,"publicationDate":"2024-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/bmsp.12365","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142513386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recently Variational Autoencoders (VAEs) have been proposed as a method to estimate high dimensional Item Response Theory (IRT) models on large datasets. Although these improve the efficiency of estimation drastically compared to traditional methods, they have no natural way to deal with missing values. In this paper, we adapt three existing methods from the VAE literature to the IRT setting and propose one new method. We compare the performance of the different VAE-based methods to each other and to marginal maximum likelihood estimation for increasing levels of missing data in a simulation study for both three- and ten-dimensional IRT models. Additionally, we demonstrate the use of the VAE-based models on an existing algebra test dataset. Results confirm that VAE-based methods are a time-efficient alternative to marginal maximum likelihood, but that a larger number of importance-weighted samples are needed when the proportion of missing values is large.
{"title":"Handling missing data in variational autoencoder based item response theory","authors":"Karel Veldkamp, Raoul Grasman, Dylan Molenaar","doi":"10.1111/bmsp.12363","DOIUrl":"10.1111/bmsp.12363","url":null,"abstract":"<p>Recently Variational Autoencoders (VAEs) have been proposed as a method to estimate high dimensional Item Response Theory (IRT) models on large datasets. Although these improve the efficiency of estimation drastically compared to traditional methods, they have no natural way to deal with missing values. In this paper, we adapt three existing methods from the VAE literature to the IRT setting and propose one new method. We compare the performance of the different VAE-based methods to each other and to marginal maximum likelihood estimation for increasing levels of missing data in a simulation study for both three- and ten-dimensional IRT models. Additionally, we demonstrate the use of the VAE-based models on an existing algebra test dataset. Results confirm that VAE-based methods are a time-efficient alternative to marginal maximum likelihood, but that a larger number of importance-weighted samples are needed when the proportion of missing values is large.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":"78 1","pages":"378-397"},"PeriodicalIF":1.8,"publicationDate":"2024-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/bmsp.12363","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142513387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
<p>We consider the problem of determining the maximum value of the point-polyserial correlation between a random variable with an assigned continuous distribution and an ordinal random variable with <span></span><math>