Forced-choice measures are an alternative to rating-scale surveys designed to reduce response bias, particularly socially desirable responding, by requiring respondents to make rank-order comparisons among two or more statements at a time. Although forced-choice instruments have been used in psychological testing since at least the 1940s, recent methodological advances in item response theory modeling have enabled the estimation of normative scores from the raw ipsative data these assessments produce. The introduction of new scoring methods has resulted in an uptick in the use of forced-choice tests, as full cross-person comparisons were made possible. This paper chronicles the historical development of forced-choice instruments up to the pivotal introduction of item response models for scoring and uses that foundation to review contemporary methods for their construction and analysis. Our review of modern-day methods begins by examining approaches to constructing forced-choice blocks, including the use of mean indices, interitem agreement coefficients, and factor loadings. We then discuss the ideal-point and dominance-based item response models used to evaluate the internal structure of forced-choice assessments and compute scores, as well as methods for assessing differential item functioning. Throughout the review, we also synthesize literature on evaluating response processes, reliability, and other considerations in test construction. Finally, we discuss ongoing debates regarding the extent to which forced-choice measures effectively limit response bias, particularly when negatively keyed items are included in blocks, and conclude by outlining directions for future research. To support engagement with the historical literature, we provide an annotated bibliography spanning more than 8 decades of forced-choice research. (PsycInfo Database Record (c) 2026 APA, all rights reserved).
In psychological research, a common factor model is the most popular measurement model for scale items. However, there is increasing awareness that alternative measurement models, such as formative models, may make more theoretical sense for many kinds of psychological data. We demonstrate the nesting structure of three models specified in a structural equation modeling framework: a reflective confirmatory factor analysis (CFA), a formative Henseler-Ogasawara confirmatory composite analysis, and a formative pseudo-indicator model. Unlike CFA, Henseler-Ogasawara confirmatory composite analysis and pseudo-indicator model allow for the specification of composites in the structural equation modeling framework. In this article, we establish both theoretically and empirically that these three models are nested within one another, as long as the structural part of each model is saturated. As such, the three models can be compared via a chi-square difference test and other fit indices developed for nested models. We report on the results of a small simulation to evaluate whether the chi-square difference test and the root-mean-square error of approximation (RMSEA) based on it (RMSEAD) can reliably discern whether data were sampled from a CFA or a formative measurement model, varying sample size, indicator weights, and the strength of the correlation with another concept. In two empirical examples, we illustrate how tools for nested model comparison can be used to distinguish among reflective and formative measurement models. (PsycInfo Database Record (c) 2026 APA, all rights reserved).
The effectiveness of a just-in-time adaptive intervention relies on accurate algorithms (i.e., decision rules), that determine when and how interventions should be administered. Yet, so far, there is a lack of empirical investigations that evaluate the performance of decision rules. Simulation can be a useful tool to evaluate and refine a range of decision rules prior to implementing just-in-time adaptive interventions in real-world settings. In this study, we evaluate the performance of various decision rules using both an existing data set and a simulated data set that includes measures of craving and alcohol consumption. The tested decision rules consist of adaptive algorithms, like previous-day mean craving and online logistic regression, as well as fixed thresholds (e.g., a craving score larger than 1 on a 7-point Likert scale). For each decision rule, we generated confusion matrices and compared them across performance metrics, including accuracy, specificity, and sensitivity, as well as the number of interventions sent prior to drinking. To assess the robustness of our findings, we simulated a range of data sets with varying underlying distributions and tested the decision rule performance across these conditions. In addition, we conducted a multilevel logistic regression to identify the strongest association between the predictor and outcome variable across time lags. The presented method illustrates an approach to test and refine one's decision rules prior to launching a time-intensive, smartphone-based real-time intervention. A tutorial for conducting such simulations, as well as analysis codes, is provided online and in supplementary materials. (PsycInfo Database Record (c) 2026 APA, all rights reserved).
Equivalence testing, also called negligible effect significance testing (NEST), is appropriate when a researcher would like to find evidence of a negligible association. However, since equivalence testing/NEST procedures are newer and considerably less popular than traditional difference-based null hypothesis significance testing, it is useful to give a gentle introduction to these methods. Accordingly, this tutorial article aims to provide an overview of NEST/equivalence testing procedures by describing the nature of the procedures, explaining when they should be used, defining what considerations should go into their application (including selecting a minimally meaningful effect size), and outlining how they may be conducted and interpreted. The tutorial article also includes examples and code in open-source software to illustrate how these procedures may be applied to real data. (PsycInfo Database Record (c) 2026 APA, all rights reserved).

