Pub Date : 2024-08-20DOI: 10.1177/00131644241268073
Yan Xia, Xinchang Zhou
Parallel analysis has been considered one of the most accurate methods for determining the number of factors in factor analysis. One major advantage of parallel analysis over traditional factor retention methods (e.g., Kaiser's rule) is that it addresses the sampling variability of eigenvalues obtained from the identity matrix, representing the correlation matrix for a zero-factor model. This study argues that we should also address the sampling variability of eigenvalues obtained from the observed data, such that the results would inform practitioners of the variability of the number of factors across random samples. Thus, this study proposes to revise the parallel analysis to provide the proportion of random samples that suggest k factors (k = 0, 1, 2, . . .) rather than a single suggested number. Simulation results support the use of the proposed strategy, especially for research scenarios with limited sample sizes where sampling fluctuation is concerning.
平行分析法被认为是确定因子分析中因子个数的最准确方法之一。与传统的因子保留方法(如凯撒法则)相比,平行分析法的一大优势在于它能解决从特征矩阵(代表零因子模型的相关矩阵)中获得的特征值的抽样变异性问题。本研究认为,我们还应该解决从观测数据中获得的特征值的抽样变异性问题,从而使研究结果能够告知从业人员不同随机样本中因子数量的变异性。因此,本研究建议修改并行分析,以提供建议 k 个因子(k = 0、1、2、...)的随机样本比例,而不是单一的建议因子数。模拟结果支持使用所建议的策略,尤其是在样本量有限、抽样波动令人担忧的研究场景中。
{"title":"Improving the Use of Parallel Analysis by Accounting for Sampling Variability of the Observed Correlation Matrix.","authors":"Yan Xia, Xinchang Zhou","doi":"10.1177/00131644241268073","DOIUrl":"10.1177/00131644241268073","url":null,"abstract":"<p><p>Parallel analysis has been considered one of the most accurate methods for determining the number of factors in factor analysis. One major advantage of parallel analysis over traditional factor retention methods (e.g., Kaiser's rule) is that it addresses the sampling variability of eigenvalues obtained from the identity matrix, representing the correlation matrix for a zero-factor model. This study argues that we should also address the sampling variability of eigenvalues obtained from the observed data, such that the results would inform practitioners of the variability of the number of factors across random samples. Thus, this study proposes to revise the parallel analysis to provide the proportion of random samples that suggest <i>k</i> factors (<i>k</i> = 0, 1, 2, . . .) rather than a single suggested number. Simulation results support the use of the proposed strategy, especially for research scenarios with limited sample sizes where sampling fluctuation is concerning.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"00131644241268073"},"PeriodicalIF":2.1,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11572087/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142675458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-07DOI: 10.1177/00131644241268128
Kylie Gorney, Sandip Sinharay
Test-takers, policymakers, teachers, and institutions are increasingly demanding that testing programs provide more detailed feedback regarding test performance. As a result, there has been a growing interest in the reporting of subscores that potentially provide such detailed feedback. Haberman developed a method based on classical test theory for determining whether a subscore has added value over the total score. Sinharay conducted a detailed study using both real and simulated data and concluded that it is not common for subscores to have added value according to Haberman’s criterion. However, Sinharay almost exclusively dealt with data from tests with only dichotomous items. In this article, we show that it is more common for subscores to have added value in tests with polytomous items.
{"title":"Added Value of Subscores for Tests With Polytomous Items","authors":"Kylie Gorney, Sandip Sinharay","doi":"10.1177/00131644241268128","DOIUrl":"https://doi.org/10.1177/00131644241268128","url":null,"abstract":"Test-takers, policymakers, teachers, and institutions are increasingly demanding that testing programs provide more detailed feedback regarding test performance. As a result, there has been a growing interest in the reporting of subscores that potentially provide such detailed feedback. Haberman developed a method based on classical test theory for determining whether a subscore has added value over the total score. Sinharay conducted a detailed study using both real and simulated data and concluded that it is not common for subscores to have added value according to Haberman’s criterion. However, Sinharay almost exclusively dealt with data from tests with only dichotomous items. In this article, we show that it is more common for subscores to have added value in tests with polytomous items.","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"3 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141933506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-25DOI: 10.1177/00131644241262964
Yongtian Cheng, K. V. Petrides
Psychologists are emphasizing the importance of predictive conclusions. Machine learning methods, such as supervised neural networks, have been used in psychological studies as they naturally fit prediction tasks. However, we are concerned about whether neural networks fitted with random datasets (i.e., datasets where there is no relationship between ordinal independent variables and continuous or binary-dependent variables) can provide an acceptable level of predictive performance from a psychologist’s perspective. Through a Monte Carlo simulation study, we found that this kind of erroneous conclusion is not likely to be drawn as long as the sample size is larger than 50 with continuous-dependent variables. However, when the dependent variable is binary, the minimum sample size is 500 when the criteria are balanced accuracy ≥ .6 or balanced accuracy ≥ .65, and the minimum sample size is 200 when the criterion is balanced accuracy ≥ .7 for a decision error less than .05. In the case where area under the curve (AUC) is used as a metric, a sample size of 100, 200, and 500 is necessary when the minimum acceptable performance level is set at AUC ≥ .7, AUC ≥ .65, and AUC ≥ .6, respectively. The results found by this study can be used for sample size planning for psychologists who wish to apply neural networks for a qualitatively reliable conclusion. Further directions and limitations of the study are also discussed.
{"title":"Evaluating The Predictive Reliability of Neural Networks in Psychological Research With Random Datasets","authors":"Yongtian Cheng, K. V. Petrides","doi":"10.1177/00131644241262964","DOIUrl":"https://doi.org/10.1177/00131644241262964","url":null,"abstract":"Psychologists are emphasizing the importance of predictive conclusions. Machine learning methods, such as supervised neural networks, have been used in psychological studies as they naturally fit prediction tasks. However, we are concerned about whether neural networks fitted with random datasets (i.e., datasets where there is no relationship between ordinal independent variables and continuous or binary-dependent variables) can provide an acceptable level of predictive performance from a psychologist’s perspective. Through a Monte Carlo simulation study, we found that this kind of erroneous conclusion is not likely to be drawn as long as the sample size is larger than 50 with continuous-dependent variables. However, when the dependent variable is binary, the minimum sample size is 500 when the criteria are balanced accuracy ≥ .6 or balanced accuracy ≥ .65, and the minimum sample size is 200 when the criterion is balanced accuracy ≥ .7 for a decision error less than .05. In the case where area under the curve (AUC) is used as a metric, a sample size of 100, 200, and 500 is necessary when the minimum acceptable performance level is set at AUC ≥ .7, AUC ≥ .65, and AUC ≥ .6, respectively. The results found by this study can be used for sample size planning for psychologists who wish to apply neural networks for a qualitatively reliable conclusion. Further directions and limitations of the study are also discussed.","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"39 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141772234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-24DOI: 10.1177/00131644241256626
Tenko Raykov
A latent variable modeling procedure for studying factorial invariance and differential item functioning for multi-component measuring instruments with nominal items is discussed. The method is based on a multiple testing approach utilizing the false discovery rate concept and likelihood ratio tests. The procedure complements the Revuelta, Franco-Martinez, and Ximenez approach to factorial invariance examination, and permits localization of individual invariance violations. The outlined method does not require the selection of a reference observed variable and is illustrated with empirical data.
{"title":"Studying Factorial Invariance With Nominal Items: A Note on a Latent Variable Modeling Procedure","authors":"Tenko Raykov","doi":"10.1177/00131644241256626","DOIUrl":"https://doi.org/10.1177/00131644241256626","url":null,"abstract":"A latent variable modeling procedure for studying factorial invariance and differential item functioning for multi-component measuring instruments with nominal items is discussed. The method is based on a multiple testing approach utilizing the false discovery rate concept and likelihood ratio tests. The procedure complements the Revuelta, Franco-Martinez, and Ximenez approach to factorial invariance examination, and permits localization of individual invariance violations. The outlined method does not require the selection of a reference observed variable and is illustrated with empirical data.","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"33 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141501613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-24DOI: 10.1177/00131644241259026
Tenko Raykov, Martin Pusic
A procedure is outlined for point and interval estimation of location parameters associated with polytomous items, or raters assessing studied subjects or cases, which follow the rating scale model. The method is developed within the framework of latent variable modeling, and is readily applied in empirical research using popular software. The approach permits testing the goodness of fit of this widely used model, which represents a rather parsimonious item response theory model as a means of description and explanation of an analyzed data set. The procedure allows examination of important aspects of the functioning of measuring instruments with polytomous ordinal items, which may also constitute person assessments furnished by teachers, counselors, judges, raters, or clinicians. The described method is illustrated using an empirical example.
{"title":"A Note on Evaluation of Polytomous Item Locations With the Rating Scale Model and Testing Its Fit","authors":"Tenko Raykov, Martin Pusic","doi":"10.1177/00131644241259026","DOIUrl":"https://doi.org/10.1177/00131644241259026","url":null,"abstract":"A procedure is outlined for point and interval estimation of location parameters associated with polytomous items, or raters assessing studied subjects or cases, which follow the rating scale model. The method is developed within the framework of latent variable modeling, and is readily applied in empirical research using popular software. The approach permits testing the goodness of fit of this widely used model, which represents a rather parsimonious item response theory model as a means of description and explanation of an analyzed data set. The procedure allows examination of important aspects of the functioning of measuring instruments with polytomous ordinal items, which may also constitute person assessments furnished by teachers, counselors, judges, raters, or clinicians. The described method is illustrated using an empirical example.","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"18 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141501614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-30DOI: 10.1177/00131644241255109
Sanaz Nazari, Walter L. Leite, A. Corinne Huggins-Manley
Social desirability bias (SDB) is a common threat to the validity of conclusions from responses to a scale or survey. There is a wide range of person-fit statistics in the literature that can be employed to detect SDB. In addition, machine learning classifiers, such as logistic regression and random forest, have the potential to distinguish between biased and unbiased responses. This study proposes a new application of these classifiers to detect SDB by considering several person-fit indices as features or predictors in the machine learning methods. The results of a Monte Carlo simulation study showed that for a single feature, applying person-fit indices directly and logistic regression led to similar classification results. However, the random forest classifier improved the classification of biased and unbiased responses substantially. Classification was improved in both logistic regression and random forest by considering multiple features simultaneously. Moreover, cross-validation indicated stable area under the curves (AUCs) across machine learning classifiers. A didactical illustration of applying random forest to detect SDB is presented.
{"title":"Enhancing the Detection of Social Desirability Bias Using Machine Learning: A Novel Application of Person-Fit Indices","authors":"Sanaz Nazari, Walter L. Leite, A. Corinne Huggins-Manley","doi":"10.1177/00131644241255109","DOIUrl":"https://doi.org/10.1177/00131644241255109","url":null,"abstract":"Social desirability bias (SDB) is a common threat to the validity of conclusions from responses to a scale or survey. There is a wide range of person-fit statistics in the literature that can be employed to detect SDB. In addition, machine learning classifiers, such as logistic regression and random forest, have the potential to distinguish between biased and unbiased responses. This study proposes a new application of these classifiers to detect SDB by considering several person-fit indices as features or predictors in the machine learning methods. The results of a Monte Carlo simulation study showed that for a single feature, applying person-fit indices directly and logistic regression led to similar classification results. However, the random forest classifier improved the classification of biased and unbiased responses substantially. Classification was improved in both logistic regression and random forest by considering multiple features simultaneously. Moreover, cross-validation indicated stable area under the curves (AUCs) across machine learning classifiers. A didactical illustration of applying random forest to detect SDB is presented.","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"2018 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141188132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-28DOI: 10.1177/00131644241246749
Joseph A. Rios, Jiayi Deng
To mitigate the potential damaging consequences of rapid guessing (RG), a form of noneffortful responding, researchers have proposed a number of scoring approaches. The present simulation study examines the robustness of the most popular of these approaches, the unidimensional effort-moderated (EM) scoring procedure, to multidimensional RG (i.e., RG that is linearly related to examinee ability). Specifically, EM scoring is compared with the Holman–Glas (HG) method, a multidimensional scoring approach, in terms of model fit distortion, ability parameter recovery, and omega reliability distortion. Test difficulty, the proportion of RG present within a sample, and the strength of association between ability and RG propensity were manipulated to create 80 total conditions. Overall, the results showed that EM scoring provided improved model fit compared with HG scoring when RG comprised 12% or less of all item responses. Furthermore, no significant differences in ability parameter recovery and omega reliability distortion were noted when comparing these two scoring approaches under moderate degrees of RG multidimensionality. These limited differences were largely due to the limited impact of RG on aggregated ability (bias ranged from 0.00 to 0.05 logits) and reliability (distortion was ≤ .005 units) estimates when as much as 40% of item responses in the sample data reflected RG behavior.
{"title":"Is Effort Moderated Scoring Robust to Multidimensional Rapid Guessing?","authors":"Joseph A. Rios, Jiayi Deng","doi":"10.1177/00131644241246749","DOIUrl":"https://doi.org/10.1177/00131644241246749","url":null,"abstract":"To mitigate the potential damaging consequences of rapid guessing (RG), a form of noneffortful responding, researchers have proposed a number of scoring approaches. The present simulation study examines the robustness of the most popular of these approaches, the unidimensional effort-moderated (EM) scoring procedure, to multidimensional RG (i.e., RG that is linearly related to examinee ability). Specifically, EM scoring is compared with the Holman–Glas (HG) method, a multidimensional scoring approach, in terms of model fit distortion, ability parameter recovery, and omega reliability distortion. Test difficulty, the proportion of RG present within a sample, and the strength of association between ability and RG propensity were manipulated to create 80 total conditions. Overall, the results showed that EM scoring provided improved model fit compared with HG scoring when RG comprised 12% or less of all item responses. Furthermore, no significant differences in ability parameter recovery and omega reliability distortion were noted when comparing these two scoring approaches under moderate degrees of RG multidimensionality. These limited differences were largely due to the limited impact of RG on aggregated ability (bias ranged from 0.00 to 0.05 logits) and reliability (distortion was ≤ .005 units) estimates when as much as 40% of item responses in the sample data reflected RG behavior.","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"11 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140810537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-17DOI: 10.1177/00131644241240435
Hyunjung Lee, Heining Cham
Determining the number of factors in exploratory factor analysis (EFA) is crucial because it affects the rest of the analysis and the conclusions of the study. Researchers have developed various methods for deciding the number of factors to retain in EFA, but this remains one of the most difficult decisions in the EFA. The purpose of this study is to compare the parallel analysis with the performance of fit indices that researchers have started using as another strategy for determining the optimal number of factors in EFA. The Monte Carlo simulation was conducted with ordered categorical items because there are mixed results in previous simulation studies, and ordered categorical items are common in behavioral science. The results of this study indicate that the parallel analysis and the root mean square error of approximation (RMSEA) performed well in most conditions, followed by the Tucker–Lewis index (TLI) and then by the comparative fit index (CFI). The robust corrections of CFI, TLI, and RMSEA performed better in detecting misfit underfactored models than the original fit indices. However, they did not produce satisfactory results in dichotomous data with a small sample size. Implications, limitations of this study, and future research directions are discussed.
{"title":"Comparing Accuracy of Parallel Analysis and Fit Statistics for Estimating the Number of Factors With Ordered Categorical Data in Exploratory Factor Analysis","authors":"Hyunjung Lee, Heining Cham","doi":"10.1177/00131644241240435","DOIUrl":"https://doi.org/10.1177/00131644241240435","url":null,"abstract":"Determining the number of factors in exploratory factor analysis (EFA) is crucial because it affects the rest of the analysis and the conclusions of the study. Researchers have developed various methods for deciding the number of factors to retain in EFA, but this remains one of the most difficult decisions in the EFA. The purpose of this study is to compare the parallel analysis with the performance of fit indices that researchers have started using as another strategy for determining the optimal number of factors in EFA. The Monte Carlo simulation was conducted with ordered categorical items because there are mixed results in previous simulation studies, and ordered categorical items are common in behavioral science. The results of this study indicate that the parallel analysis and the root mean square error of approximation (RMSEA) performed well in most conditions, followed by the Tucker–Lewis index (TLI) and then by the comparative fit index (CFI). The robust corrections of CFI, TLI, and RMSEA performed better in detecting misfit underfactored models than the original fit indices. However, they did not produce satisfactory results in dichotomous data with a small sample size. Implications, limitations of this study, and future research directions are discussed.","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"1 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140616292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-17DOI: 10.1177/00131644241242789
Hung-Yu Huang
The use of discrete categorical formats to assess psychological traits has a long-standing tradition that is deeply embedded in item response theory models. The increasing prevalence and endorsement of computer- or web-based testing has led to greater focus on continuous response formats, which offer numerous advantages in both respondent experience and methodological considerations. Response styles, which are frequently observed in self-reported data, reflect a propensity to answer questionnaire items in a consistent manner, regardless of the item content. These response styles have been identified as causes of skewed scale scores and biased trait inferences. In this study, we investigate the impact of response styles on individuals’ responses within a continuous scale context, with a specific emphasis on extreme response style (ERS) and acquiescence response style (ARS). Building upon the established continuous response model (CRM), we propose extensions known as the CRM-ERS and CRM-ARS. These extensions are employed to quantitatively capture individual variations in these distinct response styles. The effectiveness of the proposed models was evaluated through a series of simulation studies. Bayesian methods were employed to effectively calibrate the model parameters. The results demonstrate that both models achieve satisfactory parameter recovery. Neglecting the effects of response styles led to biased estimation, underscoring the importance of accounting for these effects. Moreover, the estimation accuracy improved with increasing test length and sample size. An empirical analysis is presented to elucidate the practical applications and implications of the proposed models.
{"title":"Exploring the Influence of Response Styles on Continuous Scale Assessments: Insights From a Novel Modeling Approach","authors":"Hung-Yu Huang","doi":"10.1177/00131644241242789","DOIUrl":"https://doi.org/10.1177/00131644241242789","url":null,"abstract":"The use of discrete categorical formats to assess psychological traits has a long-standing tradition that is deeply embedded in item response theory models. The increasing prevalence and endorsement of computer- or web-based testing has led to greater focus on continuous response formats, which offer numerous advantages in both respondent experience and methodological considerations. Response styles, which are frequently observed in self-reported data, reflect a propensity to answer questionnaire items in a consistent manner, regardless of the item content. These response styles have been identified as causes of skewed scale scores and biased trait inferences. In this study, we investigate the impact of response styles on individuals’ responses within a continuous scale context, with a specific emphasis on extreme response style (ERS) and acquiescence response style (ARS). Building upon the established continuous response model (CRM), we propose extensions known as the CRM-ERS and CRM-ARS. These extensions are employed to quantitatively capture individual variations in these distinct response styles. The effectiveness of the proposed models was evaluated through a series of simulation studies. Bayesian methods were employed to effectively calibrate the model parameters. The results demonstrate that both models achieve satisfactory parameter recovery. Neglecting the effects of response styles led to biased estimation, underscoring the importance of accounting for these effects. Moreover, the estimation accuracy improved with increasing test length and sample size. An empirical analysis is presented to elucidate the practical applications and implications of the proposed models.","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"35 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140617770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-13DOI: 10.1177/00131644241242806
Kuan-Yu Jin, Thomas Eckes
Insufficient effort responding (IER) refers to a lack of effort when answering survey or questionnaire items. Such items typically offer more than two ordered response categories, with Likert-type scales as the most prominent example. The underlying assumption is that the successive categories reflect increasing levels of the latent variable assessed. This research investigates how IER affects the intended category order of Likert-type scales, focusing on the category thresholds in the polytomous Rasch model. In a simulation study, we examined several IER patterns in datasets generated from the mixture model for IER (MMIER). The key findings were (a) random responding and overusing the non-extreme categories of a five-category scale were each associated with high frequencies of disordered category thresholds; (b) raising the IER rate from 5% to 10% led to a substantial increase in threshold disordering, particularly among easy and difficult items; (c) narrow distances between adjacent categories (0.5 logits) were associated with more frequent disordering, compared with wide distances (1.0 logits). Two real-data examples highlighted the efficiency and utility of the MMIER for detecting latent classes of respondents exhibiting different forms of IER. Under the MMIER, the frequency of disordered thresholds was reduced substantially in both examples. The discussion focuses on the practical implications of using the MMIER in survey research and points to directions for future research.
{"title":"The Impact of Insufficient Effort Responses on the Order of Category Thresholds in the Polytomous Rasch Model","authors":"Kuan-Yu Jin, Thomas Eckes","doi":"10.1177/00131644241242806","DOIUrl":"https://doi.org/10.1177/00131644241242806","url":null,"abstract":"Insufficient effort responding (IER) refers to a lack of effort when answering survey or questionnaire items. Such items typically offer more than two ordered response categories, with Likert-type scales as the most prominent example. The underlying assumption is that the successive categories reflect increasing levels of the latent variable assessed. This research investigates how IER affects the intended category order of Likert-type scales, focusing on the category thresholds in the polytomous Rasch model. In a simulation study, we examined several IER patterns in datasets generated from the mixture model for IER (MMIER). The key findings were (a) random responding and overusing the non-extreme categories of a five-category scale were each associated with high frequencies of disordered category thresholds; (b) raising the IER rate from 5% to 10% led to a substantial increase in threshold disordering, particularly among easy and difficult items; (c) narrow distances between adjacent categories (0.5 logits) were associated with more frequent disordering, compared with wide distances (1.0 logits). Two real-data examples highlighted the efficiency and utility of the MMIER for detecting latent classes of respondents exhibiting different forms of IER. Under the MMIER, the frequency of disordered thresholds was reduced substantially in both examples. The discussion focuses on the practical implications of using the MMIER in survey research and points to directions for future research.","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"116 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140581547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}