首页 > 最新文献

Educational and Psychological Measurement最新文献

英文 中文
Linear and Nonlinear Indices of Score Accuracy and Item Effectiveness for Measures That Contain Locally Dependent Items 包含局部依赖性项目的测量的得分准确性和项目有效性的线性和非线性指数
IF 2.7 3区 心理学 Q1 Social Sciences Pub Date : 2024-06-13 DOI: 10.1177/00131644241257602
P. J. Ferrando, D. Navarro-González, F. Morales-Vives
The problem of local item dependencies (LIDs) is very common in personality and attitude measures, particularly in those that measure narrow-bandwidth dimensions. At the structural level, these dependencies can be modeled by using extended factor analytic (FA) solutions that include correlated residuals. However, the effects that LIDs have on the scores based on these extended solutions have received little attention so far. Here, we propose an approach to simple sum scores, designed to assess the impact of LIDs on the accuracy and effectiveness of the scores derived from extended FA solutions with correlated residuals. The proposal is structured at three levels—(a) total score, (b) bivariate-doublet, and (c) item-by-item deletion—and considers two types of FA models: the standard linear model and the nonlinear model for ordered-categorical item responses. The current proposal is implemented in SINRELEF.LD, an R package available through CRAN. The usefulness of the proposal for item analysis is illustrated with the data of 928 participants who completed the Family Involvement Questionnaire-High School Version (FIQ-HS). The results show not only the distortion that the doublets cause in the omega reliability estimate when local independency is assumed but also the loss of information/efficiency due to the local dependencies.
局部项目依赖(LIDs)问题在人格和态度测量中非常常见,尤其是在那些测量窄带维度的测量中。在结构层面上,这些依赖性可以通过使用包含相关残差的扩展因子分析(FA)方案来建模。然而,迄今为止,LID 对基于这些扩展解的得分的影响还很少受到关注。在此,我们提出了一种简单总分的方法,旨在评估 LID 对从包含相关残差的扩展 FA 解决方案中得出的分数的准确性和有效性的影响。该建议分为三个层次--(a) 总分,(b) 双变量-双重,(c) 逐项删除,并考虑了两种 FA 模型:标准线性模型和有序分类项目反应的非线性模型。目前的建议是在 SINRELEF.LD 中实现的,SINRELEF.LD 是一个通过 CRAN 提供的 R 软件包。928 名参与者填写了 "家庭参与问卷-高中版(FIQ-HS)",我们用这些数据说明了该建议在项目分析中的实用性。结果表明,在假定局部独立的情况下,双联不仅会导致欧米茄信度估计值失真,还会因局部依赖性而损失信息/效率。
{"title":"Linear and Nonlinear Indices of Score Accuracy and Item Effectiveness for Measures That Contain Locally Dependent Items","authors":"P. J. Ferrando, D. Navarro-González, F. Morales-Vives","doi":"10.1177/00131644241257602","DOIUrl":"https://doi.org/10.1177/00131644241257602","url":null,"abstract":"The problem of local item dependencies (LIDs) is very common in personality and attitude measures, particularly in those that measure narrow-bandwidth dimensions. At the structural level, these dependencies can be modeled by using extended factor analytic (FA) solutions that include correlated residuals. However, the effects that LIDs have on the scores based on these extended solutions have received little attention so far. Here, we propose an approach to simple sum scores, designed to assess the impact of LIDs on the accuracy and effectiveness of the scores derived from extended FA solutions with correlated residuals. The proposal is structured at three levels—(a) total score, (b) bivariate-doublet, and (c) item-by-item deletion—and considers two types of FA models: the standard linear model and the nonlinear model for ordered-categorical item responses. The current proposal is implemented in SINRELEF.LD, an R package available through CRAN. The usefulness of the proposal for item analysis is illustrated with the data of 928 participants who completed the Family Involvement Questionnaire-High School Version (FIQ-HS). The results show not only the distortion that the doublets cause in the omega reliability estimate when local independency is assumed but also the loss of information/efficiency due to the local dependencies.","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141348988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Why Forced-Choice and Likert Items Provide the Same Information on Personality, Including Social Desirability. 为什么强迫选择和同类项目提供了相同的人格信息,包括社会期望
IF 2.7 3区 心理学 Q1 Social Sciences Pub Date : 2024-06-01 Epub Date: 2023-06-12 DOI: 10.1177/00131644231178721
Martin Bäckström, Fredrik Björklund

The forced-choice response format is often considered superior to the standard Likert-type format for controlling social desirability in personality inventories. We performed simulations and found that the trait information based on the two formats converges when the number of items is high and forced-choice items are mixed with regard to positively and negatively keyed items. Given that forced-choice items extract the same personality information as Likert-type items do, including socially desirable responding, other means are needed to counteract social desirability. We propose using evaluatively neutralized items in personality measurement, as they can counteract social desirability regardless of response format.

在控制人格清单中的社会期望方面,强迫选择-反应格式通常被认为优于标准的Likert型格式。我们进行了模拟,发现当项目数量高时,并且强制选择项目相对于正键和负键项目混合时,基于这两种格式的特征信息会收敛。考虑到强迫选择项目与Likert类型项目提取相同的个性信息,包括社会期望反应,需要其他手段来抵消社会期望。我们建议在人格测量中使用评价中性项目,因为无论反应形式如何,它们都可以抵消社会期望。
{"title":"Why Forced-Choice and Likert Items Provide the Same Information on Personality, Including Social Desirability.","authors":"Martin Bäckström, Fredrik Björklund","doi":"10.1177/00131644231178721","DOIUrl":"10.1177/00131644231178721","url":null,"abstract":"<p><p>The forced-choice response format is often considered superior to the standard Likert-type format for controlling social desirability in personality inventories. We performed simulations and found that the trait information based on the two formats converges when the number of items is high and forced-choice items are mixed with regard to positively and negatively keyed items. Given that forced-choice items extract the same personality information as Likert-type items do, including socially desirable responding, other means are needed to counteract social desirability. We propose using evaluatively neutralized items in personality measurement, as they can counteract social desirability regardless of response format.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11095325/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44637778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using Multiple Imputation to Account for the Uncertainty Due to Missing Data in the Context of Factor Retention. 因子保留下数据缺失不确定性的多重归算方法
IF 2.7 3区 心理学 Q1 Social Sciences Pub Date : 2024-06-01 Epub Date: 2023-06-12 DOI: 10.1177/00131644231178800
Yan Xia, Selim Havan

Although parallel analysis has been found to be an accurate method for determining the number of factors in many conditions with complete data, its application under missing data is limited. The existing literature recommends that, after using an appropriate multiple imputation method, researchers either apply parallel analysis to every imputed data set and use the number of factors suggested by most of the data copies or average the correlation matrices across all data copies, followed by applying the parallel analysis to the average correlation matrix. Both approaches for pooling the results provide a single suggested number without reflecting the uncertainty introduced by missing values. The present study proposes the use of an alternative approach, which calculates the proportion of imputed data sets that result in k (k = 1, 2, 3 . . .) factors. This approach will inform applied researchers of the degree of uncertainty due to the missingness. Results from a simulation experiment show that the proposed method can more likely suggest the correct number of factors when missingness contributes to a large amount of uncertainty.

尽管并行分析已被发现是在数据完整的许多条件下确定因素数量的准确方法,但在数据缺失的情况下,它的应用是有限的。现有文献建议,在使用适当的多重插补方法后,研究人员要么对每个插补数据集进行平行分析,并使用大多数数据副本建议的因素数量,要么对所有数据副本的相关矩阵进行平均,然后对平均相关矩阵进行平行分析。两种汇集结果的方法都提供了一个单一的建议数字,而不反映缺失值带来的不确定性。本研究建议使用一种替代方法,该方法计算导致k(k=1,2,3…)个因子的估算数据集的比例。这种方法将告知应用研究人员由于缺失而产生的不确定性程度。模拟实验结果表明,当缺失导致大量不确定性时,所提出的方法更有可能提出正确的因素数量。
{"title":"Using Multiple Imputation to Account for the Uncertainty Due to Missing Data in the Context of Factor Retention.","authors":"Yan Xia, Selim Havan","doi":"10.1177/00131644231178800","DOIUrl":"10.1177/00131644231178800","url":null,"abstract":"<p><p>Although parallel analysis has been found to be an accurate method for determining the number of factors in many conditions with complete data, its application under missing data is limited. The existing literature recommends that, after using an appropriate multiple imputation method, researchers either apply parallel analysis to every imputed data set and use the number of factors suggested by most of the data copies or average the correlation matrices across all data copies, followed by applying the parallel analysis to the average correlation matrix. Both approaches for pooling the results provide a single suggested number without reflecting the uncertainty introduced by missing values. The present study proposes the use of an alternative approach, which calculates the proportion of imputed data sets that result in <i>k</i> (<i>k</i> = 1, 2, 3 . . .) factors. This approach will inform applied researchers of the degree of uncertainty due to the missingness. Results from a simulation experiment show that the proposed method can more likely suggest the correct number of factors when missingness contributes to a large amount of uncertainty.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11095323/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46745523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating Equating Methods for Varying Levels of Form Difference. 评价不同形式差异水平的等价方法
IF 2.7 3区 心理学 Q1 Social Sciences Pub Date : 2024-06-01 Epub Date: 2023-06-08 DOI: 10.1177/00131644231176989
Ting Sun, Stella Yun Kim

Equating is a statistical procedure used to adjust for the difference in form difficulty such that scores on those forms can be used and interpreted comparably. In practice, however, equating methods are often implemented without considering the extent to which two forms differ in difficulty. The study aims to examine the effect of the magnitude of a form difficulty difference on equating results under random group (RG) and common-item nonequivalent group (CINEG) designs. Specifically, this study evaluates the performance of six equating methods under a set of simulation conditions including varying levels of form difference. Results revealed that, under the RG design, mean equating was proven to be the most accurate method when there is no or small form difference, whereas equipercentile is the most accurate method when the difficulty difference is medium or large. Under the CINEG design, Tucker Linear was found to be the most accurate method when the difficulty difference is medium or small, and either chained equipercentile or frequency estimation is preferred with a large difficulty level. This study would provide practitioners with research evidence-based guidance in the choice of equating methods with varying levels of form difference. As the condition of no form difficulty difference is also included, this study would inform testing companies of appropriate equating methods when two forms are similar in difficulty level.

等分是一种统计程序,用于调整表格难度的差异,使这些表格的分数可以使用和解释。然而,在实践中,通常在不考虑两种形式的难度差异程度的情况下实施等同方法。本研究旨在探讨在随机分组(RG)和共同项目非等效组(CINEG)设计下,表格难度差异的大小对等值结果的影响。具体来说,本研究在一组模拟条件下,包括不同程度的形式差异,评估了六种等价方法的性能。结果表明,在RG设计下,当形式差异为零或较小时,平均等值法是最准确的方法,而当难度差异为中等或较大时,等百分位法是最准确的方法。在CINEG设计下,当难度差为中等或较小时,Tucker Linear是最准确的方法,当难度较大时,首选链式等百分位法或频率估计法。本研究将为实践者在不同程度的形式差异下选择等值方法提供基于研究证据的指导。由于没有表格难度差异的情况也包括在内,本研究将告知测试公司,当两种表格的难度水平相似时,适当的等同方法。
{"title":"Evaluating Equating Methods for Varying Levels of Form Difference.","authors":"Ting Sun, Stella Yun Kim","doi":"10.1177/00131644231176989","DOIUrl":"10.1177/00131644231176989","url":null,"abstract":"<p><p>Equating is a statistical procedure used to adjust for the difference in form difficulty such that scores on those forms can be used and interpreted comparably. In practice, however, equating methods are often implemented without considering the extent to which two forms differ in difficulty. The study aims to examine the effect of the magnitude of a form difficulty difference on equating results under random group (RG) and common-item nonequivalent group (CINEG) designs. Specifically, this study evaluates the performance of six equating methods under a set of simulation conditions including varying levels of form difference. Results revealed that, under the RG design, mean equating was proven to be the most accurate method when there is no or small form difference, whereas equipercentile is the most accurate method when the difficulty difference is medium or large. Under the CINEG design, Tucker Linear was found to be the most accurate method when the difficulty difference is medium or small, and either chained equipercentile or frequency estimation is preferred with a large difficulty level. This study would provide practitioners with research evidence-based guidance in the choice of equating methods with varying levels of form difference. As the condition of no form difficulty difference is also included, this study would inform testing companies of appropriate equating methods when two forms are similar in difficulty level.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11095324/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46627790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Can People With Higher Versus Lower Scores on Impression Management or Self-Monitoring Be Identified Through Different Traces Under Faking? 印象管理或自我监控得分较高与较低的人是否可以通过作假的不同痕迹来识别?
IF 2.7 3区 心理学 Q1 Social Sciences Pub Date : 2024-06-01 Epub Date: 2023-07-02 DOI: 10.1177/00131644231182598
Jessica Röhner, Philipp Thoss, Liad Uziel

According to faking models, personality variables and faking are related. Most prominently, people's tendency to try to make an appropriate impression (impression management; IM) and their tendency to adjust the impression they make (self-monitoring; SM) have been suggested to be associated with faking. Nevertheless, empirical findings connecting these personality variables to faking have been contradictory, partly because different studies have given individuals different tests to fake and different faking directions (to fake low vs. high scores). Importantly, whereas past research has focused on faking by examining test scores, recent advances have suggested that the faking process could be better understood by analyzing individuals' responses at the item level (response pattern). Using machine learning (elastic net and random forest regression), we reanalyzed a data set (N = 260) to investigate whether individuals' faked response patterns on extraversion (features; i.e., input variables) could reveal their IM and SM scores. We found that individuals had similar response patterns when they faked, irrespective of their IM scores (excluding the faking of high scores when random forest regression was used). Elastic net and random forest regression converged in revealing that individuals higher on SM differed from individuals lower on SM in how they faked. Thus, response patterns were able to reveal individuals' SM, but not IM. Feature importance analyses showed that whereas some items were faked differently by individuals with higher versus lower SM scores, others were faked similarly. Our results imply that analyses of response patterns offer valuable new insights into the faking process.

根据伪造模型,人格变量和伪造是相关的。最突出的是,人们试图给人留下适当印象的倾向(印象管理;IM)和调整自己留下的印象的趋势(自我监控;SM)被认为与造假有关。然而,将这些人格变量与造假联系起来的实证研究结果是矛盾的,部分原因是不同的研究给了个体不同的造假测试和不同的造假方向(低分与高分)。重要的是,尽管过去的研究侧重于通过检查考试成绩来造假,但最近的进展表明,通过分析个人在项目层面的反应(反应模式),可以更好地理解造假过程。使用机器学习(弹性网和随机森林回归),我们重新分析了一个数据集(N=260),以调查个体在外向性(特征;即输入变量)上的虚假反应模式是否可以揭示他们的IM和SM得分。我们发现,无论IM得分如何,个体在伪造时都有相似的反应模式(不包括使用随机森林回归时伪造高分的情况)。弹性网和随机森林回归表明,SM水平较高的个体和SM水平较低的个体在造假方式上有所不同。因此,反应模式能够揭示个体的SM,但不能揭示IM。特征重要性分析表明,尽管SM得分较高和较低的人对某些项目的伪造方式不同,但其他项目的伪造情况相似。我们的研究结果表明,对反应模式的分析为伪造过程提供了有价值的新见解。
{"title":"Can People With Higher Versus Lower Scores on Impression Management or Self-Monitoring Be Identified Through Different Traces Under Faking?","authors":"Jessica Röhner, Philipp Thoss, Liad Uziel","doi":"10.1177/00131644231182598","DOIUrl":"10.1177/00131644231182598","url":null,"abstract":"<p><p>According to faking models, personality variables and faking are related. Most prominently, people's tendency to try to make an appropriate impression (impression management; IM) and their tendency to adjust the impression they make (self-monitoring; SM) have been suggested to be associated with faking. Nevertheless, empirical findings connecting these personality variables to faking have been contradictory, partly because different studies have given individuals different tests to fake and different faking directions (to fake low vs. high scores). Importantly, whereas past research has focused on faking by examining test scores, recent advances have suggested that the faking process could be better understood by analyzing individuals' responses at the item level (response pattern). Using machine learning (elastic net and random forest regression), we reanalyzed a data set (<i>N</i> = 260) to investigate whether individuals' faked response patterns on extraversion (features; i.e., input variables) could reveal their IM and SM scores. We found that individuals had similar response patterns when they faked, irrespective of their IM scores (excluding the faking of high scores when random forest regression was used). Elastic net and random forest regression converged in revealing that individuals higher on SM differed from individuals lower on SM in how they faked. Thus, response patterns were able to reveal individuals' SM, but not IM. Feature importance analyses showed that whereas some items were faked differently by individuals with higher versus lower SM scores, others were faked similarly. Our results imply that analyses of response patterns offer valuable new insights into the faking process.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11095321/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47440034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Item Response Theory Model for Incorporating Response Times in Forced-Choice Measures. 在强迫选择措施中纳入反应时间的项目反应理论模型
IF 2.7 3区 心理学 Q1 Social Sciences Pub Date : 2024-06-01 Epub Date: 2023-06-04 DOI: 10.1177/00131644231171193
Zhichen Guo, Daxun Wang, Yan Cai, Dongbo Tu

Forced-choice (FC) measures have been widely used in many personality or attitude tests as an alternative to rating scales, which employ comparative rather than absolute judgments. Several response biases, such as social desirability, response styles, and acquiescence bias, can be reduced effectively. Another type of data linked with comparative judgments is response time (RT), which contains potential information concerning respondents' decision-making process. It would be challenging but exciting to combine RT into FC measures better to reveal respondents' behaviors or preferences in personality measurement. Given this situation, this study aims to propose a new item response theory (IRT) model that incorporates RT into FC measures to improve personality assessment. Simulation studies show that the proposed model can effectively improve the estimation accuracy of personality traits with the ancillary information contained in RT. Also, an application on a real data set reveals that the proposed model estimates similar but different parameter values compared with the conventional Thurstonian IRT model. The RT information can explain these differences.

强迫选择(FC)措施已广泛用于许多人格或态度测试中,作为评定量表的替代方案,评定量表采用比较而不是绝对判断。一些反应偏差,如社会可取性、反应风格和默认偏差,可以有效地减少。与比较判断相关联的另一种类型的数据是反应时间(RT),它包含有关应答者决策过程的潜在信息。将RT与FC相结合,更好地揭示被调查者在人格测量中的行为或偏好,是一项具有挑战性但又令人兴奋的工作。鉴于此,本研究旨在提出一种新的项目反应理论(IRT)模型,该模型将RT纳入FC测量,以改善人格评估。仿真研究表明,该模型可以有效地提高人格特征的估计精度,同时,在实际数据集上的应用表明,与传统的Thurstonian IRT模型相比,该模型估计的参数值相似但不同。RT信息可以解释这些差异。
{"title":"An Item Response Theory Model for Incorporating Response Times in Forced-Choice Measures.","authors":"Zhichen Guo, Daxun Wang, Yan Cai, Dongbo Tu","doi":"10.1177/00131644231171193","DOIUrl":"10.1177/00131644231171193","url":null,"abstract":"<p><p>Forced-choice (FC) measures have been widely used in many personality or attitude tests as an alternative to rating scales, which employ comparative rather than absolute judgments. Several response biases, such as social desirability, response styles, and acquiescence bias, can be reduced effectively. Another type of data linked with comparative judgments is response time (RT), which contains potential information concerning respondents' decision-making process. It would be challenging but exciting to combine RT into FC measures better to reveal respondents' behaviors or preferences in personality measurement. Given this situation, this study aims to propose a new item response theory (IRT) model that incorporates RT into FC measures to improve personality assessment. Simulation studies show that the proposed model can effectively improve the estimation accuracy of personality traits with the ancillary information contained in RT. Also, an application on a real data set reveals that the proposed model estimates similar but different parameter values compared with the conventional Thurstonian IRT model. The RT information can explain these differences.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11095319/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43885429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Measuring Unipolar Traits With Continuous Response Items: Some Methodological and Substantive Developments. 用连续反应项目测量单极特质:一些方法和实质性的发展
IF 2.7 3区 心理学 Q1 Social Sciences Pub Date : 2024-06-01 Epub Date: 2023-06-26 DOI: 10.1177/00131644231181889
Pere J Ferrando, Fabia Morales-Vives, Ana Hernández-Dorado

In recent years, some models for binary and graded format responses have been proposed to assess unipolar variables or "quasi-traits." These studies have mainly focused on clinical variables that have traditionally been treated as bipolar traits. In the present study, we have made a proposal for unipolar traits measured with continuous response items. The proposed log-logistic continuous unipolar model (LL-C) is remarkably simple and is more similar to the original binary formulation than the graded extensions, which is an advantage. Furthermore, considering that irrational, extreme, or polarizing beliefs could be another domain of unipolar variables, we have applied this proposal to an empirical example of superstitious beliefs. The results suggest that, in certain cases, the standard linear model can be a good approximation to the LL-C model in terms of parameter estimation and goodness of fit, but not trait estimates and their accuracy. The results also show the importance of considering the unipolar nature of this kind of trait when predicting criterion variables, since the validity results were clearly different.

近年来,已经提出了一些二元和分级格式响应模型来评估单极变量或“准特征”。这些研究主要集中在传统上被视为双相特征的临床变量上。在本研究中,我们提出了用连续反应项目测量单极特质的建议。所提出的对数-逻辑连续单极模型(LL-C)非常简单,比分级扩展更接近原始二进制公式,这是一个优点。此外,考虑到非理性、极端或两极分化的信仰可能是单极变量的另一个领域,我们将这一建议应用于迷信信仰的经验例子。结果表明,在某些情况下,标准线性模型在参数估计和拟合优度方面可以很好地近似于LL-C模型,但不能很好地近似于性状估计及其精度。结果还表明,由于效度结果明显不同,在预测标准变量时考虑这类特质的单极性是很重要的。
{"title":"Measuring Unipolar Traits With Continuous Response Items: Some Methodological and Substantive Developments.","authors":"Pere J Ferrando, Fabia Morales-Vives, Ana Hernández-Dorado","doi":"10.1177/00131644231181889","DOIUrl":"10.1177/00131644231181889","url":null,"abstract":"<p><p>In recent years, some models for binary and graded format responses have been proposed to assess unipolar variables or \"quasi-traits.\" These studies have mainly focused on clinical variables that have traditionally been treated as bipolar traits. In the present study, we have made a proposal for unipolar traits measured with continuous response items. The proposed log-logistic continuous unipolar model (LL-C) is remarkably simple and is more similar to the original binary formulation than the graded extensions, which is an advantage. Furthermore, considering that irrational, extreme, or polarizing beliefs could be another domain of unipolar variables, we have applied this proposal to an empirical example of superstitious beliefs. The results suggest that, in certain cases, the standard linear model can be a good approximation to the LL-C model in terms of parameter estimation and goodness of fit, but not trait estimates and their accuracy. The results also show the importance of considering the unipolar nature of this kind of trait when predicting criterion variables, since the validity results were clearly different.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11095320/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42691490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Wald χ2 Test for Differential Item Functioning Detection with Polytomous Items in Multilevel Data. Wald χ2检验在多水平数据中多同构项目的差异项目功能检测
IF 2.7 3区 心理学 Q1 Social Sciences Pub Date : 2024-06-01 Epub Date: 2023-07-11 DOI: 10.1177/00131644231181688
Sijia Huang, Dubravka Svetina Valdivia

Identifying items with differential item functioning (DIF) in an assessment is a crucial step for achieving equitable measurement. One critical issue that has not been fully addressed with existing studies is how DIF items can be detected when data are multilevel. In the present study, we introduced a Lord's Wald χ2 test-based procedure for detecting both uniform and non-uniform DIF with polytomous items in the presence of the ubiquitous multilevel data structure. The proposed approach is a multilevel extension of a two-stage procedure, which identifies anchor items in its first stage and formally evaluates candidate items in the second stage. We applied the Metropolis-Hastings Robbins-Monro (MH-RM) algorithm to estimate multilevel polytomous item response theory (IRT) models and to obtain accurate covariance matrices. To evaluate the performance of the proposed approach, we conducted a preliminary simulation study that considered various conditions to mimic real-world scenarios. The simulation results indicated that the proposed approach has great power for identifying DIF items and well controls the Type I error rate. Limitations and future research directions were also discussed.

在评估中识别具有差异项目功能(DIF)的项目是实现公平衡量的关键步骤。现有研究尚未完全解决的一个关键问题是,当数据是多级的时,如何检测DIF项目。在本研究中,我们介绍了一种基于Lord's Wald[公式:见正文]测试的程序,用于在普遍存在的多级数据结构的情况下,检测具有多同调项的一致和非一致DIF。所提出的方法是两阶段程序的多级扩展,该程序在第一阶段识别锚项目,并在第二阶段正式评估候选项目。我们应用Metropolis–Hastings–Robbins–Monro(MH-RM)算法来估计多水平多模项目反应理论(IRT)模型,并获得准确的协方差矩阵。为了评估所提出方法的性能,我们进行了一项初步的模拟研究,考虑了各种条件来模拟真实世界的场景。仿真结果表明,该方法具有较强的DIF项目识别能力,并能很好地控制I类错误率。还讨论了局限性和未来的研究方向。
{"title":"Wald χ<sup>2</sup> Test for Differential Item Functioning Detection with Polytomous Items in Multilevel Data.","authors":"Sijia Huang, Dubravka Svetina Valdivia","doi":"10.1177/00131644231181688","DOIUrl":"10.1177/00131644231181688","url":null,"abstract":"<p><p>Identifying items with differential item functioning (DIF) in an assessment is a crucial step for achieving equitable measurement. One critical issue that has not been fully addressed with existing studies is how DIF items can be detected when data are multilevel. In the present study, we introduced a Lord's Wald <math><mrow><msup><mrow><mi>χ</mi></mrow><mrow><mn>2</mn></mrow></msup></mrow></math> test-based procedure for detecting both uniform and non-uniform DIF with polytomous items in the presence of the ubiquitous multilevel data structure. The proposed approach is a multilevel extension of a two-stage procedure, which identifies anchor items in its first stage and formally evaluates candidate items in the second stage. We applied the Metropolis-Hastings Robbins-Monro (MH-RM) algorithm to estimate multilevel polytomous item response theory (IRT) models and to obtain accurate covariance matrices. To evaluate the performance of the proposed approach, we conducted a preliminary simulation study that considered various conditions to mimic real-world scenarios. The simulation results indicated that the proposed approach has great power for identifying DIF items and well controls the Type I error rate. Limitations and future research directions were also discussed.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11095326/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42032084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Evaluation of Fit Indices Used in Model Selection of Dichotomous Mixture IRT Models. 二分类混合IRT模型选择的拟合指标评价
IF 2.7 3区 心理学 Q1 Social Sciences Pub Date : 2024-06-01 Epub Date: 2023-06-26 DOI: 10.1177/00131644231180529
Sedat Sen, Allan S Cohen

A Monte Carlo simulation study was conducted to compare fit indices used for detecting the correct latent class in three dichotomous mixture item response theory (IRT) models. Ten indices were considered: Akaike's information criterion (AIC), the corrected AIC (AICc), Bayesian information criterion (BIC), consistent AIC (CAIC), Draper's information criterion (DIC), sample size adjusted BIC (SABIC), relative entropy, the integrated classification likelihood criterion (ICL-BIC), the adjusted Lo-Mendell-Rubin (LMR), and Vuong-Lo-Mendell-Rubin (VLMR). The accuracy of the fit indices was assessed for correct detection of the number of latent classes for different simulation conditions including sample size (2,500 and 5,000), test length (15, 30, and 45), mixture proportions (equal and unequal), number of latent classes (2, 3, and 4), and latent class separation (no-separation and small separation). Simulation study results indicated that as the number of examinees or number of items increased, correct identification rates also increased for most of the indices. Correct identification rates by the different fit indices, however, decreased as the number of estimated latent classes or parameters (i.e., model complexity) increased. Results were good for BIC, CAIC, DIC, SABIC, ICL-BIC, LMR, and VLMR, and the relative entropy index tended to select correct models most of the time. Consistent with previous studies, AIC and AICc showed poor performance. Most of these indices had limited utility for three-class and four-class mixture 3PL model conditions.

采用蒙特卡罗模拟方法比较了三种二元混合项目反应理论模型中用于检测正确潜在类别的拟合指标。考虑10个指标:Akaike信息准则(AIC)、修正AIC (AICc)、贝叶斯信息准则(BIC)、一致性AIC (CAIC)、Draper信息准则(DIC)、样本量调整BIC (SABIC)、相对熵、综合分类似然准则(ICL-BIC)、调整Lo-Mendell-Rubin (LMR)和Vuong-Lo-Mendell-Rubin (VLMR)。评估拟合指标的准确性,以正确检测不同模拟条件下的潜在类别数量,包括样本量(2,500和5,000)、测试长度(15,30和45)、混合比例(相等和不相等)、潜在类别数量(2,3和4)和潜在类别分离(无分离和小分离)。模拟研究结果表明,随着考生人数或题项数量的增加,大部分指标的正确率也随之增加。然而,不同拟合指标的正确识别率随着估计的潜在类别或参数数量(即模型复杂性)的增加而降低。结果表明,BIC、CAIC、DIC、SABIC、ICL-BIC、LMR和VLMR模型均较好,且相对熵指数在大多数情况下倾向于选择正确的模型。与以往的研究一致,AIC和AICc表现不佳。这些指标大多对三级和四级混合3PL模型条件的效用有限。
{"title":"An Evaluation of Fit Indices Used in Model Selection of Dichotomous Mixture IRT Models.","authors":"Sedat Sen, Allan S Cohen","doi":"10.1177/00131644231180529","DOIUrl":"10.1177/00131644231180529","url":null,"abstract":"<p><p>A Monte Carlo simulation study was conducted to compare fit indices used for detecting the correct latent class in three dichotomous mixture item response theory (IRT) models. Ten indices were considered: Akaike's information criterion (AIC), the corrected AIC (AICc), Bayesian information criterion (BIC), consistent AIC (CAIC), Draper's information criterion (DIC), sample size adjusted BIC (SABIC), relative entropy, the integrated classification likelihood criterion (ICL-BIC), the adjusted Lo-Mendell-Rubin (LMR), and Vuong-Lo-Mendell-Rubin (VLMR). The accuracy of the fit indices was assessed for correct detection of the number of latent classes for different simulation conditions including sample size (2,500 and 5,000), test length (15, 30, and 45), mixture proportions (equal and unequal), number of latent classes (2, 3, and 4), and latent class separation (no-separation and small separation). Simulation study results indicated that as the number of examinees or number of items increased, correct identification rates also increased for most of the indices. Correct identification rates by the different fit indices, however, decreased as the number of estimated latent classes or parameters (i.e., model complexity) increased. Results were good for BIC, CAIC, DIC, SABIC, ICL-BIC, LMR, and VLMR, and the relative entropy index tended to select correct models most of the time. Consistent with previous studies, AIC and AICc showed poor performance. Most of these indices had limited utility for three-class and four-class mixture 3PL model conditions.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11095322/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46075824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing the Detection of Social Desirability Bias Using Machine Learning: A Novel Application of Person-Fit Indices 利用机器学习加强对社会可取性偏见的检测:拟人指数的新应用
IF 2.7 3区 心理学 Q1 Social Sciences Pub Date : 2024-05-30 DOI: 10.1177/00131644241255109
Sanaz Nazari, Walter L. Leite, A. Corinne Huggins-Manley
Social desirability bias (SDB) is a common threat to the validity of conclusions from responses to a scale or survey. There is a wide range of person-fit statistics in the literature that can be employed to detect SDB. In addition, machine learning classifiers, such as logistic regression and random forest, have the potential to distinguish between biased and unbiased responses. This study proposes a new application of these classifiers to detect SDB by considering several person-fit indices as features or predictors in the machine learning methods. The results of a Monte Carlo simulation study showed that for a single feature, applying person-fit indices directly and logistic regression led to similar classification results. However, the random forest classifier improved the classification of biased and unbiased responses substantially. Classification was improved in both logistic regression and random forest by considering multiple features simultaneously. Moreover, cross-validation indicated stable area under the curves (AUCs) across machine learning classifiers. A didactical illustration of applying random forest to detect SDB is presented.
社会可取性偏差(SDB)是一种常见的威胁,会影响从量表或调查中得出的结论的有效性。文献中有多种拟人统计方法可用于检测 SDB。此外,机器学习分类器(如逻辑回归和随机森林)也有可能区分有偏见和无偏见的回答。本研究提出了将这些分类器应用于检测 SDB 的新方法,即在机器学习方法中考虑几个人称拟合指数作为特征或预测因子。蒙特卡罗模拟研究结果表明,对于单一特征,直接应用人称拟合指数和逻辑回归的分类结果相似。不过,随机森林分类器大大提高了有偏差和无偏差响应的分类效果。通过同时考虑多个特征,逻辑回归和随机森林分类器的分类效果都得到了改善。此外,交叉验证表明,各种机器学习分类器的曲线下面积(AUC)都很稳定。本文介绍了应用随机森林检测 SDB 的教学示例。
{"title":"Enhancing the Detection of Social Desirability Bias Using Machine Learning: A Novel Application of Person-Fit Indices","authors":"Sanaz Nazari, Walter L. Leite, A. Corinne Huggins-Manley","doi":"10.1177/00131644241255109","DOIUrl":"https://doi.org/10.1177/00131644241255109","url":null,"abstract":"Social desirability bias (SDB) is a common threat to the validity of conclusions from responses to a scale or survey. There is a wide range of person-fit statistics in the literature that can be employed to detect SDB. In addition, machine learning classifiers, such as logistic regression and random forest, have the potential to distinguish between biased and unbiased responses. This study proposes a new application of these classifiers to detect SDB by considering several person-fit indices as features or predictors in the machine learning methods. The results of a Monte Carlo simulation study showed that for a single feature, applying person-fit indices directly and logistic regression led to similar classification results. However, the random forest classifier improved the classification of biased and unbiased responses substantially. Classification was improved in both logistic regression and random forest by considering multiple features simultaneously. Moreover, cross-validation indicated stable area under the curves (AUCs) across machine learning classifiers. A didactical illustration of applying random forest to detect SDB is presented.","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141188132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Educational and Psychological Measurement
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1