首页 > 最新文献

Educational and Psychological Measurement最新文献

英文 中文
Procedures for Analyzing Multidimensional Mixture Data. 多维混合数据分析程序
IF 2.7 3区 心理学 Q1 Social Sciences Pub Date : 2023-12-01 Epub Date: 2023-02-16 DOI: 10.1177/00131644231151470
Hsu-Lin Su, Po-Hsi Chen

The multidimensional mixture data structure exists in many test (or inventory) conditions. Heterogeneity also relatively exists in populations. Still, some researchers are interested in deciding to which subpopulation a participant belongs according to the participant's factor pattern. Thus, in this study, we proposed three analysis procedures based on the factor mixture model to analyze data in the multidimensional mixture context. Simulations were manipulated with different levels of factor numbers, factor correlations, numbers of latent classes, and class separation. Issues with regard to model selection were discussed at first. The results showed that in the two-class situations the procedures of "factor structure first then class number" (Procedure 1) and "factor structure and class number considered simultaneously" (Procedure 3) performed better than the "class number first then factor structure" (Procedure 2) and yielded precise parameter estimation and classification accuracy. It would be appropriate to choose Procedures 1 and 3 when strong measurement invariance is assumed while using an information criterion, but Procedure 1 saved more time than Procedure 3. In the three-class situations, the performance of all three procedures was limited. Implementations and suggestions have been addressed in this research.

多维混合物数据结构存在于许多测试(或库存)条件下。种群中也相对存在异质性。尽管如此,一些研究人员还是有兴趣根据参与者的因素模式来决定参与者属于哪个亚群体。因此,在本研究中,我们提出了三种基于因子混合模型的分析程序来分析多维混合背景下的数据。模拟采用不同水平的因子数、因子相关性、潜在类数和类分离进行操作。首先讨论了有关型号选择的问题。结果表明,在两种分类情况下,“因子结构先类数”(程序1)和“因子结构和类数同时考虑”(程序3)的处理效果优于“类数先因子结构”(程序2),并产生了精确的参数估计和分类精度。当在使用信息标准时假设强测量不变性时,选择程序1和3是合适的,但程序1比程序3节省了更多的时间。在这三种情况下,所有三种程序的执行都是有限的。在这项研究中提出了实施和建议。
{"title":"Procedures for Analyzing Multidimensional Mixture Data.","authors":"Hsu-Lin Su, Po-Hsi Chen","doi":"10.1177/00131644231151470","DOIUrl":"10.1177/00131644231151470","url":null,"abstract":"<p><p>The multidimensional mixture data structure exists in many test (or inventory) conditions. Heterogeneity also relatively exists in populations. Still, some researchers are interested in deciding to which subpopulation a participant belongs according to the participant's factor pattern. Thus, in this study, we proposed three analysis procedures based on the factor mixture model to analyze data in the multidimensional mixture context. Simulations were manipulated with different levels of factor numbers, factor correlations, numbers of latent classes, and class separation. Issues with regard to model selection were discussed at first. The results showed that in the two-class situations the procedures of \"factor structure first then class number\" (Procedure 1) and \"factor structure and class number considered simultaneously\" (Procedure 3) performed better than the \"class number first then factor structure\" (Procedure 2) and yielded precise parameter estimation and classification accuracy. It would be appropriate to choose Procedures 1 and 3 when strong measurement invariance is assumed while using an information criterion, but Procedure 1 saved more time than Procedure 3. In the three-class situations, the performance of all three procedures was limited. Implementations and suggestions have been addressed in this research.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10638979/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48059643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Explanatory Multidimensional Random Item Effects Rating Scale Model. 一个解释性的多维随机项目效果评定量表模型
IF 2.7 3区 心理学 Q1 Social Sciences Pub Date : 2023-12-01 Epub Date: 2022-12-13 DOI: 10.1177/00131644221140906
Sijia Huang, Jinwen Jevan Luo, Li Cai

Random item effects item response theory (IRT) models, which treat both person and item effects as random, have received much attention for more than a decade. The random item effects approach has several advantages in many practical settings. The present study introduced an explanatory multidimensional random item effects rating scale model. The proposed model was formulated under a novel parameterization of the nominal response model (NRM), and allows for flexible inclusion of person-related and item-related covariates (e.g., person characteristics and item features) to study their impacts on the person and item latent variables. A new variant of the Metropolis-Hastings Robbins-Monro (MH-RM) algorithm designed for latent variable models with crossed random effects was applied to obtain parameter estimates for the proposed model. A preliminary simulation study was conducted to evaluate the performance of the MH-RM algorithm for estimating the proposed model. Results indicated that the model parameters were well recovered. An empirical data set was analyzed to further illustrate the usage of the proposed model.

随机项目效应-项目反应理论(IRT)模型将人和项目效应都视为随机的,十多年来一直备受关注。随机项目效果方法在许多实际环境中具有几个优点。本研究引入了一个解释性多维随机项目效应评分量表模型。所提出的模型是在名义反应模型(NRM)的新参数化下制定的,并允许灵活地包含与人和项目相关的协变量(例如,人特征和项目特征),以研究它们对人和项目潜在变量的影响。应用为具有交叉随机效应的潜变量模型设计的Metropolis Hastings-Robbins-Monro(MH-RM)算法的新变体来获得所提出模型的参数估计。进行了初步的仿真研究,以评估MH-RM算法用于估计所提出的模型的性能。结果表明,模型参数恢复良好。分析了一个经验数据集,以进一步说明所提出的模型的使用。
{"title":"An Explanatory Multidimensional Random Item Effects Rating Scale Model.","authors":"Sijia Huang, Jinwen Jevan Luo, Li Cai","doi":"10.1177/00131644221140906","DOIUrl":"10.1177/00131644221140906","url":null,"abstract":"<p><p>Random item effects item response theory (IRT) models, which treat both person and item effects as random, have received much attention for more than a decade. The random item effects approach has several advantages in many practical settings. The present study introduced an explanatory multidimensional random item effects rating scale model. The proposed model was formulated under a novel parameterization of the nominal response model (NRM), and allows for flexible inclusion of person-related and item-related covariates (e.g., person characteristics and item features) to study their impacts on the person and item latent variables. A new variant of the Metropolis-Hastings Robbins-Monro (MH-RM) algorithm designed for latent variable models with crossed random effects was applied to obtain parameter estimates for the proposed model. A preliminary simulation study was conducted to evaluate the performance of the MH-RM algorithm for estimating the proposed model. Results indicated that the model parameters were well recovered. An empirical data set was analyzed to further illustrate the usage of the proposed model.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10638980/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41340323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Functional Approaches for Modeling Unfolding Data. 展开数据建模的函数方法
IF 2.7 3区 心理学 Q1 Social Sciences Pub Date : 2023-12-01 Epub Date: 2023-01-05 DOI: 10.1177/00131644221143474
George Engelhard

The purpose of this study is to introduce a functional approach for modeling unfolding response data. Functional data analysis (FDA) has been used for examining cumulative item response data, but a functional approach has not been systematically used with unfolding response processes. A brief overview of FDA is presented and illustrated within the context of unfolding data. Seven decision parameters are described that can provide a guide to conducting FDA in this context. These decision parameters are illustrated with real data using two scales that are designed to measure attitude toward capital punishment and attitude toward censorship. The analyses suggest that FDA offers a useful set of tools for examining unfolding response processes.

本研究的目的是介绍一种用于建模展开响应数据的函数方法。功能数据分析(FDA)已用于检查累积项目响应数据,但功能方法尚未系统地用于展开响应过程。在展开数据的背景下,对美国食品药品监督管理局进行了简要概述和说明。描述了七个决策参数,这些参数可以为在这种情况下进行FDA提供指导。这些决策参数通过使用两个量表的真实数据进行了说明,这两个量旨在衡量对死刑的态度和对审查的态度。分析表明,美国食品药品监督管理局提供了一套有用的工具来检查正在展开的反应过程。
{"title":"Functional Approaches for Modeling Unfolding Data.","authors":"George Engelhard","doi":"10.1177/00131644221143474","DOIUrl":"10.1177/00131644221143474","url":null,"abstract":"<p><p>The purpose of this study is to introduce a functional approach for modeling unfolding response data. Functional data analysis (FDA) has been used for examining cumulative item response data, but a functional approach has not been systematically used with unfolding response processes. A brief overview of FDA is presented and illustrated within the context of unfolding data. Seven decision parameters are described that can provide a guide to conducting FDA in this context. These decision parameters are illustrated with real data using two scales that are designed to measure attitude toward capital punishment and attitude toward censorship. The analyses suggest that FDA offers a useful set of tools for examining unfolding response processes.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10638986/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42770061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating the Effects of Missing Data Handling Methods on Scale Linking Accuracy. 评估缺失数据处理方法对比例尺连接精度的影响
IF 2.7 3区 心理学 Q1 Social Sciences Pub Date : 2023-12-01 Epub Date: 2022-12-09 DOI: 10.1177/00131644221140941
Tong Wu, Stella Y Kim, Carl Westine

For large-scale assessments, data are often collected with missing responses. Despite the wide use of item response theory (IRT) in many testing programs, however, the existing literature offers little insight into the effectiveness of various approaches to handling missing responses in the context of scale linking. Scale linking is commonly used in large-scale assessments to maintain scale comparability over multiple forms of a test. Under a common-item nonequivalent group design (CINEG), missing data that occur to common items potentially influence the linking coefficients and, consequently, may affect scale comparability, test validity, and reliability. The objective of this study was to evaluate the effect of six missing data handling approaches, including listwise deletion (LWD), treating missing data as incorrect responses (IN), corrected item mean imputation (CM), imputing with a response function (RF), multiple imputation (MI), and full information likelihood information (FIML), on IRT scale linking accuracy when missing data occur to common items. Under a set of simulation conditions, the relative performance of the six missing data treatment methods under two missing mechanisms was explored. Results showed that RF, MI, and FIML produced less errors for conducting scale linking whereas LWD was associated with the most errors regardless of various testing conditions.

对于大规模评估而言,收集的数据往往缺少答复。然而,尽管在许多测试项目中广泛使用了项目反应理论(IRT),但现有文献很少深入了解在量表链接的背景下处理缺失反应的各种方法的有效性。量表链接通常用于大规模评估,以保持多种测试形式的量表可比性。在共同项目非等价组设计(CINEG)下,共同项目出现的数据缺失可能会影响链接系数,从而可能影响量表的可比性、测试有效性和可靠性。本研究的目的是评估六种缺失数据处理方法的效果,包括列表删除(LWD)、将缺失数据视为错误响应(IN)、校正项目平均值插补(CM)、响应函数插补(RF)、多重插补(MI)和全信息似然信息(FIML),当常见项目出现数据丢失时,IRT级别的链接准确性。在一组模拟条件下,探讨了六种缺失数据处理方法在两种缺失机制下的相对性能。结果表明,无论各种测试条件如何,RF、MI和FIML在进行标度连接时产生的误差较小,而LWD产生的误差最大。
{"title":"Evaluating the Effects of Missing Data Handling Methods on Scale Linking Accuracy.","authors":"Tong Wu, Stella Y Kim, Carl Westine","doi":"10.1177/00131644221140941","DOIUrl":"10.1177/00131644221140941","url":null,"abstract":"<p><p>For large-scale assessments, data are often collected with missing responses. Despite the wide use of item response theory (IRT) in many testing programs, however, the existing literature offers little insight into the effectiveness of various approaches to handling missing responses in the context of scale linking. Scale linking is commonly used in large-scale assessments to maintain scale comparability over multiple forms of a test. Under a common-item nonequivalent group design (CINEG), missing data that occur to common items potentially influence the linking coefficients and, consequently, may affect scale comparability, test validity, and reliability. The objective of this study was to evaluate the effect of six missing data handling approaches, including listwise deletion (LWD), treating missing data as incorrect responses (IN), corrected item mean imputation (CM), imputing with a response function (RF), multiple imputation (MI), and full information likelihood information (FIML), on IRT scale linking accuracy when missing data occur to common items. Under a set of simulation conditions, the relative performance of the six missing data treatment methods under two missing mechanisms was explored. Results showed that RF, MI, and FIML produced less errors for conducting scale linking whereas LWD was associated with the most errors regardless of various testing conditions.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10638981/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49647903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Why Do Regular and Reversed Items Load on Separate Factors? Response Difficulty vs. Item Extremity. 为什么常规项目和反向项目分别加载因子?反应难度与项目极端
IF 2.7 3区 心理学 Q1 Social Sciences Pub Date : 2023-12-01 Epub Date: 2023-01-02 DOI: 10.1177/00131644221143972
Chester Chun Seng Kam

When constructing measurement scales, regular and reversed items are often used (e.g., "I am satisfied with my job"/"I am not satisfied with my job"). Some methodologists recommend excluding reversed items because they are more difficult to understand and therefore engender a second, artificial factor distinct from the regular-item factor. The current study compares two explanations for why a construct's dimensionality may become distorted: response difficulty and item extremity. Two types of reversed items were created: negation items ("The conditions of my life are not good") and polar opposites ("The conditions of my life are bad"), with the former type having higher response difficulty. When extreme wording was used (e.g., "excellent/terrible" instead of "good/bad"), negation items did not load on a factor distinct from regular items, but polar opposites did. Results thus support item extremity over response difficulty as an explanation for dimensionality distortion. Given that scale developers seldom check for extremity, it is unsurprising that regular and polar opposite items often load on distinct factors.

在构建测量量表时,经常使用规则和反向的项目(例如,“我对我的工作满意”/“我对我的工作不满意”)。一些方法学家建议排除反向项目,因为它们更难以理解,因此会产生与常规项目因素不同的第二个人为因素。目前的研究比较了两种解释为什么一个结构的维度可能会扭曲:反应困难和项目极端。我们创造了两种类型的反题:否定题(“我的生活条件不好”)和两极对立题(“我的生活条件很差”),前者具有更高的反应难度。当使用极端的措辞时(例如,“优秀/糟糕”而不是“好/坏”),否定项不会与常规项产生不同的影响,但极性相反。因此,结果支持项目极端化而非反应困难作为维度扭曲的解释。考虑到规模开发者很少检查极端情况,所以规则和极性相反的项目通常加载不同的因素也就不足为奇了。
{"title":"Why Do Regular and Reversed Items Load on Separate Factors? Response Difficulty vs. Item Extremity.","authors":"Chester Chun Seng Kam","doi":"10.1177/00131644221143972","DOIUrl":"10.1177/00131644221143972","url":null,"abstract":"<p><p>When constructing measurement scales, regular and reversed items are often used (e.g., \"I am satisfied with my job\"/\"I am not satisfied with my job\"). Some methodologists recommend excluding reversed items because they are more difficult to understand and therefore engender a second, artificial factor distinct from the regular-item factor. The current study compares two explanations for why a construct's dimensionality may become distorted: response difficulty and item extremity. Two types of reversed items were created: negation items (\"The conditions of my life are not good\") and polar opposites (\"The conditions of my life are bad\"), with the former type having higher response difficulty. When extreme wording was used (e.g., \"excellent/terrible\" instead of \"good/bad\"), negation items did not load on a factor distinct from regular items, but polar opposites did. Results thus support item extremity over response difficulty as an explanation for dimensionality distortion. Given that scale developers seldom check for extremity, it is unsurprising that regular and polar opposite items often load on distinct factors.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10638982/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42489941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
On Modeling Missing Data in Structural Investigations Based on Tetrachoric Correlations With Free and Fixed Factor Loadings. 基于自由因子和固定因子载荷四分频相关的结构调查缺失数据建模
IF 2.7 3区 心理学 Q1 Social Sciences Pub Date : 2023-12-01 Epub Date: 2022-12-20 DOI: 10.1177/00131644221143145
Karl Schweizer, Andreas Gold, Dorothea Krampen

In modeling missing data, the missing data latent variable of the confirmatory factor model accounts for systematic variation associated with missing data so that replacement of what is missing is not required. This study aimed at extending the modeling missing data approach to tetrachoric correlations as input and at exploring the consequences of switching between models with free and fixed factor loadings. In a simulation study, confirmatory factor analysis (CFA) models with and without a missing data latent variable were used for investigating the structure of data with and without missing data. In addition, the numbers of columns of data sets with missing data and the amount of missing data were varied. The root mean square error of approximation (RMSEA) results revealed that an additional missing data latent variable recovered the degree-of-model fit characterizing complete data when tetrachoric correlations served as input while comparative fit index (CFI) results showed overestimation of this degree-of-model fit. Whereas the results for fixed factor loadings were in line with the assumptions of modeling missing data, the other results showed only partial agreement. Therefore, modeling missing data with fixed factor loadings is recommended.

在对缺失数据进行建模时,验证性因素模型的缺失数据潜变量解释了与缺失数据相关的系统变化,因此不需要替换缺失的数据。本研究旨在将建模缺失数据的方法扩展到四水平相关性作为输入,并探索在具有自由和固定因子负载的模型之间切换的后果。在一项模拟研究中,验证性因素分析(CFA)模型用于研究有无数据缺失的数据结构。此外,有缺失数据的数据集的列数和缺失数据的数量各不相同。近似均方根误差(RMSEA)结果表明,当四元相关性作为输入时,额外的缺失数据潜变量恢复了表征完整数据的模型拟合程度,而比较拟合指数(CFI)结果显示对该模型拟合程度的高估。固定因子载荷的结果与缺失数据建模的假设一致,而其他结果仅显示出部分一致性。因此,建议使用固定因子载荷对缺失数据进行建模。
{"title":"On Modeling Missing Data in Structural Investigations Based on Tetrachoric Correlations With Free and Fixed Factor Loadings.","authors":"Karl Schweizer, Andreas Gold, Dorothea Krampen","doi":"10.1177/00131644221143145","DOIUrl":"10.1177/00131644221143145","url":null,"abstract":"<p><p>In modeling missing data, the missing data latent variable of the confirmatory factor model accounts for systematic variation associated with missing data so that replacement of what is missing is not required. This study aimed at extending the modeling missing data approach to tetrachoric correlations as input and at exploring the consequences of switching between models with free and fixed factor loadings. In a simulation study, confirmatory factor analysis (CFA) models with and without a missing data latent variable were used for investigating the structure of data with and without missing data. In addition, the numbers of columns of data sets with missing data and the amount of missing data were varied. The root mean square error of approximation (RMSEA) results revealed that an additional missing data latent variable recovered the degree-of-model fit characterizing complete data when tetrachoric correlations served as input while comparative fit index (CFI) results showed overestimation of this degree-of-model fit. Whereas the results for fixed factor loadings were in line with the assumptions of modeling missing data, the other results showed only partial agreement. Therefore, modeling missing data with fixed factor loadings is recommended.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10638985/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47544581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Note on Statistical Hypothesis Testing: Probabilifying Modus Tollens Invalidates Its Force? Not True! 关于统计假设检验的一个注记:莫杜斯·托伦斯的概率失效?不是真的!
IF 2.7 3区 心理学 Q1 Social Sciences Pub Date : 2023-12-01 Epub Date: 2023-01-13 DOI: 10.1177/00131644221145132
Keith F Widaman

The import or force of the result of a statistical test has long been portrayed as consistent with deductive reasoning. The simplest form of deductive argument has a first premise with conditional form, such as pq, which means that "if p is true, then q must be true." Given the first premise, one can either affirm or deny the antecedent clause (p) or affirm or deny the consequent claim (q). This leads to four forms of deductive argument, two of which are valid forms of reasoning and two of which are invalid. The typical conclusion is that only a single form of argument-denying the consequent, also known as modus tollens-is a reasonable analog of decisions based on statistical hypothesis testing. Now, statistical evidence is never certain, but is associated with a probability (i.e., a p-level). Some have argued that modus tollens, when probabilified, loses its force and leads to ridiculous, nonsensical conclusions. Their argument is based on specious problem setup. This note is intended to correct this error and restore the position of modus tollens as a valid form of deductive inference in statistical matters, even when it is probabilified.

长期以来,统计测试结果的重要性或影响力一直被描述为与演绎推理一致。演绎论证的最简单形式有一个条件形式的第一前提,如p→ q、 这意味着“如果p是真的,那么q必须是真的。”给定第一个前提,可以肯定或否定先行子句(p),也可以肯定或否认后接主张(q)。这导致了四种形式的演绎论证,其中两种是有效的推理形式,两种是无效的。典型的结论是,只有一种形式的论点——否认结果,也被称为modus tollens——是基于统计假设检验的决策的合理类比。现在,统计证据从来都不是确定的,而是与概率(即p水平)相关的。一些人认为,当有可能的时候,模式会失去力量,并导致荒谬、无意义的结论。他们的论点是基于似是而非的问题设置。本注释旨在纠正这一错误,并恢复modus tollens作为统计事项中演绎推理的有效形式的地位,即使它是有可能的。
{"title":"A Note on Statistical Hypothesis Testing: Probabilifying <i>Modus Tollens</i> Invalidates Its Force? Not True!","authors":"Keith F Widaman","doi":"10.1177/00131644221145132","DOIUrl":"10.1177/00131644221145132","url":null,"abstract":"<p><p>The import or force of the result of a statistical test has long been portrayed as consistent with deductive reasoning. The simplest form of deductive argument has a first premise with conditional form, such as <i>p</i>→<i>q</i>, which means that \"if <i>p</i> is true, then <i>q</i> must be true.\" Given the first premise, one can either affirm or deny the antecedent clause (<i>p</i>) or affirm or deny the consequent claim (<i>q</i>). This leads to four forms of deductive argument, two of which are valid forms of reasoning and two of which are invalid. The typical conclusion is that only a single form of argument-denying the consequent, also known as <i>modus tollens</i>-is a reasonable analog of decisions based on statistical hypothesis testing. Now, statistical evidence is never certain, but is associated with a probability (i.e., a <i>p</i>-level). Some have argued that <i>modus tollens</i>, when probabilified, loses its force and leads to ridiculous, nonsensical conclusions. Their argument is based on specious problem setup. This note is intended to correct this error and restore the position of <i>modus tollens</i> as a valid form of deductive inference in statistical matters, even when it is probabilified.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10638983/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43119306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the Utility of Indirect Methods for Detecting Faking 论间接检测造假的效用
3区 心理学 Q1 Social Sciences Pub Date : 2023-11-13 DOI: 10.1177/00131644231209520
Philippe Goldammer, Peter Lucas Stöckli, Yannik Andrea Escher, Hubert Annen, Klaus Jonas
Indirect indices for faking detection in questionnaires make use of a respondent’s deviant or unlikely response pattern over the course of the questionnaire to identify them as a faker. Compared with established direct faking indices (i.e., lying and social desirability scales), indirect indices have at least two advantages: First, they cannot be detected by the test taker. Second, their usage does not require changes to the questionnaire. In the last decades, several such indirect indices have been proposed. However, at present, the researcher’s choice between different indirect faking detection indices is guided by relatively little information, especially if conceptually different indices are to be used together. Thus, we examined and compared how well indices of a representative selection of 12 conceptionally different indirect indices perform and how well they perform individually and jointly compared with an established direct faking measure or validity scale. We found that, first, the score on the agreement factor of the Likert-type item response process tree model, the proportion of desirable scale endpoint responses, and the covariance index were the best-performing indirect indices. Second, using indirect indices in combination resulted in comparable and in some cases even better detection rates than when using direct faking measures. Third, some effective indirect indices were only minimally correlated with substantive scales and could therefore be used to partial faking variance from response sets without losing substance. We, therefore, encourage researchers to use indirect indices instead of direct faking measures when they aim to detect faking in their data.
问卷造假检测的间接指标利用被调查者在问卷过程中的偏差或不太可能的反应模式来识别他们是否为伪造者。与现有的直接欺骗指数(即说谎和社会期望量表)相比,间接指数至少有两个优势:首先,它们不会被测试者察觉。其次,它们的使用不需要改变问卷。在过去的几十年里,已经提出了几个这样的间接指数。然而,目前研究人员在不同的间接伪造检测指标之间的选择所获得的信息相对较少,特别是在概念上不同的指标要同时使用的情况下。因此,我们检查和比较了12个概念上不同的间接指标的代表性选择的指数的表现,以及它们单独和联合与已建立的直接虚假测量或效度量表相比的表现。研究发现,第一,李克特项目反应过程树模型的一致性因子得分、理想量表端点反应比例和协方差指数是表现最好的间接指标。其次,与使用直接检测方法相比,结合使用间接指标的检出率相当,在某些情况下甚至更好。第三,一些有效的间接指标与实质性量表只有最低程度的相关性,因此可以用来部分伪造响应集的方差而不失去实质。因此,我们鼓励研究人员在检测数据造假时使用间接指标,而不是直接的造假措施。
{"title":"On the Utility of Indirect Methods for Detecting Faking","authors":"Philippe Goldammer, Peter Lucas Stöckli, Yannik Andrea Escher, Hubert Annen, Klaus Jonas","doi":"10.1177/00131644231209520","DOIUrl":"https://doi.org/10.1177/00131644231209520","url":null,"abstract":"Indirect indices for faking detection in questionnaires make use of a respondent’s deviant or unlikely response pattern over the course of the questionnaire to identify them as a faker. Compared with established direct faking indices (i.e., lying and social desirability scales), indirect indices have at least two advantages: First, they cannot be detected by the test taker. Second, their usage does not require changes to the questionnaire. In the last decades, several such indirect indices have been proposed. However, at present, the researcher’s choice between different indirect faking detection indices is guided by relatively little information, especially if conceptually different indices are to be used together. Thus, we examined and compared how well indices of a representative selection of 12 conceptionally different indirect indices perform and how well they perform individually and jointly compared with an established direct faking measure or validity scale. We found that, first, the score on the agreement factor of the Likert-type item response process tree model, the proportion of desirable scale endpoint responses, and the covariance index were the best-performing indirect indices. Second, using indirect indices in combination resulted in comparable and in some cases even better detection rates than when using direct faking measures. Third, some effective indirect indices were only minimally correlated with substantive scales and could therefore be used to partial faking variance from response sets without losing substance. We, therefore, encourage researchers to use indirect indices instead of direct faking measures when they aim to detect faking in their data.","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136352015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Investigating Heterogeneity in Response Strategies: A Mixture Multidimensional IRTree Approach 研究响应策略的异质性:一种混合多维IRTree方法
3区 心理学 Q1 Social Sciences Pub Date : 2023-11-09 DOI: 10.1177/00131644231206765
Ö. Emre C. Alagöz, Thorsten Meiser
To improve the validity of self-report measures, researchers should control for response style (RS) effects, which can be achieved with IRTree models. A traditional IRTree model considers a response as a combination of distinct decision-making processes, where the substantive trait affects the decision on response direction, while decisions about choosing the middle category or extreme categories are largely determined by midpoint RS (MRS) and extreme RS (ERS). One limitation of traditional IRTree models is the assumption that all respondents utilize the same set of RS in their response strategies, whereas it can be assumed that the nature and the strength of RS effects can differ between individuals. To address this limitation, we propose a mixture multidimensional IRTree (MM-IRTree) model that detects heterogeneity in response strategies. The MM-IRTree model comprises four latent classes of respondents, each associated with a different set of RS traits in addition to the substantive trait. More specifically, the class-specific response strategies involve (1) only ERS in the “ERS only” class, (2) only MRS in the “MRS only” class, (3) both ERS and MRS in the “2RS” class, and (4) neither ERS nor MRS in the “0RS” class. In a simulation study, we showed that the MM-IRTree model performed well in recovering model parameters and class memberships, whereas the traditional IRTree approach showed poor performance if the population includes a mixture of response strategies. In an application to empirical data, the MM-IRTree model revealed distinct classes with noticeable class sizes, suggesting that respondents indeed utilize different response strategies.
为了提高自我报告量表的效度,研究人员应该控制反应风格(RS)效应,这可以通过IRTree模型来实现。传统的IRTree模型将响应视为不同决策过程的组合,其中实质性特征影响响应方向的决策,而选择中间类别或极端类别的决策主要由中点RS (MRS)和极端RS (ERS)决定。传统IRTree模型的一个局限性是假设所有受访者在其响应策略中使用相同的RS集,而可以假设RS效应的性质和强度在个体之间是不同的。为了解决这一限制,我们提出了一个混合多维IRTree (MM-IRTree)模型来检测响应策略的异质性。MM-IRTree模型包括四类潜在的被调查者,每一类都与一组不同的RS特征相关联。更具体地说,针对特定类别的响应策略包括(1)“仅限ERS”类别中的ERS,(2)“仅限MRS”类别中的MRS,(3)“2RS”类别中的ERS和MRS,以及(4)“0RS”类别中的ERS和MRS都不是。在模拟研究中,我们发现MM-IRTree模型在恢复模型参数和类隶属度方面表现良好,而传统的IRTree方法在总体包含混合响应策略时表现不佳。在对实证数据的应用中,MM-IRTree模型揭示了不同的类别和显著的班级规模,表明受访者确实使用不同的响应策略。
{"title":"Investigating Heterogeneity in Response Strategies: A Mixture Multidimensional IRTree Approach","authors":"Ö. Emre C. Alagöz, Thorsten Meiser","doi":"10.1177/00131644231206765","DOIUrl":"https://doi.org/10.1177/00131644231206765","url":null,"abstract":"To improve the validity of self-report measures, researchers should control for response style (RS) effects, which can be achieved with IRTree models. A traditional IRTree model considers a response as a combination of distinct decision-making processes, where the substantive trait affects the decision on response direction, while decisions about choosing the middle category or extreme categories are largely determined by midpoint RS (MRS) and extreme RS (ERS). One limitation of traditional IRTree models is the assumption that all respondents utilize the same set of RS in their response strategies, whereas it can be assumed that the nature and the strength of RS effects can differ between individuals. To address this limitation, we propose a mixture multidimensional IRTree (MM-IRTree) model that detects heterogeneity in response strategies. The MM-IRTree model comprises four latent classes of respondents, each associated with a different set of RS traits in addition to the substantive trait. More specifically, the class-specific response strategies involve (1) only ERS in the “ERS only” class, (2) only MRS in the “MRS only” class, (3) both ERS and MRS in the “2RS” class, and (4) neither ERS nor MRS in the “0RS” class. In a simulation study, we showed that the MM-IRTree model performed well in recovering model parameters and class memberships, whereas the traditional IRTree approach showed poor performance if the population includes a mixture of response strategies. In an application to empirical data, the MM-IRTree model revealed distinct classes with noticeable class sizes, suggesting that respondents indeed utilize different response strategies.","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135242059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparing RMSEA-Based Indices for Assessing Measurement Invariance in Confirmatory Factor Models 基于rmsea的验证性因子模型测量不变性评价指标比较
3区 心理学 Q1 Social Sciences Pub Date : 2023-11-01 DOI: 10.1177/00131644231202949
Nataly Beribisky, Gregory R. Hancock
Fit indices are descriptive measures that can help evaluate how well a confirmatory factor analysis (CFA) model fits a researcher’s data. In multigroup models, before between-group comparisons are made, fit indices may be used to evaluate measurement invariance by assessing the degree to which multiple groups’ data are consistent with increasingly constrained nested models. One such fit index is an adaptation of the root mean square error of approximation (RMSEA) called RMSEA D . This index embeds the chi-square and degree-of-freedom differences into a modified RMSEA formula. The present study comprehensively compared RMSEA D to ΔRMSEA, the difference between two RMSEA values associated with a comparison of nested models. The comparison consisted of both derivations as well as a population analysis using one-factor CFA models with features common to those found in practical research. The findings demonstrated that for the same model, RMSEA D will always have increased sensitivity relative to ΔRMSEA with an increasing number of indicator variables. The study also indicated that RMSEA D had increased ability to detect noninvariance relative to ΔRMSEA in one-factor models. For these reasons, when evaluating measurement invariance, RMSEA D is recommended instead of ΔRMSEA.
拟合指数是描述性的措施,可以帮助评估如何很好地验证因子分析(CFA)模型适合研究人员的数据。在多组模型中,在进行组间比较之前,可以使用拟合指数通过评估多组数据与日益受限的嵌套模型的一致程度来评估测量不变性。其中一种拟合指标是对近似均方根误差(RMSEA)的适应,称为RMSEA D。该指标将卡方和自由度差异嵌入到修改后的RMSEA公式中。本研究全面比较了RMSEA D与ΔRMSEA,两个RMSEA值之间的差异与嵌套模型的比较有关。比较包括推导和使用单因素CFA模型的总体分析,其特征与实际研究中发现的特征相同。研究结果表明,对于同一模型,随着指标变量数量的增加,RMSEA D相对于ΔRMSEA的灵敏度总是增加。该研究还表明,在单因素模型中,RMSEA D相对于ΔRMSEA具有更高的检测非不变性的能力。由于这些原因,在评估测量不变性时,建议使用RMSEA D而不是ΔRMSEA。
{"title":"Comparing RMSEA-Based Indices for Assessing Measurement Invariance in Confirmatory Factor Models","authors":"Nataly Beribisky, Gregory R. Hancock","doi":"10.1177/00131644231202949","DOIUrl":"https://doi.org/10.1177/00131644231202949","url":null,"abstract":"Fit indices are descriptive measures that can help evaluate how well a confirmatory factor analysis (CFA) model fits a researcher’s data. In multigroup models, before between-group comparisons are made, fit indices may be used to evaluate measurement invariance by assessing the degree to which multiple groups’ data are consistent with increasingly constrained nested models. One such fit index is an adaptation of the root mean square error of approximation (RMSEA) called RMSEA D . This index embeds the chi-square and degree-of-freedom differences into a modified RMSEA formula. The present study comprehensively compared RMSEA D to ΔRMSEA, the difference between two RMSEA values associated with a comparison of nested models. The comparison consisted of both derivations as well as a population analysis using one-factor CFA models with features common to those found in practical research. The findings demonstrated that for the same model, RMSEA D will always have increased sensitivity relative to ΔRMSEA with an increasing number of indicator variables. The study also indicated that RMSEA D had increased ability to detect noninvariance relative to ΔRMSEA in one-factor models. For these reasons, when evaluating measurement invariance, RMSEA D is recommended instead of ΔRMSEA.","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135326183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Educational and Psychological Measurement
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1