首页 > 最新文献

Practical Assessment, Research and Evaluation最新文献

英文 中文
Negatively-Worded Multiple Choice Questions: An Avoidable Threat to Validity. 消极措辞选择题:对效度的可避免威胁。
Q2 Social Sciences Pub Date : 2017-05-01 DOI: 10.7275/5VVY-8613
N. Chiavaroli
Despite the majority of MCQ writing guides discouraging the use of negatively-worded multiple choice questions (NWQs), they continue to be regularly used both in locally produced examinations and commercially available questions. There are several reasons why the use of NWQs may prove resistant to sound pedagogical advice. Nevertheless, systematic inspection of item-level analysis often reveals anomalous behavior of NWQs on high-stakes examinations, due to otherwise highperforming students selecting the incorrect option for those questions. Highlighting the negative term as commonly recommended does not prevent this, since both anecdotal and empirical evidence suggests that many students answer the question as if it were positively phrased. The continued use of NWQs in high-stakes examinations poses a significant threat to the validity of interpretation based on these assessments. This is a form of ‘construct-irrelevant variance’ within the control of the item writer, and is therefore completely avoidable.
尽管大多数MCQ写作指南不鼓励使用否定词的多项选择题(NWQs),但它们继续在本地考试和商业试题中经常使用。nwq的使用可能会对合理的教学建议产生抵触,原因有几个。然而,对项目层面分析的系统检查经常揭示出nwq在高风险考试中的异常行为,这是由于表现优异的学生在这些问题中选择了错误的选项。强调通常推荐的负面术语并不能阻止这种情况,因为轶事和经验证据都表明,许多学生回答这个问题时,好像它是积极的措辞。在高风险考试中继续使用nwq对基于这些评估的解释的有效性构成了重大威胁。这是一种在项目编写者控制范围内的“与结构无关的差异”,因此是完全可以避免的。
{"title":"Negatively-Worded Multiple Choice Questions: An Avoidable Threat to Validity.","authors":"N. Chiavaroli","doi":"10.7275/5VVY-8613","DOIUrl":"https://doi.org/10.7275/5VVY-8613","url":null,"abstract":"Despite the majority of MCQ writing guides discouraging the use of negatively-worded multiple choice questions (NWQs), they continue to be regularly used both in locally produced examinations and commercially available questions. There are several reasons why the use of NWQs may prove resistant to sound pedagogical advice. Nevertheless, systematic inspection of item-level analysis often reveals anomalous behavior of NWQs on high-stakes examinations, due to otherwise highperforming students selecting the incorrect option for those questions. Highlighting the negative term as commonly recommended does not prevent this, since both anecdotal and empirical evidence suggests that many students answer the question as if it were positively phrased. The continued use of NWQs in high-stakes examinations poses a significant threat to the validity of interpretation based on these assessments. This is a form of ‘construct-irrelevant variance’ within the control of the item writer, and is therefore completely avoidable.","PeriodicalId":20361,"journal":{"name":"Practical Assessment, Research and Evaluation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76929701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Constructing multiple-choice items to measure higher-order thinking 构建多项选择题来衡量高阶思维
Q2 Social Sciences Pub Date : 2017-05-01 DOI: 10.7275/CA7Y-MM27
Darina Scully
Across education, certification and licensure, there are repeated calls for the development of assessments that target higher-order thinking, as opposed to mere recall of facts. A common assumption is that this necessitates the use of constructed response or essay-style test questions; however, empirical evidence suggests that this may not be the case. In this paper, it is argued that multiplechoice items have the capacity to assess certain higher-order skills. In addition, a series of practical recommendations for test developers seeking to purposefully construct such items is provided.
在教育、认证和执照方面,不断有人呼吁开发针对高阶思维的评估,而不仅仅是对事实的回忆。一个常见的假设是,这需要使用建构式回答或文章式的测试问题;然而,经验证据表明,情况可能并非如此。本文认为,多项选择题具有评估某些高阶技能的能力。此外,本文还提供了一系列实用的建议,供测试开发人员有目的地构建这样的项目。
{"title":"Constructing multiple-choice items to measure higher-order thinking","authors":"Darina Scully","doi":"10.7275/CA7Y-MM27","DOIUrl":"https://doi.org/10.7275/CA7Y-MM27","url":null,"abstract":"Across education, certification and licensure, there are repeated calls for the development of \u0000assessments that target higher-order thinking, as opposed to mere recall of facts. A common assumption \u0000is that this necessitates the use of constructed response or essay-style test questions; however, \u0000empirical evidence suggests that this may not be the case. In this paper, it is argued that multiplechoice items have the capacity to assess certain higher-order skills. In addition, a series of practical \u0000recommendations for test developers seeking to purposefully construct such items is provided.","PeriodicalId":20361,"journal":{"name":"Practical Assessment, Research and Evaluation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74879840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 69
The Use of Reddit as an Inexpensive Source for High-Quality Data 使用Reddit作为高质量数据的廉价来源
Q2 Social Sciences Pub Date : 2017-01-01 DOI: 10.7275/SWGT-RJ52
Matthew R. Jamnik, D. J. Lane
Today, researchers have the ability to conduct their investigations in a number of different manners, including both traditional testing using university subject pool participants and the more recent method of online recruitment. Although the use of internet participants is becoming more popular, this area of research is still very much in its infancy and needs further examination. Additionally, alternative web-based platforms need to be investigated because much of the literature has focused on using Amazon.com’s Mechanical Turk (MTurk). Therefore, the current study recruited an internet population using the website Reddit, and compared them to a traditional undergraduate sample to learn more about this web-based platform. The results demonstrated similarities and distinctions between the two samples. Furthermore, previous findings in the psychological well-being literature were replicated. As a whole, the participants recruited from Reddit provided high-quality data that were inexpensive and comparable to the responses gathered using undergraduate participants. We conclude that this website appears to be a promising tool for the field of psychological assessment, research, and evaluation.
今天,研究人员有能力以多种不同的方式进行调查,包括使用大学学科池参与者的传统测试和最近的在线招聘方法。尽管使用互联网参与者变得越来越流行,但这一研究领域仍处于起步阶段,需要进一步研究。此外,还需要研究其他基于网络的平台,因为大部分文献都集中在使用亚马逊的土耳其机械(MTurk)上。因此,目前的研究招募了一个使用Reddit网站的互联网人群,并将他们与传统的大学生样本进行比较,以了解更多关于这个基于网络的平台。结果显示了两个样本之间的相似之处和区别。此外,先前在心理健康文献中的发现也得到了重复。总的来说,从Reddit上招募的参与者提供了高质量的数据,这些数据价格低廉,与从本科生参与者那里收集的答案相当。我们的结论是,这个网站似乎是心理评估、研究和评估领域的一个有前途的工具。
{"title":"The Use of Reddit as an Inexpensive Source for High-Quality Data","authors":"Matthew R. Jamnik, D. J. Lane","doi":"10.7275/SWGT-RJ52","DOIUrl":"https://doi.org/10.7275/SWGT-RJ52","url":null,"abstract":"Today, researchers have the ability to conduct their investigations in a number of different manners, including both traditional testing using university subject pool participants and the more recent method of online recruitment. Although the use of internet participants is becoming more popular, this area of research is still very much in its infancy and needs further examination. Additionally, alternative web-based platforms need to be investigated because much of the literature has focused on using Amazon.com’s Mechanical Turk (MTurk). Therefore, the current study recruited an internet population using the website Reddit, and compared them to a traditional undergraduate sample to learn more about this web-based platform. The results demonstrated similarities and distinctions between the two samples. Furthermore, previous findings in the psychological well-being literature were replicated. As a whole, the participants recruited from Reddit provided high-quality data that were inexpensive and comparable to the responses gathered using undergraduate participants. We conclude that this website appears to be a promising tool for the field of psychological assessment, research, and evaluation.","PeriodicalId":20361,"journal":{"name":"Practical Assessment, Research and Evaluation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90018097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 51
A note on using eigenvalues in dimensionality assessment 关于在维数评估中使用特征值的注释
Q2 Social Sciences Pub Date : 2017-01-01 DOI: 10.7275/E7GH-0785
Cengiz Zopluoglu, Ernest C Davenport
{"title":"A note on using eigenvalues in dimensionality assessment","authors":"Cengiz Zopluoglu, Ernest C Davenport","doi":"10.7275/E7GH-0785","DOIUrl":"https://doi.org/10.7275/E7GH-0785","url":null,"abstract":"","PeriodicalId":20361,"journal":{"name":"Practical Assessment, Research and Evaluation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87390666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Investigating Causal DIF via Propensity Score Methods. 通过倾向评分方法调查因果DIF。
Q2 Social Sciences Pub Date : 2016-12-01 DOI: 10.7275/EWQZ-N963
Yan Liu, B. Zumbo, P. Gustafson, Yi Huang, Edward Kroc, Amery Wu
A variety of differential item functioning (DIF) methods have been proposed and used for ensuring that a test is fair to all test takers in a target population in the situations of, for example, a test being translated to other languages. However, once a method flags an item as DIF, it is difficult to conclude that the grouping variable (e.g., test language) is responsible for the DIF result because there may exist many confounding variables that lead to the DIF result. The present study aims to (i) demonstrate the application of propensity score methods in psychometric research on DIF for dayto-day researchers, and (ii) describe conditional logistic regression for matched data in a DIF context. Propensity score methods can help to achieve the comparability between different populations or groups with respect to participants’ pre-test differences, which can assist in examining the validity of making a causal claim with regard to DIF.
各种差异项目功能(DIF)方法已被提出并用于确保考试在被翻译成其他语言的情况下对目标人群中的所有考生公平。然而,一旦一个方法将一个项目标记为DIF,就很难得出分组变量(例如,测试语言)对DIF结果负责的结论,因为可能存在许多导致DIF结果的混淆变量。本研究旨在(i)展示倾向评分方法在日常研究人员对DIF的心理测量研究中的应用,以及(ii)描述DIF背景下匹配数据的条件逻辑回归。倾向评分方法可以帮助实现不同人群或群体之间关于参与者的测试前差异的可比性,这可以帮助检查关于DIF的因果主张的有效性。
{"title":"Investigating Causal DIF via Propensity Score Methods.","authors":"Yan Liu, B. Zumbo, P. Gustafson, Yi Huang, Edward Kroc, Amery Wu","doi":"10.7275/EWQZ-N963","DOIUrl":"https://doi.org/10.7275/EWQZ-N963","url":null,"abstract":"A variety of differential item functioning (DIF) methods have been proposed and used for ensuring that a test is fair to all test takers in a target population in the situations of, for example, a test being translated to other languages. However, once a method flags an item as DIF, it is difficult to conclude that the grouping variable (e.g., test language) is responsible for the DIF result because there may exist many confounding variables that lead to the DIF result. The present study aims to (i) demonstrate the application of propensity score methods in psychometric research on DIF for dayto-day researchers, and (ii) describe conditional logistic regression for matched data in a DIF context. Propensity score methods can help to achieve the comparability between different populations or groups with respect to participants’ pre-test differences, which can assist in examining the validity of making a causal claim with regard to DIF.","PeriodicalId":20361,"journal":{"name":"Practical Assessment, Research and Evaluation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78205873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Partial Least Squares Structural Equation Modeling with R. 偏最小二乘结构方程建模。
Q2 Social Sciences Pub Date : 2016-09-01 DOI: 10.7275/D2FA-QV48
Hamdollah Ravand, Purya Baghaei
Structural equation modeling (SEM) has become widespread in educational and psychological research. Its flexibility in addressing complex theoretical models and the proper treatment of measurement error has made it the model of choice for many researchers in the social sciences. Nevertheless, the model imposes some daunting assumptions and restrictions (e.g. normality and relatively large sample sizes) that could discourage practitioners from applying the model. Partial least squares SEM (PLS-SEM) is a nonparametric technique which makes no distributional assumptions and can be estimated with small sample sizes. In this paper a general introduction to PLS-SEM is given and is compared with conventional SEM. Next, step by step procedures, along with R functions, are presented to estimate the model. A data set is analyzed and the outputs are interpreted.
结构方程模型(SEM)在教育和心理学研究中得到了广泛应用。它在处理复杂理论模型方面的灵活性和对测量误差的适当处理使其成为许多社会科学研究人员的首选模型。然而,该模型施加了一些令人生畏的假设和限制(例如,正态性和相对较大的样本量),这可能会阻碍从业者应用该模型。偏最小二乘扫描电镜(PLS-SEM)是一种不做分布假设的非参数技术,可以在小样本量下进行估计。本文对PLS-SEM进行了概述,并与传统SEM进行了比较。接下来,一步一步的程序,连同R函数,提出了估计模型。分析数据集并解释输出。
{"title":"Partial Least Squares Structural Equation Modeling with R.","authors":"Hamdollah Ravand, Purya Baghaei","doi":"10.7275/D2FA-QV48","DOIUrl":"https://doi.org/10.7275/D2FA-QV48","url":null,"abstract":"Structural equation modeling (SEM) has become widespread in educational and psychological research. Its flexibility in addressing complex theoretical models and the proper treatment of measurement error has made it the model of choice for many researchers in the social sciences. Nevertheless, the model imposes some daunting assumptions and restrictions (e.g. normality and relatively large sample sizes) that could discourage practitioners from applying the model. Partial least squares SEM (PLS-SEM) is a nonparametric technique which makes no distributional assumptions and can be estimated with small sample sizes. In this paper a general introduction to PLS-SEM is given and is compared with conventional SEM. Next, step by step procedures, along with R functions, are presented to estimate the model. A data set is analyzed and the outputs are interpreted.","PeriodicalId":20361,"journal":{"name":"Practical Assessment, Research and Evaluation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90207681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 98
Accuracy of Bayes and Logistic Regression Subscale Probabilities for Educational and Certification Tests. 教育和认证考试中贝叶斯和逻辑回归子尺度概率的准确性。
Q2 Social Sciences Pub Date : 2016-07-01 DOI: 10.7275/Q7ZZ-D655
Lawrence M. Rudner
In the machine learning literature, it is commonly accepted as fact that as calibration sample sizes increase, Naïve Bayes classifiers initially outperform Logistic Regression classifiers in terms of classification accuracy. Applied to subtests from an on-line final examination and from a highly regarded certification examination, this study shows that the conclusion also applies to the probabilities estimated from short subtests of mental abilities and that small samples can yield excellent accuracy. The calculated Bayes probabilities can be used to provide meaningful examinee feedback regardless of whether the test was originally designed to be unidimensional.
在机器学习文献中,人们普遍认为,随着校准样本量的增加,Naïve贝叶斯分类器在分类精度方面最初优于逻辑回归分类器。应用于在线期末考试和高度重视的认证考试的子测试,本研究表明,结论也适用于从心理能力的短子测试估计的概率,并且小样本可以产生极好的准确性。计算出的贝叶斯概率可以用来提供有意义的考生反馈,而不管测试最初是否被设计为一维的。
{"title":"Accuracy of Bayes and Logistic Regression Subscale Probabilities for Educational and Certification Tests.","authors":"Lawrence M. Rudner","doi":"10.7275/Q7ZZ-D655","DOIUrl":"https://doi.org/10.7275/Q7ZZ-D655","url":null,"abstract":"In the machine learning literature, it is commonly accepted as fact that as calibration sample sizes increase, Naïve Bayes classifiers initially outperform Logistic Regression classifiers in terms of classification accuracy. Applied to subtests from an on-line final examination and from a highly regarded certification examination, this study shows that the conclusion also applies to the probabilities estimated from short subtests of mental abilities and that small samples can yield excellent accuracy. The calculated Bayes probabilities can be used to provide meaningful examinee feedback regardless of whether the test was originally designed to be unidimensional.","PeriodicalId":20361,"journal":{"name":"Practical Assessment, Research and Evaluation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89498377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Regularization Methods for Fitting Linear Models with Small Sample Sizes: Fitting the Lasso Estimator Using R. 小样本量线性模型拟合的正则化方法:用R拟合Lasso估计量。
Q2 Social Sciences Pub Date : 2016-05-01 DOI: 10.7275/JR3D-CQ04
W. H. Finch, M. Finch
Researchers and data analysts are sometimes faced with the problem of very small samples, where the number of variables approaches or exceeds the overall sample size; i.e. high dimensional data. In such cases, standard statistical models such as regression or analysis of variance cannot be used, either because the resulting parameter estimates exhibit very high variance and can therefore not be trusted, or because the statistical algorithm cannot converge on parameter estimates at all. There exist an alternative set of model estimation procedures, known collectively as regularization methods, which can be used in such circumstances, and which have been shown through simulation research to yield accurate parameter estimates. The purpose of this paper is to describe, for those unfamiliar with them, the most popular of these regularization methods, the lasso, and to demonstrate its use on an actual high dimensional dataset involving adults with autism, using the R software language. Results of analyses involving relating measures of executive functioning with a full scale intelligence test score are presented, and implications of using these models are discussed.
研究人员和数据分析师有时会面临样本非常小的问题,其中变量的数量接近或超过总体样本量;即高维数据。在这种情况下,不能使用诸如回归或方差分析之类的标准统计模型,因为所得到的参数估计表现出非常高的方差,因此不可信,或者因为统计算法根本不能收敛于参数估计。存在一组可供选择的模型估计程序,统称为正则化方法,可以在这种情况下使用,并且通过仿真研究表明可以产生准确的参数估计。对于那些不熟悉这些正则化方法的人来说,本文的目的是描述这些正则化方法中最流行的套索,并使用R软件语言演示其在涉及自闭症成年人的实际高维数据集上的使用。分析结果涉及执行功能的相关措施与全面的智力测试成绩提出,并讨论了使用这些模型的含义。
{"title":"Regularization Methods for Fitting Linear Models with Small Sample Sizes: Fitting the Lasso Estimator Using R.","authors":"W. H. Finch, M. Finch","doi":"10.7275/JR3D-CQ04","DOIUrl":"https://doi.org/10.7275/JR3D-CQ04","url":null,"abstract":"Researchers and data analysts are sometimes faced with the problem of very small samples, where the number of variables approaches or exceeds the overall sample size; i.e. high dimensional data. In such cases, standard statistical models such as regression or analysis of variance cannot be used, either because the resulting parameter estimates exhibit very high variance and can therefore not be trusted, or because the statistical algorithm cannot converge on parameter estimates at all. There exist an alternative set of model estimation procedures, known collectively as regularization methods, which can be used in such circumstances, and which have been shown through simulation research to yield accurate parameter estimates. The purpose of this paper is to describe, for those unfamiliar with them, the most popular of these regularization methods, the lasso, and to demonstrate its use on an actual high dimensional dataset involving adults with autism, using the R software language. Results of analyses involving relating measures of executive functioning with a full scale intelligence test score are presented, and implications of using these models are discussed.","PeriodicalId":20361,"journal":{"name":"Practical Assessment, Research and Evaluation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73574099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
A Comparison of Three Approaches to Correct for Direct and Indirect Range Restrictions: A Simulation Study. 三种直接和间接距离限制校正方法的比较:仿真研究。
Q2 Social Sciences Pub Date : 2016-03-01 DOI: 10.7275/X4EP-FV42
A. Pfaffel, Barbara Schober, C. Spiel
A common methodological problem in the evaluation of the predictive validity of selection methods, e.g. in educational and employment selection, is that the correlation between predictor and criterion is biased. Thorndike’s (1949) formulas are commonly used to correct for this biased correlation. An alternative approach is to view the selection mechanism as a missing data mechanism. The aim of this study was to compare Thorndike’s formulas for direct and indirect range restriction scenarios with two state-of-the-art approaches for handling missing data: full information maximum likelihood (FIML) and multiple imputation by chained equations (MICE). We conducted Monte-Carlo simulations to investigate the accuracy of the population correlation estimates in dependence of the selection ratio and the true population correlation in an experimental design. For a direct range restriction scenario, the three approaches are equally accurate. For an indirect range restriction scenario, the corrections using FIML and MICE are more precise than when using Thorndike’s formula. The higher the selection ratio and the true population correlation, the higher the precision of the population correlation estimates. Our findings indicate that both missing data approaches are alternative corrections to Thorndike’s formulas, especially in the case of indirect range restriction.
在评估选择方法的预测有效性时,例如在教育和就业选择中,一个常见的方法学问题是预测器和标准之间的相关性是有偏差的。桑代克(1949)的公式通常用于校正这种偏倚相关性。另一种方法是将选择机制视为缺失的数据机制。本研究的目的是比较桑代克公式的直接和间接范围限制情景与两种最先进的方法来处理缺失数据:全信息最大似然(FIML)和链式方程(MICE)的多重imputation。我们进行了蒙特卡罗模拟,以研究在实验设计中依赖于选择比率和真实种群相关性的种群相关估计的准确性。对于直接的范围限制场景,这三种方法同样准确。对于间接距离限制情况,使用FIML和MICE的修正比使用桑代克公式的修正更精确。选择比和真实种群相关越高,种群相关估计的精度越高。我们的研究结果表明,两种缺失的数据方法都是对桑代克公式的替代修正,特别是在间接范围限制的情况下。
{"title":"A Comparison of Three Approaches to Correct for Direct and Indirect Range Restrictions: A Simulation Study.","authors":"A. Pfaffel, Barbara Schober, C. Spiel","doi":"10.7275/X4EP-FV42","DOIUrl":"https://doi.org/10.7275/X4EP-FV42","url":null,"abstract":"A common methodological problem in the evaluation of the predictive validity of selection methods, e.g. in educational and employment selection, is that the correlation between predictor and criterion is biased. Thorndike’s (1949) formulas are commonly used to correct for this biased correlation. An alternative approach is to view the selection mechanism as a missing data mechanism. The aim of this study was to compare Thorndike’s formulas for direct and indirect range restriction scenarios with two state-of-the-art approaches for handling missing data: full information maximum likelihood (FIML) and multiple imputation by chained equations (MICE). We conducted Monte-Carlo simulations to investigate the accuracy of the population correlation estimates in dependence of the selection ratio and the true population correlation in an experimental design. For a direct range restriction scenario, the three approaches are equally accurate. For an indirect range restriction scenario, the corrections using FIML and MICE are more precise than when using Thorndike’s formula. The higher the selection ratio and the true population correlation, the higher the precision of the population correlation estimates. Our findings indicate that both missing data approaches are alternative corrections to Thorndike’s formulas, especially in the case of indirect range restriction.","PeriodicalId":20361,"journal":{"name":"Practical Assessment, Research and Evaluation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74165017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Confidence Intervals for Effect Sizes: Applying Bootstrap Resampling. 效应大小的置信区间:应用自举重采样。
Q2 Social Sciences Pub Date : 2016-03-01 DOI: 10.7275/DZ3R-8N08
Erin S. Banjanovic, J. Osborne
{"title":"Confidence Intervals for Effect Sizes: Applying Bootstrap Resampling.","authors":"Erin S. Banjanovic, J. Osborne","doi":"10.7275/DZ3R-8N08","DOIUrl":"https://doi.org/10.7275/DZ3R-8N08","url":null,"abstract":"","PeriodicalId":20361,"journal":{"name":"Practical Assessment, Research and Evaluation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74067969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 70
期刊
Practical Assessment, Research and Evaluation
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1