首页 > 最新文献

International Journal of Testing最新文献

英文 中文
Using third-party evaluations to assess socioemotional skills in graduate and professional school admissions 使用第三方评估来评估研究生和专业学校招生中的社会情感技能
IF 1.7 Q2 SOCIAL SCIENCES, INTERDISCIPLINARY Pub Date : 2022-01-02 DOI: 10.1080/15305058.2021.2019748
David Klieger, Jennifer L. Bochenek, Chelsea Ezzo, Steven Holtzman, Frederick Cline, Margarita Olivera-Aguilar
Abstract Consideration of socioemotional skills in admissions potentially can increase representation of racial and ethnic minorities and women in graduate and professional education as well as identify candidates more likely to succeed in graduate and professional school. Research on one such assessment, the ETS Personal Potential Index (PPI), showed that the PPI produced much smaller racial/ethnic-gender group mean score differences than undergraduate grade point average (UGPA) and the Graduate Record Examinations (GRE) did. Across levels of institutional selectivity, the PPI can promote racial/ethnic and gender diversity in graduate and professional school in ways that UGPA and GRE scores do not. Predictive validity analyses showed that for doctoral STEM programs the PPI dimensions of (1) Planning and Organization and (2) Communication Skills positively predict school grade point average as well as a lower risk of academic probation, a determinant of degree progress, both alone and incrementally over UGPA and GRE scores. Supplemental data for this article is available online at https://doi.org/10.1080/15305058.2021.2019748 .
摘要在招生中考虑社会情感技能可能会增加少数民族和妇女在研究生和专业教育中的代表性,并确定更有可能在研究生院和专业学校取得成功的候选人。对其中一项评估——ETS个人潜力指数(PPI)的研究表明,PPI产生的种族/民族性别组平均分差异比本科生平均分(UGPA)和研究生入学考试(GRE)小得多。在不同的机构选择性水平上,PPI可以促进研究生院和专业学校的种族/民族和性别多样性,而UGPA和GRE成绩则不然。预测有效性分析表明,对于博士STEM项目,PPI维度(1)规划与组织和(2)沟通技能正向预测学校平均绩点,以及较低的学业试用风险,这是学位进步的决定因素,无论是单独还是逐步超过UGPA和GRE成绩。本文的补充数据可在线获取,网址为https://doi.org/10.1080/15305058.2021.2019748。
{"title":"Using third-party evaluations to assess socioemotional skills in graduate and professional school admissions","authors":"David Klieger, Jennifer L. Bochenek, Chelsea Ezzo, Steven Holtzman, Frederick Cline, Margarita Olivera-Aguilar","doi":"10.1080/15305058.2021.2019748","DOIUrl":"https://doi.org/10.1080/15305058.2021.2019748","url":null,"abstract":"Abstract Consideration of socioemotional skills in admissions potentially can increase representation of racial and ethnic minorities and women in graduate and professional education as well as identify candidates more likely to succeed in graduate and professional school. Research on one such assessment, the ETS Personal Potential Index (PPI), showed that the PPI produced much smaller racial/ethnic-gender group mean score differences than undergraduate grade point average (UGPA) and the Graduate Record Examinations (GRE) did. Across levels of institutional selectivity, the PPI can promote racial/ethnic and gender diversity in graduate and professional school in ways that UGPA and GRE scores do not. Predictive validity analyses showed that for doctoral STEM programs the PPI dimensions of (1) Planning and Organization and (2) Communication Skills positively predict school grade point average as well as a lower risk of academic probation, a determinant of degree progress, both alone and incrementally over UGPA and GRE scores. Supplemental data for this article is available online at https://doi.org/10.1080/15305058.2021.2019748 .","PeriodicalId":46615,"journal":{"name":"International Journal of Testing","volume":"22 1","pages":"72 - 99"},"PeriodicalIF":1.7,"publicationDate":"2022-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43931141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Test efficacy: Refocusing validation from college exams to candidates 考试效能:将验证的焦点从大学考试转移到考生身上
IF 1.7 Q2 SOCIAL SCIENCES, INTERDISCIPLINARY Pub Date : 2022-01-02 DOI: 10.1080/15305058.2021.2019752
Alvaro J. Arce, M. J. Young
Abstract The paper argues that contemporary test validity theory places the consequences of testing on the lives of all college applicants at the back of the test validation argument. It introduces the notion of test efficacy as a process to gather evidence on claims on consequences of testing on all college applicants that can be traced back to validity. The paper proposes a test efficacy framework to evaluate test efficacy claims on the impact of admission examinations on all college applicants (not just those attaining the admission standard).
摘要本文认为,现代考试有效性理论将考试对所有大学申请者生活的影响置于考试有效性论点的后面。它引入了测试有效性的概念,作为一个收集证据的过程,这些证据可以追溯到有效性。本文提出了一个测试效能框架,以评估测试效能主张对入学考试对所有大学申请人(而不仅仅是达到录取标准的申请人)的影响。
{"title":"Test efficacy: Refocusing validation from college exams to candidates","authors":"Alvaro J. Arce, M. J. Young","doi":"10.1080/15305058.2021.2019752","DOIUrl":"https://doi.org/10.1080/15305058.2021.2019752","url":null,"abstract":"Abstract The paper argues that contemporary test validity theory places the consequences of testing on the lives of all college applicants at the back of the test validation argument. It introduces the notion of test efficacy as a process to gather evidence on claims on consequences of testing on all college applicants that can be traced back to validity. The paper proposes a test efficacy framework to evaluate test efficacy claims on the impact of admission examinations on all college applicants (not just those attaining the admission standard).","PeriodicalId":46615,"journal":{"name":"International Journal of Testing","volume":"22 1","pages":"100 - 119"},"PeriodicalIF":1.7,"publicationDate":"2022-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43977564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Using personal statements in college admissions: An investigation of gender bias and the effects of increased structure 在大学录取中使用个人陈述:性别偏见和增加结构影响的调查
IF 1.7 Q2 SOCIAL SCIENCES, INTERDISCIPLINARY Pub Date : 2021-12-15 DOI: 10.1080/15305058.2021.2019749
Susan Niessen, Marvin Neumann
Abstract Personal statements are among the most commonly used instruments in college admissions procedures. Yet, little research on their reliability, validity, and fairness exists. The first aim of this paper was to investigate hypotheses about adverse impact and underprediction for female applicants, which could result from lower tendencies to use agentic language compared to male applicants. Second, we examined if rating personal statements in a more structured manner would increase reliability and validity. Using personal statements (250 words) from a large cohort of applicants to an undergraduate psychology program at a Dutch University, we found no evidence for adverse impact for female applicants or more agentic language use by male applicants, and no relationship between agentic language use and personal statement ratings. In contrast, we found that personal statements of female applicants were rated slightly more positively than those of males. Exploratory analyses suggest that female applicants’ better writing skills might explain this difference. A more structured approach to rating personal statements yielded higher, but still only ‘moderate’ inter-rater reliability, and virtually identical, negligible predictive validity for first year GPA and dropout.
摘要个人陈述是大学招生程序中最常用的工具之一。然而,关于它们的可靠性、有效性和公平性的研究却很少。本文的第一个目的是调查对女性申请人不利影响和预测不足的假设,这可能是由于与男性申请人相比,使用代理语言的倾向较低。其次,我们研究了以更结构化的方式对个人陈述进行评级是否会提高可靠性和有效性。使用荷兰大学一个本科生心理学项目的大量申请人的个人陈述(250个单词),我们没有发现任何证据表明女性申请人或男性申请人使用更多代理语言会产生不利影响,代理语言使用与个人陈述评级之间也没有关系。相比之下,我们发现女性申请人的个人陈述比男性略为正面。探索性分析表明,女性申请人更好的写作能力可能解释了这种差异。一种更结构化的个人陈述评级方法产生了更高但仍然只有“中等”的评分者间可靠性,以及几乎相同的、可忽略不计的第一年GPA和辍学的预测有效性。
{"title":"Using personal statements in college admissions: An investigation of gender bias and the effects of increased structure","authors":"Susan Niessen, Marvin Neumann","doi":"10.1080/15305058.2021.2019749","DOIUrl":"https://doi.org/10.1080/15305058.2021.2019749","url":null,"abstract":"Abstract Personal statements are among the most commonly used instruments in college admissions procedures. Yet, little research on their reliability, validity, and fairness exists. The first aim of this paper was to investigate hypotheses about adverse impact and underprediction for female applicants, which could result from lower tendencies to use agentic language compared to male applicants. Second, we examined if rating personal statements in a more structured manner would increase reliability and validity. Using personal statements (250 words) from a large cohort of applicants to an undergraduate psychology program at a Dutch University, we found no evidence for adverse impact for female applicants or more agentic language use by male applicants, and no relationship between agentic language use and personal statement ratings. In contrast, we found that personal statements of female applicants were rated slightly more positively than those of males. Exploratory analyses suggest that female applicants’ better writing skills might explain this difference. A more structured approach to rating personal statements yielded higher, but still only ‘moderate’ inter-rater reliability, and virtually identical, negligible predictive validity for first year GPA and dropout.","PeriodicalId":46615,"journal":{"name":"International Journal of Testing","volume":"22 1","pages":"5 - 20"},"PeriodicalIF":1.7,"publicationDate":"2021-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44509062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Metacognitive skills inventory (MSI): development and validation 元认知技能量表(MSI):开发与验证
IF 1.7 Q2 SOCIAL SCIENCES, INTERDISCIPLINARY Pub Date : 2021-10-02 DOI: 10.1080/15305058.2021.1986051
Haja Hameed, Reena Cheruvalath
Abstract Metacognitive skills help to control and regulate negative thoughts, emotions, beliefs and sad memories. The objective of the study was to develop and validate an inventory-Metacognitive Skills Inventory (MSI) to assess the variance in adopting metacognitive strategies between those who have depressive symptoms and those who have not. Two studies were carried out among Indian youth (study 1—N = 269, MeanAge= 21.1 and study 2—N = 745, MeanAge= 20.9). They completed the MSI as well as measures of depression and negative emotions. Item response theory (IRT) analysis, and exploratory (EFA) and confirmatory factor analysis (CFA) were carried out for the scale development. The analyses derived a meaningful four-factor structure [(i) Navigation of negative thoughts by adopting metacognitive strategies, (ii) Channelizing negative emotions constructively, (iii) Recognizing ruminative tendencies, (iv) Knowledge of strengths and weaknesses in regulating emotions] of a 12-item MSI. An MSI could be used to identify patient-specific metacognitive skills in people with depressive symptoms, which need to be improved while doing Metacognitive Therapy (MCT) after validating clinical samples.
抽象元认知技能有助于控制和调节消极的想法、情绪、信念和悲伤记忆。本研究的目的是开发和验证元认知技能量表(MSI),以评估有抑郁症状者和没有抑郁症状者在采用元认知策略方面的差异。在印度青年中进行了两项研究(研究1-N = 269,平均年龄=21.1,研究2-N = 745,平均年龄=20.9)。他们完成了MSI以及抑郁和负面情绪的测量。量表开发采用项目反应理论(IRT)分析、探索性因素分析和验证性因素分析。这些分析得出了一个有意义的四因素结构[(i)通过采用元认知策略引导负面思维,(ii)建设性地引导负面情绪,(iii)识别沉思倾向,(iv)了解调节情绪的优势和劣势]。MSI可用于识别抑郁症状患者的特定元认知技能,在验证临床样本后,在进行元认知治疗(MCT)时需要提高这些技能。
{"title":"Metacognitive skills inventory (MSI): development and validation","authors":"Haja Hameed, Reena Cheruvalath","doi":"10.1080/15305058.2021.1986051","DOIUrl":"https://doi.org/10.1080/15305058.2021.1986051","url":null,"abstract":"Abstract Metacognitive skills help to control and regulate negative thoughts, emotions, beliefs and sad memories. The objective of the study was to develop and validate an inventory-Metacognitive Skills Inventory (MSI) to assess the variance in adopting metacognitive strategies between those who have depressive symptoms and those who have not. Two studies were carried out among Indian youth (study 1—N = 269, MeanAge= 21.1 and study 2—N = 745, MeanAge= 20.9). They completed the MSI as well as measures of depression and negative emotions. Item response theory (IRT) analysis, and exploratory (EFA) and confirmatory factor analysis (CFA) were carried out for the scale development. The analyses derived a meaningful four-factor structure [(i) Navigation of negative thoughts by adopting metacognitive strategies, (ii) Channelizing negative emotions constructively, (iii) Recognizing ruminative tendencies, (iv) Knowledge of strengths and weaknesses in regulating emotions] of a 12-item MSI. An MSI could be used to identify patient-specific metacognitive skills in people with depressive symptoms, which need to be improved while doing Metacognitive Therapy (MCT) after validating clinical samples.","PeriodicalId":46615,"journal":{"name":"International Journal of Testing","volume":"21 1","pages":"154 - 181"},"PeriodicalIF":1.7,"publicationDate":"2021-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44737070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Cross-country comparability of a social-emotional skills assessment designed for youth in low-resource environments 为低资源环境中的青年设计的社会情感技能评估的跨国可比性
IF 1.7 Q2 SOCIAL SCIENCES, INTERDISCIPLINARY Pub Date : 2021-10-02 DOI: 10.1080/15305058.2021.1995867
Nina Menezes Cunha, Andres Martinez, P. Kyllonen, Sarah Gates
Abstract We evaluate the measurement invariance of a 48-item instrument designed to measure general social and emotional skills of youth in low resource environments. We refer to the skills measured as positive self-concept, negative self-concept, higher order thinking skills, and social and communication skills. These skills are often associated with economic development and can be used to evaluate programs designed to enhance economic development. Our evaluation is based on a sample of 1,794 in and out-of-school youth from Uganda and Guatemala’s Western Highlands. We conduct the analyses using a multiple group confirmatory factor analysis approach, breaking the sample by country, gender, and socio-economic status (high vs. low). Overall, our analysis points to strong invariance for all four measures across the different groups being compared. These findings contribute to the validity of the instrument as a tool for better understanding youth in diverse, developing economies.
摘要我们评估了一种48项仪器的测量不变性,该仪器旨在测量低资源环境中青年的一般社会和情感技能。我们将技能称为积极自我概念、消极自我概念、高级思维技能以及社交和沟通技能。这些技能通常与经济发展有关,可用于评估旨在促进经济发展的计划。我们的评估基于来自乌干达和危地马拉西部高地的1794名在校和校外青年的样本。我们使用多组验证性因素分析方法进行分析,按国家、性别和社会经济地位(高与低)划分样本。总的来说,我们的分析表明,在被比较的不同组中,所有四种测量都具有很强的不变性。这些发现有助于该文书作为更好地了解不同发展中经济体青年的工具的有效性。
{"title":"Cross-country comparability of a social-emotional skills assessment designed for youth in low-resource environments","authors":"Nina Menezes Cunha, Andres Martinez, P. Kyllonen, Sarah Gates","doi":"10.1080/15305058.2021.1995867","DOIUrl":"https://doi.org/10.1080/15305058.2021.1995867","url":null,"abstract":"Abstract We evaluate the measurement invariance of a 48-item instrument designed to measure general social and emotional skills of youth in low resource environments. We refer to the skills measured as positive self-concept, negative self-concept, higher order thinking skills, and social and communication skills. These skills are often associated with economic development and can be used to evaluate programs designed to enhance economic development. Our evaluation is based on a sample of 1,794 in and out-of-school youth from Uganda and Guatemala’s Western Highlands. We conduct the analyses using a multiple group confirmatory factor analysis approach, breaking the sample by country, gender, and socio-economic status (high vs. low). Overall, our analysis points to strong invariance for all four measures across the different groups being compared. These findings contribute to the validity of the instrument as a tool for better understanding youth in diverse, developing economies.","PeriodicalId":46615,"journal":{"name":"International Journal of Testing","volume":"21 1","pages":"182 - 219"},"PeriodicalIF":1.7,"publicationDate":"2021-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43334250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Examining severity and centrality effects in TestDaF writing and speaking assessments: An extended Bayesian many-facet Rasch analysis 检验TestDaF写作和口语评估中的严重性和中心性效应:一种扩展的贝叶斯多方面Rasch分析
IF 1.7 Q2 SOCIAL SCIENCES, INTERDISCIPLINARY Pub Date : 2021-09-03 DOI: 10.1080/15305058.2021.1963260
T. Eckes, K. Jin
Abstract Severity and centrality are two main kinds of rater effects posing threats to the validity and fairness of performance assessments. Adopting Jin and Wang’s (2018) extended facets modeling approach, we separately estimated the magnitude of rater severity and centrality effects in the web-based TestDaF (Test of German as a Foreign Language) writing and speaking assessments using Bayesian MCMC methods. The findings revealed that (a) the extended facets model had a better data–model fit than models that ignored either or both kinds of rater effects, (b) rating scale and partial credit versions of the extended model differed in terms of data–model fit for writing and speaking, (c) rater severity and centrality estimates were not significantly correlated with each other, and (d) centrality effects had a demonstrable impact on examinee rank orderings. The discussion focuses on implications for the analysis and evaluation of rating quality in performance assessments.
摘要严重性和中心性是两种主要的评分者效应,对绩效评估的有效性和公平性构成威胁。采用金和王(2018)的扩展facets建模方法,我们使用贝叶斯MCMC方法分别估计了基于网络的TestDaF(德语作为外语的测试)写作和口语评估中评分者严重性和中心性效应的大小。研究结果表明,(a)扩展facets模型比忽略任何一种或两种评分者效应的模型具有更好的数据-模型拟合性,(b)扩展模型的评分量表和部分信用版本在写作和口语的数据-模式拟合方面存在差异,(c)评分者严重程度和中心性估计彼此之间没有显著相关性,(d)中心性效应对考生的排名顺序有明显的影响。讨论的重点是对业绩评估中评级质量的分析和评估的影响。
{"title":"Examining severity and centrality effects in TestDaF writing and speaking assessments: An extended Bayesian many-facet Rasch analysis","authors":"T. Eckes, K. Jin","doi":"10.1080/15305058.2021.1963260","DOIUrl":"https://doi.org/10.1080/15305058.2021.1963260","url":null,"abstract":"Abstract Severity and centrality are two main kinds of rater effects posing threats to the validity and fairness of performance assessments. Adopting Jin and Wang’s (2018) extended facets modeling approach, we separately estimated the magnitude of rater severity and centrality effects in the web-based TestDaF (Test of German as a Foreign Language) writing and speaking assessments using Bayesian MCMC methods. The findings revealed that (a) the extended facets model had a better data–model fit than models that ignored either or both kinds of rater effects, (b) rating scale and partial credit versions of the extended model differed in terms of data–model fit for writing and speaking, (c) rater severity and centrality estimates were not significantly correlated with each other, and (d) centrality effects had a demonstrable impact on examinee rank orderings. The discussion focuses on implications for the analysis and evaluation of rating quality in performance assessments.","PeriodicalId":46615,"journal":{"name":"International Journal of Testing","volume":"21 1","pages":"131 - 153"},"PeriodicalIF":1.7,"publicationDate":"2021-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49282149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Exploring task features that predict psychometric quality of test items: the case for the Dutch driving theory exam 探索预测测试项目心理测量质量的任务特征:以荷兰驾驶理论考试为例
IF 1.7 Q2 SOCIAL SCIENCES, INTERDISCIPLINARY Pub Date : 2021-06-15 DOI: 10.1080/15305058.2021.1916506
E. Roelofs, Wilco H M Emons, Angela J. Verschoor
Abstract This study reports on an Evidence Centered Design (ECD) project in the Netherlands, involving the theory exam for prospective car drivers. In particular, we illustrate how cognitive load theory, task-analysis, response process models, and explanatory item-response theory can be used to systematically develop and refine task models. Based on a cognitive model for driving, 353 existing items involving rules of priority at intersections, were coded on intrinsic task features and task presentation features. Hierarchical regression analyses were carried out to determine the contribution of task features to item difficulty and item discrimination. A substantial proportion of variance in both item difficulty and item discrimination parameters could be explained by intrinsic task-features, including rules and signs (25%, 18.6%), task-intersection features (13.4%, 14.1%), and a smaller small proportion to item presentation features (3.5%, 7.1%) of the total variance. It is concluded that the systematic approach of discerning task features and determining the impact on item parameters has added value as an ECD-tool for evaluating existing assessments that are planned to be innovated. The paper concludes with a discussion of practical implications.
摘要本研究报告了荷兰的一个以证据为中心的设计(ECD)项目,该项目涉及未来汽车驾驶员的理论考试。特别是,我们说明了如何使用认知负荷理论、任务分析、反应过程模型和解释性项目反应理论来系统地开发和完善任务模型。基于驾驶认知模型,对353个涉及十字路口优先规则的现有项目进行了内在任务特征和任务呈现特征编码。进行了层次回归分析,以确定任务特征对项目难度和项目辨别的贡献。项目难度和项目辨别参数的差异很大一部分可以由内在任务特征来解释,包括规则和符号(25%,18.6%)、任务交叉特征(13.4%,14.1%),以及占总方差的较小比例的项目呈现特征(3.5%,7.1%)。结论是,识别任务特征和确定对项目参数的影响的系统方法作为评估计划创新的现有评估的ECD工具具有附加值。论文最后讨论了实际意义。
{"title":"Exploring task features that predict psychometric quality of test items: the case for the Dutch driving theory exam","authors":"E. Roelofs, Wilco H M Emons, Angela J. Verschoor","doi":"10.1080/15305058.2021.1916506","DOIUrl":"https://doi.org/10.1080/15305058.2021.1916506","url":null,"abstract":"Abstract This study reports on an Evidence Centered Design (ECD) project in the Netherlands, involving the theory exam for prospective car drivers. In particular, we illustrate how cognitive load theory, task-analysis, response process models, and explanatory item-response theory can be used to systematically develop and refine task models. Based on a cognitive model for driving, 353 existing items involving rules of priority at intersections, were coded on intrinsic task features and task presentation features. Hierarchical regression analyses were carried out to determine the contribution of task features to item difficulty and item discrimination. A substantial proportion of variance in both item difficulty and item discrimination parameters could be explained by intrinsic task-features, including rules and signs (25%, 18.6%), task-intersection features (13.4%, 14.1%), and a smaller small proportion to item presentation features (3.5%, 7.1%) of the total variance. It is concluded that the systematic approach of discerning task features and determining the impact on item parameters has added value as an ECD-tool for evaluating existing assessments that are planned to be innovated. The paper concludes with a discussion of practical implications.","PeriodicalId":46615,"journal":{"name":"International Journal of Testing","volume":"21 1","pages":"80 - 104"},"PeriodicalIF":1.7,"publicationDate":"2021-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/15305058.2021.1916506","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47064095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Validating theoretical assumptions about reading with cognitive diagnosis models 用认知诊断模型验证阅读的理论假设
IF 1.7 Q2 SOCIAL SCIENCES, INTERDISCIPLINARY Pub Date : 2021-06-15 DOI: 10.1080/15305058.2021.1931238
A. George, A. Robitzsch
Abstract Modern large-scale studies such as the Progress in International Reading Literacy Study (PIRLS) do not only report reading competence of students on a global reading scale but also report reading on the level of reading subskills. However, the number of and the dependencies between the subskills are frequently discussed. In this study, different theoretical assumptions regarding the subskills describing the reading competence “acquiring and using information” in PIRLS are deduced from accompanying official materials. The different assumptions are then translated into empirical cognitive diagnosis models (CDMs). By evaluating and comparing the CDMs in terms of empirical fit criteria in each country participating in PIRLS 2016, the underlying theoretical assumptions are validated. Results show that in all but one country, a model proposing four reading subskills with no order between the subskills shows the best fit. This selected model could be simplified in order to facilitate practical derivations as, for example, the evaluation of skill classes and the analysis of learning paths.
国际阅读素养研究进展(PIRLS)等现代大规模研究不仅在全球阅读尺度上报告学生的阅读能力,而且在阅读子技能水平上报告阅读。但是,经常讨论子技能的数量和子技能之间的依赖关系。本研究从相关官方资料中,对PIRLS中描述阅读能力“获取和使用信息”的子技能进行了不同的理论假设。然后将不同的假设转化为经验认知诊断模型(CDMs)。通过根据参与PIRLS 2016的每个国家的经验拟合标准评估和比较清洁发展机制,验证了基本的理论假设。结果表明,在除一个国家外的所有国家中,提出四种阅读子技能且子技能之间没有顺序的模型最适合。这个选定的模型可以简化,以便于实际的推导,例如,技能等级的评估和学习路径的分析。
{"title":"Validating theoretical assumptions about reading with cognitive diagnosis models","authors":"A. George, A. Robitzsch","doi":"10.1080/15305058.2021.1931238","DOIUrl":"https://doi.org/10.1080/15305058.2021.1931238","url":null,"abstract":"Abstract Modern large-scale studies such as the Progress in International Reading Literacy Study (PIRLS) do not only report reading competence of students on a global reading scale but also report reading on the level of reading subskills. However, the number of and the dependencies between the subskills are frequently discussed. In this study, different theoretical assumptions regarding the subskills describing the reading competence “acquiring and using information” in PIRLS are deduced from accompanying official materials. The different assumptions are then translated into empirical cognitive diagnosis models (CDMs). By evaluating and comparing the CDMs in terms of empirical fit criteria in each country participating in PIRLS 2016, the underlying theoretical assumptions are validated. Results show that in all but one country, a model proposing four reading subskills with no order between the subskills shows the best fit. This selected model could be simplified in order to facilitate practical derivations as, for example, the evaluation of skill classes and the analysis of learning paths.","PeriodicalId":46615,"journal":{"name":"International Journal of Testing","volume":"21 1","pages":"105 - 129"},"PeriodicalIF":1.7,"publicationDate":"2021-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/15305058.2021.1931238","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46664133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Animated videos in assessment: comparing validity evidence from and test-takers’ reactions to an animated and a text-based situational judgment test 评估中的动画视频:比较来自动画和基于文本的情景判断测试的效度证据和考生的反应
IF 1.7 Q2 SOCIAL SCIENCES, INTERDISCIPLINARY Pub Date : 2021-06-15 DOI: 10.1080/15305058.2021.1916505
Anastasios Karakolidis, M. O’Leary, Darina Scully
Abstract The linguistic complexity of many text-based tests can be a source of construct-irrelevant variance, as test-takers’ performance may be affected by factors that are beyond the focus of the assessment itself, such as reading comprehension skills. This experimental study examined the extent to which the use of animated videos, as opposed to written text, could (i) reduce construct-irrelevant variance attributed to language and reading skills and (ii) impact test-takers’ reactions to a situational judgment test. The results indicated that the variance attributed to construct-irrelevant factors was lower by 9.5% in the animated version of the test. In addition, those who took the animated test perceived it to be more valid, fair, and enjoyable, than those who took the text-based test. They also rated the language used as less difficult to understand. The implications of these findings are discussed.
许多基于文本的测试的语言复杂性可能是结构无关方差的来源,因为考生的表现可能受到超出评估本身重点的因素的影响,例如阅读理解技能。本实验研究考察了在多大程度上使用动画视频,而不是书面文本,可以(i)减少归因于语言和阅读技能的结构无关的方差,(ii)影响考生对情景判断测试的反应。结果表明,在动画版本的测试中,归因于结构无关因素的方差降低了9.5%。此外,那些参加动画测试的人认为它比那些参加基于文本的测试的人更有效、更公平、更有趣。他们还认为所使用的语言更容易理解。讨论了这些发现的意义。
{"title":"Animated videos in assessment: comparing validity evidence from and test-takers’ reactions to an animated and a text-based situational judgment test","authors":"Anastasios Karakolidis, M. O’Leary, Darina Scully","doi":"10.1080/15305058.2021.1916505","DOIUrl":"https://doi.org/10.1080/15305058.2021.1916505","url":null,"abstract":"Abstract The linguistic complexity of many text-based tests can be a source of construct-irrelevant variance, as test-takers’ performance may be affected by factors that are beyond the focus of the assessment itself, such as reading comprehension skills. This experimental study examined the extent to which the use of animated videos, as opposed to written text, could (i) reduce construct-irrelevant variance attributed to language and reading skills and (ii) impact test-takers’ reactions to a situational judgment test. The results indicated that the variance attributed to construct-irrelevant factors was lower by 9.5% in the animated version of the test. In addition, those who took the animated test perceived it to be more valid, fair, and enjoyable, than those who took the text-based test. They also rated the language used as less difficult to understand. The implications of these findings are discussed.","PeriodicalId":46615,"journal":{"name":"International Journal of Testing","volume":"21 1","pages":"57 - 79"},"PeriodicalIF":1.7,"publicationDate":"2021-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/15305058.2021.1916505","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41447909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Post-COVID-19 perceived stigma-discrimination scale: psychometric development and evaluation covid -19后感知污名歧视量表:心理测量开发与评估
IF 1.7 Q2 SOCIAL SCIENCES, INTERDISCIPLINARY Pub Date : 2021-06-09 DOI: 10.1080/15305058.2022.2042000
C. Cassiani-Miranda, J. Pedrozo-Pupo, A. Campo‐Arias
Abstract The study aimed to adapt and evaluate a scale to measure COVID-19-CED in COVID-19 survivors. A sample of 330 COVID-19 survivors filled out the COVID-19 Perceived Discrimination Scale (C-19-PDS). C-19-PDS was adapted from the Tuberculosis Perceived Discrimination Scale (11 items). Confirmatory factor analysis showed poor goodness-of-fit indicators. However, the 5-item version of the C-19-PDS showed better goodness-of-fit indicators, high internal consistency, and non-gender DIF. This instrument is recommended to evaluate COVID-19-CED in Colombian and other Spanish-speaking populations.
摘要本研究旨在调整和评估衡量新冠肺炎幸存者COVID-19-CED的量表。330名新冠肺炎幸存者的样本填写了新冠肺炎感知歧视量表(C-19-PDS)。C-19-PDS改编自结核病感知歧视量表(11项)。验证性因素分析显示拟合优度指标较差。然而,C-19-PDS的5项版本显示出更好的拟合优度指标、高内部一致性和非性别DIF。建议使用该仪器评估哥伦比亚和其他西班牙人口的COVID-19-CED。
{"title":"Post-COVID-19 perceived stigma-discrimination scale: psychometric development and evaluation","authors":"C. Cassiani-Miranda, J. Pedrozo-Pupo, A. Campo‐Arias","doi":"10.1080/15305058.2022.2042000","DOIUrl":"https://doi.org/10.1080/15305058.2022.2042000","url":null,"abstract":"Abstract The study aimed to adapt and evaluate a scale to measure COVID-19-CED in COVID-19 survivors. A sample of 330 COVID-19 survivors filled out the COVID-19 Perceived Discrimination Scale (C-19-PDS). C-19-PDS was adapted from the Tuberculosis Perceived Discrimination Scale (11 items). Confirmatory factor analysis showed poor goodness-of-fit indicators. However, the 5-item version of the C-19-PDS showed better goodness-of-fit indicators, high internal consistency, and non-gender DIF. This instrument is recommended to evaluate COVID-19-CED in Colombian and other Spanish-speaking populations.","PeriodicalId":46615,"journal":{"name":"International Journal of Testing","volume":"23 1","pages":"1 - 9"},"PeriodicalIF":1.7,"publicationDate":"2021-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43353665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Journal of Testing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1