首页 > 最新文献

ETS Research Report Series最新文献

英文 中文
Methods for Imputing Scores When All Responses Are Missing for One or More Polytomous Items: Accuracy and Impact on Psychometric Property 当一个或多个多同构项目的所有回答都缺失时的计分方法:准确性和对心理测量属性的影响
Q3 Social Sciences Pub Date : 2023-05-04 DOI: 10.1002/ets2.12369
Yanxuan Qu, Sandip Sinharay

Though a substantial amount of research exists on imputing missing scores in educational assessments, there is little research on cases where responses or scores to an item are missing for all test takers. In this paper, we tackled the problem of imputing missing scores for tests for which the responses to an item are missing for all test takers. We considered three missing-data imputation methods—the median method, the item response theory (IRT) method, and the two-way method—for imputing scores. We compared the performance of these three imputation methods with respect to their accuracy in estimating scaled scores and test reliability for the aforementioned problem. Real data were used in the comparison. All three methods performed well in imputing scaled scores with negligible imputation error: The IRT method and the median method provided slightly more accurate scaled scores. The two-way method provided the most accurate reliability estimates. Recommendations for practice are provided.

尽管目前已有大量关于教育评估中缺失分数归因的研究,但对于所有应试者对某个项目的回答或分数都缺失的情况却鲜有研究。在本文中,我们探讨了在所有应试者对某个项目的回答都缺失的测试中如何估算缺失分数的问题。我们考虑了三种缺失数据估算方法--中值法、项目反应理论(IRT)法和双向法--来估算分数。我们比较了这三种估算方法在估算上述问题的比例分数和测验信度方面的准确性。比较中使用了真实数据。所有三种方法在估算比例分数时都表现良好,估算误差可以忽略不计:IRT 法和中位数法提供的比例分数准确度稍高。双向法提供了最准确的信度估计。本文提出了一些实践建议。
{"title":"Methods for Imputing Scores When All Responses Are Missing for One or More Polytomous Items: Accuracy and Impact on Psychometric Property","authors":"Yanxuan Qu,&nbsp;Sandip Sinharay","doi":"10.1002/ets2.12369","DOIUrl":"10.1002/ets2.12369","url":null,"abstract":"<p>Though a substantial amount of research exists on imputing missing scores in educational assessments, there is little research on cases where responses or scores to an item are missing for all test takers. In this paper, we tackled the problem of imputing missing scores for tests for which the responses to an item are missing for all test takers. We considered three missing-data imputation methods—the median method, the item response theory (IRT) method, and the two-way method—for imputing scores. We compared the performance of these three imputation methods with respect to their accuracy in estimating scaled scores and test reliability for the aforementioned problem. Real data were used in the comparison. All three methods performed well in imputing scaled scores with negligible imputation error: The IRT method and the median method provided slightly more accurate scaled scores. The two-way method provided the most accurate reliability estimates. Recommendations for practice are provided.</p>","PeriodicalId":11972,"journal":{"name":"ETS Research Report Series","volume":"2023 1","pages":"1-7"},"PeriodicalIF":0.0,"publicationDate":"2023-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ets2.12369","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46551362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Equity Levers: What Predicts Enrollment in and Number of College-Level Courses Taken in High School? 公平杠杆:什么可以预测高中的入学人数和大学水平课程的数量?
Q3 Social Sciences Pub Date : 2023-04-10 DOI: 10.1002/ets2.12368
Marisol J. C. Kevelson, Catherine M. Millett, Carly Slutzky, Stephanie R. Saunders

This study explores the extent to which student, family, peer, and school factors predict (a) whether students take Advanced Placement® (AP®) courses, International Baccalaureate (IB) courses, and dual enrollment courses and (b) in models limited to course takers, how many courses they completed. Our findings, based on a nationally representative, longitudinal sample, suggest that, when it comes to college-level high school course taking, the relative advantage of higher socioeconomic status (SES) is less for African American students than it is for White and Asian students. Ninth-grade math skills are the strongest predictor of AP or IB and dual enrollment course taking, above and beyond demographic background characteristics like SES and race or ethnicity. High school girls take AP/IB and dual enrollment courses at a higher rate than boys, and they take more of these courses. The level of academic focus of students and their peers is associated with both AP or IB and dual enrollment course taking, whereas having parents focused on college preparation and course taking only predicts AP or IB course taking. School factors associated with AP or IB course taking include U.S. region and rural location; the percentage of math teachers with a master's degree is also positively associated with the number of AP or IB courses students take. These findings highlight the importance of equitable educational opportunities starting from a young age. They also indicate a need for increased early attention to student math skills and for more supports for parents and school staff to enable them to encourage and prepare all students, especially those from historically marginalized groups, to take college-level courses in high school.

本研究探讨了学生、家庭、同伴和学校因素在多大程度上可以预测(a)学生是否选修大学先修课程(AP®)、国际文凭(IB)课程和双录取课程,以及(b)在仅限于选课者的模型中,他们完成了多少门课程。我们根据具有全国代表性的纵向样本得出的研究结果表明,在选修大学水平的高中课程方面,与白人和亚裔学生相比,社会经济地位(SES)较高的非裔美国学生的相对优势较小。九年级数学技能是预测选修 AP 或 IB 课程以及双录取课程的最有力因素,高于社会经济地位、种族或民族等人口背景特征。高中女生选修 AP/IB 和双录取课程的比例高于男生,而且她们选修的课程也更多。学生及其同龄人对学术的关注程度与选修 AP 或 IB 课程和双注册课程有关,而父母关注大学预备课程和选修课程只预测选修 AP 或 IB 课程的情况。与选修 AP 或 IB 课程相关的学校因素包括美国地区和农村地区;拥有硕士学位的数学教师比例也与学生选修 AP 或 IB 课程的数量呈正相关。这些发现强调了从小开始提供公平教育机会的重要性。这些研究结果还表明,有必要加强对学生数学技能的早期关注,并为家长和学校教职员工提供更多支持,使他们能够鼓励所有学生,尤其是那些来自历史上被边缘化群体的学生,并为他们在高中阶段选修大学水平的课程做好准备。
{"title":"Equity Levers: What Predicts Enrollment in and Number of College-Level Courses Taken in High School?","authors":"Marisol J. C. Kevelson,&nbsp;Catherine M. Millett,&nbsp;Carly Slutzky,&nbsp;Stephanie R. Saunders","doi":"10.1002/ets2.12368","DOIUrl":"10.1002/ets2.12368","url":null,"abstract":"<p>This study explores the extent to which student, family, peer, and school factors predict (a) whether students take <i>Advanced Placement</i>® (<i>AP</i>®) courses, International Baccalaureate (IB) courses, and dual enrollment courses and (b) in models limited to course takers, how many courses they completed. Our findings, based on a nationally representative, longitudinal sample, suggest that, when it comes to college-level high school course taking, the relative advantage of higher socioeconomic status (SES) is less for African American students than it is for White and Asian students. Ninth-grade math skills are the strongest predictor of AP or IB and dual enrollment course taking, above and beyond demographic background characteristics like SES and race or ethnicity. High school girls take AP/IB and dual enrollment courses at a higher rate than boys, and they take more of these courses. The level of academic focus of students and their peers is associated with both AP or IB and dual enrollment course taking, whereas having parents focused on college preparation and course taking only predicts AP or IB course taking. School factors associated with AP or IB course taking include U.S. region and rural location; the percentage of math teachers with a master's degree is also positively associated with the number of AP or IB courses students take. These findings highlight the importance of equitable educational opportunities starting from a young age. They also indicate a need for increased early attention to student math skills and for more supports for parents and school staff to enable them to encourage and prepare all students, especially those from historically marginalized groups, to take college-level courses in high school.</p>","PeriodicalId":11972,"journal":{"name":"ETS Research Report Series","volume":"2023 1","pages":"1-61"},"PeriodicalIF":0.0,"publicationDate":"2023-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ets2.12368","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46713183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Hybrid Model for Orthogonal Regression 正交回归的混合模型
Q3 Social Sciences Pub Date : 2023-02-27 DOI: 10.1002/ets2.12367
Michael Kane

Linear functional relationships are intended to be symmetric and therefore cannot generally be accurately estimated using ordinary least squares regression equations. Orthogonal regression (OR) models allow for errors in both Y and X and therefore can provide symmetric estimates of these relationships. The most well-established OR model, the errors-in-variables (EIV) model, assumes that the observed scatter around the line is due entirely to errors of measurement in Y and X and that the ratio of the error variances is known. If most of the variance around the line is known to be due to the errors of measurement in Y and X, the EIV model can provide an unbiased maximum likelihood estimate for a functional relationship. However, if a substantial part of the variability around the line is due to natural variability, which is not attributable to errors of measurement in Y or X, the ratio of the measurement error variances is not well defined and the EIV model is not directly applicable. The main contribution of this report is the development of a hybrid model that provides plausible estimates for linear functional relationships in cases with substantial natural variability and substantial errors of measurement. An analysis of female and male differential test functioning between an essay test and an objective test used as parts of a licensure examination provides an illustration of the use of the hybrid model.

线性函数关系旨在对称,因此一般无法使用普通最小二乘法回归方程进行准确估算。正交回归(OR)模型允许 Y 和 X 都存在误差,因此可以提供这些关系的对称估计值。最成熟的正交回归模型,即变量误差(EIV)模型,假定观察到的直线周围的方差完全是由于 Y 和 X 的测量误差造成的,并且误差方差的比率是已知的。如果已知直线周围的大部分方差是由 Y 和 X 的测量误差造成的,那么 EIV 模型就能为函数关系提供无偏的最大似然估计值。但是,如果线周围的变异有很大一部分是由于自然变异造成的,而不是由于 Y 或 X 的测量误差造成的,那么测量误差方差的比率就不能很好地定义,EIV 模型也就不能直接适用。本报告的主要贡献在于开发了一个混合模型,该模型可在存在大量自然变异和大量测量误差的情况下,为线性函数关系提供可信的估计值。通过对作为执业资格考试组成部分的论文测试和客观测试之间的男女测试功能差异的分析,说明了混合模型的应用。
{"title":"A Hybrid Model for Orthogonal Regression","authors":"Michael Kane","doi":"10.1002/ets2.12367","DOIUrl":"10.1002/ets2.12367","url":null,"abstract":"<p>Linear functional relationships are intended to be symmetric and therefore cannot generally be accurately estimated using ordinary least squares regression equations. Orthogonal regression (OR) models allow for errors in both <i>Y</i> and <i>X</i> and therefore can provide symmetric estimates of these relationships. The most well-established OR model, the errors-in-variables (EIV) model, assumes that the observed scatter around the line is due entirely to errors of measurement in <i>Y</i> and <i>X</i> and that the ratio of the error variances is known. If most of the variance around the line is known to be due to the errors of measurement in <i>Y</i> and <i>X</i>, the EIV model can provide an unbiased maximum likelihood estimate for a functional relationship. However, if a substantial part of the variability around the line is due to natural variability, which is not attributable to errors of measurement in <i>Y</i> or <i>X</i>, the ratio of the measurement error variances is not well defined and the EIV model is not directly applicable. The main contribution of this report is the development of a hybrid model that provides plausible estimates for linear functional relationships in cases with substantial natural variability and substantial errors of measurement. An analysis of female and male differential test functioning between an essay test and an objective test used as parts of a licensure examination provides an illustration of the use of the hybrid model.</p>","PeriodicalId":11972,"journal":{"name":"ETS Research Report Series","volume":"2023 1","pages":"1-19"},"PeriodicalIF":0.0,"publicationDate":"2023-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ets2.12367","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45600512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Simulating Real-World Context in an Email Writing Task: Implications for Task-Based Language Assessment 在电子邮件写作任务中模拟真实世界环境:对基于任务的语言评估的启示
Q3 Social Sciences Pub Date : 2023-02-16 DOI: 10.1002/ets2.12366
John M. Norris, Shoko Sasayama, Michelle Kim

Accomplishing a communication task in the real world requires the ability not only to do the task per se but also to manage aspects of the context in which it occurs. For this reason, simulations of target language use contexts have been incorporated into the design of communicative language tests as a way of enhancing the authenticity of assessment task performance. Although some contextual factors may increase extraneous cognitive load and distract learners from focusing on the task at hand (Sweller, 1994), they represent important design considerations in task-based language assessment (TBLA), where the purpose of assessment is to determine what second language (L2) learners can do with the target language in the real world. In that sense, the extraneous cognitive load might well be part of the construct we are interested in assessing. Accordingly, the current study simulated aspects of a real-world task performance context as part of an email writing task assessment. Simulated context was operationalized as (a) additional information about the task scenario, (b) a visual image to simulate the physical context, and (c) an audio to replicate the real-world experience. A total of 276 L2 English learners performed the email task, half with simulated context and the other half without it. Findings revealed that, when presented with simulated context, the tasks were perceived by participants to have induced more time pressure and to be more interesting. In terms of performance effects, the provision of simulated context negatively affected the syntactic complexity of participants' writing but positively affected their syntactic fluency. It also led to greater discrimination among learners at different proficiency levels on various measures of language performance. The paper concludes by highlighting implications for task design and validity evaluation, especially in TBLA.

要在真实世界中完成一项交际任务,不仅要有完成任务本身的能力,还要有驾驭任务情境的能力。因此,模拟目标语言使用情境已被纳入语言交际测试的设计中,作为增强评估任务表现真实性的一种方法。虽然某些语境因素可能会增加外在认知负荷,分散学习者对手头任务的注意力(Sweller,1994),但它们是任务型语言测评(TBLA)中重要的设计考虑因素,因为测评的目的是确定第二语言(L2)学习者在真实世界中使用目标语言的能力。从这个意义上说,外在认知负荷很可能就是我们感兴趣的评估结构的一部分。因此,本研究模拟了真实世界中任务执行情境的各个方面,作为电子邮件写作任务评估的一部分。模拟情境可操作为:(a)任务场景的附加信息;(b)模拟物理情境的视觉图像;以及(c)复制真实世界体验的音频。共有 276 名中级英语学习者完成了电子邮件任务,其中一半有模拟情境,另一半没有。研究结果表明,在有模拟情境的情况下,参与者认为任务会给他们带来更多的时间压力,也更有趣。在成绩效应方面,提供模拟语境对参与者写作的句法复杂性有负面影响,但对其句法流畅性有正面影响。此外,模拟情境还使不同语言水平的学习者在各种语言成绩测量指标上有了更大的区分度。论文最后强调了任务设计和有效性评估的意义,尤其是在 TBLA 中。
{"title":"Simulating Real-World Context in an Email Writing Task: Implications for Task-Based Language Assessment","authors":"John M. Norris,&nbsp;Shoko Sasayama,&nbsp;Michelle Kim","doi":"10.1002/ets2.12366","DOIUrl":"10.1002/ets2.12366","url":null,"abstract":"<p>Accomplishing a communication task in the real world requires the ability not only to do the task per se but also to manage aspects of the context in which it occurs. For this reason, simulations of target language use contexts have been incorporated into the design of communicative language tests as a way of enhancing the authenticity of assessment task performance. Although some contextual factors may increase extraneous cognitive load and distract learners from focusing on the task at hand (Sweller, 1994), they represent important design considerations in task-based language assessment (TBLA), where the purpose of assessment is to determine what second language (L2) learners can do with the target language in the real world. In that sense, the extraneous cognitive load might well be part of the construct we are interested in assessing. Accordingly, the current study simulated aspects of a real-world task performance context as part of an email writing task assessment. Simulated context was operationalized as (a) additional information about the task scenario, (b) a visual image to simulate the physical context, and (c) an audio to replicate the real-world experience. A total of 276 L2 English learners performed the email task, half with simulated context and the other half without it. Findings revealed that, when presented with simulated context, the tasks were perceived by participants to have induced more time pressure and to be more interesting. In terms of performance effects, the provision of simulated context negatively affected the syntactic complexity of participants' writing but positively affected their syntactic fluency. It also led to greater discrimination among learners at different proficiency levels on various measures of language performance. The paper concludes by highlighting implications for task design and validity evaluation, especially in TBLA.</p>","PeriodicalId":11972,"journal":{"name":"ETS Research Report Series","volume":"2023 1","pages":"1-22"},"PeriodicalIF":0.0,"publicationDate":"2023-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ets2.12366","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45714264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Employer Expectations of 21st-Century High School Graduates: Analyzing Online Job Advertisements 21世纪高中毕业生的雇主期望:网络招聘广告分析
Q3 Social Sciences Pub Date : 2023-01-28 DOI: 10.1002/ets2.12365
Kevin M. Williams, Tao Wang, Steven Holtzman, Tak Ming Leung, Gernissia Cherfrere, Guangming Ling

Individuals with a high school education represent the largest subset of the U.S. workforce. However, little is known about the employer expectations of these individuals, particularly in the area of soft skills—also known as 21st-century skills. Online job advertisements offer useful data for examining these expectations, as they may supplement employer survey data and reflect actual recruitment practices. Our analysis of 68,505 online job advertisements suggests that employers hold generally lower expectations for the soft skills of high school–educated individuals than they do for postsecondary-educated individuals. However, employer expectations for two soft skills—professionalism and customer service skills—appear to be substantially higher for high school–educated individuals than for postsecondary-educated individuals. Additional results highlight similarities and differences within the high school–educated workforce across nine workplace industries. We discuss the implications of these results not only for high school–educated individuals and the organizations that employ them but also for practitioners and educators charged with assessing and providing training for these skills.

受过高中教育的人是美国劳动力中最大的群体。然而,人们对雇主对这些人的期望却知之甚少,尤其是在软技能--也被称为 21 世纪技能--方面。在线招聘广告为研究这些期望值提供了有用的数据,因为它们可以补充雇主调查数据并反映实际的招聘实践。我们对 68,505 份在线招聘广告的分析表明,雇主对高中学历者软技能的期望值普遍低于对大专学历者的期望值。然而,雇主对高中学历者的两项软技能--专业精神和客户服务技能--的期望似乎远远高于大专学历者。其他结果突出了九个工作场所行业中受过高中教育的劳动力的相似性和差异性。我们不仅讨论了这些结果对高中学历者和雇用他们的机构的影响,而且还讨论了负责评估和提供这些技能培训的从业人员和教育工作者的影响。
{"title":"Employer Expectations of 21st-Century High School Graduates: Analyzing Online Job Advertisements","authors":"Kevin M. Williams,&nbsp;Tao Wang,&nbsp;Steven Holtzman,&nbsp;Tak Ming Leung,&nbsp;Gernissia Cherfrere,&nbsp;Guangming Ling","doi":"10.1002/ets2.12365","DOIUrl":"10.1002/ets2.12365","url":null,"abstract":"<p>Individuals with a high school education represent the largest subset of the U.S. workforce. However, little is known about the employer expectations of these individuals, particularly in the area of soft skills—also known as 21st-century skills. Online job advertisements offer useful data for examining these expectations, as they may supplement employer survey data and reflect actual recruitment practices. Our analysis of 68,505 online job advertisements suggests that employers hold generally lower expectations for the soft skills of high school–educated individuals than they do for postsecondary-educated individuals. However, employer expectations for two soft skills—professionalism and customer service skills—appear to be substantially higher for high school–educated individuals than for postsecondary-educated individuals. Additional results highlight similarities and differences within the high school–educated workforce across nine workplace industries. We discuss the implications of these results not only for high school–educated individuals and the organizations that employ them but also for practitioners and educators charged with assessing and providing training for these skills.</p>","PeriodicalId":11972,"journal":{"name":"ETS Research Report Series","volume":"2023 1","pages":"1-19"},"PeriodicalIF":0.0,"publicationDate":"2023-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ets2.12365","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46676646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Evaluating the Use and Interpretation of the TOEIC® Listening and Reading Test Score Report: Perspectives of Test Takers in Japan 评估TOEIC®听力和阅读考试成绩报告的使用和解释:日本考生的观点
Q3 Social Sciences Pub Date : 2023-01-23 DOI: 10.1002/ets2.12364
Ching-Ni Hsieh

Researchers suggest that claims about the meaningfulness of test score interpretations and consequences of test use should be backed by evidence that stakeholders understand the definition of the construct assessed (meaningfulness) and score reports (consequences). Evaluation of stakeholders' actual uses and interpretations of score reports in large-scale standardized language proficiency tests, however, remains limited in the score reporting literature. This study investigates how test takers, as an important stakeholder group, use and interpret the score report of the TOEIC® Listening and Reading (TOEIC L&R) test. Data were collected from 834 TOEIC L&R test takers in Japan, who represented a wide range of English language proficiency based on their TOEIC L&R total scores. The participants responded to an online survey that asked about their uses and interpretations of the test results and their comprehension of the performance feedback presented in the score report. The results showed that the participants used the test results largely as intended, providing an important piece of validity evidence to support the proposed uses of the TOEIC L&R test. Some of the score reporting elements such as the performance feedback and footnote message, however, were not easy to understand for all participants, revealing a need to improve the interpretability of the score report. The study findings have implications for designing informative score reports and the usefulness around reporting test performance feedback.

研究人员建议,关于测验分数解释的意义和测验使用的后果的说法,应该有证据支持,证明利益相关者理解所评估的建构的定义(意义)和分数报告(后果)。然而,对利益相关者在大规模标准化语言能力测试中实际使用和解释分数报告的评估,在分数报告文献中仍然很有限。本研究调查了作为重要利益相关者群体的应试者如何使用和解释 TOEIC® Listening and Reading(TOEIC L&R)测试的分数报告。我们从日本的 834 名 TOEIC L&R 测试考生中收集了数据,根据他们的 TOEIC L&R 总分,这些考生代表了不同的英语水平。参与者回答了一项在线调查,调查内容包括他们对考试结果的使用和解释,以及他们对分数报告中的成绩反馈的理解。结果显示,受测者基本上按照预期使用了测试结果,为支持托业 L&R 考试的拟议用途提供了重要的有效性证据。然而,成绩反馈和脚注信息等一些分数报告元素并不容易为所有参与者所理解,这表明有必要提高分数报告的可解释性。研究结果对设计信息丰富的分数报告和报告考试成绩反馈的实用性具有启示意义。
{"title":"Evaluating the Use and Interpretation of the TOEIC® Listening and Reading Test Score Report: Perspectives of Test Takers in Japan","authors":"Ching-Ni Hsieh","doi":"10.1002/ets2.12364","DOIUrl":"10.1002/ets2.12364","url":null,"abstract":"<p>Researchers suggest that claims about the meaningfulness of test score interpretations and consequences of test use should be backed by evidence that stakeholders understand the definition of the construct assessed (meaningfulness) and score reports (consequences). Evaluation of stakeholders' actual uses and interpretations of score reports in large-scale standardized language proficiency tests, however, remains limited in the score reporting literature. This study investigates how test takers, as an important stakeholder group, use and interpret the score report of the <i>TOEIC</i>® Listening and Reading (TOEIC L&amp;R) test. Data were collected from 834 TOEIC L&amp;R test takers in Japan, who represented a wide range of English language proficiency based on their TOEIC L&amp;R total scores. The participants responded to an online survey that asked about their uses and interpretations of the test results and their comprehension of the performance feedback presented in the score report. The results showed that the participants used the test results largely as intended, providing an important piece of validity evidence to support the proposed uses of the TOEIC L&amp;R test. Some of the score reporting elements such as the performance feedback and footnote message, however, were not easy to understand for all participants, revealing a need to improve the interpretability of the score report. The study findings have implications for designing informative score reports and the usefulness around reporting test performance feedback.</p>","PeriodicalId":11972,"journal":{"name":"ETS Research Report Series","volume":"2023 1","pages":"1-16"},"PeriodicalIF":0.0,"publicationDate":"2023-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ets2.12364","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44897565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Evaluating Targeted Double Scoring for the Performance Assessment for School Leaders Using Imputation and Decision Theory 运用归因与决策理论评价学校领导绩效评估的目标双分
Q3 Social Sciences Pub Date : 2023-01-19 DOI: 10.1002/ets2.12363
Jing Miao, Sandip Sinharay, Chris Kelbaugh, Yi Cao, Wei Wang

In a targeted double-scoring procedure for performance assessments that are used for licensure and certification purposes, a subset of responses receives an independent second rating if the first rating falls into a preidentified critical score range (CSR) where an additional rating would lead to considerably more reliable pass-fail decisions. This study evaluates the CSRs using two approaches—one based on imputation of missing scores and the other based on statistical decision theory—using data from the Performance Assessment for School Leaders (PASL). Results from the evaluation indicate that the currently used CSRs are effective.

在用于执照和认证目的的绩效评估的目标双重评分程序中,如果第一个评分属于预先确定的临界分数范围(CSR),则会对一个子集的回答进行独立的第二次评分,在该范围内,额外的评分将导致更加可靠的通过-失败决定。本研究使用两种方法对 CSR 进行了评估,一种是基于缺失分数的估算,另一种是基于统计决策理论,并使用了来自学校领导绩效评估(PASL)的数据。评估结果表明,目前使用的 CSR 是有效的。
{"title":"Evaluating Targeted Double Scoring for the Performance Assessment for School Leaders Using Imputation and Decision Theory","authors":"Jing Miao,&nbsp;Sandip Sinharay,&nbsp;Chris Kelbaugh,&nbsp;Yi Cao,&nbsp;Wei Wang","doi":"10.1002/ets2.12363","DOIUrl":"10.1002/ets2.12363","url":null,"abstract":"<p>In a targeted double-scoring procedure for performance assessments that are used for licensure and certification purposes, a subset of responses receives an independent second rating if the first rating falls into a preidentified critical score range (CSR) where an additional rating would lead to considerably more reliable pass-fail decisions. This study evaluates the CSRs using two approaches—one based on imputation of missing scores and the other based on statistical decision theory—using data from the Performance Assessment for School Leaders (PASL). Results from the evaluation indicate that the currently used CSRs are effective.</p>","PeriodicalId":11972,"journal":{"name":"ETS Research Report Series","volume":"2023 1","pages":"1-10"},"PeriodicalIF":0.0,"publicationDate":"2023-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ets2.12363","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45655178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Investigating the Relationship Between Career and Technical Education High School Course-Taking and Early Job Outcomes 职业与技术教育高中课程选择与早期工作结果之间的关系研究
Q3 Social Sciences Pub Date : 2022-12-08 DOI: 10.1002/ets2.12361
Margarita Olivera-Aguilar, Harrison J. Kell, Chelsea Ezzo, Steven B. Robbins

This study examined how high school course-taking patterns (i.e., career and technical education [CTE] vs. academic vs. no concentration), personal characteristics embedded in a social cognitive theory framework (e.g., self-efficacy, academic expectations), and contextual variables (e.g., parental expectations, socioeconomic status [SES]) interact with each other in the prediction of students' income and job satisfaction 8 years after graduating from high school. Using a nationally representative data set (the Educational Longitudinal Study of 2002), we found significant differences by sex and course-taking pattern in the prediction of income: Among men, CTE concentrators had the highest income, whereas among women, academic concentrators reported the greatest earnings. We observed similar levels of job satisfaction among academic and CTE concentrators. We also found that SES significantly moderated the effect of English self-efficacy and academic expectations in the prediction of income and general effort in the prediction of job satisfaction. Our findings highlight how a social cognitive framework can be used to investigate the links between high school course-taking, personal and contextual factors, and job outcomes. They additionally suggest the need to consider a broader set of outcomes for evaluating the benefits of CTE participation.

本研究考察了高中课程模式(即职业和技术教育[CTE] vs.学术与不专注)、嵌入在社会认知理论框架中的个人特征(如自我效能感、学术期望)和情境变量(如父母期望、社会经济地位[SES])如何相互作用,以预测学生高中毕业8年后的收入和工作满意度。使用具有全国代表性的数据集(2002年教育纵向研究),我们发现性别和课程学习模式在预测收入方面存在显著差异:在男性中,专注于CTE的人收入最高,而在女性中,专注于学术的人收入最高。我们观察到学术和CTE集中者的工作满意度水平相似。我们还发现,社会科学显著调节了英语自我效能感和学业期望对收入的预测作用,以及一般努力对工作满意度的预测作用。我们的研究结果强调了如何使用社会认知框架来调查高中课程学习、个人和环境因素以及工作结果之间的联系。他们还建议需要考虑更广泛的结果来评估参与CTE的好处。
{"title":"Investigating the Relationship Between Career and Technical Education High School Course-Taking and Early Job Outcomes","authors":"Margarita Olivera-Aguilar,&nbsp;Harrison J. Kell,&nbsp;Chelsea Ezzo,&nbsp;Steven B. Robbins","doi":"10.1002/ets2.12361","DOIUrl":"10.1002/ets2.12361","url":null,"abstract":"<p>This study examined how high school course-taking patterns (i.e., career and technical education [CTE] vs. academic vs. no concentration), personal characteristics embedded in a social cognitive theory framework (e.g., self-efficacy, academic expectations), and contextual variables (e.g., parental expectations, socioeconomic status [SES]) interact with each other in the prediction of students' income and job satisfaction 8 years after graduating from high school. Using a nationally representative data set (the Educational Longitudinal Study of 2002), we found significant differences by sex and course-taking pattern in the prediction of income: Among men, CTE concentrators had the highest income, whereas among women, academic concentrators reported the greatest earnings. We observed similar levels of job satisfaction among academic and CTE concentrators. We also found that SES significantly moderated the effect of English self-efficacy and academic expectations in the prediction of income and general effort in the prediction of job satisfaction. Our findings highlight how a social cognitive framework can be used to investigate the links between high school course-taking, personal and contextual factors, and job outcomes. They additionally suggest the need to consider a broader set of outcomes for evaluating the benefits of CTE participation.</p>","PeriodicalId":11972,"journal":{"name":"ETS Research Report Series","volume":"2022 1","pages":"1-18"},"PeriodicalIF":0.0,"publicationDate":"2022-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ets2.12361","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44351231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Longitudinal Stability and Change of the Dark Triad: A Call for Research in Postsecondary, Occupational, and Community Settings 黑暗三合会的纵向稳定性和变化:呼吁在中学后、职业和社区环境中进行研究
Q3 Social Sciences Pub Date : 2022-11-09 DOI: 10.1002/ets2.12362
Kevin M. Williams, Michelle P. Martin-Raugh, Jennifer E. Lentini

Researchers, theorists, and practitioners have expressed a renewed interest in the longitudinal dynamics of personality characteristics in adulthood, including organic life span trajectories and their amenability to volitional change. However, this research has apparently not yet expanded to include the Dark Triad (psychopathy, narcissism, Machiavellianism), despite approximately 2 decades of research that has thoroughly examined other important issues related to construct validity and interpersonal behavior. We argue that researchers in postsecondary, occupational, and community-based settings are in a unique position to study the important phenomenon of Dark Triad malleability, as they are less hindered by obstacles in clinical and forensic contexts that have generated largely inconclusive results. In this article, we discuss several examples of methods for evaluating, quantifying, and interpreting Dark Triad malleability, examples of relevant extant training programs, possibilities for developing new programs, and factors that may moderate training efficacy, including Dark Triad levels themselves. Beyond addressing a fundamental question regarding the nature of these traits, the Dark Triad's destructive tendencies suggest that efforts to reduce them would provide myriad societal benefits and could propel Dark Triad research in an important new direction.

研究人员、理论家和实践者对成年期人格特征的纵向动态表现出了新的兴趣,包括有机生命周期轨迹及其对意志变化的适应性。然而,这项研究显然还没有扩展到包括黑暗三人格(精神病、自恋、马基雅维利主义),尽管大约20年的研究已经彻底研究了与建构有效性和人际行为相关的其他重要问题。我们认为,高等教育、职业和社区环境中的研究人员在研究黑暗三联人格可塑性的重要现象方面处于独特的地位,因为他们较少受到临床和法医环境中产生的大部分不确定结果的障碍的阻碍。在这篇文章中,我们讨论了一些评估、量化和解释黑暗三合一可塑性的方法的例子,相关的现有训练计划的例子,开发新计划的可能性,以及可能调节训练效果的因素,包括黑暗三合一水平本身。除了解决关于这些特征本质的基本问题外,黑暗三位一体的破坏性倾向表明,减少它们的努力将提供无数的社会效益,并可能推动黑暗三位一体研究向一个重要的新方向发展。
{"title":"Longitudinal Stability and Change of the Dark Triad: A Call for Research in Postsecondary, Occupational, and Community Settings","authors":"Kevin M. Williams,&nbsp;Michelle P. Martin-Raugh,&nbsp;Jennifer E. Lentini","doi":"10.1002/ets2.12362","DOIUrl":"10.1002/ets2.12362","url":null,"abstract":"<p>Researchers, theorists, and practitioners have expressed a renewed interest in the longitudinal dynamics of personality characteristics in adulthood, including organic life span trajectories and their amenability to volitional change. However, this research has apparently not yet expanded to include the Dark Triad (psychopathy, narcissism, Machiavellianism), despite approximately 2 decades of research that has thoroughly examined other important issues related to construct validity and interpersonal behavior. We argue that researchers in postsecondary, occupational, and community-based settings are in a unique position to study the important phenomenon of Dark Triad malleability, as they are less hindered by obstacles in clinical and forensic contexts that have generated largely inconclusive results. In this article, we discuss several examples of methods for evaluating, quantifying, and interpreting Dark Triad malleability, examples of relevant extant training programs, possibilities for developing new programs, and factors that may moderate training efficacy, including Dark Triad levels themselves. Beyond addressing a fundamental question regarding the nature of these traits, the Dark Triad's destructive tendencies suggest that efforts to reduce them would provide myriad societal benefits and could propel Dark Triad research in an important new direction.</p>","PeriodicalId":11972,"journal":{"name":"ETS Research Report Series","volume":"2022 1","pages":"1-22"},"PeriodicalIF":0.0,"publicationDate":"2022-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ets2.12362","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43813598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mapping TOEFL® Essentials™ Test Scores to the Canadian Language Benchmarks 映射托福®基础™ 加拿大语言基准测试成绩
Q3 Social Sciences Pub Date : 2022-10-22 DOI: 10.1002/ets2.12357
Spiros Papageorgiou, Larry Davis, Renka Ohta, Pablo Garcia Gomez

In this research report, we describe a study to map the scores of the TOEFL® Essentials™ test to the Canadian Language Benchmarks (CLB). The TOEFL Essentials test is a four-skills assessment of foundational English language skills and communication abilities in academic and general (daily life) contexts. At the time of writing this report, the test was the most recent addition to the TOEFL® Family of Assessments. TOEFL Essentials test scores are intended to provide academic programs and other users with reliable information regarding the test taker's ability to understand and use English. Mapping of scores to widely used language frameworks such as the CLB provides additional support for interpreting test results and for making inferences regarding test-taker abilities. The score mapping process consisted of the following steps, as recommended in the literature: (a) establishing construct congruence between the test content and the performance descriptors of the CLB; (b) establishing recommended minimum test scores (cut scores) required to classify language learners into CLB levels, based on the judgments of local experts; and (c) providing evidence of procedural, internal, and external validation of the recommended cut scores.

在这份研究报告中,我们描述了一项将TOEFL®Essentials™测试分数与加拿大语言基准(CLB)相匹配的研究。托福基本测试是一个四项技能评估的基础英语语言技能和沟通能力,在学术和一般(日常生活)环境。在撰写本报告时,该考试是托福®系列考试的最新成员。托福基本考试成绩旨在为学术课程和其他用户提供有关考生理解和使用英语能力的可靠信息。将分数映射到广泛使用的语言框架(如CLB),为解释考试结果和推断考生能力提供了额外的支持。根据文献的建议,分数映射过程包括以下步骤:(a)建立测试内容与CLB表现描述符之间的结构一致性;(二)根据本地专家的判断,订定将语言学习者划分为普通话水平所需的建议最低考试分数(及格分数);(c)提供程序、内部和外部对推荐的分数进行验证的证据。
{"title":"Mapping TOEFL® Essentials™ Test Scores to the Canadian Language Benchmarks","authors":"Spiros Papageorgiou,&nbsp;Larry Davis,&nbsp;Renka Ohta,&nbsp;Pablo Garcia Gomez","doi":"10.1002/ets2.12357","DOIUrl":"10.1002/ets2.12357","url":null,"abstract":"<p>In this research report, we describe a study to map the scores of the <i>TOEFL</i>® <i>Essentials</i>™ test to the Canadian Language Benchmarks (CLB). The TOEFL Essentials test is a four-skills assessment of foundational English language skills and communication abilities in academic and general (daily life) contexts. At the time of writing this report, the test was the most recent addition to the <i>TOEFL®</i> Family of Assessments. TOEFL Essentials test scores are intended to provide academic programs and other users with reliable information regarding the test taker's ability to understand and use English. Mapping of scores to widely used language frameworks such as the CLB provides additional support for interpreting test results and for making inferences regarding test-taker abilities. The score mapping process consisted of the following steps, as recommended in the literature: (a) establishing construct congruence between the test content and the performance descriptors of the CLB; (b) establishing recommended minimum test scores (cut scores) required to classify language learners into CLB levels, based on the judgments of local experts; and (c) providing evidence of procedural, internal, and external validation of the recommended cut scores.</p>","PeriodicalId":11972,"journal":{"name":"ETS Research Report Series","volume":"2022 1","pages":"1-42"},"PeriodicalIF":0.0,"publicationDate":"2022-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ets2.12357","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45056261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
ETS Research Report Series
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1