利用评价认知提高科学论证评价的概括性

Q2 Social Sciences Practical Assessment, Research and Evaluation Pub Date : 2019-01-01 DOI:10.7275/EY9D-P954
Katrina Borowiec, Courtney Castle
{"title":"利用评价认知提高科学论证评价的概括性","authors":"Katrina Borowiec, Courtney Castle","doi":"10.7275/EY9D-P954","DOIUrl":null,"url":null,"abstract":"Rater cognition or “think-aloud” studies have historically been used to enhance rater accuracy and consistency in writing and language assessments. As assessments are developed for new, complex constructs from the Next Generation Science Standards (NGSS) , the present study illustrates the utility of extending “think-aloud” studies to science assessment. The study focuses on the development of rubrics for scientific argumentation, one of the NGSS Science and Engineering practices. The initial rubrics were modified based on cognitive interviews with five raters. Next, a group of four new raters scored responses using the original and revised rubrics. A psychometric analysis was conducted to measure change in interrater reliability, accuracy, and generalizability (using a generalizability study or “g-study”) for the original and revised rubrics. Interrater reliability, accuracy, and generalizability increased with the rubric modifications. Furthermore, follow-up interviews with the second group of raters indicated that most raters preferred the revised rubric. These findings illustrate that cognitive interviews with raters can be used to enhance rubric usability and generalizability when assessing scientific argumentation, thereby improving assessment validity.","PeriodicalId":20361,"journal":{"name":"Practical Assessment, Research and Evaluation","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Using Rater Cognition to Improve Generalizability of an Assessment of Scientific Argumentation\",\"authors\":\"Katrina Borowiec, Courtney Castle\",\"doi\":\"10.7275/EY9D-P954\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Rater cognition or “think-aloud” studies have historically been used to enhance rater accuracy and consistency in writing and language assessments. As assessments are developed for new, complex constructs from the Next Generation Science Standards (NGSS) , the present study illustrates the utility of extending “think-aloud” studies to science assessment. The study focuses on the development of rubrics for scientific argumentation, one of the NGSS Science and Engineering practices. The initial rubrics were modified based on cognitive interviews with five raters. Next, a group of four new raters scored responses using the original and revised rubrics. A psychometric analysis was conducted to measure change in interrater reliability, accuracy, and generalizability (using a generalizability study or “g-study”) for the original and revised rubrics. Interrater reliability, accuracy, and generalizability increased with the rubric modifications. Furthermore, follow-up interviews with the second group of raters indicated that most raters preferred the revised rubric. These findings illustrate that cognitive interviews with raters can be used to enhance rubric usability and generalizability when assessing scientific argumentation, thereby improving assessment validity.\",\"PeriodicalId\":20361,\"journal\":{\"name\":\"Practical Assessment, Research and Evaluation\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Practical Assessment, Research and Evaluation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.7275/EY9D-P954\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Social Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Practical Assessment, Research and Evaluation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.7275/EY9D-P954","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Social Sciences","Score":null,"Total":0}
引用次数: 3

摘要

评分认知或“有声思考”研究历来被用于提高写作和语言评估的准确性和一致性。由于评估是针对下一代科学标准(NGSS)中新的复杂结构开发的,本研究说明了将“有声思考”研究扩展到科学评估的效用。该研究的重点是科学论证规则的发展,这是NGSS科学与工程实践之一。最初的标准是根据与五位评分者的认知访谈进行修改的。接下来,一组四名新的评分员使用原始和修订后的标准对回答进行评分。对原标准和修订后的标准进行了心理测量分析,以测量量表间信度、准确性和概括性的变化(使用概括性研究或“g研究”)。随着分类的修改,分类间的可靠性、准确性和通用性都有所提高。此外,对第二组评分人的后续访谈表明,大多数评分人更喜欢修订后的评分标准。这些研究结果表明,在评估科学论证时,对评分者进行认知访谈可以增强标题的可用性和概括性,从而提高评估的效度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Using Rater Cognition to Improve Generalizability of an Assessment of Scientific Argumentation
Rater cognition or “think-aloud” studies have historically been used to enhance rater accuracy and consistency in writing and language assessments. As assessments are developed for new, complex constructs from the Next Generation Science Standards (NGSS) , the present study illustrates the utility of extending “think-aloud” studies to science assessment. The study focuses on the development of rubrics for scientific argumentation, one of the NGSS Science and Engineering practices. The initial rubrics were modified based on cognitive interviews with five raters. Next, a group of four new raters scored responses using the original and revised rubrics. A psychometric analysis was conducted to measure change in interrater reliability, accuracy, and generalizability (using a generalizability study or “g-study”) for the original and revised rubrics. Interrater reliability, accuracy, and generalizability increased with the rubric modifications. Furthermore, follow-up interviews with the second group of raters indicated that most raters preferred the revised rubric. These findings illustrate that cognitive interviews with raters can be used to enhance rubric usability and generalizability when assessing scientific argumentation, thereby improving assessment validity.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
2.60
自引率
0.00%
发文量
0
期刊最新文献
Feedback is a gift: Do Video-enhanced rubrics result in providing better peer feedback than textual rubrics? Do Loss Aversion and the Ownership Effect Bias Content Validation Procedures Flipping the Feedback: Formative Assessment in a Flipped Freshman Circuits Class Eight issues to consider when developing animated videos for the assessment of complex constructs Variability In The Accuracy Of Self-Assessments Among Low, Moderate, And High Performing Students In University Education
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1