Rashid M Abu-Ghazalah, David N Dubins, Gregory M K Poon
{"title":"剖析多项选择评估中的知识、猜测和失误。","authors":"Rashid M Abu-Ghazalah, David N Dubins, Gregory M K Poon","doi":"10.1080/08957347.2023.2172017","DOIUrl":null,"url":null,"abstract":"<p><p>Multiple choice results are inherently probabilistic outcomes, as correct responses reflect a combination of knowledge and guessing, while incorrect responses additionally reflect blunder, a confidently committed mistake. To objectively resolve knowledge from responses in an MC test structure, we evaluated probabilistic models that explicitly account for guessing, knowledge and blunder using eight assessments (>9,000 responses) from an undergraduate biotechnology curriculum. A Bayesian implementation of the models, aimed at assessing their robustness to prior beliefs in examinee knowledge, showed that explicit estimators of knowledge are markedly sensitive to prior beliefs with scores as sole input. To overcome this limitation, we examined self-ranked confidence as a proxy knowledge indicator. For our test set, three levels of confidence resolved test performance. Responses rated as least confident were correct more frequently than expected from random selection, reflecting partial knowledge, but were balanced by blunder among the most confident responses. By translating evidence-based guessing and blunder rates to pass marks that statistically qualify a desired level of examinee knowledge, our approach finds practical utility in test analysis and design.</p>","PeriodicalId":51609,"journal":{"name":"Applied Measurement in Education","volume":"36 1","pages":"80-98"},"PeriodicalIF":1.1000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10201919/pdf/","citationCount":"0","resultStr":"{\"title\":\"Dissecting knowledge, guessing, and blunder in multiple choice assessments.\",\"authors\":\"Rashid M Abu-Ghazalah, David N Dubins, Gregory M K Poon\",\"doi\":\"10.1080/08957347.2023.2172017\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Multiple choice results are inherently probabilistic outcomes, as correct responses reflect a combination of knowledge and guessing, while incorrect responses additionally reflect blunder, a confidently committed mistake. To objectively resolve knowledge from responses in an MC test structure, we evaluated probabilistic models that explicitly account for guessing, knowledge and blunder using eight assessments (>9,000 responses) from an undergraduate biotechnology curriculum. A Bayesian implementation of the models, aimed at assessing their robustness to prior beliefs in examinee knowledge, showed that explicit estimators of knowledge are markedly sensitive to prior beliefs with scores as sole input. To overcome this limitation, we examined self-ranked confidence as a proxy knowledge indicator. For our test set, three levels of confidence resolved test performance. Responses rated as least confident were correct more frequently than expected from random selection, reflecting partial knowledge, but were balanced by blunder among the most confident responses. By translating evidence-based guessing and blunder rates to pass marks that statistically qualify a desired level of examinee knowledge, our approach finds practical utility in test analysis and design.</p>\",\"PeriodicalId\":51609,\"journal\":{\"name\":\"Applied Measurement in Education\",\"volume\":\"36 1\",\"pages\":\"80-98\"},\"PeriodicalIF\":1.1000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10201919/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Measurement in Education\",\"FirstCategoryId\":\"95\",\"ListUrlMain\":\"https://doi.org/10.1080/08957347.2023.2172017\",\"RegionNum\":4,\"RegionCategory\":\"教育学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/2/21 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q3\",\"JCRName\":\"EDUCATION & EDUCATIONAL RESEARCH\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Measurement in Education","FirstCategoryId":"95","ListUrlMain":"https://doi.org/10.1080/08957347.2023.2172017","RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/2/21 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
Dissecting knowledge, guessing, and blunder in multiple choice assessments.
Multiple choice results are inherently probabilistic outcomes, as correct responses reflect a combination of knowledge and guessing, while incorrect responses additionally reflect blunder, a confidently committed mistake. To objectively resolve knowledge from responses in an MC test structure, we evaluated probabilistic models that explicitly account for guessing, knowledge and blunder using eight assessments (>9,000 responses) from an undergraduate biotechnology curriculum. A Bayesian implementation of the models, aimed at assessing their robustness to prior beliefs in examinee knowledge, showed that explicit estimators of knowledge are markedly sensitive to prior beliefs with scores as sole input. To overcome this limitation, we examined self-ranked confidence as a proxy knowledge indicator. For our test set, three levels of confidence resolved test performance. Responses rated as least confident were correct more frequently than expected from random selection, reflecting partial knowledge, but were balanced by blunder among the most confident responses. By translating evidence-based guessing and blunder rates to pass marks that statistically qualify a desired level of examinee knowledge, our approach finds practical utility in test analysis and design.
期刊介绍:
Because interaction between the domains of research and application is critical to the evaluation and improvement of new educational measurement practices, Applied Measurement in Education" prime objective is to improve communication between academicians and practitioners. To help bridge the gap between theory and practice, articles in this journal describe original research studies, innovative strategies for solving educational measurement problems, and integrative reviews of current approaches to contemporary measurement issues. Peer Review Policy: All review papers in this journal have undergone editorial screening and peer review.