H. Brits, G. Joubert, J. Bezuidenhout, L. J. van der Merwe
{"title":"Evaluation of assessment marks in the clinical years of an undergraduate medical training programme: Where are we and how can we improve?","authors":"H. Brits, G. Joubert, J. Bezuidenhout, L. J. van der Merwe","doi":"10.7196/ajhpe.2021.v13i4.1379","DOIUrl":null,"url":null,"abstract":"Background. In high-stakes assessments, the accuracy and consistency of the decision to pass or fail a student is as important as the reliability of the assessment. Objective. To evaluate the reliability of results of high-stakes assessments in the clinical phase of the undergraduate medical programme at the University of the Free State, as a step to make recommendations for improving quality assessment. Methods. A cohort analytical study design was used. The final, end-of-block marks and the end-of-year assessment marks of both fourth-year and final-year medical students over 3 years were compared for decision reliability, test-retest reliability, stability and reproducibility. Results. 1 380 marks in 26 assessments were evaluated. The G-index of agreement for decision reliability ranged from 0.86 to 0.98. In 88.9% of assessments, the test-retest correlation coefficient was <0.7. Mean marks for end-of-block and end-of-year assessments were similar. However, the standard deviations of differences between end-of-block and end-of-year assessment marks were high. Multiple-choice questions (MCQs) and objective structured clinical examinations (OSCEs) yielded good reliability results. Conclusion. The reliability of pass/fail outcome decisions was good. The test reliability, as well as stability and reproducibility of individual student marks, could not be accurately replicated. The use of MCQs and OSCEs are practical examples of where the number of assessments can be increased to improve reliability. In order to increase the number of assessments and to reduce the stress of high-stake assessments, more workplace-based assessment with observed clinical cases is recommended.","PeriodicalId":43683,"journal":{"name":"African Journal of Health Professions Education","volume":" ","pages":""},"PeriodicalIF":0.4000,"publicationDate":"2021-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"African Journal of Health Professions Education","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.7196/ajhpe.2021.v13i4.1379","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
Abstract
Background. In high-stakes assessments, the accuracy and consistency of the decision to pass or fail a student is as important as the reliability of the assessment. Objective. To evaluate the reliability of results of high-stakes assessments in the clinical phase of the undergraduate medical programme at the University of the Free State, as a step to make recommendations for improving quality assessment. Methods. A cohort analytical study design was used. The final, end-of-block marks and the end-of-year assessment marks of both fourth-year and final-year medical students over 3 years were compared for decision reliability, test-retest reliability, stability and reproducibility. Results. 1 380 marks in 26 assessments were evaluated. The G-index of agreement for decision reliability ranged from 0.86 to 0.98. In 88.9% of assessments, the test-retest correlation coefficient was <0.7. Mean marks for end-of-block and end-of-year assessments were similar. However, the standard deviations of differences between end-of-block and end-of-year assessment marks were high. Multiple-choice questions (MCQs) and objective structured clinical examinations (OSCEs) yielded good reliability results. Conclusion. The reliability of pass/fail outcome decisions was good. The test reliability, as well as stability and reproducibility of individual student marks, could not be accurately replicated. The use of MCQs and OSCEs are practical examples of where the number of assessments can be increased to improve reliability. In order to increase the number of assessments and to reduce the stress of high-stake assessments, more workplace-based assessment with observed clinical cases is recommended.