{"title":"The Accuracy of AI-Based Automatic Proctoring in Online Exams","authors":"Adiy Tweissi, Wael Al Etaiwi, Dalia Al Eisawi","doi":"10.34190/ejel.20.4.2600","DOIUrl":null,"url":null,"abstract":"This study technically analyses one of the online exam supervision technologies, namely the Artificial Intelligence-based Auto Proctoring (AiAP). This technology has been heavily presented to the academic sectors around the globe. Proctoring technologies are developed to provide oversight and analysis of students’ behavior in online exams using AI, and sometimes with the supervision of human proctors to maintain academic integrity in a blended format. Manual Testing methodology was used to do a software testing on AiAP for verification of any possible incorrect red flags or detections. The study took place in a Middle Eastern university by conducting online exams for 14 different courses, with a total of 244 students. Afterward, five human proctors were assigned to verify the data obtained by the AiAP software. The results were then compared in terms of monitoring measurements: screen violation, sound of speech, different faces, multiple faces, and eyes movement detection. The proctoring decision was computed by averaging all monitoring measurements and then compared between the human proctors’ and the AiAP decisions, to ultimately set the AiAP against a benchmark (human proctoring) and hence to be viable for use. The decision represented the number of violations to the exam conditions, and the result showed a significant difference between Human Decision (average 25.95%) and AiAP Decision (average 35.61%), and the total number of incorrect decisions made by AiAP was 74 out of 244 exam attempts, concluding that AiAP needed some improvements and updates to meet the human level. The researchers provided some technical limitations, privacy concerns, and recommendations to carefully review before deploying and governing such proctoring technologies at institutional level. This paper contributes to the field of educational technology by providing an evidence-based accuracy test on an automatic proctoring software, and the results demand institutional provision to better establish an appropriate online exam experience for higher educational institutions.","PeriodicalId":46105,"journal":{"name":"Electronic Journal of e-Learning","volume":" ","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2022-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Electronic Journal of e-Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.34190/ejel.20.4.2600","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
引用次数: 1
Abstract
This study technically analyses one of the online exam supervision technologies, namely the Artificial Intelligence-based Auto Proctoring (AiAP). This technology has been heavily presented to the academic sectors around the globe. Proctoring technologies are developed to provide oversight and analysis of students’ behavior in online exams using AI, and sometimes with the supervision of human proctors to maintain academic integrity in a blended format. Manual Testing methodology was used to do a software testing on AiAP for verification of any possible incorrect red flags or detections. The study took place in a Middle Eastern university by conducting online exams for 14 different courses, with a total of 244 students. Afterward, five human proctors were assigned to verify the data obtained by the AiAP software. The results were then compared in terms of monitoring measurements: screen violation, sound of speech, different faces, multiple faces, and eyes movement detection. The proctoring decision was computed by averaging all monitoring measurements and then compared between the human proctors’ and the AiAP decisions, to ultimately set the AiAP against a benchmark (human proctoring) and hence to be viable for use. The decision represented the number of violations to the exam conditions, and the result showed a significant difference between Human Decision (average 25.95%) and AiAP Decision (average 35.61%), and the total number of incorrect decisions made by AiAP was 74 out of 244 exam attempts, concluding that AiAP needed some improvements and updates to meet the human level. The researchers provided some technical limitations, privacy concerns, and recommendations to carefully review before deploying and governing such proctoring technologies at institutional level. This paper contributes to the field of educational technology by providing an evidence-based accuracy test on an automatic proctoring software, and the results demand institutional provision to better establish an appropriate online exam experience for higher educational institutions.