{"title":"Machine Learning–Based Profiling in Test Cheating Detection","authors":"Huijuan Meng, Ye Ma","doi":"10.1111/emip.12541","DOIUrl":null,"url":null,"abstract":"<p>In recent years, machine learning (ML) techniques have received more attention in detecting aberrant test-taking behaviors due to advantages when compared to traditional data forensics methods. However, defining “True Test Cheaters” is challenging—different than other fraud detection tasks such as flagging forged bank checks or credit card frauds, testing organizations are often lack of physical evidences to identify “True Test Cheaters” to train ML models. This study proposed a statistically defensible method of labeling “True Test Cheaters” in the data, demonstrated the effectiveness of using ML approaches to identify irregular statistical patterns in exam data, and established an analytical framework for evaluating and conducting real-time ML-based test data forensics. Classification accuracy and false negative/positive results are evaluated across different supervised-ML techniques. The reliability and feasibility of operationally using this approach for an IT certification exam are evaluated using real data.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":null,"pages":null},"PeriodicalIF":2.7000,"publicationDate":"2023-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Educational Measurement-Issues and Practice","FirstCategoryId":"95","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/emip.12541","RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
引用次数: 0
Abstract
In recent years, machine learning (ML) techniques have received more attention in detecting aberrant test-taking behaviors due to advantages when compared to traditional data forensics methods. However, defining “True Test Cheaters” is challenging—different than other fraud detection tasks such as flagging forged bank checks or credit card frauds, testing organizations are often lack of physical evidences to identify “True Test Cheaters” to train ML models. This study proposed a statistically defensible method of labeling “True Test Cheaters” in the data, demonstrated the effectiveness of using ML approaches to identify irregular statistical patterns in exam data, and established an analytical framework for evaluating and conducting real-time ML-based test data forensics. Classification accuracy and false negative/positive results are evaluated across different supervised-ML techniques. The reliability and feasibility of operationally using this approach for an IT certification exam are evaluated using real data.