{"title":"The Examiner: Automatic Generation of \"Good\" Exams","authors":"F. Torres-Rojas","doi":"10.1109/CLEI.2018.00096","DOIUrl":null,"url":null,"abstract":"As educators, we must design, prepare, proctor and grade hundreds of exams during their careers. From this overwhelming task, we collect little or none objective evidence about the quality of the exams themselves. Thus, at most there is an intuitive learning about what characterizes a good or a bad exam. It is very likely that we blindly repeat in our exams rights and wrongs of the past. There exist metrics about the quality of an exam, and even metrics about the quality of each of the individual items in the exam. Using actual college courses, our research found experimental evidence that proves that it is possible to predict with great accuracy, parting from historical statistical data, the quality metrics that an exam will show even before applying it to a standard group of college students. With this result, we built an automatic system that generates \"good\" exams from an item bank enriched with statistical information from previous exams. Besides, powerful tools for analysis and controlled adjustment of each exam and each item were developed.","PeriodicalId":379986,"journal":{"name":"2018 XLIV Latin American Computer Conference (CLEI)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 XLIV Latin American Computer Conference (CLEI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CLEI.2018.00096","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
As educators, we must design, prepare, proctor and grade hundreds of exams during their careers. From this overwhelming task, we collect little or none objective evidence about the quality of the exams themselves. Thus, at most there is an intuitive learning about what characterizes a good or a bad exam. It is very likely that we blindly repeat in our exams rights and wrongs of the past. There exist metrics about the quality of an exam, and even metrics about the quality of each of the individual items in the exam. Using actual college courses, our research found experimental evidence that proves that it is possible to predict with great accuracy, parting from historical statistical data, the quality metrics that an exam will show even before applying it to a standard group of college students. With this result, we built an automatic system that generates "good" exams from an item bank enriched with statistical information from previous exams. Besides, powerful tools for analysis and controlled adjustment of each exam and each item were developed.