{"title":"教授是如何安排考试的?对问题多样性的分析","authors":"Paul Laskowski, Sergey Karayev, Marti A. Hearst","doi":"10.1145/3231644.3231667","DOIUrl":null,"url":null,"abstract":"This study analyzes the use of paper exams in college-level STEM courses. It leverages a unique dataset of nearly 1,800 exams, which were scanned into a web application, then processed by a team of annotators to yield a detailed snapshot of the way instructors currently structure exams. The focus of the investigation is on the variety of question formats, and how they are applied across different course topics. The analysis divides questions according to seven top-level categories, finding significant differences among these in terms of positioning, use across subjects, and student performance. The analysis also reveals a strong tendency within the collection for instructors to order questions from easier to harder. A linear mixed effects model is used to estimate the reliability of different question types. Long writing questions stand out for their high reliability, while binary and multiple choice questions have low reliability. The model suggests that over three multiple choice questions, or over five binary questions, are required to attain the same reliability as a single long writing question. A correlation analysis across seven response types finds that student abilities for different questions types exceed 70 percent for all pairs, although binary and multiple-choice questions stand out for having unusually low correlations with all other question types.","PeriodicalId":20634,"journal":{"name":"Proceedings of the Fifth Annual ACM Conference on Learning at Scale","volume":"3 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2018-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"How do professors format exams?: an analysis of question variety at scale\",\"authors\":\"Paul Laskowski, Sergey Karayev, Marti A. Hearst\",\"doi\":\"10.1145/3231644.3231667\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This study analyzes the use of paper exams in college-level STEM courses. It leverages a unique dataset of nearly 1,800 exams, which were scanned into a web application, then processed by a team of annotators to yield a detailed snapshot of the way instructors currently structure exams. The focus of the investigation is on the variety of question formats, and how they are applied across different course topics. The analysis divides questions according to seven top-level categories, finding significant differences among these in terms of positioning, use across subjects, and student performance. The analysis also reveals a strong tendency within the collection for instructors to order questions from easier to harder. A linear mixed effects model is used to estimate the reliability of different question types. Long writing questions stand out for their high reliability, while binary and multiple choice questions have low reliability. The model suggests that over three multiple choice questions, or over five binary questions, are required to attain the same reliability as a single long writing question. A correlation analysis across seven response types finds that student abilities for different questions types exceed 70 percent for all pairs, although binary and multiple-choice questions stand out for having unusually low correlations with all other question types.\",\"PeriodicalId\":20634,\"journal\":{\"name\":\"Proceedings of the Fifth Annual ACM Conference on Learning at Scale\",\"volume\":\"3 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-06-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Fifth Annual ACM Conference on Learning at Scale\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3231644.3231667\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Fifth Annual ACM Conference on Learning at Scale","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3231644.3231667","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
How do professors format exams?: an analysis of question variety at scale
This study analyzes the use of paper exams in college-level STEM courses. It leverages a unique dataset of nearly 1,800 exams, which were scanned into a web application, then processed by a team of annotators to yield a detailed snapshot of the way instructors currently structure exams. The focus of the investigation is on the variety of question formats, and how they are applied across different course topics. The analysis divides questions according to seven top-level categories, finding significant differences among these in terms of positioning, use across subjects, and student performance. The analysis also reveals a strong tendency within the collection for instructors to order questions from easier to harder. A linear mixed effects model is used to estimate the reliability of different question types. Long writing questions stand out for their high reliability, while binary and multiple choice questions have low reliability. The model suggests that over three multiple choice questions, or over five binary questions, are required to attain the same reliability as a single long writing question. A correlation analysis across seven response types finds that student abilities for different questions types exceed 70 percent for all pairs, although binary and multiple-choice questions stand out for having unusually low correlations with all other question types.