Paul Zhang, Brandon Jaipersaud, Jimmy Ba, Andrew Petersen, Lisa Zhang, Michael Ruogu Zhang
{"title":"Classifying Course Discussion Board Questions using LLMs","authors":"Paul Zhang, Brandon Jaipersaud, Jimmy Ba, Andrew Petersen, Lisa Zhang, Michael Ruogu Zhang","doi":"10.1145/3587103.3594202","DOIUrl":null,"url":null,"abstract":"Large language models (LLMs) can be used to answer student questions on course discussion boards, but there is a risk of LLMs answering questions they are unable to address. We propose and evaluate an LLM-based system that classifies student questions into one of four types: conceptual, homework, logistics, and not answerable. We then prompt an LLM using a type-specific prompt. Using GPT-3, we achieve 81% classification accuracy across the four categories. Furthermore, we achieve 93% accuracy on classifying not answerable questions. This indicates that our system effectively ignores questions that it cannot address.","PeriodicalId":366365,"journal":{"name":"Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 2","volume":"66 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 2","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3587103.3594202","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Large language models (LLMs) can be used to answer student questions on course discussion boards, but there is a risk of LLMs answering questions they are unable to address. We propose and evaluate an LLM-based system that classifies student questions into one of four types: conceptual, homework, logistics, and not answerable. We then prompt an LLM using a type-specific prompt. Using GPT-3, we achieve 81% classification accuracy across the four categories. Furthermore, we achieve 93% accuracy on classifying not answerable questions. This indicates that our system effectively ignores questions that it cannot address.