V. Ramnarain-Seetohul, V. Bassoo, Yasmine Rosunally
{"title":"Work-in-Progress: Computing Sentence Similarity for Short Texts using Transformer models","authors":"V. Ramnarain-Seetohul, V. Bassoo, Yasmine Rosunally","doi":"10.1109/EDUCON52537.2022.9766649","DOIUrl":null,"url":null,"abstract":"The field of natural language processing is being revolutionized with transformers. The latter is based on a novel type of neural network framework that is already pre-trained. Hence, large datasets to train models are no longer required. This framework is suitable for automated assessment systems (AAS), where a large number of labeled data is needed. The larger the dataset, the higher the accuracy of the AAS. In this work-in-progress paper, a prototype for an AAS has been built where two transformer models, namely the Sentence-Transformers from hugging face and the OpenAI GPT-3 models have been used. The transformer models generate the similarity index between students’ answers and reference answers from the Texas dataset. Then the similarity index is used to compute marks for students. The performance of the prototype is evaluated using the quadratic weighted kappa metric.","PeriodicalId":416694,"journal":{"name":"2022 IEEE Global Engineering Education Conference (EDUCON)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE Global Engineering Education Conference (EDUCON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EDUCON52537.2022.9766649","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The field of natural language processing is being revolutionized with transformers. The latter is based on a novel type of neural network framework that is already pre-trained. Hence, large datasets to train models are no longer required. This framework is suitable for automated assessment systems (AAS), where a large number of labeled data is needed. The larger the dataset, the higher the accuracy of the AAS. In this work-in-progress paper, a prototype for an AAS has been built where two transformer models, namely the Sentence-Transformers from hugging face and the OpenAI GPT-3 models have been used. The transformer models generate the similarity index between students’ answers and reference answers from the Texas dataset. Then the similarity index is used to compute marks for students. The performance of the prototype is evaluated using the quadratic weighted kappa metric.