{"title":"面向知识跟踪的查询、键和值计算","authors":"Youngduck Choi, Youngnam Lee, Junghyun Cho, Jineon Baek, Byungsoo Kim, Yeongmin Cha, Dongmin Shin, Chan Bae, Jaewe Heo","doi":"10.1145/3386527.3405945","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a novel Transformer-based model for knowledge tracing, SAINT: Separated Self-AttentIve Neural Knowledge Tracing. SAINT has an encoder-decoder structure where the exercise and response embedding sequences separately enter, respectively, the encoder and the decoder. The encoder applies self-attention layers to the sequence of exercise embeddings, and the decoder alternately applies self-attention layers and encoder-decoder attention layers to the sequence of response embeddings. This separation of input allows us to stack attention layers multiple times, resulting in an improvement in area under receiver operating characteristic curve (AUC). To the best of our knowledge, this is the first work to suggest an encoder-decoder model for knowledge tracing that applies deep self-attentive layers to exercises and responses separately. We empirically evaluate SAINT on a large-scale knowledge tracing dataset, EdNet, collected by an active mobile education application, Santa, which has 627,347 users, 72,907,005 response data points as well as a set of 16,175 exercises gathered since 2016. The results show that SAINT achieves state-of-the-art performance in knowledge tracing with an improvement of 1.8% in AUC compared to the current state-of-the-art model.","PeriodicalId":20608,"journal":{"name":"Proceedings of the Seventh ACM Conference on Learning @ Scale","volume":"25 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2020-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"96","resultStr":"{\"title\":\"Towards an Appropriate Query, Key, and Value Computation for Knowledge Tracing\",\"authors\":\"Youngduck Choi, Youngnam Lee, Junghyun Cho, Jineon Baek, Byungsoo Kim, Yeongmin Cha, Dongmin Shin, Chan Bae, Jaewe Heo\",\"doi\":\"10.1145/3386527.3405945\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose a novel Transformer-based model for knowledge tracing, SAINT: Separated Self-AttentIve Neural Knowledge Tracing. SAINT has an encoder-decoder structure where the exercise and response embedding sequences separately enter, respectively, the encoder and the decoder. The encoder applies self-attention layers to the sequence of exercise embeddings, and the decoder alternately applies self-attention layers and encoder-decoder attention layers to the sequence of response embeddings. This separation of input allows us to stack attention layers multiple times, resulting in an improvement in area under receiver operating characteristic curve (AUC). To the best of our knowledge, this is the first work to suggest an encoder-decoder model for knowledge tracing that applies deep self-attentive layers to exercises and responses separately. We empirically evaluate SAINT on a large-scale knowledge tracing dataset, EdNet, collected by an active mobile education application, Santa, which has 627,347 users, 72,907,005 response data points as well as a set of 16,175 exercises gathered since 2016. The results show that SAINT achieves state-of-the-art performance in knowledge tracing with an improvement of 1.8% in AUC compared to the current state-of-the-art model.\",\"PeriodicalId\":20608,\"journal\":{\"name\":\"Proceedings of the Seventh ACM Conference on Learning @ Scale\",\"volume\":\"25 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-02-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"96\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Seventh ACM Conference on Learning @ Scale\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3386527.3405945\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Seventh ACM Conference on Learning @ Scale","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3386527.3405945","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Towards an Appropriate Query, Key, and Value Computation for Knowledge Tracing
In this paper, we propose a novel Transformer-based model for knowledge tracing, SAINT: Separated Self-AttentIve Neural Knowledge Tracing. SAINT has an encoder-decoder structure where the exercise and response embedding sequences separately enter, respectively, the encoder and the decoder. The encoder applies self-attention layers to the sequence of exercise embeddings, and the decoder alternately applies self-attention layers and encoder-decoder attention layers to the sequence of response embeddings. This separation of input allows us to stack attention layers multiple times, resulting in an improvement in area under receiver operating characteristic curve (AUC). To the best of our knowledge, this is the first work to suggest an encoder-decoder model for knowledge tracing that applies deep self-attentive layers to exercises and responses separately. We empirically evaluate SAINT on a large-scale knowledge tracing dataset, EdNet, collected by an active mobile education application, Santa, which has 627,347 users, 72,907,005 response data points as well as a set of 16,175 exercises gathered since 2016. The results show that SAINT achieves state-of-the-art performance in knowledge tracing with an improvement of 1.8% in AUC compared to the current state-of-the-art model.