{"title":"Symbolic Semantic Memory in Transformer Language Models","authors":"Robert Morain, Kenneth Vargas, Dan Ventura","doi":"10.1109/ICMLA55696.2022.00166","DOIUrl":null,"url":null,"abstract":"This paper demonstrates how transformer language models can be improved by giving them access to relevant structured data extracted from a knowledge base. The methods for doing so include identifying entities in a text corpus, sorting the entities using a novel attention-based approach, linking entities to a knowledge base, then extracting and filtering the knowledge to create a knowledge-augmented dataset. We evaluate these methods with the WikiText-103 corpus using standard language modeling objectives. These results show that even simple additional knowledge augmentation leads to a reduction in validation perplexity by 81.04%. These methods also significantly outperform common ways of improving language models such as increasing the model size or adding more data.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA55696.2022.00166","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper demonstrates how transformer language models can be improved by giving them access to relevant structured data extracted from a knowledge base. The methods for doing so include identifying entities in a text corpus, sorting the entities using a novel attention-based approach, linking entities to a knowledge base, then extracting and filtering the knowledge to create a knowledge-augmented dataset. We evaluate these methods with the WikiText-103 corpus using standard language modeling objectives. These results show that even simple additional knowledge augmentation leads to a reduction in validation perplexity by 81.04%. These methods also significantly outperform common ways of improving language models such as increasing the model size or adding more data.