{"title":"The VNNLI - VLSP 2021: Leveraging Contextual Word Embedding for NLI Task on Bilingual Dataset","authors":"Quoc-Loc Duong","doi":"10.25073/2588-1086/vnucsce.317","DOIUrl":null,"url":null,"abstract":"Natural Language Inference (NLI) is one of the critical tasks in natural language understanding which we take through the VLSP2021-NLI Shared Task competition. VLSP2021-NLI Shared Task is a competition to improve existing methods for NLI tasks, thereby enhancing the efficiency of applications. One of the challenges of the competition is the dataset in both Vietnamese and English. In this article, we report on evaluating the NLI task of the competition. We first implement the 5-fold cross-validation evaluation method. We following leverage model architectures pre-trained on cross-lingual language datasets such as XLM-RoBERTa and RemBERT to create contextual word embeddings for classification. Our final result reaches 90.00% on the test dataset of the organizers.","PeriodicalId":416488,"journal":{"name":"VNU Journal of Science: Computer Science and Communication Engineering","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"VNU Journal of Science: Computer Science and Communication Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.25073/2588-1086/vnucsce.317","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Natural Language Inference (NLI) is one of the critical tasks in natural language understanding which we take through the VLSP2021-NLI Shared Task competition. VLSP2021-NLI Shared Task is a competition to improve existing methods for NLI tasks, thereby enhancing the efficiency of applications. One of the challenges of the competition is the dataset in both Vietnamese and English. In this article, we report on evaluating the NLI task of the competition. We first implement the 5-fold cross-validation evaluation method. We following leverage model architectures pre-trained on cross-lingual language datasets such as XLM-RoBERTa and RemBERT to create contextual word embeddings for classification. Our final result reaches 90.00% on the test dataset of the organizers.