{"title":"An Unsupervised Learning Method to improve Legal Document Retrieval task at ALQAC 2022","authors":"D. Nguyen, Hieu Nguyen, Tung Le, Le-Minh Nguyen","doi":"10.1109/KSE56063.2022.9953618","DOIUrl":null,"url":null,"abstract":"Document retrieval for domain-specific has been an important and challenging research in NLP, particularly legal documents. The main challenge in the legal domain is the close combination of specialized knowledge from experts, which makes the entire data collecting and evaluation procedure complex and time consuming. In this study, we propose a training data augmentation procedure and an unsupervised embedding learning method and apply it to the Legal Document Retrieval task at the Automated Legal Question Answering Competition 2022 (ALQAC 2022). In this task, our method outperformed current standard models and achieved competitive results at ALQAC 2022.","PeriodicalId":330865,"journal":{"name":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","volume":"127 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/KSE56063.2022.9953618","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Document retrieval for domain-specific has been an important and challenging research in NLP, particularly legal documents. The main challenge in the legal domain is the close combination of specialized knowledge from experts, which makes the entire data collecting and evaluation procedure complex and time consuming. In this study, we propose a training data augmentation procedure and an unsupervised embedding learning method and apply it to the Legal Document Retrieval task at the Automated Legal Question Answering Competition 2022 (ALQAC 2022). In this task, our method outperformed current standard models and achieved competitive results at ALQAC 2022.