V. V. Zhebel, D. A. Devyatkin, D. V. Zubarev, I. V. Sochenkov
{"title":"Approaches to Cross-Language Retrieval of Similar Legal Documents Based on Machine Learning","authors":"V. V. Zhebel, D. A. Devyatkin, D. V. Zubarev, I. V. Sochenkov","doi":"10.3103/s0147688223050167","DOIUrl":null,"url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">\n<b>Abstract</b>—</h3><p>In order to study global experience for legislation changing and rule-making necessitates, tools for information retrieval of regulatory documents written in different languages become increasingly necessary. One of the aspects of information identification is retrieval of thematically similar documents for a given input document. In this context, an important task of cross-lingual search arises when the user of an information system specifies a reference document in one language, and the search results contain relevant documents in other languages. The article describes different approaches to solving this problem: from classic mediator-based methods to more modern solutions, based on distributional semantics. The test collection used in the study was taken from the United Nations Digital Library, which provides legal documents in both the original English and their Russian translations.</p>","PeriodicalId":43962,"journal":{"name":"Scientific and Technical Information Processing","volume":"203 1","pages":""},"PeriodicalIF":0.4000,"publicationDate":"2024-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific and Technical Information Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3103/s0147688223050167","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}
引用次数: 0
Abstract—
In order to study global experience for legislation changing and rule-making necessitates, tools for information retrieval of regulatory documents written in different languages become increasingly necessary. One of the aspects of information identification is retrieval of thematically similar documents for a given input document. In this context, an important task of cross-lingual search arises when the user of an information system specifies a reference document in one language, and the search results contain relevant documents in other languages. The article describes different approaches to solving this problem: from classic mediator-based methods to more modern solutions, based on distributional semantics. The test collection used in the study was taken from the United Nations Digital Library, which provides legal documents in both the original English and their Russian translations.
期刊介绍:
Scientific and Technical Information Processing is a refereed journal that covers all aspects of management and use of information technology in libraries and archives, information centres, and the information industry in general. Emphasis is on practical applications of new technologies and techniques for information analysis and processing.