{"title":"An accurate transformer-based model for transition-based dependency parsing of free word order languages","authors":"Fatima Tuz Zuhra , Khalid Saleem , Surayya Naz","doi":"10.1016/j.jksuci.2024.102107","DOIUrl":null,"url":null,"abstract":"<div><p>Transformer models are the state-of-the-art in Natural Language Processing (NLP) and the core of the Large Language Models (LLMs). We propose a transformer-based model for transition-based dependency parsing of free word order languages. We have performed experiments on five treebanks from the Universal Dependencies (UD) dataset version 2.12. Our experiments show that a transformer model, trained with the dynamic word embeddings performs better than a multilayer perceptron trained on the state-of-the-art static word embeddings even if the dynamic word embeddings have a vocabulary size ten times smaller than the static word embeddings. The results show that the transformer trained on dynamic word embeddings achieves an unlabeled attachment score (UAS) of 84.17% for Urdu language which is <span><math><mrow><mo>≈</mo><mn>3</mn><mo>.</mo><mn>6</mn><mtext>%</mtext></mrow></math></span> and <span><math><mrow><mo>≈</mo><mn>1</mn><mo>.</mo><mn>9</mn><mtext>%</mtext></mrow></math></span> higher than the UAS scores of 80.56857% and 82.26859% achieved by the multilayer perceptron (MLP) using two static state-of-the-art word embeddings. The proposed approach is investigated for Arabic, Persian and Uyghur languages, in addition to Urdu, for UAS scores and the results suggest that the proposed solution outperform the MLP-based approaches.</p></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 6","pages":"Article 102107"},"PeriodicalIF":5.2000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1319157824001964/pdfft?md5=9f26f8ea4918de323a897e760f616273&pid=1-s2.0-S1319157824001964-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of King Saud University-Computer and Information Sciences","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1319157824001964","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Transformer models are the state-of-the-art in Natural Language Processing (NLP) and the core of the Large Language Models (LLMs). We propose a transformer-based model for transition-based dependency parsing of free word order languages. We have performed experiments on five treebanks from the Universal Dependencies (UD) dataset version 2.12. Our experiments show that a transformer model, trained with the dynamic word embeddings performs better than a multilayer perceptron trained on the state-of-the-art static word embeddings even if the dynamic word embeddings have a vocabulary size ten times smaller than the static word embeddings. The results show that the transformer trained on dynamic word embeddings achieves an unlabeled attachment score (UAS) of 84.17% for Urdu language which is and higher than the UAS scores of 80.56857% and 82.26859% achieved by the multilayer perceptron (MLP) using two static state-of-the-art word embeddings. The proposed approach is investigated for Arabic, Persian and Uyghur languages, in addition to Urdu, for UAS scores and the results suggest that the proposed solution outperform the MLP-based approaches.
期刊介绍:
In 2022 the Journal of King Saud University - Computer and Information Sciences will become an author paid open access journal. Authors who submit their manuscript after October 31st 2021 will be asked to pay an Article Processing Charge (APC) after acceptance of their paper to make their work immediately, permanently, and freely accessible to all. The Journal of King Saud University Computer and Information Sciences is a refereed, international journal that covers all aspects of both foundations of computer and its practical applications.