Iqra Safder, Muhammad Abu Bakar, Farooq Zaman, Hajra Waheed, Naif Radi Aljohani, Raheel Nawaz, Saeed Ul Hassan
{"title":"Transforming Language Translation: A Deep Learning Approach to Urdu–English Translation","authors":"Iqra Safder, Muhammad Abu Bakar, Farooq Zaman, Hajra Waheed, Naif Radi Aljohani, Raheel Nawaz, Saeed Ul Hassan","doi":"10.1007/s12652-024-04839-2","DOIUrl":null,"url":null,"abstract":"<p>Machine translation has revolutionized the field of language translation in the last decade. Initially dominated by statistical models, the rise of deep learning techniques has led to neural networks, particularly Transformer models, taking the lead. These models have demonstrated exceptional performance in natural language processing tasks, surpassing traditional sequence-to-sequence models like RNN, GRU, and LSTM. With advantages like better handling of long-range dependencies and requiring less training time, the NLP community has shifted towards using Transformers for sequence-to-sequence tasks. In this work, we leverage the sequence-to-sequence transformer model to translate Urdu (a low resourced language) to English. Our model is based on a variant of transformer with some changes as activation dropout, attention dropout and final layer normalization. We have used four different datasets (UMC005, Tanzil, The Wire, and PIB) from two categories (religious and news) to train our model. The achieved results demonstrated that the model’s performance and quality of translation varied depending on the dataset used for fine-tuning. Our designed model has out performed the baseline models with 23.9 BLEU, 0.46 chrf, 0.44 METEOR and 60.75 TER scores. The enhanced performance attributes to meticulous parameter tuning, encompassing modifications in architecture and optimization techniques. Comprehensive parametric details regarding model configurations and optimizations are provided to elucidate the distinctiveness of our approach and how it surpasses prior works. We provide source code via GitHub for future studies.</p>","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Ambient Intelligence and Humanized Computing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s12652-024-04839-2","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 0
Abstract
Machine translation has revolutionized the field of language translation in the last decade. Initially dominated by statistical models, the rise of deep learning techniques has led to neural networks, particularly Transformer models, taking the lead. These models have demonstrated exceptional performance in natural language processing tasks, surpassing traditional sequence-to-sequence models like RNN, GRU, and LSTM. With advantages like better handling of long-range dependencies and requiring less training time, the NLP community has shifted towards using Transformers for sequence-to-sequence tasks. In this work, we leverage the sequence-to-sequence transformer model to translate Urdu (a low resourced language) to English. Our model is based on a variant of transformer with some changes as activation dropout, attention dropout and final layer normalization. We have used four different datasets (UMC005, Tanzil, The Wire, and PIB) from two categories (religious and news) to train our model. The achieved results demonstrated that the model’s performance and quality of translation varied depending on the dataset used for fine-tuning. Our designed model has out performed the baseline models with 23.9 BLEU, 0.46 chrf, 0.44 METEOR and 60.75 TER scores. The enhanced performance attributes to meticulous parameter tuning, encompassing modifications in architecture and optimization techniques. Comprehensive parametric details regarding model configurations and optimizations are provided to elucidate the distinctiveness of our approach and how it surpasses prior works. We provide source code via GitHub for future studies.
期刊介绍:
The purpose of JAIHC is to provide a high profile, leading edge forum for academics, industrial professionals, educators and policy makers involved in the field to contribute, to disseminate the most innovative researches and developments of all aspects of ambient intelligence and humanized computing, such as intelligent/smart objects, environments/spaces, and systems. The journal discusses various technical, safety, personal, social, physical, political, artistic and economic issues. The research topics covered by the journal are (but not limited to):
Pervasive/Ubiquitous Computing and Applications
Cognitive wireless sensor network
Embedded Systems and Software
Mobile Computing and Wireless Communications
Next Generation Multimedia Systems
Security, Privacy and Trust
Service and Semantic Computing
Advanced Networking Architectures
Dependable, Reliable and Autonomic Computing
Embedded Smart Agents
Context awareness, social sensing and inference
Multi modal interaction design
Ergonomics and product prototyping
Intelligent and self-organizing transportation networks & services
Healthcare Systems
Virtual Humans & Virtual Worlds
Wearables sensors and actuators