Harmony search for hyperparameters optimization of a low resource language transformer model trained with a novel parallel corpus Ocelotl Nahuatl – Spanish
Máximo Enrique Pacheco Martínez, Maya Carrillo Ruiz, María de Lourdes Sandoval Solís
{"title":"Harmony search for hyperparameters optimization of a low resource language transformer model trained with a novel parallel corpus Ocelotl Nahuatl – Spanish","authors":"Máximo Enrique Pacheco Martínez, Maya Carrillo Ruiz, María de Lourdes Sandoval Solís","doi":"10.1016/j.sasc.2024.200152","DOIUrl":null,"url":null,"abstract":"<div><div>Nahuatl, a low-resource language, does not have an online translator application. Instead, resources are limited to dictionaries, web pages, or digital books. Given this condition, it is vital to provide as much support to the language as possible. This research aims to enhance the BLEU score in machine translation by applying the harmony search heuristic method to state-of-the-art transformers models. This is conducted by finding the optimal hyperparameter settings for the models. Models are trained and tested using a fresh moderate-size parallel corpus of 1.5k phrases. By utilizing harmony search, the study shows an improvement in the BLEU score, enhancing it by 2.569%. In order to accomplish this, various factors related to the hyperparameters need to be considered. The application of harmony search with transformers can be extended to various parallel corpora or models, taking these considerations into account.</div></div>","PeriodicalId":101205,"journal":{"name":"Systems and Soft Computing","volume":"6 ","pages":"Article 200152"},"PeriodicalIF":0.0000,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Systems and Soft Computing","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772941924000814","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Nahuatl, a low-resource language, does not have an online translator application. Instead, resources are limited to dictionaries, web pages, or digital books. Given this condition, it is vital to provide as much support to the language as possible. This research aims to enhance the BLEU score in machine translation by applying the harmony search heuristic method to state-of-the-art transformers models. This is conducted by finding the optimal hyperparameter settings for the models. Models are trained and tested using a fresh moderate-size parallel corpus of 1.5k phrases. By utilizing harmony search, the study shows an improvement in the BLEU score, enhancing it by 2.569%. In order to accomplish this, various factors related to the hyperparameters need to be considered. The application of harmony search with transformers can be extended to various parallel corpora or models, taking these considerations into account.