用新型并行语料库 Ocelotl Nahuatl - 西班牙语训练的低资源语言转换器模型超参数优化的和谐搜索

Máximo Enrique Pacheco Martínez, Maya Carrillo Ruiz, María de Lourdes Sandoval Solís
{"title":"用新型并行语料库 Ocelotl Nahuatl - 西班牙语训练的低资源语言转换器模型超参数优化的和谐搜索","authors":"Máximo Enrique Pacheco Martínez,&nbsp;Maya Carrillo Ruiz,&nbsp;María de Lourdes Sandoval Solís","doi":"10.1016/j.sasc.2024.200152","DOIUrl":null,"url":null,"abstract":"<div><div>Nahuatl, a low-resource language, does not have an online translator application. Instead, resources are limited to dictionaries, web pages, or digital books. Given this condition, it is vital to provide as much support to the language as possible. This research aims to enhance the BLEU score in machine translation by applying the harmony search heuristic method to state-of-the-art transformers models. This is conducted by finding the optimal hyperparameter settings for the models. Models are trained and tested using a fresh moderate-size parallel corpus of 1.5k phrases. By utilizing harmony search, the study shows an improvement in the BLEU score, enhancing it by 2.569%. In order to accomplish this, various factors related to the hyperparameters need to be considered. The application of harmony search with transformers can be extended to various parallel corpora or models, taking these considerations into account.</div></div>","PeriodicalId":101205,"journal":{"name":"Systems and Soft Computing","volume":"6 ","pages":"Article 200152"},"PeriodicalIF":0.0000,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Harmony search for hyperparameters optimization of a low resource language transformer model trained with a novel parallel corpus Ocelotl Nahuatl – Spanish\",\"authors\":\"Máximo Enrique Pacheco Martínez,&nbsp;Maya Carrillo Ruiz,&nbsp;María de Lourdes Sandoval Solís\",\"doi\":\"10.1016/j.sasc.2024.200152\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Nahuatl, a low-resource language, does not have an online translator application. Instead, resources are limited to dictionaries, web pages, or digital books. Given this condition, it is vital to provide as much support to the language as possible. This research aims to enhance the BLEU score in machine translation by applying the harmony search heuristic method to state-of-the-art transformers models. This is conducted by finding the optimal hyperparameter settings for the models. Models are trained and tested using a fresh moderate-size parallel corpus of 1.5k phrases. By utilizing harmony search, the study shows an improvement in the BLEU score, enhancing it by 2.569%. In order to accomplish this, various factors related to the hyperparameters need to be considered. The application of harmony search with transformers can be extended to various parallel corpora or models, taking these considerations into account.</div></div>\",\"PeriodicalId\":101205,\"journal\":{\"name\":\"Systems and Soft Computing\",\"volume\":\"6 \",\"pages\":\"Article 200152\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Systems and Soft Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2772941924000814\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Systems and Soft Computing","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772941924000814","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

纳瓦特尔语是一种低资源语言,没有在线翻译应用程序。相反,资源仅限于字典、网页或数字图书。鉴于这种情况,为该语言提供尽可能多的支持至关重要。本研究旨在将和谐搜索启发式方法应用于最先进的转换器模型,从而提高机器翻译的 BLEU 分数。具体方法是为模型找到最佳超参数设置。我们使用由 1.5k 个短语组成的新鲜中等规模平行语料库对模型进行了训练和测试。通过使用和谐搜索,研究显示 BLEU 分数有所改善,提高了 2.569%。为了实现这一目标,需要考虑与超参数相关的各种因素。考虑到这些因素,带有转换器的和谐搜索的应用可以扩展到各种并行语料库或模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Harmony search for hyperparameters optimization of a low resource language transformer model trained with a novel parallel corpus Ocelotl Nahuatl – Spanish
Nahuatl, a low-resource language, does not have an online translator application. Instead, resources are limited to dictionaries, web pages, or digital books. Given this condition, it is vital to provide as much support to the language as possible. This research aims to enhance the BLEU score in machine translation by applying the harmony search heuristic method to state-of-the-art transformers models. This is conducted by finding the optimal hyperparameter settings for the models. Models are trained and tested using a fresh moderate-size parallel corpus of 1.5k phrases. By utilizing harmony search, the study shows an improvement in the BLEU score, enhancing it by 2.569%. In order to accomplish this, various factors related to the hyperparameters need to be considered. The application of harmony search with transformers can be extended to various parallel corpora or models, taking these considerations into account.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
2.20
自引率
0.00%
发文量
0
期刊最新文献
A multi-objective game theory model for sustainable profitability in the tourism supply chain: Integrating human resource management and artificial neural networks Tourism supply chain resilience assessment and optimization based on complex networks and genetic algorithms Application of CNN-based financial risk identification and management convolutional neural networks in financial risk Forecasting the Bitcoin price using the various Machine Learning: A systematic review in data-driven marketing Optimizing multilevel image segmentation with a modified new Caledonian crow learning algorithm
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1