Exploring the Effect of Activation Function on Transformer Model Performance for Official Announcement Translator from Indonesian to Sundanese Languages
B. Wijanarko, Dina Fitria Murad, Y. Heryadi, C. Tho, Kiyota Hashimoto
{"title":"Exploring the Effect of Activation Function on Transformer Model Performance for Official Announcement Translator from Indonesian to Sundanese Languages","authors":"B. Wijanarko, Dina Fitria Murad, Y. Heryadi, C. Tho, Kiyota Hashimoto","doi":"10.1109/ICCoSITE57641.2023.10127770","DOIUrl":null,"url":null,"abstract":"Automated language translation involving low-resource language has gained wide interest from many research communities in the past decade. One lesson learned from the past COVID-19 pandemic, particularly in Indonesia, is that many local Governments have to release regular public announcements to keep people following health protocol especially when they are in public areas. Many studies showed some evidence that rural people in Indonesia which covers a large proportion of Indonesia’s population, feel more convenience receiving official announcements in their local language. However, translating official announcement from the national language to many local languages in Indonesia require many experienced bilingual translators and time. This paper presents exploration results in developing an automated language translator model to translate texts in Bahasa Indonesia to the Sundanese language. In particular, this study aims to explore the effect of ReLU, Sigmoid, and Tanh activation functions of the Vanilla Transformer Model on its translation performance. The experiment results showed that the activation function under study gives similar training accuracy (0.98). However, ReLU achieves better performance than Tanh in terms of validation accuracy, training loss, and validation loss.","PeriodicalId":256184,"journal":{"name":"2023 International Conference on Computer Science, Information Technology and Engineering (ICCoSITE)","volume":"108 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Conference on Computer Science, Information Technology and Engineering (ICCoSITE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCoSITE57641.2023.10127770","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Automated language translation involving low-resource language has gained wide interest from many research communities in the past decade. One lesson learned from the past COVID-19 pandemic, particularly in Indonesia, is that many local Governments have to release regular public announcements to keep people following health protocol especially when they are in public areas. Many studies showed some evidence that rural people in Indonesia which covers a large proportion of Indonesia’s population, feel more convenience receiving official announcements in their local language. However, translating official announcement from the national language to many local languages in Indonesia require many experienced bilingual translators and time. This paper presents exploration results in developing an automated language translator model to translate texts in Bahasa Indonesia to the Sundanese language. In particular, this study aims to explore the effect of ReLU, Sigmoid, and Tanh activation functions of the Vanilla Transformer Model on its translation performance. The experiment results showed that the activation function under study gives similar training accuracy (0.98). However, ReLU achieves better performance than Tanh in terms of validation accuracy, training loss, and validation loss.