P. D. Bianco, I. Mindlin, L. Lanzarini, Franco Ronchetti, W. Hasperué, F. Quiroga
{"title":"Structured Text Generation for Spanish Freestyle Battles using Neural Networks","authors":"P. D. Bianco, I. Mindlin, L. Lanzarini, Franco Ronchetti, W. Hasperué, F. Quiroga","doi":"10.1109/CLEI53233.2021.9639929","DOIUrl":null,"url":null,"abstract":"As the presence of artificial intelligence has increased in a variety of different areas, the use of machine learning and deep learning techniques for creative purposes has also risen significantly in recent years. Works of this kind within the area of natural language processing (NLP) are typically neural models used for fiction or lyrics generation. Those works are in most cases in English and adapting them to other languages is not feasible. In this work, we develop a Spanish text generator system for the rap sub-genre known as freestyle. Freestyle songs present unique challenges for text generation given that performers compete with one another in a lyric improvisation contest. Given the low availability of freestyle text, especially in Spanish, we collected two separate datasets, one with freestyle lyrics and the other, larger, with rap lyrics, which are more readily available. The rap dataset can be used for pretraining, and the freestyle dataset for finetuning on the generation task. Furthermore, we design a neural network-based generation model that takes into account both the structure of freestyle and the low data availability. The model was able to generate realistic freestyle verses in Spanish.","PeriodicalId":6803,"journal":{"name":"2021 XLVII Latin American Computing Conference (CLEI)","volume":"71 1","pages":"1-9"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 XLVII Latin American Computing Conference (CLEI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CLEI53233.2021.9639929","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
As the presence of artificial intelligence has increased in a variety of different areas, the use of machine learning and deep learning techniques for creative purposes has also risen significantly in recent years. Works of this kind within the area of natural language processing (NLP) are typically neural models used for fiction or lyrics generation. Those works are in most cases in English and adapting them to other languages is not feasible. In this work, we develop a Spanish text generator system for the rap sub-genre known as freestyle. Freestyle songs present unique challenges for text generation given that performers compete with one another in a lyric improvisation contest. Given the low availability of freestyle text, especially in Spanish, we collected two separate datasets, one with freestyle lyrics and the other, larger, with rap lyrics, which are more readily available. The rap dataset can be used for pretraining, and the freestyle dataset for finetuning on the generation task. Furthermore, we design a neural network-based generation model that takes into account both the structure of freestyle and the low data availability. The model was able to generate realistic freestyle verses in Spanish.