{"title":"From Translation to Generative LLMs: Classification of Code-Mixed Affective Tasks","authors":"Anjali Yadav;Tanya Garg;Matej Klemen;Matej Ulčar;Basant Agarwal;M. Robnik-Šikonja","doi":"10.1109/TAFFC.2025.3553399","DOIUrl":null,"url":null,"abstract":"Code-mixed (CM) discourse combines multiple languages in a single text. It is commonly used in informal discourse in countries with several official languages, but also in many other countries in combination with English or neighboring languages. With the recent rise of large transformer language models dominating NLP tasks, we explored their effectiveness in CM contexts. We developed four new bilingual pre-trained masked language models for Hinglish and English-Slovene languages, tailored to handle informal language. We then evaluated monolingual, bilingual, few-lingual, massively multilingual, and larger generative models across multiple languages using two affective tasks involving CM texts: sentiment analysis and offensive speech prediction in social media posts. We compared these models with two translation baselines, one obtained with a neural machine translation tool and the other produced by large generative models. The experiments conducted in five languages: French, Hindi, Russian, Slovene, and Tamil, reveal that fine-tuned bilingual models and multilingual models designed for social media texts outperform others, with massively multilingual and monolingual models following, while larger generative models lag. For the affective tasks studied, models generally performed better on CM data than on non-CM data. The monolingual models with translated datasets rarely compete with multilingual models trained on CM datasets.","PeriodicalId":13131,"journal":{"name":"IEEE Transactions on Affective Computing","volume":"16 3","pages":"2090-2101"},"PeriodicalIF":9.8000,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Affective Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10938193/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Code-mixed (CM) discourse combines multiple languages in a single text. It is commonly used in informal discourse in countries with several official languages, but also in many other countries in combination with English or neighboring languages. With the recent rise of large transformer language models dominating NLP tasks, we explored their effectiveness in CM contexts. We developed four new bilingual pre-trained masked language models for Hinglish and English-Slovene languages, tailored to handle informal language. We then evaluated monolingual, bilingual, few-lingual, massively multilingual, and larger generative models across multiple languages using two affective tasks involving CM texts: sentiment analysis and offensive speech prediction in social media posts. We compared these models with two translation baselines, one obtained with a neural machine translation tool and the other produced by large generative models. The experiments conducted in five languages: French, Hindi, Russian, Slovene, and Tamil, reveal that fine-tuned bilingual models and multilingual models designed for social media texts outperform others, with massively multilingual and monolingual models following, while larger generative models lag. For the affective tasks studied, models generally performed better on CM data than on non-CM data. The monolingual models with translated datasets rarely compete with multilingual models trained on CM datasets.
期刊介绍:
The IEEE Transactions on Affective Computing is an international and interdisciplinary journal. Its primary goal is to share research findings on the development of systems capable of recognizing, interpreting, and simulating human emotions and related affective phenomena. The journal publishes original research on the underlying principles and theories that explain how and why affective factors shape human-technology interactions. It also focuses on how techniques for sensing and simulating affect can enhance our understanding of human emotions and processes. Additionally, the journal explores the design, implementation, and evaluation of systems that prioritize the consideration of affect in their usability. We also welcome surveys of existing work that provide new perspectives on the historical and future directions of this field.