MMATR: A Lightweight Approach for Multimodal Sentiment Analysis Based on Tensor Methods

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2023-06-04 DOI:10.1109/ICASSP49357.2023.10097030

Panagiotis Koromilas, M. Nicolaou, Theodoros Giannakopoulos, Yannis Panagakis

{"title":"MMATR: A Lightweight Approach for Multimodal Sentiment Analysis Based on Tensor Methods","authors":"Panagiotis Koromilas, M. Nicolaou, Theodoros Giannakopoulos, Yannis Panagakis","doi":"10.1109/ICASSP49357.2023.10097030","DOIUrl":null,"url":null,"abstract":"Despite the considerable research output on Multimodal Learning for Affect-related tasks, most of the current methods are very complex in terms of the number of trainable parameters, and thus do not constitute effective solutions for real-life applications. In this work we try to alleviate this gap in the literature by introducing the Multimodal Attention Tensor Regression (MMATR) network, a lightweight model that is based on: (i) a static input representation (2D matrix of dimensions time × features) for each modality, which helps to avoid high-parameterized sequential models by incorporating a CNN, (ii) the replacement of the usual pooling and flattening operations as well as the linear layers by tensor contraction and tensor regression layers that are able to reduce the number of parameters, while keeping the high-order structure of the multimodal data, and (iii) a bimodal attention layer that learns multimodal co-occurrences. By a set of experiments comparing with a variety of state-of-the-art techniques, we show that the proposed MMATR can achieve results competitive to the state-of-the-art in the task of Multimodal Sentiment Analysis, albeit having four orders of magnitude fewer parameters.","PeriodicalId":113072,"journal":{"name":"ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"68 8","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP49357.2023.10097030","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Despite the considerable research output on Multimodal Learning for Affect-related tasks, most of the current methods are very complex in terms of the number of trainable parameters, and thus do not constitute effective solutions for real-life applications. In this work we try to alleviate this gap in the literature by introducing the Multimodal Attention Tensor Regression (MMATR) network, a lightweight model that is based on: (i) a static input representation (2D matrix of dimensions time × features) for each modality, which helps to avoid high-parameterized sequential models by incorporating a CNN, (ii) the replacement of the usual pooling and flattening operations as well as the linear layers by tensor contraction and tensor regression layers that are able to reduce the number of parameters, while keeping the high-order structure of the multimodal data, and (iii) a bimodal attention layer that learns multimodal co-occurrences. By a set of experiments comparing with a variety of state-of-the-art techniques, we show that the proposed MMATR can achieve results competitive to the state-of-the-art in the task of Multimodal Sentiment Analysis, albeit having four orders of magnitude fewer parameters.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

MMATR:一种基于张量方法的轻量级多模态情感分析方法

尽管在情感相关任务的多模态学习方面有相当多的研究成果，但目前大多数方法在可训练参数的数量方面非常复杂，因此不能构成实际应用的有效解决方案。在这项工作中，我们试图通过引入多模态注意张量回归(MMATR)网络来缓解文献中的这一差距，这是一个轻量级模型，基于:(i)每个模态的静态输入表示(二维维度时间x特征矩阵)，这有助于通过结合CNN避免高参数化的序列模型;(ii)用张量收缩和张量回归层取代通常的池化和平坦化操作以及线性层，这些层能够减少参数的数量，同时保持多模态数据的高阶结构;(iii)学习多模态共现的双峰注意层。通过一系列与各种最先进技术进行比较的实验，我们表明，尽管参数少了四个数量级，但所提出的MMATR在多模态情感分析任务中可以获得与最先进技术相竞争的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

自引率

0.00%

发文量