Transformer-based Multimodal Contextual Co-encoding for Humour Detection

2022 International Conference on Culture-Oriented Science and Technology (CoST) Pub Date : 2022-08-01 DOI:10.1109/CoST57098.2022.00067

Boya Deng, Jiayin Tian, Hao Li

{"title":"Transformer-based Multimodal Contextual Co-encoding for Humour Detection","authors":"Boya Deng, Jiayin Tian, Hao Li","doi":"10.1109/CoST57098.2022.00067","DOIUrl":null,"url":null,"abstract":"Humor, a unique expression of the human language system different from other emotions, plays a very important role in human communication. Previous works on humor detection have been mostly limited to a single textual modality. From the perspective of human humor perception, various aspects such as text, intonation, mannerisms, and body language can convey humor. From the perspective of the structure of jokes, any combination of textual, acoustic, and visual modalities in various positions in the context can form unexpected humor. Therefore, information that exists among multiple modalities and contexts should be considered simultaneously in humor detection. This paper proposes a humor detection model based on the transformer and contextual co-encoding called Transformer-based Multimodal Contextual Co-encoding (TMCC). The model uses the transformer-based multi-head attention to capture potential information across modalities and contexts first. Then, it uses a convolutional autoencoder to further fuse the overall feature matrix and reduce dimensionality. Finally, a simple multilayer perceptron is used to predict the humor labels. By comparing with common baselines of humor detection, it is demonstrated that our model achieves some performance improvement. The availability of each part of the model is demonstrated through a series of ablation studies.","PeriodicalId":135595,"journal":{"name":"2022 International Conference on Culture-Oriented Science and Technology (CoST)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Culture-Oriented Science and Technology (CoST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CoST57098.2022.00067","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Humor, a unique expression of the human language system different from other emotions, plays a very important role in human communication. Previous works on humor detection have been mostly limited to a single textual modality. From the perspective of human humor perception, various aspects such as text, intonation, mannerisms, and body language can convey humor. From the perspective of the structure of jokes, any combination of textual, acoustic, and visual modalities in various positions in the context can form unexpected humor. Therefore, information that exists among multiple modalities and contexts should be considered simultaneously in humor detection. This paper proposes a humor detection model based on the transformer and contextual co-encoding called Transformer-based Multimodal Contextual Co-encoding (TMCC). The model uses the transformer-based multi-head attention to capture potential information across modalities and contexts first. Then, it uses a convolutional autoencoder to further fuse the overall feature matrix and reduce dimensionality. Finally, a simple multilayer perceptron is used to predict the humor labels. By comparing with common baselines of humor detection, it is demonstrated that our model achieves some performance improvement. The availability of each part of the model is demonstrated through a series of ablation studies.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于变换的幽默检测多模态语境协同编码

幽默是人类语言系统区别于其他情感的一种独特表达方式，在人类交际中起着非常重要的作用。以往关于幽默检测的研究大多局限于单一的语篇情态。从人类对幽默的感知来看，文本、语调、言谈举止、肢体语言等各个方面都可以传达幽默。从笑话的结构来看，文本、听觉和视觉形式在语境中不同位置的任何组合都可以形成意想不到的幽默。因此，在幽默检测中应同时考虑存在于多种形式和语境中的信息。本文提出了一种基于转换器和上下文协同编码的幽默检测模型，称为基于转换器的多模态上下文协同编码(TMCC)。该模型首先使用基于转换器的多头注意来捕获跨模态和上下文的潜在信息。然后，使用卷积自编码器进一步融合整体特征矩阵并降低维数。最后，使用一个简单的多层感知器来预测幽默标签。通过与常用幽默检测基线的比较，表明我们的模型取得了一定的性能提升。通过一系列烧蚀研究证明了模型各部分的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2022 International Conference on Culture-Oriented Science and Technology (CoST)

自引率

0.00%

发文量