基于变换的幽默检测多模态语境协同编码

Boya Deng, Jiayin Tian, Hao Li
{"title":"基于变换的幽默检测多模态语境协同编码","authors":"Boya Deng, Jiayin Tian, Hao Li","doi":"10.1109/CoST57098.2022.00067","DOIUrl":null,"url":null,"abstract":"Humor, a unique expression of the human language system different from other emotions, plays a very important role in human communication. Previous works on humor detection have been mostly limited to a single textual modality. From the perspective of human humor perception, various aspects such as text, intonation, mannerisms, and body language can convey humor. From the perspective of the structure of jokes, any combination of textual, acoustic, and visual modalities in various positions in the context can form unexpected humor. Therefore, information that exists among multiple modalities and contexts should be considered simultaneously in humor detection. This paper proposes a humor detection model based on the transformer and contextual co-encoding called Transformer-based Multimodal Contextual Co-encoding (TMCC). The model uses the transformer-based multi-head attention to capture potential information across modalities and contexts first. Then, it uses a convolutional autoencoder to further fuse the overall feature matrix and reduce dimensionality. Finally, a simple multilayer perceptron is used to predict the humor labels. By comparing with common baselines of humor detection, it is demonstrated that our model achieves some performance improvement. The availability of each part of the model is demonstrated through a series of ablation studies.","PeriodicalId":135595,"journal":{"name":"2022 International Conference on Culture-Oriented Science and Technology (CoST)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Transformer-based Multimodal Contextual Co-encoding for Humour Detection\",\"authors\":\"Boya Deng, Jiayin Tian, Hao Li\",\"doi\":\"10.1109/CoST57098.2022.00067\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Humor, a unique expression of the human language system different from other emotions, plays a very important role in human communication. Previous works on humor detection have been mostly limited to a single textual modality. From the perspective of human humor perception, various aspects such as text, intonation, mannerisms, and body language can convey humor. From the perspective of the structure of jokes, any combination of textual, acoustic, and visual modalities in various positions in the context can form unexpected humor. Therefore, information that exists among multiple modalities and contexts should be considered simultaneously in humor detection. This paper proposes a humor detection model based on the transformer and contextual co-encoding called Transformer-based Multimodal Contextual Co-encoding (TMCC). The model uses the transformer-based multi-head attention to capture potential information across modalities and contexts first. Then, it uses a convolutional autoencoder to further fuse the overall feature matrix and reduce dimensionality. Finally, a simple multilayer perceptron is used to predict the humor labels. By comparing with common baselines of humor detection, it is demonstrated that our model achieves some performance improvement. The availability of each part of the model is demonstrated through a series of ablation studies.\",\"PeriodicalId\":135595,\"journal\":{\"name\":\"2022 International Conference on Culture-Oriented Science and Technology (CoST)\",\"volume\":\"51 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Conference on Culture-Oriented Science and Technology (CoST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CoST57098.2022.00067\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Culture-Oriented Science and Technology (CoST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CoST57098.2022.00067","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

幽默是人类语言系统区别于其他情感的一种独特表达方式,在人类交际中起着非常重要的作用。以往关于幽默检测的研究大多局限于单一的语篇情态。从人类对幽默的感知来看,文本、语调、言谈举止、肢体语言等各个方面都可以传达幽默。从笑话的结构来看,文本、听觉和视觉形式在语境中不同位置的任何组合都可以形成意想不到的幽默。因此,在幽默检测中应同时考虑存在于多种形式和语境中的信息。本文提出了一种基于转换器和上下文协同编码的幽默检测模型,称为基于转换器的多模态上下文协同编码(TMCC)。该模型首先使用基于转换器的多头注意来捕获跨模态和上下文的潜在信息。然后,使用卷积自编码器进一步融合整体特征矩阵并降低维数。最后,使用一个简单的多层感知器来预测幽默标签。通过与常用幽默检测基线的比较,表明我们的模型取得了一定的性能提升。通过一系列烧蚀研究证明了模型各部分的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Transformer-based Multimodal Contextual Co-encoding for Humour Detection
Humor, a unique expression of the human language system different from other emotions, plays a very important role in human communication. Previous works on humor detection have been mostly limited to a single textual modality. From the perspective of human humor perception, various aspects such as text, intonation, mannerisms, and body language can convey humor. From the perspective of the structure of jokes, any combination of textual, acoustic, and visual modalities in various positions in the context can form unexpected humor. Therefore, information that exists among multiple modalities and contexts should be considered simultaneously in humor detection. This paper proposes a humor detection model based on the transformer and contextual co-encoding called Transformer-based Multimodal Contextual Co-encoding (TMCC). The model uses the transformer-based multi-head attention to capture potential information across modalities and contexts first. Then, it uses a convolutional autoencoder to further fuse the overall feature matrix and reduce dimensionality. Finally, a simple multilayer perceptron is used to predict the humor labels. By comparing with common baselines of humor detection, it is demonstrated that our model achieves some performance improvement. The availability of each part of the model is demonstrated through a series of ablation studies.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A Vision Enhancement Network for Image Quality Assessment Analysis and Application of Tourists’ Sentiment Based on Hotel Comment Data Automatic Image Generation of Peking Opera Face using StyleGAN2 Analysis of Emotional Influencing Factors of Online Travel Reviews Based on BiLSTM-CNN Performance comparison of deep learning methods on hand bone segmentation and bone age assessment
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1