Li Zhou, Zhenyu Liu, Yutong Li, Yuchi Duan, Huimin Yu, Bin Hu
{"title":"用于抑郁检测的多细粒度融合网络","authors":"Li Zhou, Zhenyu Liu, Yutong Li, Yuchi Duan, Huimin Yu, Bin Hu","doi":"10.1145/3665247","DOIUrl":null,"url":null,"abstract":"<p>Depression is an illness that involves emotional and mental health. Currently, depression detection through interviews is the most popular way. With the advancement of natural language processing and sentiment analysis, automated interview-based depression detection is strongly supported. However, current multimodal depression detection models fail to adequately capture the fine-grained features of depressive behaviors, making it difficult for the models to accurately characterize the subtle changes in depressive symptoms. To address this problem, we propose a Multi Fine-Grained Fusion Network (MFFNet). The core idea of this model is to extract and fuse the information of different scale feature pairs through a Multi-Scale Fastformer (MSfastformer), and then use the Recurrent Pyramid Model (RPM) to integrate the features of different resolutions, promoting the interaction of multi-level information. Through the interaction of multi-scale and multi-resolution features, it aims to explore richer feature representations. To validate the effectiveness of our proposed MFFNet model, we conduct experiments on two depression interview datasets. The experimental results show that the MFFNet model performs better in depression detection compared to other benchmark multimodal models.</p>","PeriodicalId":50937,"journal":{"name":"ACM Transactions on Multimedia Computing Communications and Applications","volume":"50 1","pages":""},"PeriodicalIF":5.2000,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi Fine-Grained Fusion Network for Depression Detection\",\"authors\":\"Li Zhou, Zhenyu Liu, Yutong Li, Yuchi Duan, Huimin Yu, Bin Hu\",\"doi\":\"10.1145/3665247\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Depression is an illness that involves emotional and mental health. Currently, depression detection through interviews is the most popular way. With the advancement of natural language processing and sentiment analysis, automated interview-based depression detection is strongly supported. However, current multimodal depression detection models fail to adequately capture the fine-grained features of depressive behaviors, making it difficult for the models to accurately characterize the subtle changes in depressive symptoms. To address this problem, we propose a Multi Fine-Grained Fusion Network (MFFNet). The core idea of this model is to extract and fuse the information of different scale feature pairs through a Multi-Scale Fastformer (MSfastformer), and then use the Recurrent Pyramid Model (RPM) to integrate the features of different resolutions, promoting the interaction of multi-level information. Through the interaction of multi-scale and multi-resolution features, it aims to explore richer feature representations. To validate the effectiveness of our proposed MFFNet model, we conduct experiments on two depression interview datasets. The experimental results show that the MFFNet model performs better in depression detection compared to other benchmark multimodal models.</p>\",\"PeriodicalId\":50937,\"journal\":{\"name\":\"ACM Transactions on Multimedia Computing Communications and Applications\",\"volume\":\"50 1\",\"pages\":\"\"},\"PeriodicalIF\":5.2000,\"publicationDate\":\"2024-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Multimedia Computing Communications and Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/3665247\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Multimedia Computing Communications and Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3665247","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Multi Fine-Grained Fusion Network for Depression Detection
Depression is an illness that involves emotional and mental health. Currently, depression detection through interviews is the most popular way. With the advancement of natural language processing and sentiment analysis, automated interview-based depression detection is strongly supported. However, current multimodal depression detection models fail to adequately capture the fine-grained features of depressive behaviors, making it difficult for the models to accurately characterize the subtle changes in depressive symptoms. To address this problem, we propose a Multi Fine-Grained Fusion Network (MFFNet). The core idea of this model is to extract and fuse the information of different scale feature pairs through a Multi-Scale Fastformer (MSfastformer), and then use the Recurrent Pyramid Model (RPM) to integrate the features of different resolutions, promoting the interaction of multi-level information. Through the interaction of multi-scale and multi-resolution features, it aims to explore richer feature representations. To validate the effectiveness of our proposed MFFNet model, we conduct experiments on two depression interview datasets. The experimental results show that the MFFNet model performs better in depression detection compared to other benchmark multimodal models.
期刊介绍:
The ACM Transactions on Multimedia Computing, Communications, and Applications is the flagship publication of the ACM Special Interest Group in Multimedia (SIGMM). It is soliciting paper submissions on all aspects of multimedia. Papers on single media (for instance, audio, video, animation) and their processing are also welcome.
TOMM is a peer-reviewed, archival journal, available in both print form and digital form. The Journal is published quarterly; with roughly 7 23-page articles in each issue. In addition, all Special Issues are published online-only to ensure a timely publication. The transactions consists primarily of research papers. This is an archival journal and it is intended that the papers will have lasting importance and value over time. In general, papers whose primary focus is on particular multimedia products or the current state of the industry will not be included.