{"title":"基于融合卷积神经网络的面部特征自动抑郁检测","authors":"Renuka Acharya, Soumya P. Dash","doi":"10.1109/SPCOM55316.2022.9840812","DOIUrl":null,"url":null,"abstract":"Clinical depression is one of the crucial medical conditions affecting a substantial portion of today’s world population. This paper proposes a novel technology based on deep learning tools for efficient detection of clinical depression in potential patients. A novel algorithm is proposed to label a recent AFEW-VA dataset in terms of creating the depressed class and the non-depressed class based on the valence and arousal values for various individuals from their video frames. Furthermore, the full facial regions, the eye regions, and the mouth regions from the classified dataset are extracted as the regions of interest (ROIs) to be utilized to train three different pre-trained 2DCNN models, namely, ResNet50, VGG16, and InceptionV3 by using transfer learning. For each 2D-CNN architecture, a novel algorithm is proposed to merge the models trained on the three ROIs. It is observed that the merged model, combining all the three ROIs outperforms the individual models or a merged model merging only two of the three ROIs in terms of obtaining a higher accuracy of depression detection. It is also observed that the merged models based on the ResNet50 architecture results in the best accuracy value of 0.95 as compared to the VGG16 and InceptionV3 architectures.","PeriodicalId":246982,"journal":{"name":"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)","volume":"120 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Automatic Depression Detection Based on Merged Convolutional Neural Networks using Facial Features\",\"authors\":\"Renuka Acharya, Soumya P. Dash\",\"doi\":\"10.1109/SPCOM55316.2022.9840812\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Clinical depression is one of the crucial medical conditions affecting a substantial portion of today’s world population. This paper proposes a novel technology based on deep learning tools for efficient detection of clinical depression in potential patients. A novel algorithm is proposed to label a recent AFEW-VA dataset in terms of creating the depressed class and the non-depressed class based on the valence and arousal values for various individuals from their video frames. Furthermore, the full facial regions, the eye regions, and the mouth regions from the classified dataset are extracted as the regions of interest (ROIs) to be utilized to train three different pre-trained 2DCNN models, namely, ResNet50, VGG16, and InceptionV3 by using transfer learning. For each 2D-CNN architecture, a novel algorithm is proposed to merge the models trained on the three ROIs. It is observed that the merged model, combining all the three ROIs outperforms the individual models or a merged model merging only two of the three ROIs in terms of obtaining a higher accuracy of depression detection. It is also observed that the merged models based on the ResNet50 architecture results in the best accuracy value of 0.95 as compared to the VGG16 and InceptionV3 architectures.\",\"PeriodicalId\":246982,\"journal\":{\"name\":\"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)\",\"volume\":\"120 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-07-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SPCOM55316.2022.9840812\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPCOM55316.2022.9840812","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Automatic Depression Detection Based on Merged Convolutional Neural Networks using Facial Features
Clinical depression is one of the crucial medical conditions affecting a substantial portion of today’s world population. This paper proposes a novel technology based on deep learning tools for efficient detection of clinical depression in potential patients. A novel algorithm is proposed to label a recent AFEW-VA dataset in terms of creating the depressed class and the non-depressed class based on the valence and arousal values for various individuals from their video frames. Furthermore, the full facial regions, the eye regions, and the mouth regions from the classified dataset are extracted as the regions of interest (ROIs) to be utilized to train three different pre-trained 2DCNN models, namely, ResNet50, VGG16, and InceptionV3 by using transfer learning. For each 2D-CNN architecture, a novel algorithm is proposed to merge the models trained on the three ROIs. It is observed that the merged model, combining all the three ROIs outperforms the individual models or a merged model merging only two of the three ROIs in terms of obtaining a higher accuracy of depression detection. It is also observed that the merged models based on the ResNet50 architecture results in the best accuracy value of 0.95 as compared to the VGG16 and InceptionV3 architectures.