使用微调蒸馏酒模型检测Twitter中潜在的抑郁用户

Artificial Intelligence and Social Computing Pub Date : 1900-01-01 DOI:10.54941/ahfe1001458

Miguel Antonio Adarlo, M. D. De Leon

{"title":"使用微调蒸馏酒模型检测Twitter中潜在的抑郁用户","authors":"Miguel Antonio Adarlo, M. D. De Leon","doi":"10.54941/ahfe1001458","DOIUrl":null,"url":null,"abstract":"With the spread of Major Depressive Disorder, otherwise known simply as depression, around the world, various efforts have been made to combat it and to potentially reach out to those suffering from it. Part of those efforts includes the use of technology, such as machine learning models, to screen a potential person for depression through various means, including social media narratives, such as tweets from Twitter. Hence, this study aims to evaluate how well a pre-trained DistilBERT, a transformer model for natural language processing that was fine-tuned on a set of tweets coming from depressed and non-depressed users, can detect potential users in Twitter as having depression. Two models were built using the same procedure of preprocessing, splitting, tokenizing, training, fine-tuning, and optimizing. Both the Base Model (trained on CLPsych 2015 Dataset) and the Mixed Model (trained on the CLPsych 2015 Dataset and a half of the dataset of scraped tweets) could detect potential users in Twitter for depression more than half of the time by demonstrating an Area under the Receiver Operating Curve (AUC) score of 65% and 63%, respectively, when evaluated using the test dataset. These models performed comparably in identifying potential depressed users in Twitter given that there was no significant difference in their AUC scores when subjected to a z-test at 95% confidence interval and 0.05 level of significance (p = 0.21). These results suggest DistilBERT, when fine-tuned, may be used to detect potential users in Twitter for depression.","PeriodicalId":405313,"journal":{"name":"Artificial Intelligence and Social Computing","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Detecting Potential Depressed Users in Twitter Using a Fine-tuned DistilBERT Model\",\"authors\":\"Miguel Antonio Adarlo, M. D. De Leon\",\"doi\":\"10.54941/ahfe1001458\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the spread of Major Depressive Disorder, otherwise known simply as depression, around the world, various efforts have been made to combat it and to potentially reach out to those suffering from it. Part of those efforts includes the use of technology, such as machine learning models, to screen a potential person for depression through various means, including social media narratives, such as tweets from Twitter. Hence, this study aims to evaluate how well a pre-trained DistilBERT, a transformer model for natural language processing that was fine-tuned on a set of tweets coming from depressed and non-depressed users, can detect potential users in Twitter as having depression. Two models were built using the same procedure of preprocessing, splitting, tokenizing, training, fine-tuning, and optimizing. Both the Base Model (trained on CLPsych 2015 Dataset) and the Mixed Model (trained on the CLPsych 2015 Dataset and a half of the dataset of scraped tweets) could detect potential users in Twitter for depression more than half of the time by demonstrating an Area under the Receiver Operating Curve (AUC) score of 65% and 63%, respectively, when evaluated using the test dataset. These models performed comparably in identifying potential depressed users in Twitter given that there was no significant difference in their AUC scores when subjected to a z-test at 95% confidence interval and 0.05 level of significance (p = 0.21). These results suggest DistilBERT, when fine-tuned, may be used to detect potential users in Twitter for depression.\",\"PeriodicalId\":405313,\"journal\":{\"name\":\"Artificial Intelligence and Social Computing\",\"volume\":\"3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Artificial Intelligence and Social Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.54941/ahfe1001458\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence and Social Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.54941/ahfe1001458","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

随着重度抑郁症(又称抑郁症)在世界范围内的蔓延，人们已经做出了各种努力来对抗它，并可能向那些患有抑郁症的人伸出援手。这些努力的一部分包括使用机器学习模型等技术，通过各种手段筛选潜在的抑郁症患者，包括社交媒体叙事，如推特上的推文。因此，本研究旨在评估预训练的蒸馏器(一种用于自然语言处理的转换模型，对来自抑郁和非抑郁用户的一组推文进行微调)在Twitter上检测潜在用户是否患有抑郁症的效果。使用相同的预处理、分割、标记化、训练、微调和优化过程构建两个模型。基本模型(在CLPsych 2015数据集上训练)和混合模型(在CLPsych 2015数据集和一半的抓取推文数据集上训练)在使用测试数据集进行评估时，分别显示接收者操作曲线下的面积(AUC)得分为65%和63%，可以在超过一半的时间内检测到Twitter中潜在的抑郁症用户。这些模型在识别Twitter潜在抑郁用户方面表现相当，因为在95%置信区间和0.05显著性水平上进行z检验时，他们的AUC分数没有显著差异(p = 0.21)。这些结果表明，蒸馏器经过微调后，可以用来检测Twitter上潜在的抑郁症用户。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Detecting Potential Depressed Users in Twitter Using a Fine-tuned DistilBERT Model

With the spread of Major Depressive Disorder, otherwise known simply as depression, around the world, various efforts have been made to combat it and to potentially reach out to those suffering from it. Part of those efforts includes the use of technology, such as machine learning models, to screen a potential person for depression through various means, including social media narratives, such as tweets from Twitter. Hence, this study aims to evaluate how well a pre-trained DistilBERT, a transformer model for natural language processing that was fine-tuned on a set of tweets coming from depressed and non-depressed users, can detect potential users in Twitter as having depression. Two models were built using the same procedure of preprocessing, splitting, tokenizing, training, fine-tuning, and optimizing. Both the Base Model (trained on CLPsych 2015 Dataset) and the Mixed Model (trained on the CLPsych 2015 Dataset and a half of the dataset of scraped tweets) could detect potential users in Twitter for depression more than half of the time by demonstrating an Area under the Receiver Operating Curve (AUC) score of 65% and 63%, respectively, when evaluated using the test dataset. These models performed comparably in identifying potential depressed users in Twitter given that there was no significant difference in their AUC scores when subjected to a z-test at 95% confidence interval and 0.05 level of significance (p = 0.21). These results suggest DistilBERT, when fine-tuned, may be used to detect potential users in Twitter for depression.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Artificial Intelligence and Social Computing

自引率

0.00%

发文量