使用微调蒸馏酒模型检测Twitter中潜在的抑郁用户

Miguel Antonio Adarlo, M. D. De Leon
{"title":"使用微调蒸馏酒模型检测Twitter中潜在的抑郁用户","authors":"Miguel Antonio Adarlo, M. D. De Leon","doi":"10.54941/ahfe1001458","DOIUrl":null,"url":null,"abstract":"With the spread of Major Depressive Disorder, otherwise known simply as depression, around the world, various efforts have been made to combat it and to potentially reach out to those suffering from it. Part of those efforts includes the use of technology, such as machine learning models, to screen a potential person for depression through various means, including social media narratives, such as tweets from Twitter. Hence, this study aims to evaluate how well a pre-trained DistilBERT, a transformer model for natural language processing that was fine-tuned on a set of tweets coming from depressed and non-depressed users, can detect potential users in Twitter as having depression. Two models were built using the same procedure of preprocessing, splitting, tokenizing, training, fine-tuning, and optimizing. Both the Base Model (trained on CLPsych 2015 Dataset) and the Mixed Model (trained on the CLPsych 2015 Dataset and a half of the dataset of scraped tweets) could detect potential users in Twitter for depression more than half of the time by demonstrating an Area under the Receiver Operating Curve (AUC) score of 65% and 63%, respectively, when evaluated using the test dataset. These models performed comparably in identifying potential depressed users in Twitter given that there was no significant difference in their AUC scores when subjected to a z-test at 95% confidence interval and 0.05 level of significance (p = 0.21). These results suggest DistilBERT, when fine-tuned, may be used to detect potential users in Twitter for depression.","PeriodicalId":405313,"journal":{"name":"Artificial Intelligence and Social Computing","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Detecting Potential Depressed Users in Twitter Using a Fine-tuned DistilBERT Model\",\"authors\":\"Miguel Antonio Adarlo, M. D. De Leon\",\"doi\":\"10.54941/ahfe1001458\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the spread of Major Depressive Disorder, otherwise known simply as depression, around the world, various efforts have been made to combat it and to potentially reach out to those suffering from it. Part of those efforts includes the use of technology, such as machine learning models, to screen a potential person for depression through various means, including social media narratives, such as tweets from Twitter. Hence, this study aims to evaluate how well a pre-trained DistilBERT, a transformer model for natural language processing that was fine-tuned on a set of tweets coming from depressed and non-depressed users, can detect potential users in Twitter as having depression. Two models were built using the same procedure of preprocessing, splitting, tokenizing, training, fine-tuning, and optimizing. Both the Base Model (trained on CLPsych 2015 Dataset) and the Mixed Model (trained on the CLPsych 2015 Dataset and a half of the dataset of scraped tweets) could detect potential users in Twitter for depression more than half of the time by demonstrating an Area under the Receiver Operating Curve (AUC) score of 65% and 63%, respectively, when evaluated using the test dataset. These models performed comparably in identifying potential depressed users in Twitter given that there was no significant difference in their AUC scores when subjected to a z-test at 95% confidence interval and 0.05 level of significance (p = 0.21). These results suggest DistilBERT, when fine-tuned, may be used to detect potential users in Twitter for depression.\",\"PeriodicalId\":405313,\"journal\":{\"name\":\"Artificial Intelligence and Social Computing\",\"volume\":\"3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Artificial Intelligence and Social Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.54941/ahfe1001458\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence and Social Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.54941/ahfe1001458","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

随着重度抑郁症(又称抑郁症)在世界范围内的蔓延,人们已经做出了各种努力来对抗它,并可能向那些患有抑郁症的人伸出援手。这些努力的一部分包括使用机器学习模型等技术,通过各种手段筛选潜在的抑郁症患者,包括社交媒体叙事,如推特上的推文。因此,本研究旨在评估预训练的蒸馏器(一种用于自然语言处理的转换模型,对来自抑郁和非抑郁用户的一组推文进行微调)在Twitter上检测潜在用户是否患有抑郁症的效果。使用相同的预处理、分割、标记化、训练、微调和优化过程构建两个模型。基本模型(在CLPsych 2015数据集上训练)和混合模型(在CLPsych 2015数据集和一半的抓取推文数据集上训练)在使用测试数据集进行评估时,分别显示接收者操作曲线下的面积(AUC)得分为65%和63%,可以在超过一半的时间内检测到Twitter中潜在的抑郁症用户。这些模型在识别Twitter潜在抑郁用户方面表现相当,因为在95%置信区间和0.05显著性水平上进行z检验时,他们的AUC分数没有显著差异(p = 0.21)。这些结果表明,蒸馏器经过微调后,可以用来检测Twitter上潜在的抑郁症用户。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Detecting Potential Depressed Users in Twitter Using a Fine-tuned DistilBERT Model
With the spread of Major Depressive Disorder, otherwise known simply as depression, around the world, various efforts have been made to combat it and to potentially reach out to those suffering from it. Part of those efforts includes the use of technology, such as machine learning models, to screen a potential person for depression through various means, including social media narratives, such as tweets from Twitter. Hence, this study aims to evaluate how well a pre-trained DistilBERT, a transformer model for natural language processing that was fine-tuned on a set of tweets coming from depressed and non-depressed users, can detect potential users in Twitter as having depression. Two models were built using the same procedure of preprocessing, splitting, tokenizing, training, fine-tuning, and optimizing. Both the Base Model (trained on CLPsych 2015 Dataset) and the Mixed Model (trained on the CLPsych 2015 Dataset and a half of the dataset of scraped tweets) could detect potential users in Twitter for depression more than half of the time by demonstrating an Area under the Receiver Operating Curve (AUC) score of 65% and 63%, respectively, when evaluated using the test dataset. These models performed comparably in identifying potential depressed users in Twitter given that there was no significant difference in their AUC scores when subjected to a z-test at 95% confidence interval and 0.05 level of significance (p = 0.21). These results suggest DistilBERT, when fine-tuned, may be used to detect potential users in Twitter for depression.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Hepatitis predictive analysis model through deep learning using neural networks based on patient history A machine learning approach for optimizing waiting times in a hand surgery operation center Automated Decision Support for Collaborative, Interactive Classification Dynamically monitoring crowd-worker's reliability with interval-valued labels Detection of inappropriate images on smartphones based on computer vision techniques
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1