{"title":"Detecting and tracking depression through temporal topic modeling of tweets: insights from a 180-day study","authors":"Ranganathan Chandrasekaran, Suhas Kotaki, Abhilash Hosaagrahaara Nagaraja","doi":"10.1038/s44184-024-00107-5","DOIUrl":null,"url":null,"abstract":"Depression affects over 280 million people globally, yet many cases remain undiagnosed or untreated due to stigma and lack of awareness. Social media platforms like X (formerly Twitter) offer a way to monitor and analyze depression markers. This study analyzes Twitter data 90 days before and 90 days after a self-disclosed clinical diagnosis. We gathered 246,637 tweets from 229 diagnosed users. CorEx topic modeling identified seven themes: causes, physical symptoms, mental symptoms, swear words, treatment, coping/support mechanisms, and lifestyle, and conditional logistic regression assessed the odds of these themes occurring post-diagnosis. A control group of healthy users (284,772 tweets) was used to develop and evaluate machine learning classifiers—support vector machines, naive Bayes, and logistic regression—to distinguish between depressed and non-depressed users. Logistic regression and SVM performed best. These findings show the potential of Twitter data for tracking depression and changes in symptoms, coping mechanisms, and treatment use.","PeriodicalId":74321,"journal":{"name":"Npj mental health research","volume":" ","pages":"1-10"},"PeriodicalIF":0.0000,"publicationDate":"2024-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s44184-024-00107-5.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Npj mental health research","FirstCategoryId":"1085","ListUrlMain":"https://www.nature.com/articles/s44184-024-00107-5","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Depression affects over 280 million people globally, yet many cases remain undiagnosed or untreated due to stigma and lack of awareness. Social media platforms like X (formerly Twitter) offer a way to monitor and analyze depression markers. This study analyzes Twitter data 90 days before and 90 days after a self-disclosed clinical diagnosis. We gathered 246,637 tweets from 229 diagnosed users. CorEx topic modeling identified seven themes: causes, physical symptoms, mental symptoms, swear words, treatment, coping/support mechanisms, and lifestyle, and conditional logistic regression assessed the odds of these themes occurring post-diagnosis. A control group of healthy users (284,772 tweets) was used to develop and evaluate machine learning classifiers—support vector machines, naive Bayes, and logistic regression—to distinguish between depressed and non-depressed users. Logistic regression and SVM performed best. These findings show the potential of Twitter data for tracking depression and changes in symptoms, coping mechanisms, and treatment use.