{"title":"Automatic detection of problem-gambling signs from online texts using large language models.","authors":"Elke Smith, Jan Peters, Nils Reiter","doi":"10.1371/journal.pdig.0000605","DOIUrl":null,"url":null,"abstract":"<p><p>Problem gambling is a major public health concern and is associated with profound psychological distress and economic problems. There are numerous gambling communities on the internet where users exchange information about games, gambling tactics, as well as gambling-related problems. Individuals exhibiting higher levels of problem gambling engage more in such communities. Online gambling communities may provide insights into problem-gambling behaviour. Using data scraped from a major German gambling discussion board, we fine-tuned a large language model, specifically a Bidirectional Encoder Representations from Transformers (BERT) model, to predict signs of problem-gambling from forum posts. Training data were generated by manual annotation and by taking into account diagnostic criteria and gambling-related cognitive distortions. Using cross-validation, our models achieved a precision of 0.95 and F1 score of 0.71, demonstrating that satisfactory classification performance can be achieved by generating high-quality training material through manual annotation based on diagnostic criteria. The current study confirms that a BERT-based model can be reliably used on small data sets and to detect signatures of problem gambling in online communication data. Such computational approaches may have potential for the detection of changes in problem-gambling prevalence among online users.</p>","PeriodicalId":74465,"journal":{"name":"PLOS digital health","volume":"3 9","pages":"e0000605"},"PeriodicalIF":0.0000,"publicationDate":"2024-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11423982/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLOS digital health","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1371/journal.pdig.0000605","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/9/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Problem gambling is a major public health concern and is associated with profound psychological distress and economic problems. There are numerous gambling communities on the internet where users exchange information about games, gambling tactics, as well as gambling-related problems. Individuals exhibiting higher levels of problem gambling engage more in such communities. Online gambling communities may provide insights into problem-gambling behaviour. Using data scraped from a major German gambling discussion board, we fine-tuned a large language model, specifically a Bidirectional Encoder Representations from Transformers (BERT) model, to predict signs of problem-gambling from forum posts. Training data were generated by manual annotation and by taking into account diagnostic criteria and gambling-related cognitive distortions. Using cross-validation, our models achieved a precision of 0.95 and F1 score of 0.71, demonstrating that satisfactory classification performance can be achieved by generating high-quality training material through manual annotation based on diagnostic criteria. The current study confirms that a BERT-based model can be reliably used on small data sets and to detect signatures of problem gambling in online communication data. Such computational approaches may have potential for the detection of changes in problem-gambling prevalence among online users.