Using large language models for extracting and pre-annotating texts on mental health from noisy data in a low-resource language.

IF 3.5 4区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE PeerJ Computer Science Pub Date : 2024-11-28 eCollection Date: 2024-01-01 DOI:10.7717/peerj-cs.2395
Sergei Koltcov, Anton Surkov, Olessia Koltsova, Vera Ignatenko
{"title":"Using large language models for extracting and pre-annotating texts on mental health from noisy data in a low-resource language.","authors":"Sergei Koltcov, Anton Surkov, Olessia Koltsova, Vera Ignatenko","doi":"10.7717/peerj-cs.2395","DOIUrl":null,"url":null,"abstract":"<p><p>Recent advancements in large language models (LLMs) have opened new possibilities for developing conversational agents (CAs) in various subfields of mental healthcare. However, this progress is hindered by limited access to high-quality training data, often due to privacy concerns and high annotation costs for low-resource languages. A potential solution is to create human-AI annotation systems that utilize extensive public domain user-to-user and user-to-professional discussions on social media. These discussions, however, are extremely noisy, necessitating the adaptation of LLMs for fully automatic cleaning and pre-classification to reduce human annotation effort. To date, research on LLM-based annotation in the mental health domain is extremely scarce. In this article, we explore the potential of zero-shot classification using four LLMs to select and pre-classify texts into topics representing psychiatric disorders, in order to facilitate the future development of CAs for disorder-specific counseling. We use 64,404 Russian-language texts from online discussion threads labeled with seven most commonly discussed disorders: depression, neurosis, paranoia, anxiety disorder, bipolar disorder, obsessive-compulsive disorder, and borderline personality disorder. Our research shows that while preliminary data filtering using zero-shot technology slightly improves classification, LLM fine-tuning makes a far larger contribution to its quality. Both standard and natural language inference (NLI) modes of fine-tuning increase classification accuracy by more than three times compared to non-fine-tuned training with preliminarily filtered data. Although NLI fine-tuning achieves slightly higher accuracy (0.64) than the standard approach, it is six times slower, indicating a need for further experimentation with NLI hypothesis engineering. Additionally, we demonstrate that lemmatization does not affect classification quality and that multilingual models using texts in their original language perform slightly better than English-only models using automatically translated texts. Finally, we introduce our dataset and model as the first openly available Russian-language resource for developing conversational agents in the domain of mental health counseling.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"10 ","pages":"e2395"},"PeriodicalIF":3.5000,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11623104/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PeerJ Computer Science","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.7717/peerj-cs.2395","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Recent advancements in large language models (LLMs) have opened new possibilities for developing conversational agents (CAs) in various subfields of mental healthcare. However, this progress is hindered by limited access to high-quality training data, often due to privacy concerns and high annotation costs for low-resource languages. A potential solution is to create human-AI annotation systems that utilize extensive public domain user-to-user and user-to-professional discussions on social media. These discussions, however, are extremely noisy, necessitating the adaptation of LLMs for fully automatic cleaning and pre-classification to reduce human annotation effort. To date, research on LLM-based annotation in the mental health domain is extremely scarce. In this article, we explore the potential of zero-shot classification using four LLMs to select and pre-classify texts into topics representing psychiatric disorders, in order to facilitate the future development of CAs for disorder-specific counseling. We use 64,404 Russian-language texts from online discussion threads labeled with seven most commonly discussed disorders: depression, neurosis, paranoia, anxiety disorder, bipolar disorder, obsessive-compulsive disorder, and borderline personality disorder. Our research shows that while preliminary data filtering using zero-shot technology slightly improves classification, LLM fine-tuning makes a far larger contribution to its quality. Both standard and natural language inference (NLI) modes of fine-tuning increase classification accuracy by more than three times compared to non-fine-tuned training with preliminarily filtered data. Although NLI fine-tuning achieves slightly higher accuracy (0.64) than the standard approach, it is six times slower, indicating a need for further experimentation with NLI hypothesis engineering. Additionally, we demonstrate that lemmatization does not affect classification quality and that multilingual models using texts in their original language perform slightly better than English-only models using automatically translated texts. Finally, we introduce our dataset and model as the first openly available Russian-language resource for developing conversational agents in the domain of mental health counseling.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用大型语言模型从低资源语言的嘈杂数据中提取和预注释有关心理健康的文本。
大型语言模型(llm)的最新进展为在精神卫生保健的各个子领域开发会话代理(ca)开辟了新的可能性。然而,这一进展受到对高质量训练数据的有限访问的阻碍,这通常是由于隐私问题和低资源语言的高注释成本。一个潜在的解决方案是创建人类-人工智能注释系统,利用社交媒体上广泛的公共领域用户对用户和用户对专业人士的讨论。然而,这些讨论非常嘈杂,需要对llm进行适应,以实现全自动清洗和预分类,以减少人工注释工作。迄今为止,基于llm的标注在心理健康领域的研究非常少。在这篇文章中,我们探索了零射击分类的潜力,使用四个法学硕士来选择和预分类文本到代表精神疾病的主题,以促进ca在疾病特异性咨询方面的未来发展。我们使用了来自在线讨论线程的64404个俄语文本,这些文本被标记为七种最常讨论的疾病:抑郁症、神经症、偏执、焦虑症、双相情感障碍、强迫症和边缘型人格障碍。我们的研究表明,虽然使用zero-shot技术的初步数据过滤略微提高了分类,但LLM微调对分类质量的贡献要大得多。与使用初步过滤的数据进行非微调训练相比,标准和自然语言推理(NLI)微调模式的分类准确率都提高了3倍以上。虽然NLI微调达到了比标准方法稍高的精度(0.64),但它要慢6倍,这表明需要进一步的NLI假设工程实验。此外,我们证明了词序化不会影响分类质量,并且使用原始语言文本的多语言模型比使用自动翻译文本的纯英语模型表现略好。最后,我们介绍了我们的数据集和模型,作为在心理健康咨询领域开发会话代理的第一个公开可用的俄语资源。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
PeerJ Computer Science
PeerJ Computer Science Computer Science-General Computer Science
CiteScore
6.10
自引率
5.30%
发文量
332
审稿时长
10 weeks
期刊介绍: PeerJ Computer Science is the new open access journal covering all subject areas in computer science, with the backing of a prestigious advisory board and more than 300 academic editors.
期刊最新文献
Design of a 3D emotion mapping model for visual feature analysis using improved Gaussian mixture models. Enhancing task execution: a dual-layer approach with multi-queue adaptive priority scheduling. LOGIC: LLM-originated guidance for internal cognitive improvement of small language models in stance detection. Generative AI and future education: a review, theoretical validation, and authors' perspective on challenges and solutions. MSR-UNet: enhancing multi-scale and long-range dependencies in medical image segmentation.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1