Inductive thematic analysis of healthcare qualitative interviews using open-source large language models: How does it compare to traditional methods?

IF 4.9 2区 医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Computer methods and programs in biomedicine Pub Date : 2024-07-24 DOI:10.1016/j.cmpb.2024.108356
{"title":"Inductive thematic analysis of healthcare qualitative interviews using open-source large language models: How does it compare to traditional methods?","authors":"","doi":"10.1016/j.cmpb.2024.108356","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><p>Large language models (LLMs) are generative artificial intelligence that have ignited much interest and discussion about their utility in clinical and research settings. Despite this interest there is sparse analysis of their use in qualitative thematic analysis comparing their current ability to that of human coding and analysis. In addition, there has been no published analysis of their use in real-world, protected health information.</p></div><div><h3>Objective</h3><p>Here we fill that gap in the literature by comparing an LLM to standard human thematic analysis in real-world, semi-structured interviews of both patients and clinicians within a psychiatric setting.</p></div><div><h3>Methods</h3><p>Using a 70 billion parameter open-source LLM running on local hardware and advanced prompt engineering techniques, we produced themes that summarized a full corpus of interviews in minutes. Subsequently we used three different evaluation methods for quantifying similarity between themes produced by the LLM and those produced by humans.</p></div><div><h3>Results</h3><p>These revealed similarities ranging from moderate to substantial (Jaccard similarity coefficients 0.44–0.69), which are promising preliminary results.</p></div><div><h3>Conclusion</h3><p>Our study demonstrates that open-source LLMs can effectively generate robust themes from qualitative data, achieving substantial similarity to human-generated themes. The validation of LLMs in thematic analysis, coupled with evaluation methodologies, highlights their potential to enhance and democratize qualitative research across diverse fields.</p></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":null,"pages":null},"PeriodicalIF":4.9000,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer methods and programs in biomedicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169260724003493","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

Background

Large language models (LLMs) are generative artificial intelligence that have ignited much interest and discussion about their utility in clinical and research settings. Despite this interest there is sparse analysis of their use in qualitative thematic analysis comparing their current ability to that of human coding and analysis. In addition, there has been no published analysis of their use in real-world, protected health information.

Objective

Here we fill that gap in the literature by comparing an LLM to standard human thematic analysis in real-world, semi-structured interviews of both patients and clinicians within a psychiatric setting.

Methods

Using a 70 billion parameter open-source LLM running on local hardware and advanced prompt engineering techniques, we produced themes that summarized a full corpus of interviews in minutes. Subsequently we used three different evaluation methods for quantifying similarity between themes produced by the LLM and those produced by humans.

Results

These revealed similarities ranging from moderate to substantial (Jaccard similarity coefficients 0.44–0.69), which are promising preliminary results.

Conclusion

Our study demonstrates that open-source LLMs can effectively generate robust themes from qualitative data, achieving substantial similarity to human-generated themes. The validation of LLMs in thematic analysis, coupled with evaluation methodologies, highlights their potential to enhance and democratize qualitative research across diverse fields.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用开源大型语言模型对医疗保健定性访谈进行归纳主题分析:与传统方法相比有何优势?
背景:大型语言模型(LLMs)是一种生成式人工智能,在临床和研究领域的应用引起了广泛的兴趣和讨论。尽管人们对其兴趣浓厚,但对其在定性主题分析中的应用却鲜有分析,也没有将其目前的能力与人工编码和分析能力进行比较。目的:在此,我们通过比较 LLM 与标准人工主题分析在真实世界中对精神病患者和临床医生进行的半结构化访谈,填补了文献中的这一空白:利用在本地硬件上运行的 700 亿参数开源 LLM 和先进的提示工程技术,我们在几分钟内就生成了能概括整个访谈语料库的主题。随后,我们使用三种不同的评估方法来量化 LLM 生成的主题与人类生成的主题之间的相似性:结果:这些方法揭示了从中度到高度的相似性(Jaccard 相似系数为 0.44-0.69),这是令人鼓舞的初步结果:我们的研究表明,开源 LLMs 可以有效地从定性数据中生成稳健的主题,与人类生成的主题具有很大的相似性。在主题分析中对 LLMs 的验证,再加上评估方法,凸显了 LLMs 在不同领域加强定性研究并使之民主化的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Computer methods and programs in biomedicine
Computer methods and programs in biomedicine 工程技术-工程:生物医学
CiteScore
12.30
自引率
6.60%
发文量
601
审稿时长
135 days
期刊介绍: To encourage the development of formal computing methods, and their application in biomedical research and medical practice, by illustration of fundamental principles in biomedical informatics research; to stimulate basic research into application software design; to report the state of research of biomedical information processing projects; to report new computer methodologies applied in biomedical areas; the eventual distribution of demonstrable software to avoid duplication of effort; to provide a forum for discussion and improvement of existing software; to optimize contact between national organizations and regional user groups by promoting an international exchange of information on formal methods, standards and software in biomedicine. Computer Methods and Programs in Biomedicine covers computing methodology and software systems derived from computing science for implementation in all aspects of biomedical research and medical practice. It is designed to serve: biochemists; biologists; geneticists; immunologists; neuroscientists; pharmacologists; toxicologists; clinicians; epidemiologists; psychiatrists; psychologists; cardiologists; chemists; (radio)physicists; computer scientists; programmers and systems analysts; biomedical, clinical, electrical and other engineers; teachers of medical informatics and users of educational software.
期刊最新文献
Conv-RGNN: An efficient Convolutional Residual Graph Neural Network for ECG classification SleepGCN: A transition rule learning model based on Graph Convolutional Network for sleep staging Barriers encountered with clinical data warehouses: Recommendations from a focus group Prediction of mortality events of patients with acute heart failure in intensive care unit based on deep neural network Applications of genome-scale metabolic models to the study of human diseases: A systematic review
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1