超越 ChatGPT：本地大型语言模型 (LLM) 与社会工作研究中的机密非结构化文本数据的安全分析

IF 1.7 4区社会学 Q1 SOCIAL WORK Research on Social Work Practice Pub Date : 2024-09-30 DOI:10.1177/10497315241280686

Brian E. Perron, Hui Luan, Bryan G. Victor, Oliver Hiltz-Perron, Joseph Ryan

{"title":"超越 ChatGPT：本地大型语言模型 (LLM) 与社会工作研究中的机密非结构化文本数据的安全分析","authors":"Brian E. Perron, Hui Luan, Bryan G. Victor, Oliver Hiltz-Perron, Joseph Ryan","doi":"10.1177/10497315241280686","DOIUrl":null,"url":null,"abstract":"Purpose: Large language models (LLMs) have demonstrated remarkable abilities in natural language tasks. However, their use in social work research is limited by confidentiality and security concerns when processing sensitive data. This study addresses these challenges by evaluating the performance of local LLMs (LocalLLMs) in classifying and extracting substance-related problems from unstructured child welfare investigation summaries. LocalLLMs allow researchers to analyze data on their own computers without transmitting information to external servers for processing. Methods: Four state-of-the-art LocalLLMs—Mistral-7b, Mixtral-8 × 7b, LLama3-8b, and Llama3-70b—were tested using zero-shot prompting on 2,956 manually coded summaries. Results: The LocalLLMs achieved exceptional results comparable to human experts in classification and extraction, demonstrating their potential to unlock valuable insights from confidential, unstructured child welfare data. Conclusions: This study highlights the feasibility of using LocalLLMs to efficiently analyze large amounts of textual data while addressing the confidentiality issues associated with proprietary LLMs.","PeriodicalId":47993,"journal":{"name":"Research on Social Work Practice","volume":"220 1","pages":""},"PeriodicalIF":1.7000,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Moving Beyond ChatGPT: Local Large Language Models (LLMs) and the Secure Analysis of Confidential Unstructured Text Data in Social Work Research\",\"authors\":\"Brian E. Perron, Hui Luan, Bryan G. Victor, Oliver Hiltz-Perron, Joseph Ryan\",\"doi\":\"10.1177/10497315241280686\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Purpose: Large language models (LLMs) have demonstrated remarkable abilities in natural language tasks. However, their use in social work research is limited by confidentiality and security concerns when processing sensitive data. This study addresses these challenges by evaluating the performance of local LLMs (LocalLLMs) in classifying and extracting substance-related problems from unstructured child welfare investigation summaries. LocalLLMs allow researchers to analyze data on their own computers without transmitting information to external servers for processing. Methods: Four state-of-the-art LocalLLMs—Mistral-7b, Mixtral-8 × 7b, LLama3-8b, and Llama3-70b—were tested using zero-shot prompting on 2,956 manually coded summaries. Results: The LocalLLMs achieved exceptional results comparable to human experts in classification and extraction, demonstrating their potential to unlock valuable insights from confidential, unstructured child welfare data. Conclusions: This study highlights the feasibility of using LocalLLMs to efficiently analyze large amounts of textual data while addressing the confidentiality issues associated with proprietary LLMs.\",\"PeriodicalId\":47993,\"journal\":{\"name\":\"Research on Social Work Practice\",\"volume\":\"220 1\",\"pages\":\"\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2024-09-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Research on Social Work Practice\",\"FirstCategoryId\":\"90\",\"ListUrlMain\":\"https://doi.org/10.1177/10497315241280686\",\"RegionNum\":4,\"RegionCategory\":\"社会学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"SOCIAL WORK\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Research on Social Work Practice","FirstCategoryId":"90","ListUrlMain":"https://doi.org/10.1177/10497315241280686","RegionNum":4,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"SOCIAL WORK","Score":null,"Total":0}

引用次数: 0

摘要

目的：大型语言模型（LLM）在自然语言任务中表现出了非凡的能力。然而，在处理敏感数据时，它们在社会工作研究中的应用受到保密性和安全性问题的限制。本研究通过评估本地语言模型（LocalLLMs）在从非结构化儿童福利调查摘要中分类和提取药物相关问题时的性能，来应对这些挑战。本地 LLM 允许研究人员在自己的计算机上分析数据，而无需将信息传输到外部服务器进行处理。方法：对 2956 份手动编码的摘要进行了测试，测试中使用了零点提示，测试了四种最先进的 LocalLLMs--Mistral-7b、Mixtral-8 × 7b、LLama3-8b 和 Llama3-70b。测试结果LocalLLMs 在分类和提取方面取得了与人类专家不相上下的优异成绩，证明了它们从保密的、非结构化的儿童福利数据中挖掘宝贵见解的潜力。结论本研究强调了使用 LocalLLMs 有效分析大量文本数据的可行性，同时解决了与专有 LLMs 相关的保密问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Moving Beyond ChatGPT: Local Large Language Models (LLMs) and the Secure Analysis of Confidential Unstructured Text Data in Social Work Research

Purpose: Large language models (LLMs) have demonstrated remarkable abilities in natural language tasks. However, their use in social work research is limited by confidentiality and security concerns when processing sensitive data. This study addresses these challenges by evaluating the performance of local LLMs (LocalLLMs) in classifying and extracting substance-related problems from unstructured child welfare investigation summaries. LocalLLMs allow researchers to analyze data on their own computers without transmitting information to external servers for processing. Methods: Four state-of-the-art LocalLLMs—Mistral-7b, Mixtral-8 × 7b, LLama3-8b, and Llama3-70b—were tested using zero-shot prompting on 2,956 manually coded summaries. Results: The LocalLLMs achieved exceptional results comparable to human experts in classification and extraction, demonstrating their potential to unlock valuable insights from confidential, unstructured child welfare data. Conclusions: This study highlights the feasibility of using LocalLLMs to efficiently analyze large amounts of textual data while addressing the confidentiality issues associated with proprietary LLMs.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Research on Social Work Practice SOCIAL WORK-

CiteScore

3.50

自引率

11.10%

发文量

105

期刊介绍： Research on Social Work Practice, sponsored by the Society for Social Work and Research, is a disciplinary journal devoted to the publication of empirical research concerning the methods and outcomes of social work practice. Social work practice is broadly interpreted to refer to the application of intentionally designed social work intervention programs to problems of societal and/or interpersonal importance, including behavior analysis or psychotherapy involving individuals; case management; practice involving couples, families, and small groups; community practice education; and the development, implementation, and evaluation of social policies.