Measuring interpersonal firearm violence: natural language processing methods to address limitations in criminal charge data

IF 4.6 2区医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Journal of the American Medical Informatics Association Pub Date : 2024-04-12 DOI:10.1093/jamia/ocae082

Julie M Kafka, Julia P Schleimer, Ott Toomet, Kaidi Chen, Alice Ellyson, Ali Rowhani-Rahbar

{"title":"Measuring interpersonal firearm violence: natural language processing methods to address limitations in criminal charge data","authors":"Julie M Kafka, Julia P Schleimer, Ott Toomet, Kaidi Chen, Alice Ellyson, Ali Rowhani-Rahbar","doi":"10.1093/jamia/ocae082","DOIUrl":null,"url":null,"abstract":"Objective Firearm violence constitutes a public health crisis in the United States, but comprehensive data infrastructure is lacking to study this problem. To address this challenge, we used natural language processing (NLP) to classify court record documents from alleged violent crimes as firearm-related or non-firearm-related. Materials and Methods We accessed and digitized court records from the state of Washington (n = 1472). Human review established a gold standard label for firearm involvement (yes/no). We developed a key term search and trained supervised machine learning classifiers for this labeling task. Results were evaluated in a held-out test set. Results The decision tree performed best (F1 score: 0.82). The key term list had perfect recall (1.0) and a modest F1 score (0.65). Discussion and Conclusion This case report highlights the accuracy, feasibility, and potential time-saved by using NLP to identify firearm involvement in alleged violent crimes based on digitized narratives from court documents.","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":"47 1","pages":""},"PeriodicalIF":4.6000,"publicationDate":"2024-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Medical Informatics Association","FirstCategoryId":"91","ListUrlMain":"https://doi.org/10.1093/jamia/ocae082","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Objective Firearm violence constitutes a public health crisis in the United States, but comprehensive data infrastructure is lacking to study this problem. To address this challenge, we used natural language processing (NLP) to classify court record documents from alleged violent crimes as firearm-related or non-firearm-related. Materials and Methods We accessed and digitized court records from the state of Washington (n = 1472). Human review established a gold standard label for firearm involvement (yes/no). We developed a key term search and trained supervised machine learning classifiers for this labeling task. Results were evaluated in a held-out test set. Results The decision tree performed best (F1 score: 0.82). The key term list had perfect recall (1.0) and a modest F1 score (0.65). Discussion and Conclusion This case report highlights the accuracy, feasibility, and potential time-saved by using NLP to identify firearm involvement in alleged violent crimes based on digitized narratives from court documents.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

衡量人际火器暴力：解决刑事指控数据局限性的自然语言处理方法

目标枪支暴力是美国的一个公共卫生危机，但缺乏研究这一问题的全面数据基础设施。为了应对这一挑战，我们使用自然语言处理（NLP）技术将涉嫌暴力犯罪的法庭记录文件分类为与枪支有关或无关。材料与方法我们访问了华盛顿州的法院记录并将其数字化（n = 1472）。人工审核确定了枪支相关的黄金标准标签（是/否）。我们开发了一种关键术语搜索，并针对这一标签任务训练了有监督的机器学习分类器。我们在保留的测试集中对结果进行了评估。结果决策树表现最佳（F1 得分：0.82）。关键词列表具有完美的召回率（1.0）和适度的 F1 分数（0.65）。讨论与结论本案例报告强调了使用 NLP 根据法院文件中的数字化叙述来识别涉嫌暴力犯罪中枪支参与情况的准确性、可行性和可能节省的时间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Journal of the American Medical Informatics Association 医学-计算机：跨学科应用

CiteScore

14.50

自引率

7.80%

发文量

230

审稿时长

3-8 weeks

期刊介绍： JAMIA is AMIA''s premier peer-reviewed journal for biomedical and health informatics. Covering the full spectrum of activities in the field, JAMIA includes informatics articles in the areas of clinical care, clinical research, translational science, implementation science, imaging, education, consumer health, public health, and policy. JAMIA''s articles describe innovative informatics research and systems that help to advance biomedical science and to promote health. Case reports, perspectives and reviews also help readers stay connected with the most important informatics developments in implementation, policy and education.