Julie M Kafka, Julia P Schleimer, Ott Toomet, Kaidi Chen, Alice Ellyson, Ali Rowhani-Rahbar
{"title":"Measuring interpersonal firearm violence: natural language processing methods to address limitations in criminal charge data","authors":"Julie M Kafka, Julia P Schleimer, Ott Toomet, Kaidi Chen, Alice Ellyson, Ali Rowhani-Rahbar","doi":"10.1093/jamia/ocae082","DOIUrl":null,"url":null,"abstract":"Objective Firearm violence constitutes a public health crisis in the United States, but comprehensive data infrastructure is lacking to study this problem. To address this challenge, we used natural language processing (NLP) to classify court record documents from alleged violent crimes as firearm-related or non-firearm-related. Materials and Methods We accessed and digitized court records from the state of Washington (n = 1472). Human review established a gold standard label for firearm involvement (yes/no). We developed a key term search and trained supervised machine learning classifiers for this labeling task. Results were evaluated in a held-out test set. Results The decision tree performed best (F1 score: 0.82). The key term list had perfect recall (1.0) and a modest F1 score (0.65). Discussion and Conclusion This case report highlights the accuracy, feasibility, and potential time-saved by using NLP to identify firearm involvement in alleged violent crimes based on digitized narratives from court documents.","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":"47 1","pages":""},"PeriodicalIF":4.6000,"publicationDate":"2024-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Medical Informatics Association","FirstCategoryId":"91","ListUrlMain":"https://doi.org/10.1093/jamia/ocae082","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Objective Firearm violence constitutes a public health crisis in the United States, but comprehensive data infrastructure is lacking to study this problem. To address this challenge, we used natural language processing (NLP) to classify court record documents from alleged violent crimes as firearm-related or non-firearm-related. Materials and Methods We accessed and digitized court records from the state of Washington (n = 1472). Human review established a gold standard label for firearm involvement (yes/no). We developed a key term search and trained supervised machine learning classifiers for this labeling task. Results were evaluated in a held-out test set. Results The decision tree performed best (F1 score: 0.82). The key term list had perfect recall (1.0) and a modest F1 score (0.65). Discussion and Conclusion This case report highlights the accuracy, feasibility, and potential time-saved by using NLP to identify firearm involvement in alleged violent crimes based on digitized narratives from court documents.
期刊介绍:
JAMIA is AMIA''s premier peer-reviewed journal for biomedical and health informatics. Covering the full spectrum of activities in the field, JAMIA includes informatics articles in the areas of clinical care, clinical research, translational science, implementation science, imaging, education, consumer health, public health, and policy. JAMIA''s articles describe innovative informatics research and systems that help to advance biomedical science and to promote health. Case reports, perspectives and reviews also help readers stay connected with the most important informatics developments in implementation, policy and education.