{"title":"Machine Learning for Public Good: Predicting Urban Crime Patterns to Enhance Community Safety","authors":"Sia Gupta, Simeon Sayer","doi":"arxiv-2409.10838","DOIUrl":null,"url":null,"abstract":"In recent years, urban safety has become a paramount concern for city\nplanners and law enforcement agencies. Accurate prediction of likely crime\noccurrences can significantly enhance preventive measures and resource\nallocation. However, many law enforcement departments lack the tools to analyze\nand apply advanced AI and ML techniques that can support city planners, watch\nprograms, and safety leaders to take proactive steps towards overall community\nsafety. This paper explores the effectiveness of ML techniques to predict spatial and\ntemporal patterns of crimes in urban areas. Leveraging police dispatch call\ndata from San Jose, CA, the research goal is to achieve a high degree of\naccuracy in categorizing calls into priority levels particularly for more\ndangerous situations that require an immediate law enforcement response. This\ncategorization is informed by the time, place, and nature of the call. The\nresearch steps include data extraction, preprocessing, feature engineering,\nexploratory data analysis, implementation, optimization and tuning of different\nsupervised machine learning models and neural networks. The accuracy and\nprecision are examined for different models and features at varying granularity\nof crime categories and location precision. The results demonstrate that when compared to a variety of other models,\nRandom Forest classification models are most effective in identifying dangerous\nsituations and their corresponding priority levels with high accuracy (Accuracy\n= 85%, AUC = 0.92) at a local level while ensuring a minimum amount of false\nnegatives. While further research and data gathering is needed to include other\nsocial and economic factors, these results provide valuable insights for law\nenforcement agencies to optimize resources, develop proactive deployment\napproaches, and adjust response patterns to enhance overall public safety\noutcomes in an unbiased way.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.10838","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In recent years, urban safety has become a paramount concern for city
planners and law enforcement agencies. Accurate prediction of likely crime
occurrences can significantly enhance preventive measures and resource
allocation. However, many law enforcement departments lack the tools to analyze
and apply advanced AI and ML techniques that can support city planners, watch
programs, and safety leaders to take proactive steps towards overall community
safety. This paper explores the effectiveness of ML techniques to predict spatial and
temporal patterns of crimes in urban areas. Leveraging police dispatch call
data from San Jose, CA, the research goal is to achieve a high degree of
accuracy in categorizing calls into priority levels particularly for more
dangerous situations that require an immediate law enforcement response. This
categorization is informed by the time, place, and nature of the call. The
research steps include data extraction, preprocessing, feature engineering,
exploratory data analysis, implementation, optimization and tuning of different
supervised machine learning models and neural networks. The accuracy and
precision are examined for different models and features at varying granularity
of crime categories and location precision. The results demonstrate that when compared to a variety of other models,
Random Forest classification models are most effective in identifying dangerous
situations and their corresponding priority levels with high accuracy (Accuracy
= 85%, AUC = 0.92) at a local level while ensuring a minimum amount of false
negatives. While further research and data gathering is needed to include other
social and economic factors, these results provide valuable insights for law
enforcement agencies to optimize resources, develop proactive deployment
approaches, and adjust response patterns to enhance overall public safety
outcomes in an unbiased way.
近年来,城市安全已成为城市规划者和执法机构最为关注的问题。对可能发生的犯罪进行准确预测可以大大加强预防措施和资源分配。然而,许多执法部门缺乏分析和应用先进人工智能和 ML 技术的工具,而这些技术可以支持城市规划者、监视计划和安全领导者采取积极措施,以实现整体社区安全。本文探讨了 ML 技术在预测城市地区犯罪的空间和时间模式方面的有效性。利用来自加利福尼亚州圣何塞的警方调度呼叫数据,研究目标是实现高度准确的呼叫优先级分类,尤其是针对需要立即执法响应的更危险情况。这种分类是根据呼叫的时间、地点和性质进行的。研究步骤包括数据提取、预处理、特征工程、探索性数据分析、不同监督机器学习模型和神经网络的实施、优化和调整。在不同的犯罪类别粒度和位置精度下,对不同模型和特征的准确性和精确度进行了检验。结果表明,与其他各种模型相比,随机森林分类模型在识别危险情况及其相应的优先级别方面最为有效,而且准确率较高(准确率= 85%,AUC = 0.92),同时确保将虚假负值降到最低。虽然还需要进一步的研究和数据收集,以纳入其他社会和经济因素,但这些结果为执法机构优化资源、制定前瞻性部署方法和调整响应模式提供了宝贵的见解,从而以公正的方式提高整体公共安全成果。