Diagnosis of Crime Rate against Women using k-fold Cross Validation through Machine Learning

P. Tamilarasi, R. Rani
{"title":"Diagnosis of Crime Rate against Women using k-fold Cross Validation through Machine Learning","authors":"P. Tamilarasi, R. Rani","doi":"10.1109/ICCMC48092.2020.ICCMC-000193","DOIUrl":null,"url":null,"abstract":"Crime against women has become a very big problem of our nation. Many countries are trying to control this offence continuously and its prevention is an essential task. In recent years crimes are significantly increasing against women. Currently the Indian government show interest to address this problem and give more importance to develop our society. Every year a huge amount of data collection is generated on the basis of the crime reporting. This data can be very useful for assessing and predicting crime, and can help us to some degree stop the crime. Data analysis is a process of examining, cleansing, transformation and modelling data with the goal of establish useful information, reporting conclusion and sustaining decision-making. Feature Scaling is one of the most important techniques to standardize the independent features to place the data in a fixed range. It is performed at the time of data pre-processing. K-fold cross-validation is a re-sampling method used for calculating machine learning models on a small sample of data. It is a common strategy since it is easy to understand and usually results in a model deftness calculation that is less biased or less negative than other approaches, such as a simple train or test divide. Machine learning plays a large part in data processing. This paper introduces six different types of Machine learning algorithms such as KNN and decision trees, Naïve Bayes, Linear Regression CART (Classification and Regression Tree) and SVM using similar characteristics on crime data. Those algorithms are tested for accuracy. The main objective of this research is to evaluate the efficacy and application of the machine learning algorithms in data analytics.","PeriodicalId":130581,"journal":{"name":"2020 Fourth International Conference on Computing Methodologies and Communication (ICCMC)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 Fourth International Conference on Computing Methodologies and Communication (ICCMC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCMC48092.2020.ICCMC-000193","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 23

Abstract

Crime against women has become a very big problem of our nation. Many countries are trying to control this offence continuously and its prevention is an essential task. In recent years crimes are significantly increasing against women. Currently the Indian government show interest to address this problem and give more importance to develop our society. Every year a huge amount of data collection is generated on the basis of the crime reporting. This data can be very useful for assessing and predicting crime, and can help us to some degree stop the crime. Data analysis is a process of examining, cleansing, transformation and modelling data with the goal of establish useful information, reporting conclusion and sustaining decision-making. Feature Scaling is one of the most important techniques to standardize the independent features to place the data in a fixed range. It is performed at the time of data pre-processing. K-fold cross-validation is a re-sampling method used for calculating machine learning models on a small sample of data. It is a common strategy since it is easy to understand and usually results in a model deftness calculation that is less biased or less negative than other approaches, such as a simple train or test divide. Machine learning plays a large part in data processing. This paper introduces six different types of Machine learning algorithms such as KNN and decision trees, Naïve Bayes, Linear Regression CART (Classification and Regression Tree) and SVM using similar characteristics on crime data. Those algorithms are tested for accuracy. The main objective of this research is to evaluate the efficacy and application of the machine learning algorithms in data analytics.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过机器学习使用k-fold交叉验证诊断针对妇女的犯罪率
针对妇女的犯罪已经成为我国的一个大问题。许多国家正在不断努力控制这一罪行,预防这一罪行是一项重要任务。近年来,针对妇女的犯罪显著增加。目前,印度政府表现出了解决这个问题的兴趣,并更加重视社会的发展。每年在犯罪报告的基础上产生大量的数据收集。这些数据对于评估和预测犯罪非常有用,并能在一定程度上帮助我们制止犯罪。数据分析是一个检查、清理、转换和建模数据的过程,目的是建立有用的信息、报告结论和维持决策。特征缩放是对独立特征进行标准化,将数据置于固定范围内的重要技术之一。它是在数据预处理时执行的。K-fold交叉验证是一种重新抽样方法,用于在小样本数据上计算机器学习模型。这是一种常见的策略,因为它易于理解,并且通常会导致模型灵巧性计算比其他方法(如简单训练或测试划分)更少的偏差或负面影响。机器学习在数据处理中起着很大的作用。本文介绍了六种不同类型的机器学习算法,如KNN和决策树,Naïve贝叶斯,线性回归CART(分类与回归树)和支持向量机,利用犯罪数据的相似特征。这些算法经过了准确性测试。本研究的主要目的是评估机器学习算法在数据分析中的有效性和应用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Analysis of Time Domain Features of Dysarthria Speech Tourism Recommendation System based on Knowledge Graph Feature Learning IoT systems based on SOA services: Methodologies, Challenges and Future directions Wildfire forecast within the districts of Kerala using Fuzzy and ANFIS A Review Study on the Multiple and Useful Application of Fiber Optic Illumination System
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1