{"title":"Diagnosis of Crime Rate against Women using k-fold Cross Validation through Machine Learning","authors":"P. Tamilarasi, R. Rani","doi":"10.1109/ICCMC48092.2020.ICCMC-000193","DOIUrl":null,"url":null,"abstract":"Crime against women has become a very big problem of our nation. Many countries are trying to control this offence continuously and its prevention is an essential task. In recent years crimes are significantly increasing against women. Currently the Indian government show interest to address this problem and give more importance to develop our society. Every year a huge amount of data collection is generated on the basis of the crime reporting. This data can be very useful for assessing and predicting crime, and can help us to some degree stop the crime. Data analysis is a process of examining, cleansing, transformation and modelling data with the goal of establish useful information, reporting conclusion and sustaining decision-making. Feature Scaling is one of the most important techniques to standardize the independent features to place the data in a fixed range. It is performed at the time of data pre-processing. K-fold cross-validation is a re-sampling method used for calculating machine learning models on a small sample of data. It is a common strategy since it is easy to understand and usually results in a model deftness calculation that is less biased or less negative than other approaches, such as a simple train or test divide. Machine learning plays a large part in data processing. This paper introduces six different types of Machine learning algorithms such as KNN and decision trees, Naïve Bayes, Linear Regression CART (Classification and Regression Tree) and SVM using similar characteristics on crime data. Those algorithms are tested for accuracy. The main objective of this research is to evaluate the efficacy and application of the machine learning algorithms in data analytics.","PeriodicalId":130581,"journal":{"name":"2020 Fourth International Conference on Computing Methodologies and Communication (ICCMC)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 Fourth International Conference on Computing Methodologies and Communication (ICCMC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCMC48092.2020.ICCMC-000193","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 23
Abstract
Crime against women has become a very big problem of our nation. Many countries are trying to control this offence continuously and its prevention is an essential task. In recent years crimes are significantly increasing against women. Currently the Indian government show interest to address this problem and give more importance to develop our society. Every year a huge amount of data collection is generated on the basis of the crime reporting. This data can be very useful for assessing and predicting crime, and can help us to some degree stop the crime. Data analysis is a process of examining, cleansing, transformation and modelling data with the goal of establish useful information, reporting conclusion and sustaining decision-making. Feature Scaling is one of the most important techniques to standardize the independent features to place the data in a fixed range. It is performed at the time of data pre-processing. K-fold cross-validation is a re-sampling method used for calculating machine learning models on a small sample of data. It is a common strategy since it is easy to understand and usually results in a model deftness calculation that is less biased or less negative than other approaches, such as a simple train or test divide. Machine learning plays a large part in data processing. This paper introduces six different types of Machine learning algorithms such as KNN and decision trees, Naïve Bayes, Linear Regression CART (Classification and Regression Tree) and SVM using similar characteristics on crime data. Those algorithms are tested for accuracy. The main objective of this research is to evaluate the efficacy and application of the machine learning algorithms in data analytics.