{"title":"Machine Learning Techniques Applied To Bangla Crime News Classification","authors":"Nusrat Islam, Rokeya Siddiqua, S. Momen","doi":"10.1109/CITDS54976.2022.9914240","DOIUrl":null,"url":null,"abstract":"The methodical approach to crime detection, crime pattern classification and crime tendency guessing is called crime analysis and prediction. Crime is naturally unpredictable and socially disruptive. With the increase in the population of Bangladesh, the tendency of crime is also increasing, which is destroying our society in various ways. Therefore, crime data analysis has become essential in order to predict future crime types. In our research paper, six types of Machine learning algorithms were used in order to classify the crime news. Crime news were fetched from online Bangla newspapers and TV channels using Web Scraper. In order to extract the features (important words), two types of feature extractors have been used including CountVectorizer and TfidfVectorizer where CountVectorizer was from a well-known python pre-trained package named BnVec. Accuracies of 87.69% and 86.09% were found from the Logistic Regression and SVM models respectively. Besides, Logistic regression provided less false negative with 86.65% recall and 86.58% F1-score. This research has a potential to be used to prevent crime and to apprehend, investigate and prosecute the criminals.","PeriodicalId":271992,"journal":{"name":"2022 IEEE 2nd Conference on Information Technology and Data Science (CITDS)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 2nd Conference on Information Technology and Data Science (CITDS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CITDS54976.2022.9914240","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The methodical approach to crime detection, crime pattern classification and crime tendency guessing is called crime analysis and prediction. Crime is naturally unpredictable and socially disruptive. With the increase in the population of Bangladesh, the tendency of crime is also increasing, which is destroying our society in various ways. Therefore, crime data analysis has become essential in order to predict future crime types. In our research paper, six types of Machine learning algorithms were used in order to classify the crime news. Crime news were fetched from online Bangla newspapers and TV channels using Web Scraper. In order to extract the features (important words), two types of feature extractors have been used including CountVectorizer and TfidfVectorizer where CountVectorizer was from a well-known python pre-trained package named BnVec. Accuracies of 87.69% and 86.09% were found from the Logistic Regression and SVM models respectively. Besides, Logistic regression provided less false negative with 86.65% recall and 86.58% F1-score. This research has a potential to be used to prevent crime and to apprehend, investigate and prosecute the criminals.