{"title":"CrimeProfiler: crime information extraction and visualization from news media","authors":"Tirthankar Dasgupta, Abir Naskar, Rupsa Saha, Lipika Dey","doi":"10.1145/3106426.3106476","DOIUrl":null,"url":null,"abstract":"News articles from different sources regularly report crime incidents that contain details of crime, information about accused entities, details of the investigation process and finally details of judgement. In this paper, we have proposed natural language processing techniques for extraction and curation of crime-related information from digitally published News articles. We have leveraged computational linguistics based methods to analyse crime related News documents to extract different crime related entities and events. This includes name of the criminal, name of the victim, nature of crime, geographic location, date and time, and action taken against the criminal. We have also proposed a semi-supervised learning technique to learn different categories of crime events from the News documents. This helps in continuous evolution of the crime dictionaries. Thus the proposed methods are not restricted to detecting known crimes only but contribute actively towards maintaining an updated crime dictionary. We have done experiments with a collection of 3000 crime-reporting News articles. The end-product of our experiments is a crime-register that contains details of crime committed across geographies and time. This register can be further utilized for analytical and reporting purposes.","PeriodicalId":20685,"journal":{"name":"Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics","volume":"15 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2017-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3106426.3106476","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16
Abstract
News articles from different sources regularly report crime incidents that contain details of crime, information about accused entities, details of the investigation process and finally details of judgement. In this paper, we have proposed natural language processing techniques for extraction and curation of crime-related information from digitally published News articles. We have leveraged computational linguistics based methods to analyse crime related News documents to extract different crime related entities and events. This includes name of the criminal, name of the victim, nature of crime, geographic location, date and time, and action taken against the criminal. We have also proposed a semi-supervised learning technique to learn different categories of crime events from the News documents. This helps in continuous evolution of the crime dictionaries. Thus the proposed methods are not restricted to detecting known crimes only but contribute actively towards maintaining an updated crime dictionary. We have done experiments with a collection of 3000 crime-reporting News articles. The end-product of our experiments is a crime-register that contains details of crime committed across geographies and time. This register can be further utilized for analytical and reporting purposes.