评估基于kNN的方法在孟加拉国司法机构的文件分析

2018 Second International Conference on Computing Methodologies and Communication (ICCMC) Pub Date : 2018-02-01 DOI:10.1109/ICCMC.2018.8487847

Md. Aminul Islam, Md. Jahidul Haque

{"title":"评估基于kNN的方法在孟加拉国司法机构的文件分析","authors":"Md. Aminul Islam, Md. Jahidul Haque","doi":"10.1109/ICCMC.2018.8487847","DOIUrl":null,"url":null,"abstract":"In this contemporary era of artificial intelligence, machine learning (ML) algorithms are getting significant attention for the analysis of textual analysis. In recent years, operational improvement in different corporate sectors of Bangladesh are achieved by implementing digitization of the process flow instead of using manual paper trails in offices. Nowadays, judicial sectors are included into sate wide digitalization process by archiving the judiciary records. Despite such improvement, autonomic categorizing of documents using textual analysis is not seen in labeling the correct class of a judicial document. In fact, officers spend lots of time in manual labeling of court related document. In our present investigation, we approached a textual analysis tool that can initiate towards the major solution for solving the manual categorization problem within the judicial sector of Bangladesh. Our objective is to label a normalized text document by implementing ML algorithm into suitable class in terms of the case type. In addition, grammatical analysis of English documents is integrated by the natural language processing (NLP) techniques as well as the filtering of feature sets by TF-IDF based term weighting scheme. The outcomes show the important impacts of NLP techniques for generating useful training data in KNN classification algorithm for the categorization of English documents in Bangladeshi judiciary sector.","PeriodicalId":6604,"journal":{"name":"2018 Second International Conference on Computing Methodologies and Communication (ICCMC)","volume":"8 1","pages":"646-650"},"PeriodicalIF":0.0000,"publicationDate":"2018-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Evaluating Document Analysis with kNN Based Approaches in Judicial Offices of Bangladesh\",\"authors\":\"Md. Aminul Islam, Md. Jahidul Haque\",\"doi\":\"10.1109/ICCMC.2018.8487847\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this contemporary era of artificial intelligence, machine learning (ML) algorithms are getting significant attention for the analysis of textual analysis. In recent years, operational improvement in different corporate sectors of Bangladesh are achieved by implementing digitization of the process flow instead of using manual paper trails in offices. Nowadays, judicial sectors are included into sate wide digitalization process by archiving the judiciary records. Despite such improvement, autonomic categorizing of documents using textual analysis is not seen in labeling the correct class of a judicial document. In fact, officers spend lots of time in manual labeling of court related document. In our present investigation, we approached a textual analysis tool that can initiate towards the major solution for solving the manual categorization problem within the judicial sector of Bangladesh. Our objective is to label a normalized text document by implementing ML algorithm into suitable class in terms of the case type. In addition, grammatical analysis of English documents is integrated by the natural language processing (NLP) techniques as well as the filtering of feature sets by TF-IDF based term weighting scheme. The outcomes show the important impacts of NLP techniques for generating useful training data in KNN classification algorithm for the categorization of English documents in Bangladeshi judiciary sector.\",\"PeriodicalId\":6604,\"journal\":{\"name\":\"2018 Second International Conference on Computing Methodologies and Communication (ICCMC)\",\"volume\":\"8 1\",\"pages\":\"646-650\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 Second International Conference on Computing Methodologies and Communication (ICCMC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCMC.2018.8487847\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 Second International Conference on Computing Methodologies and Communication (ICCMC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCMC.2018.8487847","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

在当今人工智能的时代，机器学习(ML)算法在文本分析中的应用越来越受到关注。近年来，孟加拉国不同企业部门的业务改进是通过实施流程数字化而不是在办公室使用手工纸质记录来实现的。目前，司法部门通过司法档案的归档被纳入国家数字化进程。尽管有这样的改进，但在标注司法文书的正确类别时，并没有看到使用文本分析对文件进行自主分类的情况。事实上，官员们花费大量时间在手工标注法庭相关文件上。在我们目前的调查中，我们接触了一种文本分析工具，该工具可以着手解决孟加拉国司法部门内人工分类问题的主要解决方案。我们的目标是通过实现ML算法，根据case类型将规范化文本文档标记为合适的类。此外，英语文档的语法分析集成了自然语言处理(NLP)技术和基于TF-IDF的术语加权方案的特征集过滤。结果表明，NLP技术对孟加拉国司法部门英语文档分类的KNN分类算法生成有用的训练数据具有重要影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Evaluating Document Analysis with kNN Based Approaches in Judicial Offices of Bangladesh

In this contemporary era of artificial intelligence, machine learning (ML) algorithms are getting significant attention for the analysis of textual analysis. In recent years, operational improvement in different corporate sectors of Bangladesh are achieved by implementing digitization of the process flow instead of using manual paper trails in offices. Nowadays, judicial sectors are included into sate wide digitalization process by archiving the judiciary records. Despite such improvement, autonomic categorizing of documents using textual analysis is not seen in labeling the correct class of a judicial document. In fact, officers spend lots of time in manual labeling of court related document. In our present investigation, we approached a textual analysis tool that can initiate towards the major solution for solving the manual categorization problem within the judicial sector of Bangladesh. Our objective is to label a normalized text document by implementing ML algorithm into suitable class in terms of the case type. In addition, grammatical analysis of English documents is integrated by the natural language processing (NLP) techniques as well as the filtering of feature sets by TF-IDF based term weighting scheme. The outcomes show the important impacts of NLP techniques for generating useful training data in KNN classification algorithm for the categorization of English documents in Bangladeshi judiciary sector.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2018 Second International Conference on Computing Methodologies and Communication (ICCMC)

自引率

0.00%

发文量