Analysis of Machine-Based Learning Algorithm Used in Named Entity Recognition

Informing Science Pub Date : 2023-01-01 DOI:10.28945/5073

F. M. Kamau, Kennedy Ogada, Cheruiyot W. Kipruto

{"title":"Analysis of Machine-Based Learning Algorithm Used in Named Entity Recognition","authors":"F. M. Kamau, Kennedy Ogada, Cheruiyot W. Kipruto","doi":"10.28945/5073","DOIUrl":null,"url":null,"abstract":"Aim/Purpose: The amount of information published has increased dramatically due to the information explosion. The issue of managing information as it expands at this rate lies in the development of information extraction technology that can turn unstructured data into organized data that is understandable and controllable by computers Background: The primary goal of named entity recognition (NER) is to extract named entities from amorphous materials and place them in pre-defined semantic classes. Methodology: In our work, we analyze various machine learning algorithms and implement K-NN which has been widely used in machine learning and remains one of the most popular methods to classify data. Contribution: To the researchers’ best knowledge, no published study has presented Named entity recognition for the Kikuyu language using a machine learning algorithm. This research will fill this gap by recognizing entities in the Kikuyu language. Findings: An evaluation was done by testing precision, recall, and F-measure. The experiment results demonstrate that using K-NN is effective in classification performance. Recommendation for Researchers: With enough training data, researchers could perform an experiment and check the learning curve with accuracy that compares to state of art NER. Future Research: Future studies may be done using unsupervised and semi-supervised learning algorithms for other resource-scarce languages.","PeriodicalId":39754,"journal":{"name":"Informing Science","volume":"74 1","pages":"69-84"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Informing Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.28945/5073","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Aim/Purpose: The amount of information published has increased dramatically due to the information explosion. The issue of managing information as it expands at this rate lies in the development of information extraction technology that can turn unstructured data into organized data that is understandable and controllable by computers Background: The primary goal of named entity recognition (NER) is to extract named entities from amorphous materials and place them in pre-defined semantic classes. Methodology: In our work, we analyze various machine learning algorithms and implement K-NN which has been widely used in machine learning and remains one of the most popular methods to classify data. Contribution: To the researchers’ best knowledge, no published study has presented Named entity recognition for the Kikuyu language using a machine learning algorithm. This research will fill this gap by recognizing entities in the Kikuyu language. Findings: An evaluation was done by testing precision, recall, and F-measure. The experiment results demonstrate that using K-NN is effective in classification performance. Recommendation for Researchers: With enough training data, researchers could perform an experiment and check the learning curve with accuracy that compares to state of art NER. Future Research: Future studies may be done using unsupervised and semi-supervised learning algorithms for other resource-scarce languages.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

命名实体识别中的机器学习算法分析

目的/目的:由于信息爆炸，发布的信息量急剧增加。随着信息以这种速度扩展，管理信息的问题在于信息提取技术的发展，该技术可以将非结构化数据转化为计算机可以理解和控制的有组织数据。背景:命名实体识别(NER)的主要目标是从无定形材料中提取命名实体，并将它们置于预定义的语义类中。方法:在我们的工作中，我们分析了各种机器学习算法，并实现了在机器学习中广泛使用的K-NN，它仍然是最流行的数据分类方法之一。贡献:据研究人员所知，没有发表的研究使用机器学习算法对基库尤语进行命名实体识别。这项研究将通过识别基库尤语中的实体来填补这一空白。结果:通过检测精密度、召回率和f值进行评价。实验结果表明，使用K-NN在分类性能上是有效的。给研究人员的建议:有了足够的训练数据，研究人员可以进行实验，并以与最先进的NER相比的准确性检查学习曲线。未来研究:未来的研究可能会对其他资源稀缺的语言使用无监督和半监督学习算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Informing Science Social Sciences-Library and Information Sciences

CiteScore

1.60

自引率

0.00%

发文量

期刊介绍： The academically peer refereed journal Informing Science endeavors to provide an understanding of the complexities in informing clientele. Fields from information systems, library science, journalism in all its forms to education all contribute to this science. These fields, which developed independently and have been researched in separate disciplines, are evolving to form a new transdiscipline, Informing Science.