基于众包的印尼语本体关系提取在线增量学习

IF 3.4 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Inteligencia Artificial-Iberoamerical Journal of Artificial Intelligence Pub Date : 2023-09-06 DOI:10.4114/intartif.vol26iss72pp124-136

Eunike Andriani Kardinata, Nur Aini Rakhmawati

{"title":"基于众包的印尼语本体关系提取在线增量学习","authors":"Eunike Andriani Kardinata, Nur Aini Rakhmawati","doi":"10.4114/intartif.vol26iss72pp124-136","DOIUrl":null,"url":null,"abstract":"Ontology is one form of structured representation of knowledge. Ontology is widely used and developed in information retrieval because of its ability to represent knowledge in a form that machines and humans can understand. With the increasing scale and complexity of ontology, there are more significant challenges in identifying extra-logical errors. Ontological development methods mostly use machine learning, which is at risk of missed extra-logical errors. To handle it, crowdsourcing is used, i.e. dividing a large job into several small jobs and hiring the masses to complete it. Data processing is usually done offline to take advantage of crowdsourcing, and batches are converted into online and incremental. Online incremental learning directly arranges an iterative model after a change is made by ensuring that the knowledge that has been obtained before is maintained. This study built an interactive medium to present the initial relationship between concept pairs. Crowdsourcing participants were asked to validate the relationship repeatedly until a specified accuracy value was reached. This study found that the crowdsourcing process was able to improve the model used in the relationship extraction process, from F1-Score 87.2% to 89.8%. Improvements using crowdsourcing achieve the same result as improvements by experts. Thus, crowdsourcing can correct extra-logical errors appropriately as an expert. In addition, it was also found that offline incremental learning using Random Forest resulted in higher model accuracy than incremental online learning using Mondrian Forest. The accuracy of the Random Forest model has a final accuracy of 90.6%, while the accuracy of the Mondrian Forest model is 89.7%. From these results, it was concluded that incremental online learning cannot provide better results than offline incremental learning to improve the meronymy relationship extraction process.","PeriodicalId":43470,"journal":{"name":"Inteligencia Artificial-Iberoamerical Journal of Artificial Intelligence","volume":"26 1","pages":"124-136"},"PeriodicalIF":3.4000,"publicationDate":"2023-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Online Incremental Learning Based on Crowdsourcing For Indonesian Ontology Relation Extraction\",\"authors\":\"Eunike Andriani Kardinata, Nur Aini Rakhmawati\",\"doi\":\"10.4114/intartif.vol26iss72pp124-136\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Ontology is one form of structured representation of knowledge. Ontology is widely used and developed in information retrieval because of its ability to represent knowledge in a form that machines and humans can understand. With the increasing scale and complexity of ontology, there are more significant challenges in identifying extra-logical errors. Ontological development methods mostly use machine learning, which is at risk of missed extra-logical errors. To handle it, crowdsourcing is used, i.e. dividing a large job into several small jobs and hiring the masses to complete it. Data processing is usually done offline to take advantage of crowdsourcing, and batches are converted into online and incremental. Online incremental learning directly arranges an iterative model after a change is made by ensuring that the knowledge that has been obtained before is maintained. This study built an interactive medium to present the initial relationship between concept pairs. Crowdsourcing participants were asked to validate the relationship repeatedly until a specified accuracy value was reached. This study found that the crowdsourcing process was able to improve the model used in the relationship extraction process, from F1-Score 87.2% to 89.8%. Improvements using crowdsourcing achieve the same result as improvements by experts. Thus, crowdsourcing can correct extra-logical errors appropriately as an expert. In addition, it was also found that offline incremental learning using Random Forest resulted in higher model accuracy than incremental online learning using Mondrian Forest. The accuracy of the Random Forest model has a final accuracy of 90.6%, while the accuracy of the Mondrian Forest model is 89.7%. From these results, it was concluded that incremental online learning cannot provide better results than offline incremental learning to improve the meronymy relationship extraction process.\",\"PeriodicalId\":43470,\"journal\":{\"name\":\"Inteligencia Artificial-Iberoamerical Journal of Artificial Intelligence\",\"volume\":\"26 1\",\"pages\":\"124-136\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2023-09-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Inteligencia Artificial-Iberoamerical Journal of Artificial Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4114/intartif.vol26iss72pp124-136\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Inteligencia Artificial-Iberoamerical Journal of Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4114/intartif.vol26iss72pp124-136","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

本体是知识的一种结构化表示形式。本体能够以机器和人类都能理解的形式表示知识，因此在信息检索领域得到了广泛的应用和发展。随着本体规模和复杂性的不断增加，在识别逻辑外错误方面面临着越来越大的挑战。本体论开发方法大多使用机器学习，这有可能遗漏额外的逻辑错误。为了解决这个问题，使用了众包，即将一项大工作分成几个小工作，然后雇用大众来完成它。数据处理通常在线下进行，以利用众包的优势，并将批量转换为在线和增量。在线增量学习通过确保之前获得的知识得到维护，直接安排了变更后的迭代模型。本研究建立互动媒介来呈现概念对之间的初始关系。众包参与者被要求反复验证关系，直到达到指定的精度值。本研究发现，众包过程能够改善关系提取过程中使用的模型，从F1-Score的87.2%提高到89.8%。使用众包的改进与专家的改进效果相同。因此，作为专家，众包可以适当地纠正额外的逻辑错误。此外，还发现使用Random Forest的离线增量学习比使用Mondrian Forest的在线增量学习产生更高的模型精度。随机森林模型的最终准确率为90.6%，而蒙德里安森林模型的准确率为89.7%。从这些结果可以看出，增量式在线学习并不能提供比离线增量学习更好的效果来改善名称关系提取过程。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Online Incremental Learning Based on Crowdsourcing For Indonesian Ontology Relation Extraction

Ontology is one form of structured representation of knowledge. Ontology is widely used and developed in information retrieval because of its ability to represent knowledge in a form that machines and humans can understand. With the increasing scale and complexity of ontology, there are more significant challenges in identifying extra-logical errors. Ontological development methods mostly use machine learning, which is at risk of missed extra-logical errors. To handle it, crowdsourcing is used, i.e. dividing a large job into several small jobs and hiring the masses to complete it. Data processing is usually done offline to take advantage of crowdsourcing, and batches are converted into online and incremental. Online incremental learning directly arranges an iterative model after a change is made by ensuring that the knowledge that has been obtained before is maintained. This study built an interactive medium to present the initial relationship between concept pairs. Crowdsourcing participants were asked to validate the relationship repeatedly until a specified accuracy value was reached. This study found that the crowdsourcing process was able to improve the model used in the relationship extraction process, from F1-Score 87.2% to 89.8%. Improvements using crowdsourcing achieve the same result as improvements by experts. Thus, crowdsourcing can correct extra-logical errors appropriately as an expert. In addition, it was also found that offline incremental learning using Random Forest resulted in higher model accuracy than incremental online learning using Mondrian Forest. The accuracy of the Random Forest model has a final accuracy of 90.6%, while the accuracy of the Mondrian Forest model is 89.7%. From these results, it was concluded that incremental online learning cannot provide better results than offline incremental learning to improve the meronymy relationship extraction process.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Inteligencia Artificial-Iberoamerical Journal of Artificial Intelligence COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-

CiteScore

2.00

自引率

0.00%

发文量

审稿时长

8 weeks

期刊介绍： Inteligencia Artificial is a quarterly journal promoted and sponsored by the Spanish Association for Artificial Intelligence. The journal publishes high-quality original research papers reporting theoretical or applied advances in all branches of Artificial Intelligence. The journal publishes high-quality original research papers reporting theoretical or applied advances in all branches of Artificial Intelligence. Particularly, the Journal welcomes: New approaches, techniques or methods to solve AI problems, which should include demonstrations of effectiveness oor improvement over existing methods. These demonstrations must be reproducible. Integration of different technologies or approaches to solve wide problems or belonging different areas. AI applications, which should describe in detail the problem or the scenario and the proposed solution, emphasizing its novelty and present a evaluation of the AI techniques that are applied. In addition to rapid publication and dissemination of unsolicited contributions, the journal is also committed to producing monographs, surveys or special issues on topics, methods or techniques of special relevance to the AI community. Inteligencia Artificial welcomes submissions written in English, Spaninsh or Portuguese. But at least, a title, summary and keywords in english should be included in each contribution.