{"title":"名称实体识别使用归纳逻辑编程","authors":"H. T. Le, Thien Huu Nguyen","doi":"10.1145/1852611.1852626","DOIUrl":null,"url":null,"abstract":"Named entity recognition (NER) is the process of seeking to locate atomic elements in text into predefined categories such as the names of persons, organizations, locations, expressions of times, quantities, and percentages. It is useful in applying NER to other natural language tasks such as question-answering, text summarization, building semantic web, etc. This paper presents a system, called BKIE, that uses SRV -- an inductive logic program - to extract name entities in Vietnamese text. New predicates and features are added to SRV to deal with characteristics of Vietnamese language. Also, several strategies are proposed in this paper to improve the efficiency of the SRV algorithm. The data set using in experiments is 80 homepages of scientists in Vietnamese language that were tagged manually. The experiments give us the best F-score of 83% for extracting the \"name\" entity. It shows that SRV is an efficient NER algorithm given its advantages of generality and flexibility. In order to increase the system's performance, our future work includes (i) building a larger set of training data to improve system's performance; (ii) implementing BKIE using parallel programming to increase system efficiency; and (iii) testing BKIE with other application domains to get a more accurate evaluation of the system.","PeriodicalId":388053,"journal":{"name":"Proceedings of the 1st Symposium on Information and Communication Technology","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2010-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Name entity recognition using inductive logic programming\",\"authors\":\"H. T. Le, Thien Huu Nguyen\",\"doi\":\"10.1145/1852611.1852626\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Named entity recognition (NER) is the process of seeking to locate atomic elements in text into predefined categories such as the names of persons, organizations, locations, expressions of times, quantities, and percentages. It is useful in applying NER to other natural language tasks such as question-answering, text summarization, building semantic web, etc. This paper presents a system, called BKIE, that uses SRV -- an inductive logic program - to extract name entities in Vietnamese text. New predicates and features are added to SRV to deal with characteristics of Vietnamese language. Also, several strategies are proposed in this paper to improve the efficiency of the SRV algorithm. The data set using in experiments is 80 homepages of scientists in Vietnamese language that were tagged manually. The experiments give us the best F-score of 83% for extracting the \\\"name\\\" entity. It shows that SRV is an efficient NER algorithm given its advantages of generality and flexibility. In order to increase the system's performance, our future work includes (i) building a larger set of training data to improve system's performance; (ii) implementing BKIE using parallel programming to increase system efficiency; and (iii) testing BKIE with other application domains to get a more accurate evaluation of the system.\",\"PeriodicalId\":388053,\"journal\":{\"name\":\"Proceedings of the 1st Symposium on Information and Communication Technology\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-08-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 1st Symposium on Information and Communication Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1852611.1852626\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 1st Symposium on Information and Communication Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1852611.1852626","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Name entity recognition using inductive logic programming
Named entity recognition (NER) is the process of seeking to locate atomic elements in text into predefined categories such as the names of persons, organizations, locations, expressions of times, quantities, and percentages. It is useful in applying NER to other natural language tasks such as question-answering, text summarization, building semantic web, etc. This paper presents a system, called BKIE, that uses SRV -- an inductive logic program - to extract name entities in Vietnamese text. New predicates and features are added to SRV to deal with characteristics of Vietnamese language. Also, several strategies are proposed in this paper to improve the efficiency of the SRV algorithm. The data set using in experiments is 80 homepages of scientists in Vietnamese language that were tagged manually. The experiments give us the best F-score of 83% for extracting the "name" entity. It shows that SRV is an efficient NER algorithm given its advantages of generality and flexibility. In order to increase the system's performance, our future work includes (i) building a larger set of training data to improve system's performance; (ii) implementing BKIE using parallel programming to increase system efficiency; and (iii) testing BKIE with other application domains to get a more accurate evaluation of the system.