Fatima Riaz, Muhammad Waqas Anwar, Humaira Muqades
{"title":"基于最大熵的乌尔都语命名实体识别","authors":"Fatima Riaz, Muhammad Waqas Anwar, Humaira Muqades","doi":"10.1109/ICEET48479.2020.9048203","DOIUrl":null,"url":null,"abstract":"Urdu is widely spoken and a national language of Pakistan. The language covers huge variety from others languages as well therefore known as “Lashkari Zuban” (a mixture of different languages. We have performed experiments on Urdu Named Entity Recognition (NER) using model-based approach. NER is the task for the identification and classification of named entities from the given text therefore; name, place, organization, time/date, etc. The task has an important role for automated systems, information extraction, machine learning and artificial intelligence. A lot of work has been done for European languages but the task for South Asian languages is its development stage. We chose Urdu language as it is our national language but still there are a lot of challenges in Urdu language as the language as very limited resources and it is also free structured language. Our research has been conducted on IJCNLP-08 dataset which is IOB (inside Outside Beginning) tagged using maximum entropy model. Precision, Recall and F-measure are used to evaluate the accuracy of the model.","PeriodicalId":144846,"journal":{"name":"2020 International Conference on Engineering and Emerging Technologies (ICEET)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Maximum Entropy based Urdu Named Entity Recognition\",\"authors\":\"Fatima Riaz, Muhammad Waqas Anwar, Humaira Muqades\",\"doi\":\"10.1109/ICEET48479.2020.9048203\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Urdu is widely spoken and a national language of Pakistan. The language covers huge variety from others languages as well therefore known as “Lashkari Zuban” (a mixture of different languages. We have performed experiments on Urdu Named Entity Recognition (NER) using model-based approach. NER is the task for the identification and classification of named entities from the given text therefore; name, place, organization, time/date, etc. The task has an important role for automated systems, information extraction, machine learning and artificial intelligence. A lot of work has been done for European languages but the task for South Asian languages is its development stage. We chose Urdu language as it is our national language but still there are a lot of challenges in Urdu language as the language as very limited resources and it is also free structured language. Our research has been conducted on IJCNLP-08 dataset which is IOB (inside Outside Beginning) tagged using maximum entropy model. Precision, Recall and F-measure are used to evaluate the accuracy of the model.\",\"PeriodicalId\":144846,\"journal\":{\"name\":\"2020 International Conference on Engineering and Emerging Technologies (ICEET)\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference on Engineering and Emerging Technologies (ICEET)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICEET48479.2020.9048203\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Engineering and Emerging Technologies (ICEET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICEET48479.2020.9048203","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Maximum Entropy based Urdu Named Entity Recognition
Urdu is widely spoken and a national language of Pakistan. The language covers huge variety from others languages as well therefore known as “Lashkari Zuban” (a mixture of different languages. We have performed experiments on Urdu Named Entity Recognition (NER) using model-based approach. NER is the task for the identification and classification of named entities from the given text therefore; name, place, organization, time/date, etc. The task has an important role for automated systems, information extraction, machine learning and artificial intelligence. A lot of work has been done for European languages but the task for South Asian languages is its development stage. We chose Urdu language as it is our national language but still there are a lot of challenges in Urdu language as the language as very limited resources and it is also free structured language. Our research has been conducted on IJCNLP-08 dataset which is IOB (inside Outside Beginning) tagged using maximum entropy model. Precision, Recall and F-measure are used to evaluate the accuracy of the model.