An improved TBL based post-processing approach is proposed for Japanese named entity recognition (NER) in this paper. Firstly, tuning rules are automatically acquired from the results of Japanese NER by error-driven learning. And then, the tuning rules are optimized according to given threshold conditions. After filtered, the rules are used to revise the results of Japanese NER. Above all, this approach could be used in special domains perfectly for its learning domain linguistic knowledge automatically. The learnt rules could not go over fit as well. The experimental results show that a high result can be achieved in precision for Japanese NER.
{"title":"Research on Improved TBL Based Japanese NER Post-Processing","authors":"Jing Wang, Dequan Zheng, T. Zhao","doi":"10.1109/ALPIT.2008.109","DOIUrl":"https://doi.org/10.1109/ALPIT.2008.109","url":null,"abstract":"An improved TBL based post-processing approach is proposed for Japanese named entity recognition (NER) in this paper. Firstly, tuning rules are automatically acquired from the results of Japanese NER by error-driven learning. And then, the tuning rules are optimized according to given threshold conditions. After filtered, the rules are used to revise the results of Japanese NER. Above all, this approach could be used in special domains perfectly for its learning domain linguistic knowledge automatically. The learnt rules could not go over fit as well. The experimental results show that a high result can be achieved in precision for Japanese NER.","PeriodicalId":412433,"journal":{"name":"Advanced Language Processing and Web Information Technology","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120924724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents a template-based English-Chinese translation system characterized by two important features: Fast Optimal Parsing Algorithm (FOPA) and Universal Algorithm of Matching and Replacing Templates (UAMRT). First, the FOPA parses an English sentence into an optimal parse tree or template structure quickly. Second, the UAMRT matches each source template with the optimal structure and replaces it with the corresponding target template. The basic idea of the template-based system is to represent all translation knowledge in uniform templates which can be shown as context-free grammar rules. It then translates an English sentence into Chinese with FOPA and UAMRT. System evaluation shows encouraging and promising results for quality and speed, and the system may perform better by adding more templates without any other changes.
{"title":"A Template-Based English-Chinese Translation System Using FOPA and UAMRT","authors":"Yujian Li","doi":"10.1109/ALPIT.2007.22","DOIUrl":"https://doi.org/10.1109/ALPIT.2007.22","url":null,"abstract":"This paper presents a template-based English-Chinese translation system characterized by two important features: Fast Optimal Parsing Algorithm (FOPA) and Universal Algorithm of Matching and Replacing Templates (UAMRT). First, the FOPA parses an English sentence into an optimal parse tree or template structure quickly. Second, the UAMRT matches each source template with the optimal structure and replaces it with the corresponding target template. The basic idea of the template-based system is to represent all translation knowledge in uniform templates which can be shown as context-free grammar rules. It then translates an English sentence into Chinese with FOPA and UAMRT. System evaluation shows encouraging and promising results for quality and speed, and the system may perform better by adding more templates without any other changes.","PeriodicalId":412433,"journal":{"name":"Advanced Language Processing and Web Information Technology","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121138665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In previous works, subtopics are seldom mentioned in multi-document summarization while only one topic is focused to extract summary. In this paper, we propose a subtopic- focused model to score sentences in the extractive summarization task. Different with supervised methods, it does not require costly manual work to form the training set. Multiple documents are represented as mixture over subtopics, denoted by term distributions through unsupervised learning. Our method learns the subtopic distribution over sentences via a hierarchical Bayesian model, through which sentences are scored and extracted as summary. Experiments on DUC 2006 data are performed and the ROUGE evaluation results show that the proposed method can reach the state-of-the-art performance.
{"title":"Subtopic-Focused Sentence Scoring in Multi-document Summarization","authors":"Sujian Li, Weiguang Qu","doi":"10.1109/ALPIT.2007.106","DOIUrl":"https://doi.org/10.1109/ALPIT.2007.106","url":null,"abstract":"In previous works, subtopics are seldom mentioned in multi-document summarization while only one topic is focused to extract summary. In this paper, we propose a subtopic- focused model to score sentences in the extractive summarization task. Different with supervised methods, it does not require costly manual work to form the training set. Multiple documents are represented as mixture over subtopics, denoted by term distributions through unsupervised learning. Our method learns the subtopic distribution over sentences via a hierarchical Bayesian model, through which sentences are scored and extracted as summary. Experiments on DUC 2006 data are performed and the ROUGE evaluation results show that the proposed method can reach the state-of-the-art performance.","PeriodicalId":412433,"journal":{"name":"Advanced Language Processing and Web Information Technology","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127554460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}