Young-Shing Youn, Hye-Jeong Song, Chan-Young Park, Jong-Dae Kim, Yu-Seop Kim
{"title":"考虑语境和句法特性的角色转换","authors":"Young-Shing Youn, Hye-Jeong Song, Chan-Young Park, Jong-Dae Kim, Yu-Seop Kim","doi":"10.14257/ijdta.2017.10.8.04","DOIUrl":null,"url":null,"abstract":"Semantic Role Labeling (SRL) is to determine the relationship between predicates and their arguments in a sentence. In order to determine the semantic roles, a large amount of corpus with annotated semantic roles is required. Nowadays the most widely used semantic corpus is Proposition Bank (PropBank) which is semantically annotated over the predicate and argument structure. But the Korean version of the PropBank could not be widely used because the corpus has limitation in size and be different from its original English version in its usability. To solve these problems, we also used another semantic tagged corpus, built by Sejong Plan, which is nation-wide Korean corpus construction project. However, the task of corpus construction with semantic roles defined in PropBank and Sejong is much time-consuming and these corpora use their own role sets. They finally require a way of converting one role to other side role(s). In this paper, we propose a method for automatically converting the roles. First, we use similarity between a given noun argument word to find a new role and noun words appearing in the example sentences of candidate roles. Second, we extract suffix of the argument word and estimate closeness between the suffix and candidate roles. Finally, the predicate itself is used for selection,that is we calculate the closeness between the predicate and the candidate roles. With these, the role is decided among multiple candidate roles. In the experiment, we convert 491 arguments automatically and about 78% of them show the agreement with manually annotated arguments.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"82 1","pages":"31-42"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Role Conversion Considering Its Context and Syntactic Property\",\"authors\":\"Young-Shing Youn, Hye-Jeong Song, Chan-Young Park, Jong-Dae Kim, Yu-Seop Kim\",\"doi\":\"10.14257/ijdta.2017.10.8.04\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Semantic Role Labeling (SRL) is to determine the relationship between predicates and their arguments in a sentence. In order to determine the semantic roles, a large amount of corpus with annotated semantic roles is required. Nowadays the most widely used semantic corpus is Proposition Bank (PropBank) which is semantically annotated over the predicate and argument structure. But the Korean version of the PropBank could not be widely used because the corpus has limitation in size and be different from its original English version in its usability. To solve these problems, we also used another semantic tagged corpus, built by Sejong Plan, which is nation-wide Korean corpus construction project. However, the task of corpus construction with semantic roles defined in PropBank and Sejong is much time-consuming and these corpora use their own role sets. They finally require a way of converting one role to other side role(s). In this paper, we propose a method for automatically converting the roles. First, we use similarity between a given noun argument word to find a new role and noun words appearing in the example sentences of candidate roles. Second, we extract suffix of the argument word and estimate closeness between the suffix and candidate roles. Finally, the predicate itself is used for selection,that is we calculate the closeness between the predicate and the candidate roles. With these, the role is decided among multiple candidate roles. In the experiment, we convert 491 arguments automatically and about 78% of them show the agreement with manually annotated arguments.\",\"PeriodicalId\":13926,\"journal\":{\"name\":\"International journal of database theory and application\",\"volume\":\"82 1\",\"pages\":\"31-42\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-08-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International journal of database theory and application\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.14257/ijdta.2017.10.8.04\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of database theory and application","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14257/ijdta.2017.10.8.04","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
语义角色标注(Semantic Role Labeling, SRL)是用来确定句子中谓语及其参数之间的关系。为了确定语义角色,需要大量带有标注语义角色的语料库。目前使用最广泛的语义语料库是命题库(PropBank),它对谓词和论证结构进行了语义标注。但韩国版的PropBank由于语料库规模有限,而且在可用性方面与英文原版存在差异,因此未能得到广泛应用。为了解决这些问题,我们还使用了另一个语义标记语料库,该语料库是由Sejong Plan建立的,这是一个全国性的韩国语语料库建设项目。然而,使用PropBank和Sejong中定义的语义角色构建语料库的任务非常耗时,并且这些语料库使用自己的角色集。它们最后需要一种将一个角色转换为另一个角色的方法。本文提出了一种自动转换角色的方法。首先,我们利用给定的名词论证词与候选角色例句中出现的名词词之间的相似性来寻找新角色。其次,我们提取参数词的后缀,并估计后缀与候选角色之间的接近度。最后,谓词本身用于选择,也就是说,我们计算谓词和候选角色之间的接近度。有了这些,在多个候选角色中决定角色。在实验中,我们自动转换了491个参数,其中约78%的参数与人工标注的参数一致。
Role Conversion Considering Its Context and Syntactic Property
Semantic Role Labeling (SRL) is to determine the relationship between predicates and their arguments in a sentence. In order to determine the semantic roles, a large amount of corpus with annotated semantic roles is required. Nowadays the most widely used semantic corpus is Proposition Bank (PropBank) which is semantically annotated over the predicate and argument structure. But the Korean version of the PropBank could not be widely used because the corpus has limitation in size and be different from its original English version in its usability. To solve these problems, we also used another semantic tagged corpus, built by Sejong Plan, which is nation-wide Korean corpus construction project. However, the task of corpus construction with semantic roles defined in PropBank and Sejong is much time-consuming and these corpora use their own role sets. They finally require a way of converting one role to other side role(s). In this paper, we propose a method for automatically converting the roles. First, we use similarity between a given noun argument word to find a new role and noun words appearing in the example sentences of candidate roles. Second, we extract suffix of the argument word and estimate closeness between the suffix and candidate roles. Finally, the predicate itself is used for selection,that is we calculate the closeness between the predicate and the candidate roles. With these, the role is decided among multiple candidate roles. In the experiment, we convert 491 arguments automatically and about 78% of them show the agreement with manually annotated arguments.