{"title":"基于噪声滤波的远程监督关系提取","authors":"Jing Chen, Zhiqiang Guo, Jie Yang","doi":"10.1145/3457682.3457743","DOIUrl":null,"url":null,"abstract":"As a widely used method in relation extraction at the present stage suggests, distant supervision is affected by label noise. The data noise is introduced artificially due to the theory and the performance of distant supervision will be restricted during the modeling process. To solve this problem on the sentence level, the task of relation extraction in our project is modeled with two parts: sentence selector and relation extractor. Sentence selector, based on the theory of reinforcement learning, processes the corpus in units of entity pairs. The training corpus is divided into three parts including selected sentences, discarded sentences, and unlabeled sentences. We try to obtain more semantic information of the training corpus by introducing the intra-class attention and inter-class similarity. To make the operation of filtering noise data more accurate, this model evaluates the predicted value produced by the relation extractor between the selected and discarded sentences in the sentence package. The result shows that the redesigned reinforcement learning algorithm WPR-RL in this study can significantly improve the deficiencies of the existing approach. At the same time, we also carry out a number of composite tests to discuss the impact of each improvement on the performance of the model.","PeriodicalId":142045,"journal":{"name":"2021 13th International Conference on Machine Learning and Computing","volume":"119 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Distant Supervision for Relation Extraction via Noise Filtering\",\"authors\":\"Jing Chen, Zhiqiang Guo, Jie Yang\",\"doi\":\"10.1145/3457682.3457743\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As a widely used method in relation extraction at the present stage suggests, distant supervision is affected by label noise. The data noise is introduced artificially due to the theory and the performance of distant supervision will be restricted during the modeling process. To solve this problem on the sentence level, the task of relation extraction in our project is modeled with two parts: sentence selector and relation extractor. Sentence selector, based on the theory of reinforcement learning, processes the corpus in units of entity pairs. The training corpus is divided into three parts including selected sentences, discarded sentences, and unlabeled sentences. We try to obtain more semantic information of the training corpus by introducing the intra-class attention and inter-class similarity. To make the operation of filtering noise data more accurate, this model evaluates the predicted value produced by the relation extractor between the selected and discarded sentences in the sentence package. The result shows that the redesigned reinforcement learning algorithm WPR-RL in this study can significantly improve the deficiencies of the existing approach. At the same time, we also carry out a number of composite tests to discuss the impact of each improvement on the performance of the model.\",\"PeriodicalId\":142045,\"journal\":{\"name\":\"2021 13th International Conference on Machine Learning and Computing\",\"volume\":\"119 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-02-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 13th International Conference on Machine Learning and Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3457682.3457743\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 13th International Conference on Machine Learning and Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3457682.3457743","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Distant Supervision for Relation Extraction via Noise Filtering
As a widely used method in relation extraction at the present stage suggests, distant supervision is affected by label noise. The data noise is introduced artificially due to the theory and the performance of distant supervision will be restricted during the modeling process. To solve this problem on the sentence level, the task of relation extraction in our project is modeled with two parts: sentence selector and relation extractor. Sentence selector, based on the theory of reinforcement learning, processes the corpus in units of entity pairs. The training corpus is divided into three parts including selected sentences, discarded sentences, and unlabeled sentences. We try to obtain more semantic information of the training corpus by introducing the intra-class attention and inter-class similarity. To make the operation of filtering noise data more accurate, this model evaluates the predicted value produced by the relation extractor between the selected and discarded sentences in the sentence package. The result shows that the redesigned reinforcement learning algorithm WPR-RL in this study can significantly improve the deficiencies of the existing approach. At the same time, we also carry out a number of composite tests to discuss the impact of each improvement on the performance of the model.