一种用于远程监督关系提取的简单有效的蒸馏框架

ACM Transactions on Information Systems (TOIS) Pub Date : 2022-01-12 DOI:10.1145/3503917

Rui Li, Cheng Yang, Tingwei Li, Sen Su

{"title":"一种用于远程监督关系提取的简单有效的蒸馏框架","authors":"Rui Li, Cheng Yang, Tingwei Li, Sen Su","doi":"10.1145/3503917","DOIUrl":null,"url":null,"abstract":"Relation extraction (RE), an important information extraction task, faced the great challenge brought by limited annotation data. To this end, distant supervision was proposed to automatically label RE data, and thus largely increased the number of annotated instances. Unfortunately, lots of noise relation annotations brought by automatic labeling become a new obstacle. Some recent studies have shown that the teacher-student framework of knowledge distillation can alleviate the interference of noise relation annotations via label softening. Nevertheless, we find that they still suffer from two problems: propagation of inaccurate dark knowledge and constraint of a unified distillation temperature. In this article, we propose a simple and effective Multi-instance Dynamic Temperature Distillation (MiDTD) framework, which is model-agnostic and mainly involves two modules: multi-instance target fusion (MiTF) and dynamic temperature regulation (DTR). MiTF combines the teacher’s predictions for multiple sentences with the same entity pair to amend the inaccurate dark knowledge in each student’s target. DTR allocates alterable distillation temperatures to different training instances to enable the softness of most student’s targets to be regulated to a moderate range. In experiments, we construct three concrete MiDTD instantiations with BERT, PCNN, and BiLSTM-based RE models, and the distilled students significantly outperform their teachers and the state-of-the-art (SOTA) methods.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"8 1","pages":"1 - 32"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"MiDTD: A Simple and Effective Distillation Framework for Distantly Supervised Relation Extraction\",\"authors\":\"Rui Li, Cheng Yang, Tingwei Li, Sen Su\",\"doi\":\"10.1145/3503917\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Relation extraction (RE), an important information extraction task, faced the great challenge brought by limited annotation data. To this end, distant supervision was proposed to automatically label RE data, and thus largely increased the number of annotated instances. Unfortunately, lots of noise relation annotations brought by automatic labeling become a new obstacle. Some recent studies have shown that the teacher-student framework of knowledge distillation can alleviate the interference of noise relation annotations via label softening. Nevertheless, we find that they still suffer from two problems: propagation of inaccurate dark knowledge and constraint of a unified distillation temperature. In this article, we propose a simple and effective Multi-instance Dynamic Temperature Distillation (MiDTD) framework, which is model-agnostic and mainly involves two modules: multi-instance target fusion (MiTF) and dynamic temperature regulation (DTR). MiTF combines the teacher’s predictions for multiple sentences with the same entity pair to amend the inaccurate dark knowledge in each student’s target. DTR allocates alterable distillation temperatures to different training instances to enable the softness of most student’s targets to be regulated to a moderate range. In experiments, we construct three concrete MiDTD instantiations with BERT, PCNN, and BiLSTM-based RE models, and the distilled students significantly outperform their teachers and the state-of-the-art (SOTA) methods.\",\"PeriodicalId\":6934,\"journal\":{\"name\":\"ACM Transactions on Information Systems (TOIS)\",\"volume\":\"8 1\",\"pages\":\"1 - 32\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Information Systems (TOIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3503917\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Information Systems (TOIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3503917","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

关系抽取作为一项重要的信息抽取任务，由于标注数据的有限性，给关系抽取带来了巨大的挑战。为此，提出了远程监督来自动标记RE数据，从而大大增加了注释实例的数量。然而，自动标注带来的大量噪声关系标注成为自动标注的新障碍。最近的一些研究表明，师生知识蒸馏框架可以通过标签软化来缓解噪声关系标注的干扰。然而，我们发现它们仍然存在两个问题:传播不准确的暗知识和统一蒸馏温度的约束。本文提出了一种简单有效的多实例动态温度蒸馏(MiDTD)框架，该框架与模型无关，主要包括两个模块:多实例目标融合(MiTF)和动态温度调节(DTR)。MiTF将教师对多个句子的预测与同一实体对结合起来，以修正每个学生目标中不准确的暗知识。DTR为不同的训练实例分配可变的蒸馏温度，以使大多数学生目标的柔软度被调节到一个适中的范围。在实验中，我们用BERT、PCNN和基于bilstm的RE模型构建了三个具体的MiDTD实例，提炼出来的学生明显优于他们的老师和最先进的(SOTA)方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

MiDTD: A Simple and Effective Distillation Framework for Distantly Supervised Relation Extraction

Relation extraction (RE), an important information extraction task, faced the great challenge brought by limited annotation data. To this end, distant supervision was proposed to automatically label RE data, and thus largely increased the number of annotated instances. Unfortunately, lots of noise relation annotations brought by automatic labeling become a new obstacle. Some recent studies have shown that the teacher-student framework of knowledge distillation can alleviate the interference of noise relation annotations via label softening. Nevertheless, we find that they still suffer from two problems: propagation of inaccurate dark knowledge and constraint of a unified distillation temperature. In this article, we propose a simple and effective Multi-instance Dynamic Temperature Distillation (MiDTD) framework, which is model-agnostic and mainly involves two modules: multi-instance target fusion (MiTF) and dynamic temperature regulation (DTR). MiTF combines the teacher’s predictions for multiple sentences with the same entity pair to amend the inaccurate dark knowledge in each student’s target. DTR allocates alterable distillation temperatures to different training instances to enable the softness of most student’s targets to be regulated to a moderate range. In experiments, we construct three concrete MiDTD instantiations with BERT, PCNN, and BiLSTM-based RE models, and the distilled students significantly outperform their teachers and the state-of-the-art (SOTA) methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Transactions on Information Systems (TOIS)

自引率

0.00%

发文量