Braden Hancock, Martin Bringmann, Paroma Varma, Percy Liang, Stephanie Wang, Christopher Ré
{"title":"Training Classifiers with Natural Language Explanations.","authors":"Braden Hancock, Martin Bringmann, Paroma Varma, Percy Liang, Stephanie Wang, Christopher Ré","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>Training accurate classifiers requires many labels, but each label provides only limited information (one bit for binary classification). In this work, we propose BabbleLabble, a framework for training classifiers in which an annotator provides a natural language explanation for each labeling decision. A semantic parser converts these explanations into programmatic labeling functions that generate noisy labels for an arbitrary amount of unlabeled data, which is used to train a classifier. On three relation extraction tasks, we find that users are able to train classifiers with comparable F1 scores from 5-100× faster by providing explanations instead of just labels. Furthermore, given the inherent imperfection of labeling functions, we find that a simple rule-based semantic parser suffices.</p>","PeriodicalId":74541,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. Meeting","volume":"2018 ","pages":"1884-1895"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6534135/pdf/nihms-993798.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the conference. Association for Computational Linguistics. Meeting","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Training accurate classifiers requires many labels, but each label provides only limited information (one bit for binary classification). In this work, we propose BabbleLabble, a framework for training classifiers in which an annotator provides a natural language explanation for each labeling decision. A semantic parser converts these explanations into programmatic labeling functions that generate noisy labels for an arbitrary amount of unlabeled data, which is used to train a classifier. On three relation extraction tasks, we find that users are able to train classifiers with comparable F1 scores from 5-100× faster by providing explanations instead of just labels. Furthermore, given the inherent imperfection of labeling functions, we find that a simple rule-based semantic parser suffices.
训练精确的分类器需要许多标签,但每个标签只能提供有限的信息(二进制分类时为一个比特)。在这项工作中,我们提出了一个用于训练分类器的框架--BabbleLabble,在这个框架中,注释者为每个标签决定提供自然语言解释。语义解析器将这些解释转换成程序化的标签函数,为任意数量的未标签数据生成噪声标签,用于训练分类器。在三个关系提取任务中,我们发现用户通过提供解释而不仅仅是标签,能够以 5-100 倍的速度训练分类器,并获得相当的 F1 分数。此外,考虑到标签功能本身的不完善,我们发现基于规则的简单语义解析器就足够了。