Snorkel MeTaL: Weak Supervision for Multi-Task Learning.

Proceedings of the Second Workshop on Data Management for End-to-End Machine Learning. Workshop on Data Management for End-to-End Machine Learning (2nd : 2018 : Houston, Tex.) Pub Date : 2018-06-01 DOI:10.1145/3209889.3209898

Alex Ratner, Braden Hancock, Jared Dunnmon, Roger Goldman, Christopher Ré

{"title":"Snorkel MeTaL: Weak Supervision for Multi-Task Learning.","authors":"Alex Ratner, Braden Hancock, Jared Dunnmon, Roger Goldman, Christopher Ré","doi":"10.1145/3209889.3209898","DOIUrl":null,"url":null,"abstract":"Many real-world machine learning problems are challenging to tackle for two reasons: (i) they involve multiple sub-tasks at different levels of granularity; and (ii) they require large volumes of labeled training data. We propose Snorkel MeTaL, an end-to-end system for multi-task learning that leverages weak supervision provided at multiple levels of granularity by domain expert users. In MeTaL, a user specifies a problem consisting of multiple, hierarchically-related sub-tasks-for example, classifying a document at multiple levels of granularity-and then provides labeling functions for each sub-task as weak supervision. MeTaL learns a re-weighted model of these labeling functions, and uses the combined signal to train a hierarchical multi-task network which is automatically compiled from the structure of the sub-tasks. Using MeTaL on a radiology report triage task and a fine-grained news classification task, we achieve average gains of 11.2 accuracy points over a baseline supervised approach and 9.5 accuracy points over the predictions of the user-provided labeling functions.","PeriodicalId":92710,"journal":{"name":"Proceedings of the Second Workshop on Data Management for End-to-End Machine Learning. Workshop on Data Management for End-to-End Machine Learning (2nd : 2018 : Houston, Tex.)","volume":"2018 ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6436830/pdf/nihms-993812.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Second Workshop on Data Management for End-to-End Machine Learning. Workshop on Data Management for End-to-End Machine Learning (2nd : 2018 : Houston, Tex.)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3209889.3209898","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Many real-world machine learning problems are challenging to tackle for two reasons: (i) they involve multiple sub-tasks at different levels of granularity; and (ii) they require large volumes of labeled training data. We propose Snorkel MeTaL, an end-to-end system for multi-task learning that leverages weak supervision provided at multiple levels of granularity by domain expert users. In MeTaL, a user specifies a problem consisting of multiple, hierarchically-related sub-tasks-for example, classifying a document at multiple levels of granularity-and then provides labeling functions for each sub-task as weak supervision. MeTaL learns a re-weighted model of these labeling functions, and uses the combined signal to train a hierarchical multi-task network which is automatically compiled from the structure of the sub-tasks. Using MeTaL on a radiology report triage task and a fine-grained news classification task, we achieve average gains of 11.2 accuracy points over a baseline supervised approach and 9.5 accuracy points over the predictions of the user-provided labeling functions.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Snorkel MeTaL：多任务学习的弱监督。

现实世界中的许多机器学习问题都很难解决，原因有二：(i) 它们涉及不同粒度的多个子任务；(ii) 它们需要大量标注的训练数据。我们提出了 Snorkel MeTaL，这是一个用于多任务学习的端到端系统，可利用领域专家用户提供的多粒度弱监督。在 MeTaL 中，用户指定一个由多个层次相关的子任务组成的问题--例如，对文档进行多级分类--然后为每个子任务提供标签函数作为弱监督。MeTaL 学习这些标注函数的重新加权模型，并利用综合信号训练分层多任务网络，该网络由子任务结构自动编译而成。使用 MeTaL 完成放射报告分流任务和细粒度新闻分类任务后，我们的平均准确率比基准监督方法提高了 11.2 个百分点，比用户提供的标签函数预测准确率提高了 9.5 个百分点。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the Second Workshop on Data Management for End-to-End Machine Learning. Workshop on Data Management for End-to-End Machine Learning (2nd : 2018 : Houston, Tex.)

自引率

0.00%

发文量