使用多任务学习在多个分层数据集中分类文档

2013 IEEE 25th International Conference on Tools with Artificial Intelligence Pub Date : 2013-11-04 DOI:10.1109/ICTAI.2013.65

Azad Naik, Anveshi Charuvaka, H. Rangwala

{"title":"使用多任务学习在多个分层数据集中分类文档","authors":"Azad Naik, Anveshi Charuvaka, H. Rangwala","doi":"10.1109/ICTAI.2013.65","DOIUrl":null,"url":null,"abstract":"Multi-task learning (MTL) is a supervised learning paradigm in which the prediction models for several related tasks are learned jointly to achieve better generalization performance. When there are only a few training examples per task, MTL considerably outperforms the traditional Single task learning (STL) in terms of prediction accuracy. In this work we develop an MTL based approach for classifying documents that are archived within dual concept hierarchies, namely, DMOZ and Wikipedia. We solve the multi-class classification problem by defining one-versus-rest binary classification tasks for each of the different classes across the two hierarchical datasets. Instead of learning a linear discriminant for each of the different tasks independently, we use a MTL approach with relationships between the different tasks across the datasets established using the non-parametric, lazy, nearest neighbor approach. We also develop and evaluate a transfer learning (TL) approach and compare the MTL (and TL) methods against the standard single task learning and semi-supervised learning approaches. Our empirical results demonstrate the strength of our developed methods that show an improvement especially when there are fewer number of training examples per classification task.","PeriodicalId":140309,"journal":{"name":"2013 IEEE 25th International Conference on Tools with Artificial Intelligence","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Classifying Documents within Multiple Hierarchical Datasets Using Multi-task Learning\",\"authors\":\"Azad Naik, Anveshi Charuvaka, H. Rangwala\",\"doi\":\"10.1109/ICTAI.2013.65\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multi-task learning (MTL) is a supervised learning paradigm in which the prediction models for several related tasks are learned jointly to achieve better generalization performance. When there are only a few training examples per task, MTL considerably outperforms the traditional Single task learning (STL) in terms of prediction accuracy. In this work we develop an MTL based approach for classifying documents that are archived within dual concept hierarchies, namely, DMOZ and Wikipedia. We solve the multi-class classification problem by defining one-versus-rest binary classification tasks for each of the different classes across the two hierarchical datasets. Instead of learning a linear discriminant for each of the different tasks independently, we use a MTL approach with relationships between the different tasks across the datasets established using the non-parametric, lazy, nearest neighbor approach. We also develop and evaluate a transfer learning (TL) approach and compare the MTL (and TL) methods against the standard single task learning and semi-supervised learning approaches. Our empirical results demonstrate the strength of our developed methods that show an improvement especially when there are fewer number of training examples per classification task.\",\"PeriodicalId\":140309,\"journal\":{\"name\":\"2013 IEEE 25th International Conference on Tools with Artificial Intelligence\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-11-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 IEEE 25th International Conference on Tools with Artificial Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICTAI.2013.65\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE 25th International Conference on Tools with Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTAI.2013.65","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

摘要

多任务学习(Multi-task learning, MTL)是一种监督学习范式，它将多个相关任务的预测模型联合学习，以获得更好的泛化性能。当每个任务只有几个训练样例时，MTL在预测精度方面明显优于传统的单任务学习(STL)。在这项工作中，我们开发了一种基于MTL的方法，用于对双重概念层次(即DMOZ和Wikipedia)中存档的文档进行分类。我们通过在两个分层数据集中为每个不同的类定义one-versus-rest二元分类任务来解决多类分类问题。我们不是独立地为每个不同的任务学习线性判别式，而是使用MTL方法，使用非参数、惰性、最近邻方法建立数据集上不同任务之间的关系。我们还开发和评估了迁移学习(TL)方法，并将MTL(和TL)方法与标准的单任务学习和半监督学习方法进行了比较。我们的实证结果证明了我们开发的方法的强度，特别是当每个分类任务的训练样本数量较少时，这种方法表现出了改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Classifying Documents within Multiple Hierarchical Datasets Using Multi-task Learning

Multi-task learning (MTL) is a supervised learning paradigm in which the prediction models for several related tasks are learned jointly to achieve better generalization performance. When there are only a few training examples per task, MTL considerably outperforms the traditional Single task learning (STL) in terms of prediction accuracy. In this work we develop an MTL based approach for classifying documents that are archived within dual concept hierarchies, namely, DMOZ and Wikipedia. We solve the multi-class classification problem by defining one-versus-rest binary classification tasks for each of the different classes across the two hierarchical datasets. Instead of learning a linear discriminant for each of the different tasks independently, we use a MTL approach with relationships between the different tasks across the datasets established using the non-parametric, lazy, nearest neighbor approach. We also develop and evaluate a transfer learning (TL) approach and compare the MTL (and TL) methods against the standard single task learning and semi-supervised learning approaches. Our empirical results demonstrate the strength of our developed methods that show an improvement especially when there are fewer number of training examples per classification task.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2013 IEEE 25th International Conference on Tools with Artificial Intelligence

自引率

0.00%

发文量

期刊最新文献

An Automatic Algorithm Selection Approach for Planning Learning Useful Macro-actions for Planning with N-Grams Optimizing Dynamic Ensemble Selection Procedure by Evolutionary Extreme Learning Machines and a Noise Reduction Filter Motion-Driven Action-Based Planning Assessing Procedural Knowledge in Free-Text Answers through a Hybrid Semantic Web Approach