{"title":"Transfer Learning via Cluster Correspondence Inference","authors":"Mingsheng Long, Wei-min Cheng, Xiaoming Jin, Jianmin Wang, Dou Shen","doi":"10.1109/ICDM.2010.146","DOIUrl":null,"url":null,"abstract":"Transfer learning targets to leverage knowledge from one domain for tasks in a new domain. It finds abundant applications, such as text/sentiment classification. Many previous works are based on cluster analysis, which assume some common clusters shared by both domains. They mainly focus on the one-to-one cluster correspondence to bridge different domains. However, such a correspondence scheme might be too strong for real applications where each cluster in one domain corresponds to many clusters in the other domain. In this paper, we propose a Cluster Correspondence Inference (CCI) method to iteratively infer many-to-many correspondence among clusters from different domains. Specifically, word clusters and document clusters are exploited for each domain using nonnegative matrix factorization, then the word clusters from different domains are corresponded in a many-to-many scheme, with the help of shared word space as a bridge. These two steps are run iteratively and label information is transferred from source domain to target domain through the inferred cluster correspondence. Experiments on various real data sets demonstrate that our method outperforms several state-of-the-art approaches for cross-domain text classification.","PeriodicalId":294061,"journal":{"name":"2010 IEEE International Conference on Data Mining","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE International Conference on Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM.2010.146","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13
Abstract
Transfer learning targets to leverage knowledge from one domain for tasks in a new domain. It finds abundant applications, such as text/sentiment classification. Many previous works are based on cluster analysis, which assume some common clusters shared by both domains. They mainly focus on the one-to-one cluster correspondence to bridge different domains. However, such a correspondence scheme might be too strong for real applications where each cluster in one domain corresponds to many clusters in the other domain. In this paper, we propose a Cluster Correspondence Inference (CCI) method to iteratively infer many-to-many correspondence among clusters from different domains. Specifically, word clusters and document clusters are exploited for each domain using nonnegative matrix factorization, then the word clusters from different domains are corresponded in a many-to-many scheme, with the help of shared word space as a bridge. These two steps are run iteratively and label information is transferred from source domain to target domain through the inferred cluster correspondence. Experiments on various real data sets demonstrate that our method outperforms several state-of-the-art approaches for cross-domain text classification.