{"title":"Efficient Online Multi-Task Learning via Adaptive Kernel Selection","authors":"Peng Yang, P. Li","doi":"10.1145/3366423.3379993","DOIUrl":null,"url":null,"abstract":"Conventional multi-task model restricts the task structure to be linearly related, which may not be suitable when data is linearly nonseparable. To remedy this issue, we propose a kernel algorithm for online multi-task classification, as the large approximation space provided by reproducing kernel Hilbert spaces often contains an accurate function. Specifically, it maintains a local-global Gaussian distribution over each task model that guides the direction and scale of parameter updates. Nonetheless, optimizing over this space is computationally expensive. Moreover, most multi-task learning methods require accessing to the entire training instances, which is luxury unavailable in the large-scale streaming learning scenario. To overcome this issue, we propose a randomized kernel sampling technique across multiple tasks. Instead of requiring all inputs’ labels, the proposed algorithm determines whether to query a label or not via considering the confidence from the related tasks over label prediction. Theoretically, the algorithm trained on actively sampled labels can achieve a comparable result with one learned on all labels. Empirically, the proposed algorithm is able to achieve promising learning efficacy, while reducing the computational complexity and labeling cost simultaneously.","PeriodicalId":20754,"journal":{"name":"Proceedings of The Web Conference 2020","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2020-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of The Web Conference 2020","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3366423.3379993","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Conventional multi-task model restricts the task structure to be linearly related, which may not be suitable when data is linearly nonseparable. To remedy this issue, we propose a kernel algorithm for online multi-task classification, as the large approximation space provided by reproducing kernel Hilbert spaces often contains an accurate function. Specifically, it maintains a local-global Gaussian distribution over each task model that guides the direction and scale of parameter updates. Nonetheless, optimizing over this space is computationally expensive. Moreover, most multi-task learning methods require accessing to the entire training instances, which is luxury unavailable in the large-scale streaming learning scenario. To overcome this issue, we propose a randomized kernel sampling technique across multiple tasks. Instead of requiring all inputs’ labels, the proposed algorithm determines whether to query a label or not via considering the confidence from the related tasks over label prediction. Theoretically, the algorithm trained on actively sampled labels can achieve a comparable result with one learned on all labels. Empirically, the proposed algorithm is able to achieve promising learning efficacy, while reducing the computational complexity and labeling cost simultaneously.