Dongmin Park, Hwanjun Song, Minseok Kim, Jae-Gil Lee
{"title":"TRAP: Two-level Regularized Autoencoder-based Embedding for Power-law Distributed Data","authors":"Dongmin Park, Hwanjun Song, Minseok Kim, Jae-Gil Lee","doi":"10.1145/3366423.3380233","DOIUrl":null,"url":null,"abstract":"Recently, autoencoder (AE)-based embedding approaches have achieved state-of-the-art performance in many tasks, especially in top-k recommendation with user embedding or node classification with node embedding. However, we find that many real-world data follow the power-law distribution with respect to the data object sparsity. When learning AE-based embeddings of these data, dense inputs move away from sparse inputs in an embedding space even when they are highly correlated. This phenomenon, which we call polarization, obviously distorts the embedding. In this paper, we propose TRAP that leverages two-level regularizers to effectively alleviate the polarization problem. The macroscopic regularizer generally prevents dense input objects from being distant from other sparse input objects, and the microscopic regularizer individually attracts each object to correlated neighbor objects rather than uncorrelated ones. Importantly, TRAP is a meta-algorithm that can be easily coupled with existing AE-based embedding methods with a simple modification. In extensive experiments on two representative embedding tasks using six-real world datasets, TRAP boosted the performance of the state-of-the-art algorithms by up to 31.53% and 94.99% respectively.","PeriodicalId":20754,"journal":{"name":"Proceedings of The Web Conference 2020","volume":"6 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2020-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of The Web Conference 2020","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3366423.3380233","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
Recently, autoencoder (AE)-based embedding approaches have achieved state-of-the-art performance in many tasks, especially in top-k recommendation with user embedding or node classification with node embedding. However, we find that many real-world data follow the power-law distribution with respect to the data object sparsity. When learning AE-based embeddings of these data, dense inputs move away from sparse inputs in an embedding space even when they are highly correlated. This phenomenon, which we call polarization, obviously distorts the embedding. In this paper, we propose TRAP that leverages two-level regularizers to effectively alleviate the polarization problem. The macroscopic regularizer generally prevents dense input objects from being distant from other sparse input objects, and the microscopic regularizer individually attracts each object to correlated neighbor objects rather than uncorrelated ones. Importantly, TRAP is a meta-algorithm that can be easily coupled with existing AE-based embedding methods with a simple modification. In extensive experiments on two representative embedding tasks using six-real world datasets, TRAP boosted the performance of the state-of-the-art algorithms by up to 31.53% and 94.99% respectively.