{"title":"Multi-Party Federated Recommendation Based on Semi-Supervised Learning","authors":"Xin Liu;Jiuluan Lv;Feng Chen;Qingjie Wei;Hangxuan He;Ying Qian","doi":"10.1109/TBDATA.2023.3338009","DOIUrl":null,"url":null,"abstract":"Leveraging multi-party data to provide recommendations remains a challenge, particularly when the party in need of recommendation services possesses only positive samples while other parties just have unlabeled data. To address UDD-PU learning problem, this paper proposes an algorithm VFPU, Vertical Federated learning with Positive and Unlabeled data. VFPU conducts random sampling repeatedly from the multi-party unlabeled data, treating sampled data as negative ones. It hence forms multiple training datasets with balanced positive and negative samples, and multiple testing datasets with those unsampled data. For each training dataset, VFPU trains a base estimator adapted for the vertical federated learning framework iteratively. We use the trained base estimator to generate forecast scores for each sample in the testing dataset. Based on the sum of scores and their frequency of occurrence in the testing datasets, we calculate the probability of being positive for each unlabeled sample. Those with top probabilities are regarded as reliable positive samples. They are then added to the positive samples and subsequently removed from the unlabeled data. This process of sampling, training, and selecting positive samples is iterated repeatedly. Experimental results demonstrated that VFPU performed comparably to its non-federated counterparts and outperformed other federated semi-supervised learning methods.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"10 4","pages":"356-370"},"PeriodicalIF":7.5000,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Big Data","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10336386/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Leveraging multi-party data to provide recommendations remains a challenge, particularly when the party in need of recommendation services possesses only positive samples while other parties just have unlabeled data. To address UDD-PU learning problem, this paper proposes an algorithm VFPU, Vertical Federated learning with Positive and Unlabeled data. VFPU conducts random sampling repeatedly from the multi-party unlabeled data, treating sampled data as negative ones. It hence forms multiple training datasets with balanced positive and negative samples, and multiple testing datasets with those unsampled data. For each training dataset, VFPU trains a base estimator adapted for the vertical federated learning framework iteratively. We use the trained base estimator to generate forecast scores for each sample in the testing dataset. Based on the sum of scores and their frequency of occurrence in the testing datasets, we calculate the probability of being positive for each unlabeled sample. Those with top probabilities are regarded as reliable positive samples. They are then added to the positive samples and subsequently removed from the unlabeled data. This process of sampling, training, and selecting positive samples is iterated repeatedly. Experimental results demonstrated that VFPU performed comparably to its non-federated counterparts and outperformed other federated semi-supervised learning methods.
期刊介绍:
The IEEE Transactions on Big Data publishes peer-reviewed articles focusing on big data. These articles present innovative research ideas and application results across disciplines, including novel theories, algorithms, and applications. Research areas cover a wide range, such as big data analytics, visualization, curation, management, semantics, infrastructure, standards, performance analysis, intelligence extraction, scientific discovery, security, privacy, and legal issues specific to big data. The journal also prioritizes applications of big data in fields generating massive datasets.