{"title":"Budget-Constrained Ego Network Extraction With Maximized Willingness","authors":"Bay-Yuan Hsu;Chia-Hsun Lu;Ming-Yi Chang;Chih-Ying Tseng;Chih-Ya Shen","doi":"10.1109/TKDE.2024.3446169","DOIUrl":null,"url":null,"abstract":"Many large-scale machine learning approaches and graph algorithms are proposed recently to address a variety of problems in online social networks (OSNs). To evaluate and validate these algorithms and models, the data of ego-centric networks (ego networks) are widely adopted. Therefore, effectively extracting large-scale ego networks from OSNs becomes an important issue, particularly when privacy policies become increasingly strict nowadays. In this paper, we study the problem of extracting ego network data by considering jointly the user willingness, crawling cost, and structure of the network. We formulate a new research problem, named \n<i>Structure and Willingness Aware Ego Network Extraction (SWAN)</i>\n and analyze its NP-hardness. We first propose a \n<inline-formula><tex-math>$(1-\\frac{1}{e})$</tex-math></inline-formula>\n-approximation algorithm, named \n<i>Tristar-Optimized Ego Network Identification with Maximum Willingness (TOMW)</i>\n. In addition to the deterministic approximation algorithm, we also propose to automatically \n<i>learn</i>\n an effective heuristic approach with machine learning, to avoid the huge efforts for human to devise a good algorithm. The learning approach is named \n<i>Willingness-maximized and Structure-aware Ego Network Extraction with Reinforcement Learning (WSRL)</i>\n, in which we propose a novel constrastive learning strategy, named \n<i>Contrastive Learning with Performance-boosting Graph Augmentation</i>\n. We recruited 1,810 real-world participants and conducted an evaluation study to validate our problem formulation and proposed approaches. Moreover, experimental results on real social network datasets show that the proposed approaches outperform the other baselines significantly.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"36 12","pages":"7692-7707"},"PeriodicalIF":8.9000,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10640244/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Many large-scale machine learning approaches and graph algorithms are proposed recently to address a variety of problems in online social networks (OSNs). To evaluate and validate these algorithms and models, the data of ego-centric networks (ego networks) are widely adopted. Therefore, effectively extracting large-scale ego networks from OSNs becomes an important issue, particularly when privacy policies become increasingly strict nowadays. In this paper, we study the problem of extracting ego network data by considering jointly the user willingness, crawling cost, and structure of the network. We formulate a new research problem, named
Structure and Willingness Aware Ego Network Extraction (SWAN)
and analyze its NP-hardness. We first propose a
$(1-\frac{1}{e})$
-approximation algorithm, named
Tristar-Optimized Ego Network Identification with Maximum Willingness (TOMW)
. In addition to the deterministic approximation algorithm, we also propose to automatically
learn
an effective heuristic approach with machine learning, to avoid the huge efforts for human to devise a good algorithm. The learning approach is named
Willingness-maximized and Structure-aware Ego Network Extraction with Reinforcement Learning (WSRL)
, in which we propose a novel constrastive learning strategy, named
Contrastive Learning with Performance-boosting Graph Augmentation
. We recruited 1,810 real-world participants and conducted an evaluation study to validate our problem formulation and proposed approaches. Moreover, experimental results on real social network datasets show that the proposed approaches outperform the other baselines significantly.
期刊介绍:
The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.