Pub Date : 2024-12-09DOI: 10.1109/TKDE.2024.3514323
Jian Chen;Hong Gao;Yuhong Shi;Junle Chen;Donghua Yang;Jianzhong Li
Maximizing Influence (Max-Inf) query is a fundamental operation in spatial data management. This query returns an optimal site from a candidate set to maximize its influence. Existing work commonly focuses on outdoor spaces. In practice, however, people spend up to 87% of their daily life inside indoor spaces. The outdoor techniques fall short in indoor spaces due to the complicated topology of indoor spaces. In this paper, we formulate two indoor Max-Inf queries: Top-$k$k Probabilistic Influence Query (T$k$kPI) and Collective-$k$k Probabilistic Influence Query (C$k$kPI) taking probability and mobility factors into consideration. We propose a novel spatial index, IT-tree, which utilizes the properties of indoor venues to facilitate the indoor distance computation, and then applies a trie to further organize the trajectories with similar check-in partitions together, based on their sketch information. This structure is simple but highly effective in pruning the trajectory search space. To process T$k$PI efficiently, we devise subtree pruning and progressive pruning techniques to delicately filter out unnecessary trajectories based on probability bounds and the monotonicity of influence probability. For C$k$PI queries, which is a submodular NP-hard problem, three approximation algorithms are provided with different strategies of computing marginal influence value during the search. Through extensive experiments on several real indoor venues, we demonstrate the efficiency and effectiveness of our proposed algorithms.
{"title":"Maximizing Influence Query Over Indoor Trajectories","authors":"Jian Chen;Hong Gao;Yuhong Shi;Junle Chen;Donghua Yang;Jianzhong Li","doi":"10.1109/TKDE.2024.3514323","DOIUrl":"https://doi.org/10.1109/TKDE.2024.3514323","url":null,"abstract":"Maximizing Influence (Max-Inf) query is a fundamental operation in spatial data management. This query returns an optimal site from a candidate set to maximize its <i>influence</i>. Existing work commonly focuses on outdoor spaces. In practice, however, people spend up to 87% of their daily life inside indoor spaces. The outdoor techniques fall short in indoor spaces due to the complicated topology of indoor spaces. In this paper, we formulate two indoor Max-Inf queries: <i>Top-<inline-formula><tex-math>$k$</tex-math><alternatives><mml:math><mml:mi>k</mml:mi></mml:math><inline-graphic></alternatives></inline-formula> Probabilistic Influence Query (T<inline-formula><tex-math>$k$</tex-math><alternatives><mml:math><mml:mi>k</mml:mi></mml:math><inline-graphic></alternatives></inline-formula>PI)</i> and <i>Collective-<inline-formula><tex-math>$k$</tex-math><alternatives><mml:math><mml:mi>k</mml:mi></mml:math><inline-graphic></alternatives></inline-formula> Probabilistic Influence Query (C<inline-formula><tex-math>$k$</tex-math><alternatives><mml:math><mml:mi>k</mml:mi></mml:math><inline-graphic></alternatives></inline-formula>PI)</i> taking probability and mobility factors into consideration. We propose a novel spatial index, IT-tree, which utilizes the properties of indoor venues to facilitate the indoor distance computation, and then applies a trie to further organize the trajectories with similar check-in partitions together, based on their sketch information. This structure is simple but highly effective in pruning the trajectory search space. To process T<inline-formula><tex-math>$k$</tex-math></inline-formula>PI efficiently, we devise subtree pruning and progressive pruning techniques to delicately filter out unnecessary trajectories based on probability bounds and the monotonicity of influence probability. For C<inline-formula><tex-math>$k$</tex-math></inline-formula>PI queries, which is a submodular NP-hard problem, three approximation algorithms are provided with different strategies of computing marginal influence value during the search. Through extensive experiments on several real indoor venues, we demonstrate the efficiency and effectiveness of our proposed algorithms.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 3","pages":"1294-1310"},"PeriodicalIF":8.9,"publicationDate":"2024-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143106903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-09DOI: 10.1109/TKDE.2024.3513533
Haipeng Ding;Zhewei Wei;Yuhang Ye
Graph Neural Networks (GNNs) have aroused increasing research attention for their effectiveness on graph mining tasks. However, full-batch training methods based on stochastic gradient descent (SGD) require substantial resources since all gradient-required computational processes are stored in the acceleration device. The bottleneck of storage challenges the training of classic GNNs on large-scale datasets within one acceleration device. Meanwhile, message-passing based (spatial) GNN designs usually necessitate the homophily hypothesis of the graph, which easily fails on heterophilous graphs. In this paper, we propose the random walk extension for those message-passing based GNNs, enriching them with spectral powers. We prove that our random walk sampling with appropriate correction coefficients generates an unbiased approximation of the $K$