Shangqi Lu, W. Martens, Matthias Niewerth, Yufei Tao
{"title":"Optimal Algorithms for Multiway Search on Partial Orders","authors":"Shangqi Lu, W. Martens, Matthias Niewerth, Yufei Tao","doi":"10.1145/3517804.3524150","DOIUrl":null,"url":null,"abstract":"We study partial order multiway search (POMS), which is a game between an algorithm A and an oracle, played on a directed acyclic graph G known to both parties. First, the oracle picks a vertex t in G called the target. Then, A needs to figure out which vertex is t by probing reachability. Specifically, in each probe, A selects a set Q of vertices in G whose size is bounded by a (pre-agreed) limit; the oracle reveals, for each vertex q ∈ Q, whether q can reach the target in G. The objective of A is to minimize the number of probes. This problem finds use in crowdsourcing, distributed file systems, software testing, etc. We describe an algorithm to solve POMS in O(log1+k n + d/k log1+dn) probes, where n is the number of vertices in G, k is the maximum permissible |Q|, and d is the largest out-degree of the vertices in G. We further establish the algorithm's asymptotic optimality by proving a matching lower bound. We also introduce a variant of POMS in the external memory (EM) computation model, which is the key to a black-box approach for converting a class of pointer-machine structures to their I/O-efficient counterparts. In the EM version of POMS, A is allowed to pre-compute a (disk-based) structure on G and is then required to clear its memory. The oracle (as before) picks a target t. A still needs to find t by issuing probes, except that the set Q in each probe must be read from the disk. The objective of A is now to minimize the number of I/Os. We present a structure that uses O(n/B) space and guarantees discovering the target in O(logB n + d/B log1+dn) I/Os where B is the block size, and n and d are as defined earlier. We establish the structure's asymptotic optimality by proving that any structure demands Ω(log_B n + d/B log1+d n) I/Os to find the target in the worst case regardless of the space consumption.","PeriodicalId":230606,"journal":{"name":"Proceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3517804.3524150","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
We study partial order multiway search (POMS), which is a game between an algorithm A and an oracle, played on a directed acyclic graph G known to both parties. First, the oracle picks a vertex t in G called the target. Then, A needs to figure out which vertex is t by probing reachability. Specifically, in each probe, A selects a set Q of vertices in G whose size is bounded by a (pre-agreed) limit; the oracle reveals, for each vertex q ∈ Q, whether q can reach the target in G. The objective of A is to minimize the number of probes. This problem finds use in crowdsourcing, distributed file systems, software testing, etc. We describe an algorithm to solve POMS in O(log1+k n + d/k log1+dn) probes, where n is the number of vertices in G, k is the maximum permissible |Q|, and d is the largest out-degree of the vertices in G. We further establish the algorithm's asymptotic optimality by proving a matching lower bound. We also introduce a variant of POMS in the external memory (EM) computation model, which is the key to a black-box approach for converting a class of pointer-machine structures to their I/O-efficient counterparts. In the EM version of POMS, A is allowed to pre-compute a (disk-based) structure on G and is then required to clear its memory. The oracle (as before) picks a target t. A still needs to find t by issuing probes, except that the set Q in each probe must be read from the disk. The objective of A is now to minimize the number of I/Os. We present a structure that uses O(n/B) space and guarantees discovering the target in O(logB n + d/B log1+dn) I/Os where B is the block size, and n and d are as defined earlier. We establish the structure's asymptotic optimality by proving that any structure demands Ω(log_B n + d/B log1+d n) I/Os to find the target in the worst case regardless of the space consumption.