Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.)最新文献
We study validation of streamed XML documents by means of finite state machines. Previous work has shown that validation is in principle possible by finite state automata, but the construction was prohibitively expensive, giving an exponential-size nondeterministic automaton. Instead, we want to find deterministic automata for validating streamed documents: for them, the complexity of validation is constant per tag. We show that for a reading window of size one and nonrecursive DTDs with one-unambiguous content (i.e. conforming to the current XML standard) there is an algorithm producing a deterministic automaton that validates documents with respect to that DTD. The size of the automaton is at most exponential and we give matching lower bounds. To capture the possible advantages offered by reading windows of size k, we introduce k-unambiguity as a generalization of one-unambiguity, and study the validation against DTDs with k-unambiguous content. We also consider recursive DTDs and give conditions under which they can be validated against by using one-counter automata.
{"title":"On validation of XML streams using finite state machines","authors":"Cristiana Chitic, D. Rosu","doi":"10.1145/1017074.1017096","DOIUrl":"https://doi.org/10.1145/1017074.1017096","url":null,"abstract":"We study validation of streamed XML documents by means of finite state machines. Previous work has shown that validation is in principle possible by finite state automata, but the construction was prohibitively expensive, giving an exponential-size nondeterministic automaton. Instead, we want to find deterministic automata for validating streamed documents: for them, the complexity of validation is constant per tag. We show that for a reading window of size one and nonrecursive DTDs with one-unambiguous content (i.e. conforming to the current XML standard) there is an algorithm producing a deterministic automaton that validates documents with respect to that DTD. The size of the automaton is at most exponential and we give matching lower bounds. To capture the possible advantages offered by reading windows of size k, we introduce k-unambiguity as a generalization of one-unambiguity, and study the validation against DTDs with k-unambiguous content. We also consider recursive DTDs and give conditions under which they can be validated against by using one-counter automata.","PeriodicalId":93360,"journal":{"name":"Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.)","volume":"41 1","pages":"85-90"},"PeriodicalIF":0.0,"publicationDate":"2004-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88824163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rooted in electronic publishing, XML is now widely used for modelling and storing structured text documents. Especially in the WWW, retrieval of XML documents is most useful in combination with a relevance-based ranking of the query result. Index structures with ranking support are therefore needed for fast access to relevant parts of large document collections. This paper proposes a classification scheme for both XML ranking models and index structures, allowing to determine which index suits which ranking model. An analysis reveals that ranking parameters related to both the content and structure of the data are poorly supported by most known XML indices. The IR-CADG index, owing to its tight integration of content and structure, supports various XML ranking models in a very efficient retrieval process. Experiments show that it outperforms separate content/structure indexing by more than two orders of magnitude for large corpora of several hundred MB.
{"title":"Content and structure in indexing and ranking XML","authors":"Felix Weigel, H. Meuss, K. Schulz, François Bry","doi":"10.1145/1017074.1017092","DOIUrl":"https://doi.org/10.1145/1017074.1017092","url":null,"abstract":"Rooted in electronic publishing, XML is now widely used for modelling and storing structured text documents. Especially in the WWW, retrieval of XML documents is most useful in combination with a relevance-based ranking of the query result. Index structures with ranking support are therefore needed for fast access to relevant parts of large document collections. This paper proposes a classification scheme for both XML ranking models and index structures, allowing to determine which index suits which ranking model. An analysis reveals that ranking parameters related to both the content and structure of the data are poorly supported by most known XML indices. The IR-CADG index, owing to its tight integration of content and structure, supports various XML ranking models in a very efficient retrieval process. Experiments show that it outperforms separate content/structure indexing by more than two orders of magnitude for large corpora of several hundred MB.","PeriodicalId":93360,"journal":{"name":"Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.)","volume":"3 1","pages":"67-72"},"PeriodicalIF":0.0,"publicationDate":"2004-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76103988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Current approaches for answering queries with imprecise constraints require users to provide distance metrics and importance measures for attributes of interest. In this paper we focus on providing a domain and end-user independent solution for supporting imprecise queries over Web databases without affecting the underlying database. We propose a query processing framework that integrates techniques from IR and database research to efficiently determine answers for imprecise queries. We mine and use approximate functional dependencies between attributes to create precise queries having tuples relevant to the given imprecise query. An approach to automatically estimate the semantic distances between values of categorical attributes is also proposed. We provide preliminary results showing the utility of our approach.
{"title":"Mining approximate functional dependencies and concept similarities to answer imprecise queries","authors":"Ullas Nambiar, S. Kambhampati","doi":"10.1145/1017074.1017093","DOIUrl":"https://doi.org/10.1145/1017074.1017093","url":null,"abstract":"Current approaches for answering queries with imprecise constraints require users to provide distance metrics and importance measures for attributes of interest. In this paper we focus on providing a domain and end-user independent solution for supporting imprecise queries over Web databases without affecting the underlying database. We propose a query processing framework that integrates techniques from IR and database research to efficiently determine answers for imprecise queries. We mine and use approximate functional dependencies between attributes to create precise queries having tuples relevant to the given imprecise query. An approach to automatically estimate the semantic distances between values of categorical attributes is also proposed. We provide preliminary results showing the utility of our approach.","PeriodicalId":93360,"journal":{"name":"Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.)","volume":"43 1","pages":"73-78"},"PeriodicalIF":0.0,"publicationDate":"2004-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79192736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Adina Crainiceanu, Prakash Linga, J. Gehrke, J. Shanmugasundaram
We propose a new distributed, fault-tolerant peer-to-peer index structure called the P-tree. P-trees efficiently evaluate range queries in addition to equality queries.
{"title":"Querying peer-to-peer networks using P-trees","authors":"Adina Crainiceanu, Prakash Linga, J. Gehrke, J. Shanmugasundaram","doi":"10.1145/1017074.1017082","DOIUrl":"https://doi.org/10.1145/1017074.1017082","url":null,"abstract":"We propose a new distributed, fault-tolerant peer-to-peer index structure called the <B>P-tree</B>. P-trees efficiently evaluate range queries in addition to equality queries.","PeriodicalId":93360,"journal":{"name":"Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.)","volume":"53 1","pages":"25-30"},"PeriodicalIF":0.0,"publicationDate":"2004-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83272429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dynamic content generation poses huge resource demands on web servers, creating a scalability problem. WebView Materialization, where web pages are cached and constantly refreshed in the background, has been shown to ameliorate the scalability problem without sacrificing data freshness. In this work we present an adaptive online algorithm to select which WebViews to materialize, that realizes the trade-off between Quality of Service and Quality of Data. Our algorithm performs very close to the static, off-line optimal algorithm, and, under rapid workload changes, it outperforms the optimal.
{"title":"Adaptive WebView Materialization","authors":"Alexandros Labrinidis, N. Roussopoulos","doi":"10.21236/ada439848","DOIUrl":"https://doi.org/10.21236/ada439848","url":null,"abstract":"Dynamic content generation poses huge resource demands on web servers, creating a scalability problem. WebView Materialization, where web pages are cached and constantly refreshed in the background, has been shown to ameliorate the scalability problem without sacrificing data freshness. In this work we present an adaptive online algorithm to select which WebViews to materialize, that realizes the trade-off between Quality of Service and Quality of Data. Our algorithm performs very close to the static, off-line optimal algorithm, and, under rapid workload changes, it outperforms the optimal.","PeriodicalId":93360,"journal":{"name":"Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.)","volume":"38 1","pages":"85-90"},"PeriodicalIF":0.0,"publicationDate":"2001-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90088596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient Relational Storage and Retrieval of XML Documents","authors":"A. Schmidt, M. Kersten, Menzo Windhouwer, F. Waas","doi":"10.1007/3-540-45271-0_9","DOIUrl":"https://doi.org/10.1007/3-540-45271-0_9","url":null,"abstract":"","PeriodicalId":93360,"journal":{"name":"Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.)","volume":"31 1","pages":"137-150"},"PeriodicalIF":0.0,"publicationDate":"2000-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85204354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-05-18DOI: 10.1007/3-540-45271-0_13
Aldo Bongio, S. Ceri, P. Fraternali, A. Maurino
{"title":"Modeling Data Entry and Operations in WebML","authors":"Aldo Bongio, S. Ceri, P. Fraternali, A. Maurino","doi":"10.1007/3-540-45271-0_13","DOIUrl":"https://doi.org/10.1007/3-540-45271-0_13","url":null,"abstract":"","PeriodicalId":93360,"journal":{"name":"Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.)","volume":"27 1","pages":"87-92"},"PeriodicalIF":0.0,"publicationDate":"2000-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87707553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-05-18DOI: 10.1007/3-540-45271-0_15
H. Hosoya, B. Pierce
{"title":"XDuce: A Typed XML Processing Language (Preliminary Report)","authors":"H. Hosoya, B. Pierce","doi":"10.1007/3-540-45271-0_15","DOIUrl":"https://doi.org/10.1007/3-540-45271-0_15","url":null,"abstract":"","PeriodicalId":93360,"journal":{"name":"Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.)","volume":"31 1","pages":"226-244"},"PeriodicalIF":0.0,"publicationDate":"2000-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74949608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qiong Luo, J. Naughton, R. Krishnamurthy, P. Cao, Yunrui Li
{"title":"Active Query Caching for Database Web Servers","authors":"Qiong Luo, J. Naughton, R. Krishnamurthy, P. Cao, Yunrui Li","doi":"10.1007/3-540-45271-0_6","DOIUrl":"https://doi.org/10.1007/3-540-45271-0_6","url":null,"abstract":"","PeriodicalId":93360,"journal":{"name":"Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.)","volume":"57 4 1","pages":"29-34"},"PeriodicalIF":0.0,"publicationDate":"2000-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89736103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Quilt: An XML Query Language for Heterogeneous Data Sources","authors":"D. Chamberlin, J. Robie, D. Florescu","doi":"10.1007/3-540-45271-0_1","DOIUrl":"https://doi.org/10.1007/3-540-45271-0_1","url":null,"abstract":"","PeriodicalId":93360,"journal":{"name":"Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.)","volume":"1 1","pages":"53-62"},"PeriodicalIF":0.0,"publicationDate":"2000-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84076474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.)