{"title":"Proceedings of the International Database Engineered Applications Symposium Conference, IDEAS 2023, Heraklion, Crete, Greece, May 5-7, 2023","authors":"","doi":"10.1145/3589462","DOIUrl":"https://doi.org/10.1145/3589462","url":null,"abstract":"","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87453266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A method combining improved Mahalanobis distance and adversarial autoencoder to detect abnormal network traffic","authors":"Ming Li, Dezhi Han, Dun Li","doi":"10.1145/3589462.3589489","DOIUrl":"https://doi.org/10.1145/3589462.3589489","url":null,"abstract":"","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"4 1","pages":"161-169"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76258947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IDEAS'22: International Database Engineered Applications Symposium, Budapest, Hungary, August 22 - 24, 2022","authors":"","doi":"10.1145/3548785","DOIUrl":"https://doi.org/10.1145/3548785","url":null,"abstract":"","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"35 9","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72477628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IDEAS 2020: 24th International Database Engineering & Applications Symposium, Seoul, Republic of Korea, August 12-14, 2020","authors":"","doi":"10.1145/3410566","DOIUrl":"https://doi.org/10.1145/3410566","url":null,"abstract":"","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"223 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75687475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This article will explore some of the issues that immigrant and ethnic groups have dealt with when tackling the task of archiving – gathering and preserving the documents that tell the group’s story – and that of history- or memory-building through archives, a process I will refer to, for the sake of convenience, as ethnic archiving. The paper will trace the process of ethnic archiving through the case study of three specific groups–Finnish, German and Jewish communities in the United States–in the period preceding and following the ethnic “revival” of the 1960s. These groups were chosen because they illustrate the evolution of ethnic archiving among immigrant groups that arrived in the United States before the 1920s and the adoption of restrictive immigration laws. The similarities and differences these groups display are visible in the groups’ negotiations of, and answers to, the following questions: Who should be responsible for archiving? What should be the purpose of archiving and of the transmission of migration heritage? What should be archived and transmitted? These questions have broad implications for the shaping of history and memory.
{"title":"Shaping immigrant and ethnic heritage in North America: ethnic organizations and the documentary heritage","authors":"Dominique Daniel","doi":"10.4000/IDEAS.1089","DOIUrl":"https://doi.org/10.4000/IDEAS.1089","url":null,"abstract":"This article will explore some of the issues that immigrant and ethnic groups have dealt with when tackling the task of archiving – gathering and preserving the documents that tell the group’s story – and that of history- or memory-building through archives, a process I will refer to, for the sake of convenience, as ethnic archiving. The paper will trace the process of ethnic archiving through the case study of three specific groups–Finnish, German and Jewish communities in the United States–in the period preceding and following the ethnic “revival” of the 1960s. These groups were chosen because they illustrate the evolution of ethnic archiving among immigrant groups that arrived in the United States before the 1920s and the adoption of restrictive immigration laws. The similarities and differences these groups display are visible in the groups’ negotiations of, and answers to, the following questions: Who should be responsible for archiving? What should be the purpose of archiving and of the transmission of migration heritage? What should be archived and transmitted? These questions have broad implications for the shaping of history and memory.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91396154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Parallel share-nothing architectures are currently used to handle large amounts of data arriving in real-time for processing. The continuous increase on data volume and organization, introduce several limitations to scalability and quality of service (QoS) due to processing requirements and joins. Parallelism may improve query performance, however some business require timely results (results not faster or slower than specified) which, even with additional parallelism and significant upgrade costs (both monetary and due to disturbance of normal operations), cannot be guaranteed. We propose a timely-aware execution architecture, Cloudy, which balances data and queries processing among an elastic set of non-dedicated and heterogeneous nodes in order to provide scale-out performance and timely results, nor faster or slower, using both Complex Event Processing (CEP) and database (DB). Data is distributed by nodes accordingly with their hardware characteristics, then a set of layered mechanisms rearrange queries in order to provide in timely results. We present experimental evaluation of Cloudy and demonstrate its ability to provide timely results.
{"title":"Cloudy: heterogeneous middleware for in time queries processing","authors":"P. Martins, Maryam Abbasi, P. Furtado","doi":"10.1145/2513591.2513659","DOIUrl":"https://doi.org/10.1145/2513591.2513659","url":null,"abstract":"Parallel share-nothing architectures are currently used to handle large amounts of data arriving in real-time for processing. The continuous increase on data volume and organization, introduce several limitations to scalability and quality of service (QoS) due to processing requirements and joins. Parallelism may improve query performance, however some business require timely results (results not faster or slower than specified) which, even with additional parallelism and significant upgrade costs (both monetary and due to disturbance of normal operations), cannot be guaranteed. We propose a timely-aware execution architecture, Cloudy, which balances data and queries processing among an elastic set of non-dedicated and heterogeneous nodes in order to provide scale-out performance and timely results, nor faster or slower, using both Complex Event Processing (CEP) and database (DB). Data is distributed by nodes accordingly with their hardware characteristics, then a set of layered mechanisms rearrange queries in order to provide in timely results. We present experimental evaluation of Cloudy and demonstrate its ability to provide timely results.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"25 1","pages":"5-13"},"PeriodicalIF":0.0,"publicationDate":"2013-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85208324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cost models have been following the life cycle of databases. In the first generation, they have been used by query optimizers, where the cost-based optimization paradigm has been developed and supported by most of important optimizers. The spectacular development of complex decision queries amplifies the interest of the physical design phase (PhD), where cost models are used to select the relevant optimization techniques such as indexes, materialized views, etc. Most of these cost models are usually developed for one storage device (usually disk) with a well identified storage model and ignore the interaction between the different components of databases: interaction between optimization techniques, interaction between queries, interaction between devices, etc. In this paper, we propose a generic cost model for the physical design that can be instantiated for each need. We contribute an ontology describing storage devices. Furthermore, we provide an instantiation of our meta model for two interdependent problems: query scheduling and buffer management. The evaluation results show the applicability of our model as well as its effectiveness.
{"title":"How to exploit the device diversity and database interaction to propose a generic cost model?","authors":"Ladjel Bellatreche, Salmi Cheikh, S. Breß, Amira Kerkad, Ahcène Boukorca, Jalil Boukhobza","doi":"10.1145/2513591.2513660","DOIUrl":"https://doi.org/10.1145/2513591.2513660","url":null,"abstract":"Cost models have been following the life cycle of databases. In the first generation, they have been used by query optimizers, where the cost-based optimization paradigm has been developed and supported by most of important optimizers. The spectacular development of complex decision queries amplifies the interest of the physical design phase (PhD), where cost models are used to select the relevant optimization techniques such as indexes, materialized views, etc. Most of these cost models are usually developed for one storage device (usually disk) with a well identified storage model and ignore the interaction between the different components of databases: interaction between optimization techniques, interaction between queries, interaction between devices, etc. In this paper, we propose a generic cost model for the physical design that can be instantiated for each need. We contribute an ontology describing storage devices. Furthermore, we provide an instantiation of our meta model for two interdependent problems: query scheduling and buffer management. The evaluation results show the applicability of our model as well as its effectiveness.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"2 1","pages":"142-147"},"PeriodicalIF":0.0,"publicationDate":"2013-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88124975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fredton Doan, David Chiu, Brasil Perez Lukes, Jason Sawin, Gheorghi Guzun, G. Canahuate
Many large-scale read-only databases and data warehouses use bitmap indices in an effort to speed up data analysis. These indices have the dual properties of compressibility and being able to leverage fast bit-wise operations for query processing. Numerous hybrid run-length encoding compression schemes have been proposed that greatly compress the index and enable querying without the need to decompress. Typically, these schemes align their compression with the computer architecture's word size to further accelerate queries. Previously, we introduced Variable Length Compression (VLC), which uses a general encoding that can achieve better compression than word-aligned schemes. However, VLC's querying efficiency can vary widely due to mismatched alignment of compressed columns. In this paper, we present an optimizer which recompresses the bitmap over time. Based on query history, our approach allows the VLC user to specify the priority of compression versus query efficiency, then possibly recompress the bitmap accordingly. In an empirical study using scientific data sets, we showed that our approach was able to achieve both better compression ratios and query speedup over WAH and PLWAH. On the largest data set, our VLC optimizer compressed up to 1.73x better than WAH, and 1.46x over PLWAH. We also show a slight improvement in query efficiency in most experiments, while observing lucrative (11x to 16x) speedup in special cases.
{"title":"Dynamic bitmap index recompression through workload-based optimizations","authors":"Fredton Doan, David Chiu, Brasil Perez Lukes, Jason Sawin, Gheorghi Guzun, G. Canahuate","doi":"10.1145/2513591.2513641","DOIUrl":"https://doi.org/10.1145/2513591.2513641","url":null,"abstract":"Many large-scale read-only databases and data warehouses use bitmap indices in an effort to speed up data analysis. These indices have the dual properties of compressibility and being able to leverage fast bit-wise operations for query processing. Numerous hybrid run-length encoding compression schemes have been proposed that greatly compress the index and enable querying without the need to decompress. Typically, these schemes align their compression with the computer architecture's word size to further accelerate queries.\u0000 Previously, we introduced Variable Length Compression (VLC), which uses a general encoding that can achieve better compression than word-aligned schemes. However, VLC's querying efficiency can vary widely due to mismatched alignment of compressed columns. In this paper, we present an optimizer which recompresses the bitmap over time. Based on query history, our approach allows the VLC user to specify the priority of compression versus query efficiency, then possibly recompress the bitmap accordingly. In an empirical study using scientific data sets, we showed that our approach was able to achieve both better compression ratios and query speedup over WAH and PLWAH. On the largest data set, our VLC optimizer compressed up to 1.73x better than WAH, and 1.46x over PLWAH. We also show a slight improvement in query efficiency in most experiments, while observing lucrative (11x to 16x) speedup in special cases.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"47 1","pages":"96-105"},"PeriodicalIF":0.0,"publicationDate":"2013-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91322794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Multidimensional data are commonly utilized in many application areas like electronic shopping, cartography and many others. These data structures support various types of queries, e.g. point or range query. The range query retrieves all tuples of a multidimensional space matched by a query rectangle. Processing range queries in a multidimensional data structure has some performance issues, especially in the case of a higher space dimension or a lower query selectivity. As result, these data are often stored in an array or one-dimensional index like B-tree and range queries are processed with a sequence scan. Many real world queries can be transformed to a multiple range query: the query including more than one query rectangle. In this article, we aim our effort to processing of this type of the range query. First, we show an algorithm processing a sequence of range queries. Second, we introduce a special type of the multiple range query, the Cartesian range query. We show optimality of these algorithms from the IO and CPU costs point of view and we compare their performance with current methods. Although we introduce these algorithms for the R-tree, we show that these algorithms are appropriate for all multidimensional data structures with nested regions.
{"title":"On the efficiency of multiple range query processing in multidimensional data structures","authors":"P. Chovanec, M. Krátký","doi":"10.1145/2513591.2513656","DOIUrl":"https://doi.org/10.1145/2513591.2513656","url":null,"abstract":"Multidimensional data are commonly utilized in many application areas like electronic shopping, cartography and many others. These data structures support various types of queries, e.g. point or range query. The range query retrieves all tuples of a multidimensional space matched by a query rectangle. Processing range queries in a multidimensional data structure has some performance issues, especially in the case of a higher space dimension or a lower query selectivity. As result, these data are often stored in an array or one-dimensional index like B-tree and range queries are processed with a sequence scan. Many real world queries can be transformed to a multiple range query: the query including more than one query rectangle. In this article, we aim our effort to processing of this type of the range query. First, we show an algorithm processing a sequence of range queries. Second, we introduce a special type of the multiple range query, the Cartesian range query. We show optimality of these algorithms from the IO and CPU costs point of view and we compare their performance with current methods. Although we introduce these algorithms for the R-tree, we show that these algorithms are appropriate for all multidimensional data structures with nested regions.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"34 1","pages":"14-27"},"PeriodicalIF":0.0,"publicationDate":"2013-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76941647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}