{"title":"Integrated Design Solution for Distributed Databases Using Genetic Algorithms","authors":"Sukkyu Song","doi":"10.14257/IJDTA.2017.10.6.02","DOIUrl":null,"url":null,"abstract":"The design of distributed database systems has prompted many research problems. Among others, the issue of interdependency and interaction associated with data fragmentation, data allocation, and distributed query optimization still remains unanswered. These problems have been proven to be NP-complete or NP-hard, so most previous studies have addressed these problems in isolation by making simplified assumptions. However, these problems are interdependent and hence solving them independently results in inefficient solution overall. In this research, we develop an integrated distributed database design solution for three problems: partitioning data sets, allocating partitioned data sets among the sites of a network, and allocating operations as a problem of distributed query optimization. We use a transaction-based approach, wherein most important transactions are considered in determining the effective design of distributed database, and consider two types of transactions: OLTP (on-line transaction processing) and DSS (decision support system), for reflecting various distributed database design objectives such as total time minimization, response time minimization, and minimization of a combination of both. We employ genetic algorithms as searching methods for the best distributed database design solution. The integrated design solutions are determined by analyzing interactions between the problems in four stages: 1) between vertical fragmentation and operation allocation, 2) between vertical fragmentation and data allocation, 3) between data allocation and operation allocation, and 4) integration of all three problems, with the objectives of cost minimization and load balancing. Our integrated approach resulted in a cost effective distributed database design compared to the designs considering the problems in isolation.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"44 1","pages":"13-34"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of database theory and application","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14257/IJDTA.2017.10.6.02","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The design of distributed database systems has prompted many research problems. Among others, the issue of interdependency and interaction associated with data fragmentation, data allocation, and distributed query optimization still remains unanswered. These problems have been proven to be NP-complete or NP-hard, so most previous studies have addressed these problems in isolation by making simplified assumptions. However, these problems are interdependent and hence solving them independently results in inefficient solution overall. In this research, we develop an integrated distributed database design solution for three problems: partitioning data sets, allocating partitioned data sets among the sites of a network, and allocating operations as a problem of distributed query optimization. We use a transaction-based approach, wherein most important transactions are considered in determining the effective design of distributed database, and consider two types of transactions: OLTP (on-line transaction processing) and DSS (decision support system), for reflecting various distributed database design objectives such as total time minimization, response time minimization, and minimization of a combination of both. We employ genetic algorithms as searching methods for the best distributed database design solution. The integrated design solutions are determined by analyzing interactions between the problems in four stages: 1) between vertical fragmentation and operation allocation, 2) between vertical fragmentation and data allocation, 3) between data allocation and operation allocation, and 4) integration of all three problems, with the objectives of cost minimization and load balancing. Our integrated approach resulted in a cost effective distributed database design compared to the designs considering the problems in isolation.