{"title":"Parallel and Distributed Intra Query Transient Fault Tolerance Model via Parity Checking","authors":"Salisu Ibrahim Yusuf, S. Junaidu","doi":"10.1109/ICECCO.2018.8634666","DOIUrl":null,"url":null,"abstract":"In a distributed parallel query execution, complex queries are executed by splitting them into partially related simple subqueries executing each on a different node, communication between machines is often done by message exchange for shared-nothing architecture based grid. Integrity of messages can be lost by temporary or permanent interference along the communication network. Fault tolerance strategies are used to keep the system running in the presence of fault. This is traditionally done through query restart, replication or check pointing and other variations of these approaches to improve latency, restoration time and reduce cost of execution. These processes include: monitoring, detection and tolerance. Transient faults are caused by interference in the medium of exchange which may pass undetected but yielding an incorrect query result. Moreover, the traditional fault tolerance, there is a strong dependency between the nodes. In this research we proposed a model of a fault tolerance strategy that will allow self-detection and tolerate transient fault with less dependency between nodes. The model will be compared with the tradition strategies in terms of detection ability, inter node dependency, and cost of execution.","PeriodicalId":399326,"journal":{"name":"2018 14th International Conference on Electronics Computer and Computation (ICECCO)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 14th International Conference on Electronics Computer and Computation (ICECCO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICECCO.2018.8634666","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In a distributed parallel query execution, complex queries are executed by splitting them into partially related simple subqueries executing each on a different node, communication between machines is often done by message exchange for shared-nothing architecture based grid. Integrity of messages can be lost by temporary or permanent interference along the communication network. Fault tolerance strategies are used to keep the system running in the presence of fault. This is traditionally done through query restart, replication or check pointing and other variations of these approaches to improve latency, restoration time and reduce cost of execution. These processes include: monitoring, detection and tolerance. Transient faults are caused by interference in the medium of exchange which may pass undetected but yielding an incorrect query result. Moreover, the traditional fault tolerance, there is a strong dependency between the nodes. In this research we proposed a model of a fault tolerance strategy that will allow self-detection and tolerate transient fault with less dependency between nodes. The model will be compared with the tradition strategies in terms of detection ability, inter node dependency, and cost of execution.