Scientific Workflow Management Systems (S-WFMS), such as Kepler, have proven to be an important tools in scientific problem solving. Interestingly, S-WFMS fault-tolerance and failure recovery is still an open topic. It often involves classic fault-tolerance mechanisms, such as alternative versions and rollback with re-runs, reliance on the fault-tolerance capabilities provided by subcomponents and lower layers such as schedulers, Grid and cloud resources, or the underlying operating systems. When failures occur at the underlying layers, a workflow system sees this as failed steps in the process, but frequently without additional detail. This limits S-WFMS' ability to recover from failures. We describe a light weight end-to-end S-WFMS fault-tolerance framework, developed to handle failure patterns that occur in some real-life scientific workflows. Capabilities and limitations of the framework are discussed and assessed using simulations. The results show that the solution considerably increase workflow reliability and execution time stability.
{"title":"On High-Assurance Scientific Workflows","authors":"M. Vouk, Pierre Mouallem","doi":"10.1109/HASE.2011.58","DOIUrl":"https://doi.org/10.1109/HASE.2011.58","url":null,"abstract":"Scientific Workflow Management Systems (S-WFMS), such as Kepler, have proven to be an important tools in scientific problem solving. Interestingly, S-WFMS fault-tolerance and failure recovery is still an open topic. It often involves classic fault-tolerance mechanisms, such as alternative versions and rollback with re-runs, reliance on the fault-tolerance capabilities provided by subcomponents and lower layers such as schedulers, Grid and cloud resources, or the underlying operating systems. When failures occur at the underlying layers, a workflow system sees this as failed steps in the process, but frequently without additional detail. This limits S-WFMS' ability to recover from failures. We describe a light weight end-to-end S-WFMS fault-tolerance framework, developed to handle failure patterns that occur in some real-life scientific workflows. Capabilities and limitations of the framework are discussed and assessed using simulations. The results show that the solution considerably increase workflow reliability and execution time stability.","PeriodicalId":403140,"journal":{"name":"2011 IEEE 13th International Symposium on High-Assurance Systems Engineering","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133812437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As modern software systems operate in a highly dynamic context, they have to adapt their behaviour in response to changes in their operational environment or/and requirements. Triggering adaptation depends on detecting quality of service (QoS) violations by comparing observed QoS values to predefined thresholds. These threshold-based adaptation approaches result in late adaptations as they wait until violations have occurred. This may lead to undesired consequences such as late response to critical events. In this paper we introduce a statistical approach CREQA - Control Charts for the Runtime Evaluation of QoS Attributes. This approach estimates at runtime capability of a system, and then it monitors and provides early detection of any changes in QoS values allowing timely intervention in order to prevent undesired consequences. We validated our approach using a series of experiments and response time datasets from real world web services.
{"title":"Using Automated Control Charts for the Runtime Evaluation of QoS Attributes","authors":"Ayman A. Amin, A. Colman, Lars Grunske","doi":"10.1109/HASE.2011.20","DOIUrl":"https://doi.org/10.1109/HASE.2011.20","url":null,"abstract":"As modern software systems operate in a highly dynamic context, they have to adapt their behaviour in response to changes in their operational environment or/and requirements. Triggering adaptation depends on detecting quality of service (QoS) violations by comparing observed QoS values to predefined thresholds. These threshold-based adaptation approaches result in late adaptations as they wait until violations have occurred. This may lead to undesired consequences such as late response to critical events. In this paper we introduce a statistical approach CREQA - Control Charts for the Runtime Evaluation of QoS Attributes. This approach estimates at runtime capability of a system, and then it monitors and provides early detection of any changes in QoS values allowing timely intervention in order to prevent undesired consequences. We validated our approach using a series of experiments and response time datasets from real world web services.","PeriodicalId":403140,"journal":{"name":"2011 IEEE 13th International Symposium on High-Assurance Systems Engineering","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133534508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
An ensemble is a collection of independent processes, each tasked with drawing potentially differing conclusions about the same data. Using Petri nets, this paper formally describes how ensembles are organized and their behavior coordinated to effect distributed discrete event control of an ocean turbine prototype. Compositions, duals, reverses, and cliques formed over known Petri net graphs comprise the building blocks of the proposed ensemble coordination strategy. The behavior of an ensemble of controllers tasked with fault triage are subject to constraints formulated herein. The controller tasked with prognosis and health management (PHM) itself uses an ensemble of classifiers to detect faults. This ensemble is subject to constraints imposed by stream processing, which require a non-blocking form of rendezvous synchronization. Furthermore, results from each classifier must be fused in a manner that rewards that classifier's ability to predict faults. We identify two competing merit schemes -- one based on individual classifier performance and the other on performance of the sub-ensembles to which that classifier participates. Finally, we model check these Petri nets and report their results.
{"title":"Ensemble Coordination for Discrete Event Control","authors":"J. Sloan, T. Khoshgoftaar","doi":"10.1109/HASE.2011.26","DOIUrl":"https://doi.org/10.1109/HASE.2011.26","url":null,"abstract":"An ensemble is a collection of independent processes, each tasked with drawing potentially differing conclusions about the same data. Using Petri nets, this paper formally describes how ensembles are organized and their behavior coordinated to effect distributed discrete event control of an ocean turbine prototype. Compositions, duals, reverses, and cliques formed over known Petri net graphs comprise the building blocks of the proposed ensemble coordination strategy. The behavior of an ensemble of controllers tasked with fault triage are subject to constraints formulated herein. The controller tasked with prognosis and health management (PHM) itself uses an ensemble of classifiers to detect faults. This ensemble is subject to constraints imposed by stream processing, which require a non-blocking form of rendezvous synchronization. Furthermore, results from each classifier must be fused in a manner that rewards that classifier's ability to predict faults. We identify two competing merit schemes -- one based on individual classifier performance and the other on performance of the sub-ensembles to which that classifier participates. Finally, we model check these Petri nets and report their results.","PeriodicalId":403140,"journal":{"name":"2011 IEEE 13th International Symposium on High-Assurance Systems Engineering","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131300237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The importance of the relationship between the software variability and defect proneness of software modules is well recognized. The utilization of variability can strengthen the software versatile, but as the software variability increases, the software complexity can increase correspondingly. Most variability realization techniques are based on configuration, and that the variability realization code correlate with one configuration options may scatter across many software modules, which could easily induce defect and lead to dead code. This paper analyzes the preprocessor based realization of the variability, series of variability metrics are defined and the variability from different granulites is analyzed to verify whether the high variability can cause high defect. Experimental result shows that the software variability and the defect have statistically significant relationship.
{"title":"On the Relationship between Preprocessor-Based Software Variability and Software Defects","authors":"Kunming Nie, Li Zhang","doi":"10.1109/HASE.2011.44","DOIUrl":"https://doi.org/10.1109/HASE.2011.44","url":null,"abstract":"The importance of the relationship between the software variability and defect proneness of software modules is well recognized. The utilization of variability can strengthen the software versatile, but as the software variability increases, the software complexity can increase correspondingly. Most variability realization techniques are based on configuration, and that the variability realization code correlate with one configuration options may scatter across many software modules, which could easily induce defect and lead to dead code. This paper analyzes the preprocessor based realization of the variability, series of variability metrics are defined and the variability from different granulites is analyzed to verify whether the high variability can cause high defect. Experimental result shows that the software variability and the defect have statistically significant relationship.","PeriodicalId":403140,"journal":{"name":"2011 IEEE 13th International Symposium on High-Assurance Systems Engineering","volume":"210 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116488257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We are proposing a systematic approach to building reliable distributed applications. The main objective of this approach is to consider reliability from application inception to completion, adding reliability patterns along the lifecycle and in all architectural layers. We start by enumerating the possible failures of the application, considering every activity in the use cases of the application. The identified failures are then handled by applying reliability policies. We evaluate the benefits of this approach and compare our approach to others.
{"title":"Enumerating Software Failures to Build Dependable Distributed Applications","authors":"Ingrid A. Buckley, E. Fernández","doi":"10.1109/HASE.2011.35","DOIUrl":"https://doi.org/10.1109/HASE.2011.35","url":null,"abstract":"We are proposing a systematic approach to building reliable distributed applications. The main objective of this approach is to consider reliability from application inception to completion, adding reliability patterns along the lifecycle and in all architectural layers. We start by enumerating the possible failures of the application, considering every activity in the use cases of the application. The identified failures are then handled by applying reliability policies. We evaluate the benefits of this approach and compare our approach to others.","PeriodicalId":403140,"journal":{"name":"2011 IEEE 13th International Symposium on High-Assurance Systems Engineering","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122440490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The need for efficient processes for implementing security properties in systems that are made of networked embedded devices motivates a fresh look at how security is "added" to embedded components at late development stages. In this short paper we propose a security domain model intended for considering security aspects at early embedded systems design stages. It consists of two components: (1) An interface to the system engineering models in which aspects that are relevant to security are extracted, and (2) Elements that are specific to known security solutions and realisable in collections or libraries. The paper shows that what this security domain model has in common with other software security models is the need for representation of security properties, and what it has in common with other embedded systems models is the need for representation of resources. The proposed model is described and illustrated by application to a mesh communication for crisis management scenario.
{"title":"Towards a Security Domain Model for Embedded Systems","authors":"S. Nadjm-Tehrani, Maria Vasilevskaya","doi":"10.1109/HASE.2011.19","DOIUrl":"https://doi.org/10.1109/HASE.2011.19","url":null,"abstract":"The need for efficient processes for implementing security properties in systems that are made of networked embedded devices motivates a fresh look at how security is \"added\" to embedded components at late development stages. In this short paper we propose a security domain model intended for considering security aspects at early embedded systems design stages. It consists of two components: (1) An interface to the system engineering models in which aspects that are relevant to security are extracted, and (2) Elements that are specific to known security solutions and realisable in collections or libraries. The paper shows that what this security domain model has in common with other software security models is the need for representation of security properties, and what it has in common with other embedded systems models is the need for representation of resources. The proposed model is described and illustrated by application to a mesh communication for crisis management scenario.","PeriodicalId":403140,"journal":{"name":"2011 IEEE 13th International Symposium on High-Assurance Systems Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131272730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhongxing Yu, Hai Hu, Chenggang Bai, K. Cai, W. E. Wong
Graphical User Interfaces (GUIs) have become an important and accepted way of interacting with today's software. Fault localization is considered to be one of the most expensive program debugging activities. This paper presents a fault localization technique designed for GUI software. Unlike traditional software, GUI test cases usually are event sequences and each individual event has a unique corresponding event handler. We apply data mining techniques to the event sequences and their output data in terms of failure detections collected in the testing phase to rank the fault proneness of the event handlers for fault localization. Our method applies N-gram analysis to rank the event handlers of GUI programs and data collected from case studies on four real life GUI programs demonstrate the effectiveness of the proposed technique.
{"title":"GUI Software Fault Localization Using N-gram Analysis","authors":"Zhongxing Yu, Hai Hu, Chenggang Bai, K. Cai, W. E. Wong","doi":"10.1109/HASE.2011.29","DOIUrl":"https://doi.org/10.1109/HASE.2011.29","url":null,"abstract":"Graphical User Interfaces (GUIs) have become an important and accepted way of interacting with today's software. Fault localization is considered to be one of the most expensive program debugging activities. This paper presents a fault localization technique designed for GUI software. Unlike traditional software, GUI test cases usually are event sequences and each individual event has a unique corresponding event handler. We apply data mining techniques to the event sequences and their output data in terms of failure detections collected in the testing phase to rank the fault proneness of the event handlers for fault localization. Our method applies N-gram analysis to rank the event handlers of GUI programs and data collected from case studies on four real life GUI programs demonstrate the effectiveness of the proposed technique.","PeriodicalId":403140,"journal":{"name":"2011 IEEE 13th International Symposium on High-Assurance Systems Engineering","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131278242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This work proposes a novel approach -- Discriminatively Fortified Computing (DFC) -- to achievehardware-efficient reliable computing without deterministically knowing the location and occurrence time of hardware defects and design faults. The key insights behind DFC comprise:1) different system components contribute differently to the overall correctness of a target application, therefore should be treated distinctively, and 2) abundant error resilience exists inherently in many practical algorithms, such as signal processing, visual perception, and artificial learning. Such error resilience can be significantly improved with effective hardware support. The major contributions of this work include 1) the development of a complete methodology to perform sensitivity and criticality analysis of hardware redundancy, 2) a novel problem formulation and an efficient heuristic methodology to discriminatively allocate hardware redundancy among a targetdesign's key components in order to maximize its overall error resilience, 3) an academic prototype of DFC computing device that illustrates a 4 times improvement of error resilience for aH.264 encoder implemented with an FPGA device.
{"title":"Discriminatively Fortified Computing with Reconfigurable Digital Fabric","authors":"Mingjie Lin, Yu Bai, J. Wawrzynek","doi":"10.1109/HASE.2011.49","DOIUrl":"https://doi.org/10.1109/HASE.2011.49","url":null,"abstract":"This work proposes a novel approach -- Discriminatively Fortified Computing (DFC) -- to achievehardware-efficient reliable computing without deterministically knowing the location and occurrence time of hardware defects and design faults. The key insights behind DFC comprise:1) different system components contribute differently to the overall correctness of a target application, therefore should be treated distinctively, and 2) abundant error resilience exists inherently in many practical algorithms, such as signal processing, visual perception, and artificial learning. Such error resilience can be significantly improved with effective hardware support. The major contributions of this work include 1) the development of a complete methodology to perform sensitivity and criticality analysis of hardware redundancy, 2) a novel problem formulation and an efficient heuristic methodology to discriminatively allocate hardware redundancy among a targetdesign's key components in order to maximize its overall error resilience, 3) an academic prototype of DFC computing device that illustrates a 4 times improvement of error resilience for aH.264 encoder implemented with an FPGA device.","PeriodicalId":403140,"journal":{"name":"2011 IEEE 13th International Symposium on High-Assurance Systems Engineering","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131351085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper discusses the importance of a Patient Centric Health Record system. Such systems can empower patients to participate in improving health care quality. It would also provide an economically viable solution to the need for better healthcare without escalating costs by avoiding duplication. The proposed system is Web-based so patients and healthcare providers can access it from any location. Moreover the architecture is cloud-based so large amount of data can be stored without any restrictions. Also the use of cloud computing architecture will allow consumers to address the challenge of sharing medical data that is overly complex and highly expensive to address with traditional technologies.
{"title":"Personal Health Record System and Integration Techniques with Various Electronic Medical Record Systems","authors":"V. Ved, V. Tyagi, Ankur Agarwal, A. Pandya","doi":"10.1109/HASE.2011.63","DOIUrl":"https://doi.org/10.1109/HASE.2011.63","url":null,"abstract":"This paper discusses the importance of a Patient Centric Health Record system. Such systems can empower patients to participate in improving health care quality. It would also provide an economically viable solution to the need for better healthcare without escalating costs by avoiding duplication. The proposed system is Web-based so patients and healthcare providers can access it from any location. Moreover the architecture is cloud-based so large amount of data can be stored without any restrictions. Also the use of cloud computing architecture will allow consumers to address the challenge of sharing medical data that is overly complex and highly expensive to address with traditional technologies.","PeriodicalId":403140,"journal":{"name":"2011 IEEE 13th International Symposium on High-Assurance Systems Engineering","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129985292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recently, "rootkit" becomes a popular hacker malware on the Internet, which controls the hosts on the Internet by hiding itself, and raises a serious security threat. Existing host-based and hardware-based solutions have some disadvantages, such as hardware overhead and being discovered by root kits, where the development of virtualization technology provides a better solution to avoid those. Virtual machine monitor has the highest authority on the virtual machine, and has the right to control the activities in the virtual machine without being found by root kits in the virtual machines. We propose VM Detector based on this hardware virtualization technology, using multi-view detection mechanism, to detect hidden processes inside the virtual machine on many aspects, then to improve the virtual machine's security. Through several experiments, VM Detector carried on the process detection effectively, and introduced less than 10% performance overhead.
{"title":"VMDetector: A VMM-based Platform to Detect Hidden Process by Multi-view Comparison","authors":"Y. Wang, Chunming Hu, B. Li","doi":"10.1109/HASE.2011.41","DOIUrl":"https://doi.org/10.1109/HASE.2011.41","url":null,"abstract":"Recently, \"rootkit\" becomes a popular hacker malware on the Internet, which controls the hosts on the Internet by hiding itself, and raises a serious security threat. Existing host-based and hardware-based solutions have some disadvantages, such as hardware overhead and being discovered by root kits, where the development of virtualization technology provides a better solution to avoid those. Virtual machine monitor has the highest authority on the virtual machine, and has the right to control the activities in the virtual machine without being found by root kits in the virtual machines. We propose VM Detector based on this hardware virtualization technology, using multi-view detection mechanism, to detect hidden processes inside the virtual machine on many aspects, then to improve the virtual machine's security. Through several experiments, VM Detector carried on the process detection effectively, and introduced less than 10% performance overhead.","PeriodicalId":403140,"journal":{"name":"2011 IEEE 13th International Symposium on High-Assurance Systems Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131084572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}