首页 > 最新文献

2021 10th Latin-American Symposium on Dependable Computing (LADC)最新文献

英文 中文
An Algorithm-Based Fault Tolerance Strategy for the Bitonic Sort Parallel Algorithm 基于算法的双次排序并行算法容错策略
Pub Date : 2021-11-01 DOI: 10.1109/ladc53747.2021.9672590
E. T. Camargo, E. P. Duarte
High Performance Computing (HPC) systems are employed to solve hard problems and rely on parallel algorithms which present very long execution times - up to several days. These systems are expensive in terms of the computational resources required, including energy consumption. Thus, after failures occur it is highly desirable to loose as little of the work that has already been done as possible. In this work we present an Algorithm-Based Fault Tolerance (ABFT) strategy that can be applied to make a robust version of any hypercube-based parallel algorithm. Note that we do not assume a physical hypercube: after nodes crash, fault-free nodes autonomously adapt themselves according to a logical topology called VCube, preserving several logarithmic properties. The proposed strategy guarantees that the algorithm does not halt even after up to (N - 1) nodes crash, in a system of N nodes. We use parallel sorting as a case study, describing how to make a fault-tolerant version of the Bitonic Sort parallel algorithm. The algorithm was implemented in MPI using ULMF to handle faults. Experimental results are presented showing the performance and robustness of the proposed solution.
高性能计算(HPC)系统被用来解决难题,并依赖于并行算法,这些算法的执行时间很长——长达几天。就所需的计算资源(包括能源消耗)而言,这些系统是昂贵的。因此,在发生故障后,尽可能少地丢失已经完成的工作是非常可取的。在这项工作中,我们提出了一种基于算法的容错(ABFT)策略,该策略可用于制作任何基于超立方体的并行算法的鲁棒版本。请注意,我们没有假设一个物理超立方体:在节点崩溃后,无故障节点会根据称为VCube的逻辑拓扑自主调整自己,从而保留几个对数属性。所提出的策略保证了在N个节点的系统中,即使在多达(N - 1)个节点崩溃后,算法也不会停止。我们使用并行排序作为案例研究,描述了如何制作一个容错版本的Bitonic Sort并行算法。该算法在MPI中实现,采用ULMF进行故障处理。实验结果表明了该方法的性能和鲁棒性。
{"title":"An Algorithm-Based Fault Tolerance Strategy for the Bitonic Sort Parallel Algorithm","authors":"E. T. Camargo, E. P. Duarte","doi":"10.1109/ladc53747.2021.9672590","DOIUrl":"https://doi.org/10.1109/ladc53747.2021.9672590","url":null,"abstract":"High Performance Computing (HPC) systems are employed to solve hard problems and rely on parallel algorithms which present very long execution times - up to several days. These systems are expensive in terms of the computational resources required, including energy consumption. Thus, after failures occur it is highly desirable to loose as little of the work that has already been done as possible. In this work we present an Algorithm-Based Fault Tolerance (ABFT) strategy that can be applied to make a robust version of any hypercube-based parallel algorithm. Note that we do not assume a physical hypercube: after nodes crash, fault-free nodes autonomously adapt themselves according to a logical topology called VCube, preserving several logarithmic properties. The proposed strategy guarantees that the algorithm does not halt even after up to (N - 1) nodes crash, in a system of N nodes. We use parallel sorting as a case study, describing how to make a fault-tolerant version of the Bitonic Sort parallel algorithm. The algorithm was implemented in MPI using ULMF to handle faults. Experimental results are presented showing the performance and robustness of the proposed solution.","PeriodicalId":376642,"journal":{"name":"2021 10th Latin-American Symposium on Dependable Computing (LADC)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124283527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Safety analysis of Brazilian suborbital launch operations based on system-theoretic approach 基于系统理论方法的巴西亚轨道发射安全分析
Pub Date : 2021-11-01 DOI: 10.1109/ladc53747.2021.9672557
A. V. D. Merladet, Rodrigo De Melo Silveira, S. Fugivara, C. Lahoz
The proposed analysis consists in identify aspects that can influence safety and mission fulfilment in Brazilian Suborbital Launch Operations through the application of System-Theoretic Process Analysis, a new hazard analysis technique capable of identifying potential hazardous design and operational flaws, including system design errors and unsafe interactions among multiple procedures and system components. This work identifies losses, hazards, system-level safety constraints, hierarchical control structure of the general system, unsafe control actions, loss scenarios that could occur and related causal factors, detecting possibilities of improvements for future launch operations of Brazilian suborbital launch vehicles by acting throughout the life cycle of the products to avoid undesired events or mitigate their consequences.
提出的分析包括通过应用系统理论过程分析确定可能影响巴西亚轨道发射操作安全和任务完成的方面,系统理论过程分析是一种新的危害分析技术,能够识别潜在的危险设计和操作缺陷,包括系统设计错误和多个程序和系统组件之间的不安全相互作用。这项工作确定了损失、危险、系统级安全约束、一般系统的分层控制结构、不安全的控制行动、可能发生的损失情景和相关的因果因素,通过在产品的整个生命周期中采取行动,检测巴西亚轨道运载火箭未来发射操作的改进可能性,以避免不希望发生的事件或减轻其后果。
{"title":"Safety analysis of Brazilian suborbital launch operations based on system-theoretic approach","authors":"A. V. D. Merladet, Rodrigo De Melo Silveira, S. Fugivara, C. Lahoz","doi":"10.1109/ladc53747.2021.9672557","DOIUrl":"https://doi.org/10.1109/ladc53747.2021.9672557","url":null,"abstract":"The proposed analysis consists in identify aspects that can influence safety and mission fulfilment in Brazilian Suborbital Launch Operations through the application of System-Theoretic Process Analysis, a new hazard analysis technique capable of identifying potential hazardous design and operational flaws, including system design errors and unsafe interactions among multiple procedures and system components. This work identifies losses, hazards, system-level safety constraints, hierarchical control structure of the general system, unsafe control actions, loss scenarios that could occur and related causal factors, detecting possibilities of improvements for future launch operations of Brazilian suborbital launch vehicles by acting throughout the life cycle of the products to avoid undesired events or mitigate their consequences.","PeriodicalId":376642,"journal":{"name":"2021 10th Latin-American Symposium on Dependable Computing (LADC)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127994644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
STAMP: Program 邮票:程序
Pub Date : 2021-11-01 DOI: 10.1109/ladc53747.2021.9672581
G. Olsson
{"title":"STAMP: Program","authors":"G. Olsson","doi":"10.1109/ladc53747.2021.9672581","DOIUrl":"https://doi.org/10.1109/ladc53747.2021.9672581","url":null,"abstract":"","PeriodicalId":376642,"journal":{"name":"2021 10th Latin-American Symposium on Dependable Computing (LADC)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133011888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SAFELIFE: Program SAFELIFE:程序
Pub Date : 2021-11-01 DOI: 10.1109/ladc53747.2021.9672574
{"title":"SAFELIFE: Program","authors":"","doi":"10.1109/ladc53747.2021.9672574","DOIUrl":"https://doi.org/10.1109/ladc53747.2021.9672574","url":null,"abstract":"","PeriodicalId":376642,"journal":{"name":"2021 10th Latin-American Symposium on Dependable Computing (LADC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122947226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modelling and Analysis of Fire Sprinklers by Verifying Dynamic Fault Trees 基于动态故障树验证的消防喷头建模与分析
Pub Date : 2021-11-01 DOI: 10.1109/ladc53747.2021.9672579
Shahid Khan, J. Katoen, Matthias Volk, Muhammad Ahmad Zafar, Falak Sher
We study the reliability analysis of fire sprinkler systems. We show that the characteristic features of Dugan's dynamic fault trees (DFTs) such as spare management, temporal ordering of failures and functional dependencies, are natural and adequate mechanisms to model various relevant phenomena in realistic fire sprinklers. For DFT analysis, we employ probabilistic model checking, an automated technique to assess reliability along with correctness. This is to date the most scalable, numerical DFT analysis technique. We show how standard reliability measures of fire sprinkler systems can be efficiently computed using the Storm model checker. In addition, we consider metrics beyond standard reliability, e.g., the probability to fail without going through a degradation phase and the worst-case reliability achieved after degradation. We illustrate our approach by fire sprinkler systems in shopping centers.
研究了消防喷水灭火系统的可靠性分析。研究表明,Dugan动态故障树(dft)的特征,如备用管理、故障时序排序和功能依赖,是模拟现实消防喷头中各种相关现象的自然和充分的机制。对于DFT分析,我们采用概率模型检查,这是一种评估可靠性和正确性的自动化技术。这是迄今为止最具扩展性的数值DFT分析技术。我们展示了如何使用Storm模型检查器有效地计算消防喷水灭火系统的标准可靠性措施。此外,我们考虑超出标准可靠性的度量,例如,不经过退化阶段而失效的概率和退化后达到的最坏情况可靠性。我们用购物中心的消防喷水灭火系统来说明我们的方法。
{"title":"Modelling and Analysis of Fire Sprinklers by Verifying Dynamic Fault Trees","authors":"Shahid Khan, J. Katoen, Matthias Volk, Muhammad Ahmad Zafar, Falak Sher","doi":"10.1109/ladc53747.2021.9672579","DOIUrl":"https://doi.org/10.1109/ladc53747.2021.9672579","url":null,"abstract":"We study the reliability analysis of fire sprinkler systems. We show that the characteristic features of Dugan's dynamic fault trees (DFTs) such as spare management, temporal ordering of failures and functional dependencies, are natural and adequate mechanisms to model various relevant phenomena in realistic fire sprinklers. For DFT analysis, we employ probabilistic model checking, an automated technique to assess reliability along with correctness. This is to date the most scalable, numerical DFT analysis technique. We show how standard reliability measures of fire sprinkler systems can be efficiently computed using the Storm model checker. In addition, we consider metrics beyond standard reliability, e.g., the probability to fail without going through a degradation phase and the worst-case reliability achieved after degradation. We illustrate our approach by fire sprinkler systems in shopping centers.","PeriodicalId":376642,"journal":{"name":"2021 10th Latin-American Symposium on Dependable Computing (LADC)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130622417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2021 10th Latin-American Symposium on Dependable Computing (LADC)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1