Space shuttle fault tolerance: Analog and digital teamwork

H. Blair-Smith
{"title":"Space shuttle fault tolerance: Analog and digital teamwork","authors":"H. Blair-Smith","doi":"10.1109/DASC.2009.5347450","DOIUrl":null,"url":null,"abstract":"The Space Shuttle control system (including the avionics suite) was developed during the 1970s to meet stringent survivability requirements that were then extraordinary but today may serve as a standard against which modern avionics can be measured. In 30 years of service, only two major malfunctions have occurred, both due to failures far beyond the reach of fault tolerance technology: the explosion of an external fuel tank, and the destruction of a launch-damaged wing by re-entry friction. The Space Shuttle is among the earliest systems (if not the earliest) designed to a “FO-FO-FS” criterion, meaning that it had to Fail (fully) Operational after any one failure, then Fail Operational after any second failure (even of the same kind of unit), then Fail Safe after most kinds of third failure. The computer system had to meet this criterion using a Redundant Set of 4 computers plus a backup of the same type, which was (ostensibly!) a COTS type. Quadruple redundancy was also employed in the hydraulic actuators for elevons and rudder. Sensors were installed with quadruple, triple, or dual redundancy. For still greater fault tolerance, these three redundancies (sensors, computers, actuators) were made independent of each other so that the reliability criterion applies to each category separately. The mission rule for Shuttle flights, as distinct from the design criterion, became “FO-FS,” so that a mission continues intact after any one failure, but is terminated with a safe return after any second failure of the same type. To avoid an unrecoverable flat spin during the most dynamic flight phases, the overall system had to continue safe operation within 400 msec of any failure, but the decision to shut down a computer had to be made by the crew. Among the interesting problems to be solved were “control slivering” and “sync holes.” The first flight test (Approach and Landing only) was the proof of the pudding: when a key wire harness solder joint was jarred loose by the Shuttle's being popped off the back of its 747 mother ship, one of the computers “went bananas” (actual quote from an IBM expert).","PeriodicalId":313168,"journal":{"name":"2009 IEEE/AIAA 28th Digital Avionics Systems Conference","volume":"73 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE/AIAA 28th Digital Avionics Systems Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DASC.2009.5347450","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

Abstract

The Space Shuttle control system (including the avionics suite) was developed during the 1970s to meet stringent survivability requirements that were then extraordinary but today may serve as a standard against which modern avionics can be measured. In 30 years of service, only two major malfunctions have occurred, both due to failures far beyond the reach of fault tolerance technology: the explosion of an external fuel tank, and the destruction of a launch-damaged wing by re-entry friction. The Space Shuttle is among the earliest systems (if not the earliest) designed to a “FO-FO-FS” criterion, meaning that it had to Fail (fully) Operational after any one failure, then Fail Operational after any second failure (even of the same kind of unit), then Fail Safe after most kinds of third failure. The computer system had to meet this criterion using a Redundant Set of 4 computers plus a backup of the same type, which was (ostensibly!) a COTS type. Quadruple redundancy was also employed in the hydraulic actuators for elevons and rudder. Sensors were installed with quadruple, triple, or dual redundancy. For still greater fault tolerance, these three redundancies (sensors, computers, actuators) were made independent of each other so that the reliability criterion applies to each category separately. The mission rule for Shuttle flights, as distinct from the design criterion, became “FO-FS,” so that a mission continues intact after any one failure, but is terminated with a safe return after any second failure of the same type. To avoid an unrecoverable flat spin during the most dynamic flight phases, the overall system had to continue safe operation within 400 msec of any failure, but the decision to shut down a computer had to be made by the crew. Among the interesting problems to be solved were “control slivering” and “sync holes.” The first flight test (Approach and Landing only) was the proof of the pudding: when a key wire harness solder joint was jarred loose by the Shuttle's being popped off the back of its 747 mother ship, one of the computers “went bananas” (actual quote from an IBM expert).
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
航天飞机容错:模拟和数字团队合作
航天飞机控制系统(包括航空电子设备套件)在1970年期间发展,以满足严格的生存能力要求,当时是非凡的,但是今天可能作为一个标准,反对现代航空电子设备可以测量。在30年的服役中,只发生过两次重大故障,都是由于故障远远超出了容错技术的范围:外部燃料箱爆炸,以及再入摩擦破坏了发射损坏的机翼。航天飞机是按照“FO-FO-FS”标准设计的最早的系统之一(如果不是最早的),这意味着它必须在任何一次故障后失效(完全)运行,然后在任何第二次故障后失效运行(即使是同一种单元),然后在大多数类型的第三次故障后失效安全。计算机系统必须使用由4台计算机组成的冗余集加上相同类型的备份来满足这个标准,这(表面上)是COTS类型。升降舵和方向舵的液压执行机构也采用了四重冗余。传感器安装有四倍、三倍或双重冗余。为了获得更大的容错性,这三种冗余(传感器、计算机、执行器)相互独立,以便可靠性标准分别适用于每个类别。与设计标准不同,航天飞机飞行的任务规则变成了“FO-FS”,即在任何一次失败后,任务继续完整,但在任何第二次相同类型的失败后,任务以安全返回而终止。为了避免在最动态的飞行阶段出现无法恢复的平旋,整个系统必须在任何故障发生后400毫秒内继续安全运行,但是关闭计算机的决定必须由机组人员做出。需要解决的有趣问题包括“控制滑动”和“同步漏洞”。第一次飞行测试(仅在着陆和降落时)是布丁的证明:当一个关键的线束焊点因航天飞机从747母船的后部弹出而松动时,其中一台计算机“发疯了”(实际引用自IBM专家)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Challenges in updating military safety-critical hardware Simplified dynamic density: A metric for dynamic airspace configuration and NextGen analysis Analysis of divergences from area navigation departure routes at DFW airport Analysis of advanced flight management systems (FMS), flight management computer (FMC) field observations, trials; lateral and vertical path integration Trajectory prediction credibility concept
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1