S. Stancu, A. Al-Shabibi, S. Batraneanu, S. Ballestrero, C. Caramarcu, B. Martin, D. Savu, R. Sjoen, L. Valsan
{"title":"ATLAS TDAQ系统的网络弹性实现","authors":"S. Stancu, A. Al-Shabibi, S. Batraneanu, S. Ballestrero, C. Caramarcu, B. Martin, D. Savu, R. Sjoen, L. Valsan","doi":"10.1109/RTC.2010.5750373","DOIUrl":null,"url":null,"abstract":"The ATLAS TDAQ (Trigger and Data Acquisition) system performs the real-time selection of events produced by the detector. For this purpose approximately 2000 computers are deployed and interconnected through various high speed networks, whose architecture has already been described. This article focuses on the implementation and validation of network connectivity resiliency (previously presented at a conceptual level). Redundancy and eventually load balancing are achieved through the synergy of various protocols: link aggregation, OSPF (Open Shortest Path First), VRRP (Virtual Router Redundancy Protocol), MST (Multiple Spanning Trees). An innovative method for cost-effective redundant connectivity of high-throughput high-availability servers is presented. Furthermore, real-life examples showing how redundancy works, and more importantly how it might fail despite careful planning are presented.","PeriodicalId":345878,"journal":{"name":"2010 17th IEEE-NPSS Real Time Conference","volume":"56 2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Network resiliency implementation in the ATLAS TDAQ system\",\"authors\":\"S. Stancu, A. Al-Shabibi, S. Batraneanu, S. Ballestrero, C. Caramarcu, B. Martin, D. Savu, R. Sjoen, L. Valsan\",\"doi\":\"10.1109/RTC.2010.5750373\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The ATLAS TDAQ (Trigger and Data Acquisition) system performs the real-time selection of events produced by the detector. For this purpose approximately 2000 computers are deployed and interconnected through various high speed networks, whose architecture has already been described. This article focuses on the implementation and validation of network connectivity resiliency (previously presented at a conceptual level). Redundancy and eventually load balancing are achieved through the synergy of various protocols: link aggregation, OSPF (Open Shortest Path First), VRRP (Virtual Router Redundancy Protocol), MST (Multiple Spanning Trees). An innovative method for cost-effective redundant connectivity of high-throughput high-availability servers is presented. Furthermore, real-life examples showing how redundancy works, and more importantly how it might fail despite careful planning are presented.\",\"PeriodicalId\":345878,\"journal\":{\"name\":\"2010 17th IEEE-NPSS Real Time Conference\",\"volume\":\"56 2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-05-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 17th IEEE-NPSS Real Time Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/RTC.2010.5750373\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 17th IEEE-NPSS Real Time Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RTC.2010.5750373","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Network resiliency implementation in the ATLAS TDAQ system
The ATLAS TDAQ (Trigger and Data Acquisition) system performs the real-time selection of events produced by the detector. For this purpose approximately 2000 computers are deployed and interconnected through various high speed networks, whose architecture has already been described. This article focuses on the implementation and validation of network connectivity resiliency (previously presented at a conceptual level). Redundancy and eventually load balancing are achieved through the synergy of various protocols: link aggregation, OSPF (Open Shortest Path First), VRRP (Virtual Router Redundancy Protocol), MST (Multiple Spanning Trees). An innovative method for cost-effective redundant connectivity of high-throughput high-availability servers is presented. Furthermore, real-life examples showing how redundancy works, and more importantly how it might fail despite careful planning are presented.