Pub Date : 2024-11-07DOI: 10.1109/OJCOMS.2024.3493417
David Franco;Marivi Higuero;Ane Sanz;Juanjo Unzilla;Maider Huarte
The rapid emergence of new applications and services, and their increased demand for Quality of Service (QoS), have a significant impact on the development of today’s communication networks. As a result, communication networks are constantly evolving towards new architectures, such as the 6th Generation (6G) of communication systems, currently being studied in academic and research environments. One of the most critical aspects of designing communication networks is meeting the restricted delay and packet loss requirements. In this context, although link failure recovery has been widely addressed in the literature, it remains one of the main causes of packet losses and delays in the network. The failure recovery time in currently deployed technologies is still far from the sub-millisecond delay required in 6G networks. The time required for distributed network architectures to converge to a common network state after a link failure is excessive. In contrast, centralized architectures such as Software-Defined Networking (SDN) solve this problem but still need to notify the failure to a centralized controller, which increases the recovery time. This paper proposes a very Fast Failure Recovery (vFFR) strategy that can recover from link failures in sub-millisecond timescales by reacting directly from the data plane of the network devices while maintaining a synchronized state with the centralized controller. We first analyze current failure recovery strategies and classify them according to the techniques used to optimize failure recovery time. Afterward, we describe the design of a vFFR strategy that combines three data plane recovery algorithms to reduce latency and packet loss under varying network conditions. Our vFFR strategy has been modeled in P4 language and tested on an emulation platform to validate the three data plane recovery algorithms under different conditions. The results show that latency varies according to the alternate path selected in the recovery algorithm, and the packet loss rate remains constant even when the background traffic reaches 90% of the link capacity. In addition, the vFFR strategy has been implemented on Intel Tofino devices, achieving a failure recovery time lower than $500~mu s$