Motivated by the rapid deployment of 5G, we carry out an in-depth measurement study of the performance, power consumption, and application quality-of-experience (QoE) of commercial 5G networks in the wild. We examine different 5G carriers, deployment schemes (Non-Standalone, NSA vs. Standalone, SA), radio bands (mmWave and sub 6-GHz), protocol configurations (_e.g._ Radio Resource Control state transitions), mobility patterns (stationary, walking, driving), client devices (_i.e._ User Equipment), and upper-layer applications (file download, video streaming, and web browsing). Our findings reveal key characteristics of commercial 5G in terms of throughput, latency, handover behaviors, radio state transitions, and radio power consumption under the above diverse scenarios, with detailed comparisons to 4G/LTE networks. Furthermore, our study provides key insights into how upper-layer applications should best utilize 5G by balancing the critical tradeoff between performance and energy consumption, as well as by taking into account the availability of both network and computation resources. We have released the datasets and tools of our study at https://github.com/SIGCOMM21-5G/artifact.
{"title":"A variegated look at 5G in the wild: performance, power, and QoE implications","authors":"Arvind Narayanan, Xumiao Zhang, Ruiyang Zhu, Ahmad Hassan, Shuowei Jin, Xiao Zhu, Xiaoxuan Zhang, Denis Rybkin, Zhengxuan Yang, Z. Mao, Feng Qian, Zhi-Li Zhang","doi":"10.1145/3452296.3472923","DOIUrl":"https://doi.org/10.1145/3452296.3472923","url":null,"abstract":"Motivated by the rapid deployment of 5G, we carry out an in-depth measurement study of the performance, power consumption, and application quality-of-experience (QoE) of commercial 5G networks in the wild. We examine different 5G carriers, deployment schemes (Non-Standalone, NSA vs. Standalone, SA), radio bands (mmWave and sub 6-GHz), protocol configurations (_e.g._ Radio Resource Control state transitions), mobility patterns (stationary, walking, driving), client devices (_i.e._ User Equipment), and upper-layer applications (file download, video streaming, and web browsing). Our findings reveal key characteristics of commercial 5G in terms of throughput, latency, handover behaviors, radio state transitions, and radio power consumption under the above diverse scenarios, with detailed comparisons to 4G/LTE networks. Furthermore, our study provides key insights into how upper-layer applications should best utilize 5G by balancing the critical tradeoff between performance and energy consumption, as well as by taking into account the availability of both network and computation resources. We have released the datasets and tools of our study at https://github.com/SIGCOMM21-5G/artifact.","PeriodicalId":20487,"journal":{"name":"Proceedings of the 2021 ACM SIGCOMM 2021 Conference","volume":"80 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75334375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bingchuan Tian, Jiaqi Gao, Mengqi Liu, Ennan Zhai, Yanqing Chen, Yu Zhou, Li Dai, Feng Yan, Mengjing Ma, Ming Tang, Jie Lu, Xionglie Wei, H. Liu, Ming Zhang, Chenfei Tian, Minlan Yu
This paper presents Aquila, the first practically usable verification system for Alibaba's production-scale programmable data planes. Aquila addresses four challenges in building a practically usable verification: (1) specification complexity; (2) verification scalability; (3) bug localization; and (4) verifier self validation. Specifically, first, Aquila proposes a high-level language that facilitates easy expression of specifications, reducing lines of specification codes by tenfold compared to the state-of-the-art. Second, Aquila constructs a sequential encoding algorithm to circumvent the exponential growth of states associated with the upscaling of data plane programs to production level. Third, Aquila adopts an automatic and accurate bug localization approach that can narrow down suspects based on reported violations and pinpoint the culprit by simulating a fix for each suspect. Fourth and finally, Aquila can perform self validation based on refinement proof, which involves the construction of an alternative representation and subsequent equivalence checking. To this date, Aquila has been used in the verification of our production-scale programmable edge networks for over half a year, and it has successfully prevented many potential failures resulting from data plane bugs.
{"title":"Aquila","authors":"Bingchuan Tian, Jiaqi Gao, Mengqi Liu, Ennan Zhai, Yanqing Chen, Yu Zhou, Li Dai, Feng Yan, Mengjing Ma, Ming Tang, Jie Lu, Xionglie Wei, H. Liu, Ming Zhang, Chenfei Tian, Minlan Yu","doi":"10.1145/3452296.3472937","DOIUrl":"https://doi.org/10.1145/3452296.3472937","url":null,"abstract":"This paper presents Aquila, the first practically usable verification system for Alibaba's production-scale programmable data planes. Aquila addresses four challenges in building a practically usable verification: (1) specification complexity; (2) verification scalability; (3) bug localization; and (4) verifier self validation. Specifically, first, Aquila proposes a high-level language that facilitates easy expression of specifications, reducing lines of specification codes by tenfold compared to the state-of-the-art. Second, Aquila constructs a sequential encoding algorithm to circumvent the exponential growth of states associated with the upscaling of data plane programs to production level. Third, Aquila adopts an automatic and accurate bug localization approach that can narrow down suspects based on reported violations and pinpoint the culprit by simulating a fix for each suspect. Fourth and finally, Aquila can perform self validation based on refinement proof, which involves the construction of an alternative representation and subsequent equivalence checking. To this date, Aquila has been used in the verification of our production-scale programmable edge networks for over half a year, and it has successfully prevented many potential failures resulting from data plane bugs.","PeriodicalId":20487,"journal":{"name":"Proceedings of the 2021 ACM SIGCOMM 2021 Conference","volume":"32 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72953419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
LoRa has seen widespread adoption as a long range IoT technology. As the number of LoRa deployments grow, packet collisions undermine its overall network throughput. In this paper, we propose a novel interference cancellation technique -- Concurrent Interference Cancellation (CIC), that enables concurrent decoding of multiple collided LoRa packets. CIC fundamentally differs from existing approaches as it demodulates symbols by canceling out all other interfering symbols. It achieves this cancellation by carefully selecting a set of sub-symbols -- pieces of the original symbol such that no interfering symbol is common across all sub-symbols in this set. Thus, after demodulating each sub-symbol, an intersection across their spectra cancels out all the interfering symbols. Through LoRa deployments using COTS devices, we demonstrate that CIC can increase the network capacity of standard LoRa by up to 10x and up to 4x over the state-of-the-art research. While beneficial across all scenarios, CIC has even more significant benefits under low SNR conditions that are common to LoRa deployments, in which prior approaches appear to perform quite poorly.
{"title":"Concurrent interference cancellation: decoding multi-packet collisions in LoRa","authors":"Muhammad Osama Shahid, Millan Philipose, Krishna Chintalapudi, Suman Banerjee, Bhuvana Krishnaswamy","doi":"10.1145/3452296.3472931","DOIUrl":"https://doi.org/10.1145/3452296.3472931","url":null,"abstract":"LoRa has seen widespread adoption as a long range IoT technology. As the number of LoRa deployments grow, packet collisions undermine its overall network throughput. In this paper, we propose a novel interference cancellation technique -- Concurrent Interference Cancellation (CIC), that enables concurrent decoding of multiple collided LoRa packets. CIC fundamentally differs from existing approaches as it demodulates symbols by canceling out all other interfering symbols. It achieves this cancellation by carefully selecting a set of sub-symbols -- pieces of the original symbol such that no interfering symbol is common across all sub-symbols in this set. Thus, after demodulating each sub-symbol, an intersection across their spectra cancels out all the interfering symbols. Through LoRa deployments using COTS devices, we demonstrate that CIC can increase the network capacity of standard LoRa by up to 10x and up to 4x over the state-of-the-art research. While beneficial across all scenarios, CIC has even more significant benefits under low SNR conditions that are common to LoRa deployments, in which prior approaches appear to perform quite poorly.","PeriodicalId":20487,"journal":{"name":"Proceedings of the 2021 ACM SIGCOMM 2021 Conference","volume":"34 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88967069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhizhen Zhong, M. Ghobadi, Alaa Khaddaj, J. Leach, Yiting Xia, Ying Zhang
Fiber cut events reduce the capacity of wide-area networks (WANs) by several Tbps. In this paper, we revive the lost capacity by reconfiguring the wavelengths from cut fibers into healthy fibers. We highlight two challenges that made prior solutions impractical and propose a system called Arrow to address them. First, our measurements show that contrary to common belief, in most cases, the lost capacity is only partially restorable. This poses a cross-layer challenge from the Traffic Engineering (TE) perspective that has not been considered before: “Which IP links should be restored and by how much to best match the TE objective?” To address this challenge, Arrow's restoration-aware TE system takes a set of partial restoration candidates (that we call LotteryTickets) as input and proactively finds the best restoration plan. Second, prior work has not considered the reconfiguration latency of amplifiers. However, in practical settings, amplifiers add tens of minutes of reconfiguration delay. To enable fast and practical restoration, Arrow leverages optical noise loading and bypasses amplifier reconfiguration altogether. We evaluate Arrow using large-scale simulations and a testbed. Our testbed demonstrates Arrow's end-to-end restoration latency is eight seconds. Our large-scale simulations compare Arrow to the state-of-the-art TE schemes and show it can support 2.0x--2.4x more demand without compromising 99.99% availability.
{"title":"ARROW: restoration-aware traffic engineering","authors":"Zhizhen Zhong, M. Ghobadi, Alaa Khaddaj, J. Leach, Yiting Xia, Ying Zhang","doi":"10.1145/3452296.3472921","DOIUrl":"https://doi.org/10.1145/3452296.3472921","url":null,"abstract":"Fiber cut events reduce the capacity of wide-area networks (WANs) by several Tbps. In this paper, we revive the lost capacity by reconfiguring the wavelengths from cut fibers into healthy fibers. We highlight two challenges that made prior solutions impractical and propose a system called Arrow to address them. First, our measurements show that contrary to common belief, in most cases, the lost capacity is only partially restorable. This poses a cross-layer challenge from the Traffic Engineering (TE) perspective that has not been considered before: “Which IP links should be restored and by how much to best match the TE objective?” To address this challenge, Arrow's restoration-aware TE system takes a set of partial restoration candidates (that we call LotteryTickets) as input and proactively finds the best restoration plan. Second, prior work has not considered the reconfiguration latency of amplifiers. However, in practical settings, amplifiers add tens of minutes of reconfiguration delay. To enable fast and practical restoration, Arrow leverages optical noise loading and bypasses amplifier reconfiguration altogether. We evaluate Arrow using large-scale simulations and a testbed. Our testbed demonstrates Arrow's end-to-end restoration latency is eight seconds. Our large-scale simulations compare Arrow to the state-of-the-art TE schemes and show it can support 2.0x--2.4x more demand without compromising 99.99% availability.","PeriodicalId":20487,"journal":{"name":"Proceedings of the 2021 ACM SIGCOMM 2021 Conference","volume":"186 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80625890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xieyang Xu, Ryan Beckett, K. Jayaraman, Ratul Mahajan, D. Walker
Testing and verification have emerged as key tools in the battle to improve the reliability of networks and the services they provide. However, the success of even the best technology of this sort is limited by how effectively it is applied, and in today's enormously complex industrial networks, it is surprisingly easy to overlook particular interfaces, routes, or flows when creating a test suite. Moreover, network engineers, unlike their software counterparts, have no help to battle this problem—there are no metrics or systems to compute the quality of their test suites or the extent to which their networks have been verified. To address this gap, we develop a general framework to define and compute network coverage for stateless network data planes. It computes coverage for a range of network components (EG, interfaces, devices, paths) and supports many types of tests (e.g., concrete versus symbolic; local versus end-to-end; tests that check network state versus those that analyze behavior). Our framework is based on the observation that any network dataplane component can be decomposed into forwarding rules and all types of tests ultimately exercise these rules using one or more packets. We build a system called Yardstick based on this framework and deploy it in Microsoft Azure. Within the first month of its deployment inside one of the production networks, it uncovered several testing gaps and helped improve testing by covering 89% more forwarding rules and 17% more network interfaces.
{"title":"Test coverage metrics for the network","authors":"Xieyang Xu, Ryan Beckett, K. Jayaraman, Ratul Mahajan, D. Walker","doi":"10.1145/3452296.3472941","DOIUrl":"https://doi.org/10.1145/3452296.3472941","url":null,"abstract":"Testing and verification have emerged as key tools in the battle to improve the reliability of networks and the services they provide. However, the success of even the best technology of this sort is limited by how effectively it is applied, and in today's enormously complex industrial networks, it is surprisingly easy to overlook particular interfaces, routes, or flows when creating a test suite. Moreover, network engineers, unlike their software counterparts, have no help to battle this problem—there are no metrics or systems to compute the quality of their test suites or the extent to which their networks have been verified. To address this gap, we develop a general framework to define and compute network coverage for stateless network data planes. It computes coverage for a range of network components (EG, interfaces, devices, paths) and supports many types of tests (e.g., concrete versus symbolic; local versus end-to-end; tests that check network state versus those that analyze behavior). Our framework is based on the observation that any network dataplane component can be decomposed into forwarding rules and all types of tests ultimately exercise these rules using one or more packets. We build a system called Yardstick based on this framework and deploy it in Microsoft Azure. Within the first month of its deployment inside one of the production networks, it uncovered several testing gaps and helped improve testing by covering 89% more forwarding rules and 17% more network interfaces.","PeriodicalId":20487,"journal":{"name":"Proceedings of the 2021 ACM SIGCOMM 2021 Conference","volume":"17 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81994414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhihong Luo, Silvery Fu, M. Theis, Shaddi Hasan, S. Ratnasamy, S. Shenker
Markets in which competition thrives are good for both consumers and innovation but, unfortunately, competition is not thriving in the increasingly important cellular market. We propose CellBricks, a novel cellular architecture that lowers the barrier to entry for new operators by enabling users to consume access on-demand from any available cellular operator — small or large, trusted or untrusted. CellBricks achieves this by moving support for mobility and user management (authentication and billing) out of the network and into end hosts. These changes, we believe, bring valuable benefits beyond enabling competition: they lead to a cellular infrastructure that is simpler and more efficient. We design, build, and evaluate CellBricks, showing that its benefits come at little-to-no cost in performance, with application performance overhead between -1.6% to 3.1% of that achieved by current cellular infrastructure.
{"title":"Democratizing cellular access with CellBricks","authors":"Zhihong Luo, Silvery Fu, M. Theis, Shaddi Hasan, S. Ratnasamy, S. Shenker","doi":"10.1145/3452296.3473336","DOIUrl":"https://doi.org/10.1145/3452296.3473336","url":null,"abstract":"Markets in which competition thrives are good for both consumers and innovation but, unfortunately, competition is not thriving in the increasingly important cellular market. We propose CellBricks, a novel cellular architecture that lowers the barrier to entry for new operators by enabling users to consume access on-demand from any available cellular operator — small or large, trusted or untrusted. CellBricks achieves this by moving support for mobility and user management (authentication and billing) out of the network and into end hosts. These changes, we believe, bring valuable benefits beyond enabling competition: they lead to a cellular infrastructure that is simpler and more efficient. We design, build, and evaluate CellBricks, showing that its benefits come at little-to-no cost in performance, with application performance overhead between -1.6% to 3.1% of that achieved by current cellular infrastructure.","PeriodicalId":20487,"journal":{"name":"Proceedings of the 2021 ACM SIGCOMM 2021 Conference","volume":"24 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74800469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With recent advances on cellular technologies (such as 5G) that push the boundary of cellular performance, cellular reliability has become a key concern of cellular technology adoption and deployment. However, this fundamental concern has never been addressed due to the challenges of measuring cellular reliability on mobile devices and the cost of conducting large-scale measurements. This paper closes the knowledge gap by presenting the first large-scale, in-depth study on cellular reliability with more than 70 million Android phones across 34 different hardware models. Our study identifies the critical factors that affect cellular reliability and clears up misleading intuitions indicated by common wisdom. In particular, our study pinpoints that software reliability defects are among the main root causes of cellular data connection failures. Our work provides actionable insights for improving cellular reliability at scale. More importantly, we have built on our insights to develop enhancements that effectively address cellular reliability issues with remarkable real-world impact---our optimizations on Android's cellular implementations have effectively reduced 40% cellular connection failures for 5G phones and 36% failure duration across all phones.
{"title":"A nationwide study on cellular reliability: measurement, analysis, and enhancements","authors":"Yang Li, Hao Lin, Zhenhua Li, Yunhao Liu, Feng Qian, Liangyi Gong, Xianlong Xin, Tianyin Xu","doi":"10.1145/3452296.3472908","DOIUrl":"https://doi.org/10.1145/3452296.3472908","url":null,"abstract":"With recent advances on cellular technologies (such as 5G) that push the boundary of cellular performance, cellular reliability has become a key concern of cellular technology adoption and deployment. However, this fundamental concern has never been addressed due to the challenges of measuring cellular reliability on mobile devices and the cost of conducting large-scale measurements. This paper closes the knowledge gap by presenting the first large-scale, in-depth study on cellular reliability with more than 70 million Android phones across 34 different hardware models. Our study identifies the critical factors that affect cellular reliability and clears up misleading intuitions indicated by common wisdom. In particular, our study pinpoints that software reliability defects are among the main root causes of cellular data connection failures. Our work provides actionable insights for improving cellular reliability at scale. More importantly, we have built on our insights to develop enhancements that effectively address cellular reliability issues with remarkable real-world impact---our optimizations on Android's cellular implementations have effectively reduced 40% cellular connection failures for 5G phones and 36% failure duration across all phones.","PeriodicalId":20487,"journal":{"name":"Proceedings of the 2021 ACM SIGCOMM 2021 Conference","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74842085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Head related transfer functions (HRTF) describe how sound signals bounce, scatter, and diffract when they arrive at the head, and travel towards the ears. HRTFs produce distinct sound patterns that ultimately help the brain infer the spatial properties of the sound, such as its direction of arrival, 𝜃. If an earphone can learn the HRTF, it could apply the HRTF to any sound and make that sound appear directional to the user. For instance, a directional voice guide could help a tourist navigate a new city. While past works have estimated human HRTFs, an important gap lies in personalization. Today's HRTFs are global templates that are used in all products; since human HRTFs are unique, a global HRTF only offers a coarse-grained experience. This paper shows that by moving a smartphone around the head, combined with mobile acoustic communications between the phone and the earbuds, it is possible to estimate a user's personal HRTF. Our personalization system, UNIQ, combines techniques from channel estimation, motion tracking, and signal processing, with a focus on modeling signal diffraction on the curvature of the face. The results are promising and could open new doors into the rapidly growing space of immersive AR/VR, earables, smart hearing aids, etc.
{"title":"Personalizing head related transfer functions for earables","authors":"Zhijian Yang, Romit Roy Choudhury","doi":"10.1145/3452296.3472907","DOIUrl":"https://doi.org/10.1145/3452296.3472907","url":null,"abstract":"Head related transfer functions (HRTF) describe how sound signals bounce, scatter, and diffract when they arrive at the head, and travel towards the ears. HRTFs produce distinct sound patterns that ultimately help the brain infer the spatial properties of the sound, such as its direction of arrival, 𝜃. If an earphone can learn the HRTF, it could apply the HRTF to any sound and make that sound appear directional to the user. For instance, a directional voice guide could help a tourist navigate a new city. While past works have estimated human HRTFs, an important gap lies in personalization. Today's HRTFs are global templates that are used in all products; since human HRTFs are unique, a global HRTF only offers a coarse-grained experience. This paper shows that by moving a smartphone around the head, combined with mobile acoustic communications between the phone and the earbuds, it is possible to estimate a user's personal HRTF. Our personalization system, UNIQ, combines techniques from channel estimation, motion tracking, and signal processing, with a focus on modeling signal diffraction on the curvature of the face. The results are promising and could open new doors into the rapidly growing space of immersive AR/VR, earables, smart hearing aids, etc.","PeriodicalId":20487,"journal":{"name":"Proceedings of the 2021 ACM SIGCOMM 2021 Conference","volume":"107 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81354532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhilong Zheng, Yunfei Ma, Yanmei Liu, Furong Yang, Zhenyu Li, Yuanbo Zhang, Jiuhai Zhang, Wei Shi, Wentao Chen, Ding Li, Qing An, Hai Hong, H. Liu, Ming Zhang
We report XLINK, a multi-path QUIC video transport solution with experiments in Taobao short videos. XLINK is designed to meet two operational challenges at the same time: (1) Optimized user-perceived quality of experience (QoE) in terms of robustness, smoothness, responsiveness, and mobility and (2) Minimized cost overhead for service providers (typically CDNs). The core of XLINK is to take the opportunity of QUIC as a user-space protocol and directly capture user-perceived video QoE intent to control multi-path scheduling and management. We overcome major hurdles such as multi-path head-of-line blocking, network heterogeneity, and rapid link variations and balance cost and performance. To the best of our knowledge, XLINK is the first large-scale experimental study of multi-path QUIC video services in production environments. We present the results of over 3 million e-commerce product short-video plays from consumers who upgraded to Taobao android app with XLINK. Our study shows that compared to single-path QUIC, XLINK achieved 19 to 50% improvement in the 99-th percentile video-chunk request completion time, 32% improvement in the 99-th percentile first-video-frame latency, 23 to 67% improvement in the re-buffering rate at the expense of 2.1% redundant traffic.
{"title":"XLINK","authors":"Zhilong Zheng, Yunfei Ma, Yanmei Liu, Furong Yang, Zhenyu Li, Yuanbo Zhang, Jiuhai Zhang, Wei Shi, Wentao Chen, Ding Li, Qing An, Hai Hong, H. Liu, Ming Zhang","doi":"10.1145/3452296.3472893","DOIUrl":"https://doi.org/10.1145/3452296.3472893","url":null,"abstract":"We report XLINK, a multi-path QUIC video transport solution with experiments in Taobao short videos. XLINK is designed to meet two operational challenges at the same time: (1) Optimized user-perceived quality of experience (QoE) in terms of robustness, smoothness, responsiveness, and mobility and (2) Minimized cost overhead for service providers (typically CDNs). The core of XLINK is to take the opportunity of QUIC as a user-space protocol and directly capture user-perceived video QoE intent to control multi-path scheduling and management. We overcome major hurdles such as multi-path head-of-line blocking, network heterogeneity, and rapid link variations and balance cost and performance. To the best of our knowledge, XLINK is the first large-scale experimental study of multi-path QUIC video services in production environments. We present the results of over 3 million e-commerce product short-video plays from consumers who upgraded to Taobao android app with XLINK. Our study shows that compared to single-path QUIC, XLINK achieved 19 to 50% improvement in the 99-th percentile video-chunk request completion time, 32% improvement in the 99-th percentile first-video-frame latency, 23 to 67% improvement in the re-buffering rate at the expense of 2.1% redundant traffic.","PeriodicalId":20487,"journal":{"name":"Proceedings of the 2021 ACM SIGCOMM 2021 Conference","volume":"48 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81395265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
V. Arun, Mina Tahmasbi Arashloo, Ahmed Saeed, Mohammad Alizadeh, H. Balakrishnan
The diversity of paths on the Internet makes it difficult for designers and operators to confidently deploy new congestion control algorithms (CCAs) without extensive real-world experiments, but such capabilities are not available to most of the networking community. And even when they are available, understanding why a CCA underperforms by trawling through massive amounts of statistical data from network connections is challenging. The history of congestion control is replete with many examples of surprising and unanticipated behaviors unseen in simulation but observed on real-world paths. In this paper, we propose initial steps toward modeling and improving our confidence in a CCA's behavior. We have developed CCAC, a tool that uses formal verification to establish certain properties of CCAs. It is able to prove hypotheses about CCAs or generate counterexamples for invalid hypotheses. With CCAC, a designer can not only gain greater confidence prior to deployment to avoid unpleasant surprises, but can also use the counterexamples to iteratively improvetheir algorithm. We have modeled additive-increase/multiplicative-decrease (AIMD), Copa, and BBR with CCAC, and describe some surprising results from the exercise.
{"title":"Toward formally verifying congestion control behavior","authors":"V. Arun, Mina Tahmasbi Arashloo, Ahmed Saeed, Mohammad Alizadeh, H. Balakrishnan","doi":"10.1145/3452296.3472912","DOIUrl":"https://doi.org/10.1145/3452296.3472912","url":null,"abstract":"The diversity of paths on the Internet makes it difficult for designers and operators to confidently deploy new congestion control algorithms (CCAs) without extensive real-world experiments, but such capabilities are not available to most of the networking community. And even when they are available, understanding why a CCA underperforms by trawling through massive amounts of statistical data from network connections is challenging. The history of congestion control is replete with many examples of surprising and unanticipated behaviors unseen in simulation but observed on real-world paths. In this paper, we propose initial steps toward modeling and improving our confidence in a CCA's behavior. We have developed CCAC, a tool that uses formal verification to establish certain properties of CCAs. It is able to prove hypotheses about CCAs or generate counterexamples for invalid hypotheses. With CCAC, a designer can not only gain greater confidence prior to deployment to avoid unpleasant surprises, but can also use the counterexamples to iteratively improvetheir algorithm. We have modeled additive-increase/multiplicative-decrease (AIMD), Copa, and BBR with CCAC, and describe some surprising results from the exercise.","PeriodicalId":20487,"journal":{"name":"Proceedings of the 2021 ACM SIGCOMM 2021 Conference","volume":"14 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81693182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}