A. Bentaleb, May Lim, Mehmet N. Akcay, A. Begen, Roger Zimmermann
In September 2020, the Consumer Technology Association (CTA) published the CTA-5004: Common Media Client Data (CMCD) specification. Using this specification, a media client can convey certain information to the content delivery network servers with object requests. This information is useful in log association/analysis, quality of service/experience monitoring and delivery enhancements. This paper is the first step toward investigating the feasibility of CMCD in addressing one of the most common problems in the streaming domain: efficient use of shared bandwidth by multiple clients. To that effect, we implemented CMCD functions on an HTTP server and built a proof-of-concept system with CMCD-aware dash.js clients. We show that even a basic bandwidth allocation scheme enabled by CMCD reduces rebuffering rate and duration without noticeably sacrificing the video quality.
{"title":"Common media client data (CMCD): initial findings","authors":"A. Bentaleb, May Lim, Mehmet N. Akcay, A. Begen, Roger Zimmermann","doi":"10.1145/3458306.3461444","DOIUrl":"https://doi.org/10.1145/3458306.3461444","url":null,"abstract":"In September 2020, the Consumer Technology Association (CTA) published the CTA-5004: Common Media Client Data (CMCD) specification. Using this specification, a media client can convey certain information to the content delivery network servers with object requests. This information is useful in log association/analysis, quality of service/experience monitoring and delivery enhancements. This paper is the first step toward investigating the feasibility of CMCD in addressing one of the most common problems in the streaming domain: efficient use of shared bandwidth by multiple clients. To that effect, we implemented CMCD functions on an HTTP server and built a proof-of-concept system with CMCD-aware dash.js clients. We show that even a basic bandwidth allocation scheme enabled by CMCD reduces rebuffering rate and duration without noticeably sacrificing the video quality.","PeriodicalId":429348,"journal":{"name":"Proceedings of the 31st ACM Workshop on Network and Operating Systems Support for Digital Audio and Video","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122376881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Farahani, F. Tashtarian, A. Erfanian, C. Timmerer, M. Ghanbari, H. Hellwagner
Recently, HTTP Adaptive Streaming (HAS) has become the dominant video delivery technology over the Internet. In HAS, clients have full control over the media streaming and adaptation processes. Lack of coordination among the clients and lack of awareness of the network conditions may lead to sub-optimal user experience and resource utilization in a pure client-based HAS adaptation scheme. Software Defined Networking (SDN) has recently been considered to enhance the video streaming process. In this paper, we leverage the capability of SDN and Network Function Virtualization (NFV) to introduce an edge- and SDN-assisted video streaming framework called ES-HAS. We employ virtualized edge components to collect HAS clients' requests and retrieve networking information in a time-slotted manner. These components then perform an optimization model in a time-slotted manner to efficiently serve clients' requests by selecting an optimal cache server (with the shortest fetch time). In case of a cache miss, a client's request is served (i) by an optimal replacement quality (only better quality levels with minimum deviation) from a cache server, or (ii) by the original requested quality level from the origin server. This approach is validated through experiments on a large-scale testbed, and the performance of our framework is compared to pure client-based strategies and the SABR system [12]. Although SABR and ES-HAS show (almost) identical performance in the number of quality switches, ES-HAS outperforms SABR in terms of playback bitrate and the number of stalls by at least 70% and 40%, respectively.
{"title":"ES-HAS: an edge- and SDN-assisted framework for HTTP adaptive video streaming","authors":"R. Farahani, F. Tashtarian, A. Erfanian, C. Timmerer, M. Ghanbari, H. Hellwagner","doi":"10.1145/3458306.3460997","DOIUrl":"https://doi.org/10.1145/3458306.3460997","url":null,"abstract":"Recently, HTTP Adaptive Streaming (HAS) has become the dominant video delivery technology over the Internet. In HAS, clients have full control over the media streaming and adaptation processes. Lack of coordination among the clients and lack of awareness of the network conditions may lead to sub-optimal user experience and resource utilization in a pure client-based HAS adaptation scheme. Software Defined Networking (SDN) has recently been considered to enhance the video streaming process. In this paper, we leverage the capability of SDN and Network Function Virtualization (NFV) to introduce an edge- and SDN-assisted video streaming framework called ES-HAS. We employ virtualized edge components to collect HAS clients' requests and retrieve networking information in a time-slotted manner. These components then perform an optimization model in a time-slotted manner to efficiently serve clients' requests by selecting an optimal cache server (with the shortest fetch time). In case of a cache miss, a client's request is served (i) by an optimal replacement quality (only better quality levels with minimum deviation) from a cache server, or (ii) by the original requested quality level from the origin server. This approach is validated through experiments on a large-scale testbed, and the performance of our framework is compared to pure client-based strategies and the SABR system [12]. Although SABR and ES-HAS show (almost) identical performance in the number of quality switches, ES-HAS outperforms SABR in terms of playback bitrate and the number of stalls by at least 70% and 40%, respectively.","PeriodicalId":429348,"journal":{"name":"Proceedings of the 31st ACM Workshop on Network and Operating Systems Support for Digital Audio and Video","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126541156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wilmer Moina-Rivera, J. Gutiérrez-Aguado, M. García-Pineda
Today, more than 85% of Internet traffic has a multimedia component. Video streaming occupies a large part of this percentage mainly because this type of content is provided by the most used applications on the Internet (e.g. Twitch, TikTok, Disney+, YouTube, Netflix, etc.). Most of these platforms use HTTP Adaptive Streaming (HAS) to send this media content to end users in order to ensure a good quality of experience (QoE). But, this QoE should be guaranteed from the video to be transmitted, i.e., the video should have an adequate quality by minimizing the bitrate before transmission. In order to solve this issue, we present a system capable of encoding a video in several resolutions given the desired value of an objective metric. Our system includes the objective metric in the encoding loop in order to maintain the quality in all segments. This system has been tested with three video and five resolutions for each video. Our proposal provides improvements of more than 10% in terms of video size and with similar coding times when compared with a fixed Constant Rate Factor (CRF) encoding. A visual comparison between our proposal and a fixed CRF encoding can be seen at: https://links.uv.es/jgutierr/multiresQ
{"title":"Multi-resolution quality-based video coding system for DASH scenarios","authors":"Wilmer Moina-Rivera, J. Gutiérrez-Aguado, M. García-Pineda","doi":"10.1145/3458306.3460996","DOIUrl":"https://doi.org/10.1145/3458306.3460996","url":null,"abstract":"Today, more than 85% of Internet traffic has a multimedia component. Video streaming occupies a large part of this percentage mainly because this type of content is provided by the most used applications on the Internet (e.g. Twitch, TikTok, Disney+, YouTube, Netflix, etc.). Most of these platforms use HTTP Adaptive Streaming (HAS) to send this media content to end users in order to ensure a good quality of experience (QoE). But, this QoE should be guaranteed from the video to be transmitted, i.e., the video should have an adequate quality by minimizing the bitrate before transmission. In order to solve this issue, we present a system capable of encoding a video in several resolutions given the desired value of an objective metric. Our system includes the objective metric in the encoding loop in order to maintain the quality in all segments. This system has been tested with three video and five resolutions for each video. Our proposal provides improvements of more than 10% in terms of video size and with similar coding times when compared with a fixed Constant Rate Factor (CRF) encoding. A visual comparison between our proposal and a fixed CRF encoding can be seen at: https://links.uv.es/jgutierr/multiresQ","PeriodicalId":429348,"journal":{"name":"Proceedings of the 31st ACM Workshop on Network and Operating Systems Support for Digital Audio and Video","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129307656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Traffic reduction for bandwidth-hungry video streaming services, such as YouTube, benefits not only subscribers struggling to avoid going over their contracted data limit, but also service providers when the number of people who use video streaming services increase. Because not all stakeholders who want to reduce traffic usage are willing to conduct cumbersome operations, e.g., manually setting lower resolution, we argue here that network operators should introduce a traffic pacer for providing traffic reduction services as an optional plan for subscribers. This paper proposes NetPacer, an in-network traffic pacing system for reducing traffic usage by degrading the video quality. NetPacer has two features. The first is relative pacing, which degrades the video quality relative to the initial quality by traffic shaping, thus enabling flexible quality control. The second is in-network timely video quality identification via encrypted traffic analysis by using machine learning. Through experiments, we demonstrate that NetPacer successfully reduces traffic by 30.8% by degrading the resolution by one level while keeping the QoE (i.e., Mean Opinion Score (MOS)) degradation below 0.268 points on average for 50 YouTube videos.
{"title":"Data diet pills: in-network video quality control system for traffic usage reduction","authors":"Anan Sawabe, Takanori Iwai, A. Nakao","doi":"10.1145/3458306.3462255","DOIUrl":"https://doi.org/10.1145/3458306.3462255","url":null,"abstract":"Traffic reduction for bandwidth-hungry video streaming services, such as YouTube, benefits not only subscribers struggling to avoid going over their contracted data limit, but also service providers when the number of people who use video streaming services increase. Because not all stakeholders who want to reduce traffic usage are willing to conduct cumbersome operations, e.g., manually setting lower resolution, we argue here that network operators should introduce a traffic pacer for providing traffic reduction services as an optional plan for subscribers. This paper proposes NetPacer, an in-network traffic pacing system for reducing traffic usage by degrading the video quality. NetPacer has two features. The first is relative pacing, which degrades the video quality relative to the initial quality by traffic shaping, thus enabling flexible quality control. The second is in-network timely video quality identification via encrypted traffic analysis by using machine learning. Through experiments, we demonstrate that NetPacer successfully reduces traffic by 30.8% by degrading the resolution by one level while keeping the QoE (i.e., Mean Opinion Score (MOS)) degradation below 0.268 points on average for 50 YouTube videos.","PeriodicalId":429348,"journal":{"name":"Proceedings of the 31st ACM Workshop on Network and Operating Systems Support for Digital Audio and Video","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122258355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Babak Taraghi, A. Bentaleb, C. Timmerer, Roger Zimmermann, H. Hellwagner
Adaptive bitrate (ABR) algorithms play a crucial role in delivering the highest possible viewer's Quality of Experience (QoE) in HTTP Adaptive Streaming (HAS). Online video streaming service providers use HAS - the dominant video streaming technique on the Internet - to deliver the best QoE for their users. A viewer's delight relies heavily on how the ABR of a media player can adapt the stream's quality to the current network conditions. QoE for video streaming sessions has been assessed in many research projects to give better insight into the significant quality metrics such as startup delay and stall events. The ITU Telecommunication Standardization Sector (ITU-T) P.1203 quality evaluation model allows to algorithmically predict a subjective Mean Opinion Score (MOS) by considering various quality metrics. Subjective evaluation is the best assessment method for examining the end-user opinion over a video streaming session's experienced quality. We have conducted subjective evaluations with crowdsourced participants and evaluated the MOS of the sessions using the ITU-T P.1203 quality model. This paper's main contribution is to investigate the correspondence of subjective and objective evaluations for well-known heuristic-based ABRs.
{"title":"Understanding quality of experience of heuristic-based HTTP adaptive bitrate algorithms","authors":"Babak Taraghi, A. Bentaleb, C. Timmerer, Roger Zimmermann, H. Hellwagner","doi":"10.1145/3458306.3458875","DOIUrl":"https://doi.org/10.1145/3458306.3458875","url":null,"abstract":"Adaptive bitrate (ABR) algorithms play a crucial role in delivering the highest possible viewer's Quality of Experience (QoE) in HTTP Adaptive Streaming (HAS). Online video streaming service providers use HAS - the dominant video streaming technique on the Internet - to deliver the best QoE for their users. A viewer's delight relies heavily on how the ABR of a media player can adapt the stream's quality to the current network conditions. QoE for video streaming sessions has been assessed in many research projects to give better insight into the significant quality metrics such as startup delay and stall events. The ITU Telecommunication Standardization Sector (ITU-T) P.1203 quality evaluation model allows to algorithmically predict a subjective Mean Opinion Score (MOS) by considering various quality metrics. Subjective evaluation is the best assessment method for examining the end-user opinion over a video streaming session's experienced quality. We have conducted subjective evaluations with crowdsourced participants and evaluated the MOS of the sessions using the ITU-T P.1203 quality model. This paper's main contribution is to investigate the correspondence of subjective and objective evaluations for well-known heuristic-based ABRs.","PeriodicalId":429348,"journal":{"name":"Proceedings of the 31st ACM Workshop on Network and Operating Systems Support for Digital Audio and Video","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134530239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the growing popularity of live streaming, high video quality and low latency with limited uplink bandwidth have become a significant challenge. In this study, we propose Live Super-Resolution Based Video Coding (LiveSRVC), a novel video uploading framework that improves the quality of live streaming with low latency under limited uplink bandwidth. We design a new super-resolution-based key frame coding module to improve the coding compression efficiency. LiveSRVC dynamically selects the bitrate and the compression ratio of key frames, mitigating the influence of uplink bandwidth capacity on live streaming quality. Trace-driven emulations verify that LiveSRVC can provide the same quality while reducing up to 50% of the required bandwidth compared to the original encoding method (H.264). LiveSRVC consumes at least 10X less GPU occupation time compared to the method of reconstructing all frames with super-resolution.
{"title":"Higher quality live streaming under lower uplink bandwidth: an approach of super-resolution based video coding","authors":"Ying Chen, Qing Li, Aoyang Zhang, Longhao Zou, Yong Jiang, Zhimin Xu, Junlin Li, Zhenhui Yuan","doi":"10.1145/3458306.3458874","DOIUrl":"https://doi.org/10.1145/3458306.3458874","url":null,"abstract":"With the growing popularity of live streaming, high video quality and low latency with limited uplink bandwidth have become a significant challenge. In this study, we propose Live Super-Resolution Based Video Coding (LiveSRVC), a novel video uploading framework that improves the quality of live streaming with low latency under limited uplink bandwidth. We design a new super-resolution-based key frame coding module to improve the coding compression efficiency. LiveSRVC dynamically selects the bitrate and the compression ratio of key frames, mitigating the influence of uplink bandwidth capacity on live streaming quality. Trace-driven emulations verify that LiveSRVC can provide the same quality while reducing up to 50% of the required bandwidth compared to the original encoding method (H.264). LiveSRVC consumes at least 10X less GPU occupation time compared to the method of reconstructing all frames with super-resolution.","PeriodicalId":429348,"journal":{"name":"Proceedings of the 31st ACM Workshop on Network and Operating Systems Support for Digital Audio and Video","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134201394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the typical transcoding pipeline for adaptive video streaming, raw videos are pre-chunked and pre-encoded according to a set of resolution-bitrate or resolution-quality pairs on the server-side, where the pair is often named as bitrate ladder. Different from existing heuristics, we argue that a good bitrate ladder should be optimized by considering video content features, network capacity, and storage costs on the cloud. We propose DeepLadder, a per-chunk optimization scheme which adopts state-of-the-art deep reinforcement learning (DRL) method to optimize the bitrate ladder w.r.t the above concerns. Technically, DeepLadder selects the proper setting for each video resolution autoregressively. We use over 8,000 video chunks, measure over 1,000,000 perceptual video qualities, collect real-world network traces for more than 50 hours, and invent faithful virtual environments to help train DeepLadder efficiently. Across a series of comprehensive experiments on both Constant Bitrate (CBR) and Variable Bitrate (VBR)-encoded videos, we demonstrate significant improvements in average video quality bandwidth utilization, and storage overhead in comparison to prior work as well as the ability to be deployed in the real-world transcoding framework.
{"title":"Deep reinforced bitrate ladders for adaptive video streaming","authors":"Tianchi Huang, Ruixiao Zhang, Lifeng Sun","doi":"10.1145/3458306.3458873","DOIUrl":"https://doi.org/10.1145/3458306.3458873","url":null,"abstract":"In the typical transcoding pipeline for adaptive video streaming, raw videos are pre-chunked and pre-encoded according to a set of resolution-bitrate or resolution-quality pairs on the server-side, where the pair is often named as bitrate ladder. Different from existing heuristics, we argue that a good bitrate ladder should be optimized by considering video content features, network capacity, and storage costs on the cloud. We propose DeepLadder, a per-chunk optimization scheme which adopts state-of-the-art deep reinforcement learning (DRL) method to optimize the bitrate ladder w.r.t the above concerns. Technically, DeepLadder selects the proper setting for each video resolution autoregressively. We use over 8,000 video chunks, measure over 1,000,000 perceptual video qualities, collect real-world network traces for more than 50 hours, and invent faithful virtual environments to help train DeepLadder efficiently. Across a series of comprehensive experiments on both Constant Bitrate (CBR) and Variable Bitrate (VBR)-encoded videos, we demonstrate significant improvements in average video quality bandwidth utilization, and storage overhead in comparison to prior work as well as the ability to be deployed in the real-world transcoding framework.","PeriodicalId":429348,"journal":{"name":"Proceedings of the 31st ACM Workshop on Network and Operating Systems Support for Digital Audio and Video","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131037403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present a study on the impact of packet loss on dynamic 3D point cloud streaming, encoded with MPEG Video-based Point Cloud Compression (V-PCC) standard. We show the distortion when different channels of V-PCC bitstream are lost, with the loss of occupancy and geometry data impacting the quality most significantly. Our results point to the need for better error concealment techniques. We end the paper by presenting preliminary thoughts and experimental results of two naive error concealment techniques in the point cloud domain, for attributes and geometry data, respectively, and highlight the limitations of each.
{"title":"Dynamic 3D point cloud streaming: distortion and concealment","authors":"Cheng-Hao Wu, Xiner Li, R. Rajesh, Wei Tsang Ooi, Cheng-Hsin Hsu","doi":"10.1145/3458306.3458876","DOIUrl":"https://doi.org/10.1145/3458306.3458876","url":null,"abstract":"We present a study on the impact of packet loss on dynamic 3D point cloud streaming, encoded with MPEG Video-based Point Cloud Compression (V-PCC) standard. We show the distortion when different channels of V-PCC bitstream are lost, with the loss of occupancy and geometry data impacting the quality most significantly. Our results point to the need for better error concealment techniques. We end the paper by presenting preliminary thoughts and experimental results of two naive error concealment techniques in the point cloud domain, for attributes and geometry data, respectively, and highlight the limitations of each.","PeriodicalId":429348,"journal":{"name":"Proceedings of the 31st ACM Workshop on Network and Operating Systems Support for Digital Audio and Video","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116522328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we propose BayesMPC, an uncertainty-aware robust adaptive bitrate (ABR) algorithm on the basis of Bayesian neural network (BNN) and model predictive control (MPC). Specifically, to improve the capacity of learning transition probability of the network throughput, we adopt a BNN-based predictor that is able to predict the statistical distribution of future throughput from the past throughput by not only considering the aleatoric uncertainty (e.g., noise), but also capturing the epistemic uncertainty incurred by lack of adequate training samples. We further show that by using the negative log-likelihood loss function to train this BNN-based throughput predictor, the generalization error can be minimized with the guarantee of PAC-Bayesian theorem. Rather than a point estimate, the learnt uncertainty can contribute to a confidence region for the future throughput, the lower bound of which then leads to an uncertainty-aware robust MPC strategy to maximize the worst-case user quality-of-experience (QoE) w.r.t. this confidence region. Finally, experimental results on three real-world network trace datasets validate the efficiency of both the proposed BNN-based predictor and uncertainty-aware robust MPC strategy, and demonstrate the superior performance compared to other baselines, in terms of both the overall QoE performance and generalization across all ranges of heterogeneous network and user conditions.
{"title":"Uncertainty-aware robust adaptive video streaming with bayesian neural network and model predictive control","authors":"Nuowen Kan, Chenglin Li, Caiyi Yang, Wenrui Dai, Junni Zou, H. Xiong","doi":"10.1145/3458306.3458872","DOIUrl":"https://doi.org/10.1145/3458306.3458872","url":null,"abstract":"In this paper, we propose BayesMPC, an uncertainty-aware robust adaptive bitrate (ABR) algorithm on the basis of Bayesian neural network (BNN) and model predictive control (MPC). Specifically, to improve the capacity of learning transition probability of the network throughput, we adopt a BNN-based predictor that is able to predict the statistical distribution of future throughput from the past throughput by not only considering the aleatoric uncertainty (e.g., noise), but also capturing the epistemic uncertainty incurred by lack of adequate training samples. We further show that by using the negative log-likelihood loss function to train this BNN-based throughput predictor, the generalization error can be minimized with the guarantee of PAC-Bayesian theorem. Rather than a point estimate, the learnt uncertainty can contribute to a confidence region for the future throughput, the lower bound of which then leads to an uncertainty-aware robust MPC strategy to maximize the worst-case user quality-of-experience (QoE) w.r.t. this confidence region. Finally, experimental results on three real-world network trace datasets validate the efficiency of both the proposed BNN-based predictor and uncertainty-aware robust MPC strategy, and demonstrate the superior performance compared to other baselines, in terms of both the overall QoE performance and generalization across all ranges of heterogeneous network and user conditions.","PeriodicalId":429348,"journal":{"name":"Proceedings of the 31st ACM Workshop on Network and Operating Systems Support for Digital Audio and Video","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131324494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We investigate a future WiFi-VLC dual connectivity streaming system for 6DOF multi-user virtual reality that enables reliable high-fidelity remote scene immersion. The system integrates an edge server that uses scalable 360° tiling to adaptively split the present 360° view of a VR user into a panoramic baseline content layer and a viewport-specific enhancement content layer. The user is then served the two content layers over complementary WiFi and VLC wireless links such that the delivered viewport quality is maximized for the given WiFi and VLC transmission resources. We formally characterize the actions of the server using rate-distortion optimization that we solve at low complexity. To account for the users' mobility as they explore different 360° viewpoints of the 6DOF remote scene content and maintain reliable high-quality VLC connectivity, we explore dynamic VLC transmitter steering and assignment in the system as graph bottleneck matching that aims to maximize the received VLC SNR across all users. We formulate an effective low-complexity solution to this discrete combinatorial optimization problem of high complexity. The paper also contributes a first actual 6DOF body and head movement VR navigation dataset that we collected and facilitate to assess the performance of our system via simulation experiments. These demonstrate enhanced VLC transmission performance and an up to 7 dB gain in viewport quality over a state-of-the-art VLC cellular system (LiFi), and an up to 10 dB gain in viewport quality over a state-of-the-art traditional wireless streaming method, for 12K-120fps 360° 6DOF VR content. Moreover, the synergistic WiFi-VLC dual connectivity of the proposed system augments its reliability over the reference method LiFi that comprises only VLC links. These outcomes motivate further exploration and prototype implementation of our system.
{"title":"Wifi-VLC dual connectivity streaming system for 6DOF multi-user virtual reality","authors":"Jacob Chakareski, M. Khan","doi":"10.1145/3458306.3460999","DOIUrl":"https://doi.org/10.1145/3458306.3460999","url":null,"abstract":"We investigate a future WiFi-VLC dual connectivity streaming system for 6DOF multi-user virtual reality that enables reliable high-fidelity remote scene immersion. The system integrates an edge server that uses scalable 360° tiling to adaptively split the present 360° view of a VR user into a panoramic baseline content layer and a viewport-specific enhancement content layer. The user is then served the two content layers over complementary WiFi and VLC wireless links such that the delivered viewport quality is maximized for the given WiFi and VLC transmission resources. We formally characterize the actions of the server using rate-distortion optimization that we solve at low complexity. To account for the users' mobility as they explore different 360° viewpoints of the 6DOF remote scene content and maintain reliable high-quality VLC connectivity, we explore dynamic VLC transmitter steering and assignment in the system as graph bottleneck matching that aims to maximize the received VLC SNR across all users. We formulate an effective low-complexity solution to this discrete combinatorial optimization problem of high complexity. The paper also contributes a first actual 6DOF body and head movement VR navigation dataset that we collected and facilitate to assess the performance of our system via simulation experiments. These demonstrate enhanced VLC transmission performance and an up to 7 dB gain in viewport quality over a state-of-the-art VLC cellular system (LiFi), and an up to 10 dB gain in viewport quality over a state-of-the-art traditional wireless streaming method, for 12K-120fps 360° 6DOF VR content. Moreover, the synergistic WiFi-VLC dual connectivity of the proposed system augments its reliability over the reference method LiFi that comprises only VLC links. These outcomes motivate further exploration and prototype implementation of our system.","PeriodicalId":429348,"journal":{"name":"Proceedings of the 31st ACM Workshop on Network and Operating Systems Support for Digital Audio and Video","volume":"304 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114598207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}