首页 > 最新文献

Proceedings of the 31st ACM Workshop on Network and Operating Systems Support for Digital Audio and Video最新文献

英文 中文
Common media client data (CMCD): initial findings 通用媒体客户端数据(CMCD):初步发现
A. Bentaleb, May Lim, Mehmet N. Akcay, A. Begen, Roger Zimmermann
In September 2020, the Consumer Technology Association (CTA) published the CTA-5004: Common Media Client Data (CMCD) specification. Using this specification, a media client can convey certain information to the content delivery network servers with object requests. This information is useful in log association/analysis, quality of service/experience monitoring and delivery enhancements. This paper is the first step toward investigating the feasibility of CMCD in addressing one of the most common problems in the streaming domain: efficient use of shared bandwidth by multiple clients. To that effect, we implemented CMCD functions on an HTTP server and built a proof-of-concept system with CMCD-aware dash.js clients. We show that even a basic bandwidth allocation scheme enabled by CMCD reduces rebuffering rate and duration without noticeably sacrificing the video quality.
2020年9月,消费者技术协会(CTA)发布了CTA-5004:通用媒体客户端数据(CMCD)规范。使用此规范,媒体客户端可以通过对象请求向内容交付网络服务器传递某些信息。此信息在日志关联/分析、服务质量/体验监控和交付增强方面非常有用。本文是研究CMCD在解决流领域中最常见的问题之一的可行性的第一步:多个客户端有效利用共享带宽。为此,我们在HTTP服务器上实现了CMCD功能,并使用感知CMCD的dash.js客户端构建了一个概念验证系统。我们表明,即使是一个基本的带宽分配方案启用CMCD减少再缓冲率和持续时间,而不会明显牺牲视频质量。
{"title":"Common media client data (CMCD): initial findings","authors":"A. Bentaleb, May Lim, Mehmet N. Akcay, A. Begen, Roger Zimmermann","doi":"10.1145/3458306.3461444","DOIUrl":"https://doi.org/10.1145/3458306.3461444","url":null,"abstract":"In September 2020, the Consumer Technology Association (CTA) published the CTA-5004: Common Media Client Data (CMCD) specification. Using this specification, a media client can convey certain information to the content delivery network servers with object requests. This information is useful in log association/analysis, quality of service/experience monitoring and delivery enhancements. This paper is the first step toward investigating the feasibility of CMCD in addressing one of the most common problems in the streaming domain: efficient use of shared bandwidth by multiple clients. To that effect, we implemented CMCD functions on an HTTP server and built a proof-of-concept system with CMCD-aware dash.js clients. We show that even a basic bandwidth allocation scheme enabled by CMCD reduces rebuffering rate and duration without noticeably sacrificing the video quality.","PeriodicalId":429348,"journal":{"name":"Proceedings of the 31st ACM Workshop on Network and Operating Systems Support for Digital Audio and Video","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122376881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
ES-HAS: an edge- and SDN-assisted framework for HTTP adaptive video streaming ES-HAS:用于HTTP自适应视频流的边缘和sdn辅助框架
R. Farahani, F. Tashtarian, A. Erfanian, C. Timmerer, M. Ghanbari, H. Hellwagner
Recently, HTTP Adaptive Streaming (HAS) has become the dominant video delivery technology over the Internet. In HAS, clients have full control over the media streaming and adaptation processes. Lack of coordination among the clients and lack of awareness of the network conditions may lead to sub-optimal user experience and resource utilization in a pure client-based HAS adaptation scheme. Software Defined Networking (SDN) has recently been considered to enhance the video streaming process. In this paper, we leverage the capability of SDN and Network Function Virtualization (NFV) to introduce an edge- and SDN-assisted video streaming framework called ES-HAS. We employ virtualized edge components to collect HAS clients' requests and retrieve networking information in a time-slotted manner. These components then perform an optimization model in a time-slotted manner to efficiently serve clients' requests by selecting an optimal cache server (with the shortest fetch time). In case of a cache miss, a client's request is served (i) by an optimal replacement quality (only better quality levels with minimum deviation) from a cache server, or (ii) by the original requested quality level from the origin server. This approach is validated through experiments on a large-scale testbed, and the performance of our framework is compared to pure client-based strategies and the SABR system [12]. Although SABR and ES-HAS show (almost) identical performance in the number of quality switches, ES-HAS outperforms SABR in terms of playback bitrate and the number of stalls by at least 70% and 40%, respectively.
近年来,HTTP自适应流媒体(HAS)已经成为互联网上占主导地位的视频传输技术。在HAS中,客户端完全控制媒体流和适应过程。在纯基于客户端的HAS自适应方案中,客户端之间缺乏协调,对网络状况缺乏认识,可能导致用户体验和资源利用率不理想。软件定义网络(SDN)最近被认为可以增强视频流处理。在本文中,我们利用SDN和网络功能虚拟化(NFV)的能力来引入一个边缘和SDN辅助的视频流框架ES-HAS。我们使用虚拟化边缘组件来收集HAS客户端的请求,并以时隙方式检索网络信息。然后,这些组件以时隙方式执行优化模型,通过选择最佳缓存服务器(获取时间最短)来有效地为客户端请求提供服务。在缓存丢失的情况下,客户端的请求由缓存服务器提供(i)最佳的替换质量(只有更好的质量水平和最小的偏差),或(ii)由原始服务器提供原始请求的质量水平。该方法通过大规模测试平台的实验得到了验证,并将我们的框架的性能与纯基于客户端的策略和SABR系统进行了比较[12]。虽然SABR和ES-HAS在质量开关数量上表现(几乎)相同,但ES-HAS在播放比特率和失速数量上分别比SABR好至少70%和40%。
{"title":"ES-HAS: an edge- and SDN-assisted framework for HTTP adaptive video streaming","authors":"R. Farahani, F. Tashtarian, A. Erfanian, C. Timmerer, M. Ghanbari, H. Hellwagner","doi":"10.1145/3458306.3460997","DOIUrl":"https://doi.org/10.1145/3458306.3460997","url":null,"abstract":"Recently, HTTP Adaptive Streaming (HAS) has become the dominant video delivery technology over the Internet. In HAS, clients have full control over the media streaming and adaptation processes. Lack of coordination among the clients and lack of awareness of the network conditions may lead to sub-optimal user experience and resource utilization in a pure client-based HAS adaptation scheme. Software Defined Networking (SDN) has recently been considered to enhance the video streaming process. In this paper, we leverage the capability of SDN and Network Function Virtualization (NFV) to introduce an edge- and SDN-assisted video streaming framework called ES-HAS. We employ virtualized edge components to collect HAS clients' requests and retrieve networking information in a time-slotted manner. These components then perform an optimization model in a time-slotted manner to efficiently serve clients' requests by selecting an optimal cache server (with the shortest fetch time). In case of a cache miss, a client's request is served (i) by an optimal replacement quality (only better quality levels with minimum deviation) from a cache server, or (ii) by the original requested quality level from the origin server. This approach is validated through experiments on a large-scale testbed, and the performance of our framework is compared to pure client-based strategies and the SABR system [12]. Although SABR and ES-HAS show (almost) identical performance in the number of quality switches, ES-HAS outperforms SABR in terms of playback bitrate and the number of stalls by at least 70% and 40%, respectively.","PeriodicalId":429348,"journal":{"name":"Proceedings of the 31st ACM Workshop on Network and Operating Systems Support for Digital Audio and Video","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126541156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Multi-resolution quality-based video coding system for DASH scenarios 用于DASH场景的基于质量的多分辨率视频编码系统
Wilmer Moina-Rivera, J. Gutiérrez-Aguado, M. García-Pineda
Today, more than 85% of Internet traffic has a multimedia component. Video streaming occupies a large part of this percentage mainly because this type of content is provided by the most used applications on the Internet (e.g. Twitch, TikTok, Disney+, YouTube, Netflix, etc.). Most of these platforms use HTTP Adaptive Streaming (HAS) to send this media content to end users in order to ensure a good quality of experience (QoE). But, this QoE should be guaranteed from the video to be transmitted, i.e., the video should have an adequate quality by minimizing the bitrate before transmission. In order to solve this issue, we present a system capable of encoding a video in several resolutions given the desired value of an objective metric. Our system includes the objective metric in the encoding loop in order to maintain the quality in all segments. This system has been tested with three video and five resolutions for each video. Our proposal provides improvements of more than 10% in terms of video size and with similar coding times when compared with a fixed Constant Rate Factor (CRF) encoding. A visual comparison between our proposal and a fixed CRF encoding can be seen at: https://links.uv.es/jgutierr/multiresQ
今天,超过85%的互联网流量包含多媒体组件。视频流占据了这一比例的很大一部分,主要是因为这类内容是由互联网上最常用的应用程序提供的(例如Twitch, TikTok, Disney+, YouTube, Netflix等)。这些平台大多使用HTTP自适应流(HAS)将媒体内容发送给最终用户,以确保良好的体验质量(QoE)。但是,这个QoE应该从要传输的视频中得到保证,即在传输前通过最小化比特率来保证视频具有足够的质量。为了解决这个问题,我们提出了一个系统,能够在给定客观度量的期望值的情况下以几种分辨率对视频进行编码。我们的系统在编码循环中包含客观度量,以保持所有片段的质量。这个系统已经测试了三个视频和五个分辨率的每个视频。与固定的恒速率因子(CRF)编码相比,我们的建议在视频大小方面提供了超过10%的改进,并且编码时间相似。我们的建议和固定的CRF编码之间的可视化比较可以在https://links.uv.es/jgutierr/multiresQ上看到
{"title":"Multi-resolution quality-based video coding system for DASH scenarios","authors":"Wilmer Moina-Rivera, J. Gutiérrez-Aguado, M. García-Pineda","doi":"10.1145/3458306.3460996","DOIUrl":"https://doi.org/10.1145/3458306.3460996","url":null,"abstract":"Today, more than 85% of Internet traffic has a multimedia component. Video streaming occupies a large part of this percentage mainly because this type of content is provided by the most used applications on the Internet (e.g. Twitch, TikTok, Disney+, YouTube, Netflix, etc.). Most of these platforms use HTTP Adaptive Streaming (HAS) to send this media content to end users in order to ensure a good quality of experience (QoE). But, this QoE should be guaranteed from the video to be transmitted, i.e., the video should have an adequate quality by minimizing the bitrate before transmission. In order to solve this issue, we present a system capable of encoding a video in several resolutions given the desired value of an objective metric. Our system includes the objective metric in the encoding loop in order to maintain the quality in all segments. This system has been tested with three video and five resolutions for each video. Our proposal provides improvements of more than 10% in terms of video size and with similar coding times when compared with a fixed Constant Rate Factor (CRF) encoding. A visual comparison between our proposal and a fixed CRF encoding can be seen at: https://links.uv.es/jgutierr/multiresQ","PeriodicalId":429348,"journal":{"name":"Proceedings of the 31st ACM Workshop on Network and Operating Systems Support for Digital Audio and Video","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129307656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Data diet pills: in-network video quality control system for traffic usage reduction 数据减肥药:网络视频质量控制系统,减少流量使用
Anan Sawabe, Takanori Iwai, A. Nakao
Traffic reduction for bandwidth-hungry video streaming services, such as YouTube, benefits not only subscribers struggling to avoid going over their contracted data limit, but also service providers when the number of people who use video streaming services increase. Because not all stakeholders who want to reduce traffic usage are willing to conduct cumbersome operations, e.g., manually setting lower resolution, we argue here that network operators should introduce a traffic pacer for providing traffic reduction services as an optional plan for subscribers. This paper proposes NetPacer, an in-network traffic pacing system for reducing traffic usage by degrading the video quality. NetPacer has two features. The first is relative pacing, which degrades the video quality relative to the initial quality by traffic shaping, thus enabling flexible quality control. The second is in-network timely video quality identification via encrypted traffic analysis by using machine learning. Through experiments, we demonstrate that NetPacer successfully reduces traffic by 30.8% by degrading the resolution by one level while keeping the QoE (i.e., Mean Opinion Score (MOS)) degradation below 0.268 points on average for 50 YouTube videos.
像YouTube这样需要带宽的视频流媒体服务的流量减少,不仅有利于努力避免超过合同数据限制的用户,而且当使用视频流媒体服务的人数增加时,服务提供商也会受益。由于并非所有希望减少流量使用的利益相关者都愿意进行繁琐的操作,例如手动设置较低的分辨率,因此我们在此认为,网络运营商应该引入流量步幅,为用户提供流量减少服务,作为可选计划。本文提出了一种网络内流量调节系统NetPacer,通过降低视频质量来减少流量的使用。NetPacer有两个特点。第一种是相对步进,它通过流量整形来降低视频质量相对于初始质量,从而实现灵活的质量控制。二是利用机器学习,通过加密流量分析实现网络内视频质量的实时识别。通过实验,我们证明NetPacer通过将分辨率降低一级,成功地减少了30.8%的流量,同时将50个YouTube视频的QoE(即平均意见分数(MOS))降低到平均低于0.268分。
{"title":"Data diet pills: in-network video quality control system for traffic usage reduction","authors":"Anan Sawabe, Takanori Iwai, A. Nakao","doi":"10.1145/3458306.3462255","DOIUrl":"https://doi.org/10.1145/3458306.3462255","url":null,"abstract":"Traffic reduction for bandwidth-hungry video streaming services, such as YouTube, benefits not only subscribers struggling to avoid going over their contracted data limit, but also service providers when the number of people who use video streaming services increase. Because not all stakeholders who want to reduce traffic usage are willing to conduct cumbersome operations, e.g., manually setting lower resolution, we argue here that network operators should introduce a traffic pacer for providing traffic reduction services as an optional plan for subscribers. This paper proposes NetPacer, an in-network traffic pacing system for reducing traffic usage by degrading the video quality. NetPacer has two features. The first is relative pacing, which degrades the video quality relative to the initial quality by traffic shaping, thus enabling flexible quality control. The second is in-network timely video quality identification via encrypted traffic analysis by using machine learning. Through experiments, we demonstrate that NetPacer successfully reduces traffic by 30.8% by degrading the resolution by one level while keeping the QoE (i.e., Mean Opinion Score (MOS)) degradation below 0.268 points on average for 50 YouTube videos.","PeriodicalId":429348,"journal":{"name":"Proceedings of the 31st ACM Workshop on Network and Operating Systems Support for Digital Audio and Video","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122258355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Understanding quality of experience of heuristic-based HTTP adaptive bitrate algorithms 理解基于启发式的HTTP自适应比特率算法的体验质量
Babak Taraghi, A. Bentaleb, C. Timmerer, Roger Zimmermann, H. Hellwagner
Adaptive bitrate (ABR) algorithms play a crucial role in delivering the highest possible viewer's Quality of Experience (QoE) in HTTP Adaptive Streaming (HAS). Online video streaming service providers use HAS - the dominant video streaming technique on the Internet - to deliver the best QoE for their users. A viewer's delight relies heavily on how the ABR of a media player can adapt the stream's quality to the current network conditions. QoE for video streaming sessions has been assessed in many research projects to give better insight into the significant quality metrics such as startup delay and stall events. The ITU Telecommunication Standardization Sector (ITU-T) P.1203 quality evaluation model allows to algorithmically predict a subjective Mean Opinion Score (MOS) by considering various quality metrics. Subjective evaluation is the best assessment method for examining the end-user opinion over a video streaming session's experienced quality. We have conducted subjective evaluations with crowdsourced participants and evaluated the MOS of the sessions using the ITU-T P.1203 quality model. This paper's main contribution is to investigate the correspondence of subjective and objective evaluations for well-known heuristic-based ABRs.
自适应比特率(ABR)算法在HTTP自适应流(HAS)中提供尽可能高的观看者体验质量(QoE)方面起着至关重要的作用。在线视频流服务提供商使用HAS——互联网上占主导地位的视频流技术——为用户提供最佳的QoE。观众的愉悦程度很大程度上取决于媒体播放器的ABR如何根据当前的网络条件调整流的质量。许多研究项目已经对视频流会话的QoE进行了评估,以便更好地了解重要的质量指标,如启动延迟和停机事件。国际电联电信标准化部门(ITU- t) P.1203质量评估模型允许通过考虑各种质量指标,通过算法预测主观平均意见得分(MOS)。主观评价是检验终端用户对视频流会话体验质量的意见的最佳评估方法。我们对众包参与者进行了主观评估,并使用ITU-T P.1203质量模型评估了会议的最大质量。本文的主要贡献是研究了著名的启发式abr的主客观评价的对应关系。
{"title":"Understanding quality of experience of heuristic-based HTTP adaptive bitrate algorithms","authors":"Babak Taraghi, A. Bentaleb, C. Timmerer, Roger Zimmermann, H. Hellwagner","doi":"10.1145/3458306.3458875","DOIUrl":"https://doi.org/10.1145/3458306.3458875","url":null,"abstract":"Adaptive bitrate (ABR) algorithms play a crucial role in delivering the highest possible viewer's Quality of Experience (QoE) in HTTP Adaptive Streaming (HAS). Online video streaming service providers use HAS - the dominant video streaming technique on the Internet - to deliver the best QoE for their users. A viewer's delight relies heavily on how the ABR of a media player can adapt the stream's quality to the current network conditions. QoE for video streaming sessions has been assessed in many research projects to give better insight into the significant quality metrics such as startup delay and stall events. The ITU Telecommunication Standardization Sector (ITU-T) P.1203 quality evaluation model allows to algorithmically predict a subjective Mean Opinion Score (MOS) by considering various quality metrics. Subjective evaluation is the best assessment method for examining the end-user opinion over a video streaming session's experienced quality. We have conducted subjective evaluations with crowdsourced participants and evaluated the MOS of the sessions using the ITU-T P.1203 quality model. This paper's main contribution is to investigate the correspondence of subjective and objective evaluations for well-known heuristic-based ABRs.","PeriodicalId":429348,"journal":{"name":"Proceedings of the 31st ACM Workshop on Network and Operating Systems Support for Digital Audio and Video","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134530239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Higher quality live streaming under lower uplink bandwidth: an approach of super-resolution based video coding 低上行带宽下的高质量直播:一种基于超分辨率的视频编码方法
Ying Chen, Qing Li, Aoyang Zhang, Longhao Zou, Yong Jiang, Zhimin Xu, Junlin Li, Zhenhui Yuan
With the growing popularity of live streaming, high video quality and low latency with limited uplink bandwidth have become a significant challenge. In this study, we propose Live Super-Resolution Based Video Coding (LiveSRVC), a novel video uploading framework that improves the quality of live streaming with low latency under limited uplink bandwidth. We design a new super-resolution-based key frame coding module to improve the coding compression efficiency. LiveSRVC dynamically selects the bitrate and the compression ratio of key frames, mitigating the influence of uplink bandwidth capacity on live streaming quality. Trace-driven emulations verify that LiveSRVC can provide the same quality while reducing up to 50% of the required bandwidth compared to the original encoding method (H.264). LiveSRVC consumes at least 10X less GPU occupation time compared to the method of reconstructing all frames with super-resolution.
随着网络直播的日益普及,在有限的上行带宽下实现高质量、低时延的视频传输已成为一项重大挑战。在本研究中,我们提出了一种基于实时超分辨率的视频编码(LiveSRVC),这是一种新颖的视频上传框架,可以在有限的上行带宽下以低延迟提高直播流的质量。为了提高编码压缩效率,设计了一种新的基于超分辨率的关键帧编码模块。LiveSRVC动态选择关键帧的比特率和压缩比,减轻了上行带宽容量对直播质量的影响。跟踪驱动的仿真验证了LiveSRVC可以提供相同的质量,同时与原始编码方法(H.264)相比,所需带宽减少了50%。与使用超分辨率重建所有帧的方法相比,LiveSRVC至少节省了10倍的GPU占用时间。
{"title":"Higher quality live streaming under lower uplink bandwidth: an approach of super-resolution based video coding","authors":"Ying Chen, Qing Li, Aoyang Zhang, Longhao Zou, Yong Jiang, Zhimin Xu, Junlin Li, Zhenhui Yuan","doi":"10.1145/3458306.3458874","DOIUrl":"https://doi.org/10.1145/3458306.3458874","url":null,"abstract":"With the growing popularity of live streaming, high video quality and low latency with limited uplink bandwidth have become a significant challenge. In this study, we propose Live Super-Resolution Based Video Coding (LiveSRVC), a novel video uploading framework that improves the quality of live streaming with low latency under limited uplink bandwidth. We design a new super-resolution-based key frame coding module to improve the coding compression efficiency. LiveSRVC dynamically selects the bitrate and the compression ratio of key frames, mitigating the influence of uplink bandwidth capacity on live streaming quality. Trace-driven emulations verify that LiveSRVC can provide the same quality while reducing up to 50% of the required bandwidth compared to the original encoding method (H.264). LiveSRVC consumes at least 10X less GPU occupation time compared to the method of reconstructing all frames with super-resolution.","PeriodicalId":429348,"journal":{"name":"Proceedings of the 31st ACM Workshop on Network and Operating Systems Support for Digital Audio and Video","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134201394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Deep reinforced bitrate ladders for adaptive video streaming 深度增强比特率阶梯自适应视频流
Tianchi Huang, Ruixiao Zhang, Lifeng Sun
In the typical transcoding pipeline for adaptive video streaming, raw videos are pre-chunked and pre-encoded according to a set of resolution-bitrate or resolution-quality pairs on the server-side, where the pair is often named as bitrate ladder. Different from existing heuristics, we argue that a good bitrate ladder should be optimized by considering video content features, network capacity, and storage costs on the cloud. We propose DeepLadder, a per-chunk optimization scheme which adopts state-of-the-art deep reinforcement learning (DRL) method to optimize the bitrate ladder w.r.t the above concerns. Technically, DeepLadder selects the proper setting for each video resolution autoregressively. We use over 8,000 video chunks, measure over 1,000,000 perceptual video qualities, collect real-world network traces for more than 50 hours, and invent faithful virtual environments to help train DeepLadder efficiently. Across a series of comprehensive experiments on both Constant Bitrate (CBR) and Variable Bitrate (VBR)-encoded videos, we demonstrate significant improvements in average video quality bandwidth utilization, and storage overhead in comparison to prior work as well as the ability to be deployed in the real-world transcoding framework.
在典型的自适应视频流的转码管道中,原始视频根据服务器端的一组分辨率-比特率或分辨率-质量对进行预分组和预编码,其中这对通常被称为比特率阶梯。与现有的启发式方法不同,我们认为一个好的比特率阶梯应该通过考虑视频内容特征、网络容量和云上的存储成本来优化。我们提出了DeepLadder,这是一种采用最先进的深度强化学习(DRL)方法来优化比特率阶梯的逐块优化方案。从技术上讲,DeepLadder为每个视频分辨率自动回归选择适当的设置。我们使用了超过8,000个视频块,测量了超过1,000,000个感知视频质量,收集了超过50小时的真实世界网络痕迹,并发明了忠实的虚拟环境来帮助有效地训练DeepLadder。通过对恒定比特率(CBR)和可变比特率(VBR)编码视频的一系列综合实验,我们展示了与之前的工作相比,在平均视频质量、带宽利用率和存储开销方面的显着改进,以及在实际转码框架中部署的能力。
{"title":"Deep reinforced bitrate ladders for adaptive video streaming","authors":"Tianchi Huang, Ruixiao Zhang, Lifeng Sun","doi":"10.1145/3458306.3458873","DOIUrl":"https://doi.org/10.1145/3458306.3458873","url":null,"abstract":"In the typical transcoding pipeline for adaptive video streaming, raw videos are pre-chunked and pre-encoded according to a set of resolution-bitrate or resolution-quality pairs on the server-side, where the pair is often named as bitrate ladder. Different from existing heuristics, we argue that a good bitrate ladder should be optimized by considering video content features, network capacity, and storage costs on the cloud. We propose DeepLadder, a per-chunk optimization scheme which adopts state-of-the-art deep reinforcement learning (DRL) method to optimize the bitrate ladder w.r.t the above concerns. Technically, DeepLadder selects the proper setting for each video resolution autoregressively. We use over 8,000 video chunks, measure over 1,000,000 perceptual video qualities, collect real-world network traces for more than 50 hours, and invent faithful virtual environments to help train DeepLadder efficiently. Across a series of comprehensive experiments on both Constant Bitrate (CBR) and Variable Bitrate (VBR)-encoded videos, we demonstrate significant improvements in average video quality bandwidth utilization, and storage overhead in comparison to prior work as well as the ability to be deployed in the real-world transcoding framework.","PeriodicalId":429348,"journal":{"name":"Proceedings of the 31st ACM Workshop on Network and Operating Systems Support for Digital Audio and Video","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131037403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Dynamic 3D point cloud streaming: distortion and concealment 动态3D点云流:失真和隐藏
Cheng-Hao Wu, Xiner Li, R. Rajesh, Wei Tsang Ooi, Cheng-Hsin Hsu
We present a study on the impact of packet loss on dynamic 3D point cloud streaming, encoded with MPEG Video-based Point Cloud Compression (V-PCC) standard. We show the distortion when different channels of V-PCC bitstream are lost, with the loss of occupancy and geometry data impacting the quality most significantly. Our results point to the need for better error concealment techniques. We end the paper by presenting preliminary thoughts and experimental results of two naive error concealment techniques in the point cloud domain, for attributes and geometry data, respectively, and highlight the limitations of each.
我们提出了一个研究丢包对动态3D点云流的影响,用MPEG视频点云压缩(V-PCC)标准编码。我们展示了V-PCC比特流中不同通道丢失时的失真,其中占用率和几何数据的丢失对质量的影响最大。我们的结果表明需要更好的错误隐藏技术。最后,我们给出了两种朴素的错误隐藏技术在点云域的初步思路和实验结果,分别针对属性和几何数据,并强调了各自的局限性。
{"title":"Dynamic 3D point cloud streaming: distortion and concealment","authors":"Cheng-Hao Wu, Xiner Li, R. Rajesh, Wei Tsang Ooi, Cheng-Hsin Hsu","doi":"10.1145/3458306.3458876","DOIUrl":"https://doi.org/10.1145/3458306.3458876","url":null,"abstract":"We present a study on the impact of packet loss on dynamic 3D point cloud streaming, encoded with MPEG Video-based Point Cloud Compression (V-PCC) standard. We show the distortion when different channels of V-PCC bitstream are lost, with the loss of occupancy and geometry data impacting the quality most significantly. Our results point to the need for better error concealment techniques. We end the paper by presenting preliminary thoughts and experimental results of two naive error concealment techniques in the point cloud domain, for attributes and geometry data, respectively, and highlight the limitations of each.","PeriodicalId":429348,"journal":{"name":"Proceedings of the 31st ACM Workshop on Network and Operating Systems Support for Digital Audio and Video","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116522328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Uncertainty-aware robust adaptive video streaming with bayesian neural network and model predictive control 基于贝叶斯神经网络和模型预测控制的不确定性鲁棒自适应视频流
Nuowen Kan, Chenglin Li, Caiyi Yang, Wenrui Dai, Junni Zou, H. Xiong
In this paper, we propose BayesMPC, an uncertainty-aware robust adaptive bitrate (ABR) algorithm on the basis of Bayesian neural network (BNN) and model predictive control (MPC). Specifically, to improve the capacity of learning transition probability of the network throughput, we adopt a BNN-based predictor that is able to predict the statistical distribution of future throughput from the past throughput by not only considering the aleatoric uncertainty (e.g., noise), but also capturing the epistemic uncertainty incurred by lack of adequate training samples. We further show that by using the negative log-likelihood loss function to train this BNN-based throughput predictor, the generalization error can be minimized with the guarantee of PAC-Bayesian theorem. Rather than a point estimate, the learnt uncertainty can contribute to a confidence region for the future throughput, the lower bound of which then leads to an uncertainty-aware robust MPC strategy to maximize the worst-case user quality-of-experience (QoE) w.r.t. this confidence region. Finally, experimental results on three real-world network trace datasets validate the efficiency of both the proposed BNN-based predictor and uncertainty-aware robust MPC strategy, and demonstrate the superior performance compared to other baselines, in terms of both the overall QoE performance and generalization across all ranges of heterogeneous network and user conditions.
本文提出了一种基于贝叶斯神经网络(BNN)和模型预测控制(MPC)的不确定性鲁棒自适应比特率(ABR)算法BayesMPC。具体而言,为了提高网络吞吐量转移概率的学习能力,我们采用了基于bnn的预测器,该预测器不仅考虑了任意不确定性(如噪声),还捕获了由于缺乏足够的训练样本而产生的认知不确定性,能够从过去的吞吐量中预测未来吞吐量的统计分布。我们进一步证明,使用负对数似然损失函数来训练基于bnn的吞吐量预测器,可以在pac -贝叶斯定理的保证下最小化泛化误差。而不是一个点估计,学习到的不确定性可以为未来吞吐量贡献一个置信区域,其下界然后导致一个不确定性感知的鲁棒MPC策略,以最大化最坏情况下的用户体验质量(QoE)。最后,在三个真实网络跟踪数据集上的实验结果验证了所提出的基于bnn的预测器和不确定性感知的鲁棒MPC策略的效率,并且在所有异构网络和用户条件范围内的总体QoE性能和泛化方面,与其他基准相比,显示出优越的性能。
{"title":"Uncertainty-aware robust adaptive video streaming with bayesian neural network and model predictive control","authors":"Nuowen Kan, Chenglin Li, Caiyi Yang, Wenrui Dai, Junni Zou, H. Xiong","doi":"10.1145/3458306.3458872","DOIUrl":"https://doi.org/10.1145/3458306.3458872","url":null,"abstract":"In this paper, we propose BayesMPC, an uncertainty-aware robust adaptive bitrate (ABR) algorithm on the basis of Bayesian neural network (BNN) and model predictive control (MPC). Specifically, to improve the capacity of learning transition probability of the network throughput, we adopt a BNN-based predictor that is able to predict the statistical distribution of future throughput from the past throughput by not only considering the aleatoric uncertainty (e.g., noise), but also capturing the epistemic uncertainty incurred by lack of adequate training samples. We further show that by using the negative log-likelihood loss function to train this BNN-based throughput predictor, the generalization error can be minimized with the guarantee of PAC-Bayesian theorem. Rather than a point estimate, the learnt uncertainty can contribute to a confidence region for the future throughput, the lower bound of which then leads to an uncertainty-aware robust MPC strategy to maximize the worst-case user quality-of-experience (QoE) w.r.t. this confidence region. Finally, experimental results on three real-world network trace datasets validate the efficiency of both the proposed BNN-based predictor and uncertainty-aware robust MPC strategy, and demonstrate the superior performance compared to other baselines, in terms of both the overall QoE performance and generalization across all ranges of heterogeneous network and user conditions.","PeriodicalId":429348,"journal":{"name":"Proceedings of the 31st ACM Workshop on Network and Operating Systems Support for Digital Audio and Video","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131324494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Wifi-VLC dual connectivity streaming system for 6DOF multi-user virtual reality 用于6DOF多用户虚拟现实的Wifi-VLC双连接流系统
Jacob Chakareski, M. Khan
We investigate a future WiFi-VLC dual connectivity streaming system for 6DOF multi-user virtual reality that enables reliable high-fidelity remote scene immersion. The system integrates an edge server that uses scalable 360° tiling to adaptively split the present 360° view of a VR user into a panoramic baseline content layer and a viewport-specific enhancement content layer. The user is then served the two content layers over complementary WiFi and VLC wireless links such that the delivered viewport quality is maximized for the given WiFi and VLC transmission resources. We formally characterize the actions of the server using rate-distortion optimization that we solve at low complexity. To account for the users' mobility as they explore different 360° viewpoints of the 6DOF remote scene content and maintain reliable high-quality VLC connectivity, we explore dynamic VLC transmitter steering and assignment in the system as graph bottleneck matching that aims to maximize the received VLC SNR across all users. We formulate an effective low-complexity solution to this discrete combinatorial optimization problem of high complexity. The paper also contributes a first actual 6DOF body and head movement VR navigation dataset that we collected and facilitate to assess the performance of our system via simulation experiments. These demonstrate enhanced VLC transmission performance and an up to 7 dB gain in viewport quality over a state-of-the-art VLC cellular system (LiFi), and an up to 10 dB gain in viewport quality over a state-of-the-art traditional wireless streaming method, for 12K-120fps 360° 6DOF VR content. Moreover, the synergistic WiFi-VLC dual connectivity of the proposed system augments its reliability over the reference method LiFi that comprises only VLC links. These outcomes motivate further exploration and prototype implementation of our system.
我们研究了未来用于6DOF多用户虚拟现实的WiFi-VLC双连接流系统,该系统可实现可靠的高保真远程场景沉浸。该系统集成了一个边缘服务器,该服务器使用可扩展的360°平片自适应地将VR用户当前的360°视图拆分为全景基线内容层和特定于视口的增强内容层。然后,用户通过互补的WiFi和VLC无线链路服务于两个内容层,这样,对于给定的WiFi和VLC传输资源,交付的视口质量是最大化的。我们使用我们在低复杂度下解决的速率失真优化来正式描述服务器的动作。为了考虑用户在探索6DOF远程场景内容的不同360°视点时的移动性,并保持可靠的高质量VLC连接,我们探索了系统中的动态VLC发射机转向和分配,作为图形瓶颈匹配,旨在最大化所有用户接收的VLC信噪比。对于这类高复杂度的离散组合优化问题,我们给出了一个有效的低复杂度解。本文还提供了我们收集的第一个实际的6DOF身体和头部运动VR导航数据集,并有助于通过仿真实验评估我们的系统性能。这些测试表明,对于12K-120fps 360°6DOF VR内容,与最先进的VLC蜂窝系统(LiFi)相比,VLC传输性能得到了增强,视口质量获得了高达7 dB的增益,而与最先进的传统无线流媒体方法相比,视口质量获得了高达10 dB的增益。此外,与仅包含VLC链路的参考方法LiFi相比,所提出系统的协同WiFi-VLC双连接增强了其可靠性。这些结果激发了我们系统的进一步探索和原型实现。
{"title":"Wifi-VLC dual connectivity streaming system for 6DOF multi-user virtual reality","authors":"Jacob Chakareski, M. Khan","doi":"10.1145/3458306.3460999","DOIUrl":"https://doi.org/10.1145/3458306.3460999","url":null,"abstract":"We investigate a future WiFi-VLC dual connectivity streaming system for 6DOF multi-user virtual reality that enables reliable high-fidelity remote scene immersion. The system integrates an edge server that uses scalable 360° tiling to adaptively split the present 360° view of a VR user into a panoramic baseline content layer and a viewport-specific enhancement content layer. The user is then served the two content layers over complementary WiFi and VLC wireless links such that the delivered viewport quality is maximized for the given WiFi and VLC transmission resources. We formally characterize the actions of the server using rate-distortion optimization that we solve at low complexity. To account for the users' mobility as they explore different 360° viewpoints of the 6DOF remote scene content and maintain reliable high-quality VLC connectivity, we explore dynamic VLC transmitter steering and assignment in the system as graph bottleneck matching that aims to maximize the received VLC SNR across all users. We formulate an effective low-complexity solution to this discrete combinatorial optimization problem of high complexity. The paper also contributes a first actual 6DOF body and head movement VR navigation dataset that we collected and facilitate to assess the performance of our system via simulation experiments. These demonstrate enhanced VLC transmission performance and an up to 7 dB gain in viewport quality over a state-of-the-art VLC cellular system (LiFi), and an up to 10 dB gain in viewport quality over a state-of-the-art traditional wireless streaming method, for 12K-120fps 360° 6DOF VR content. Moreover, the synergistic WiFi-VLC dual connectivity of the proposed system augments its reliability over the reference method LiFi that comprises only VLC links. These outcomes motivate further exploration and prototype implementation of our system.","PeriodicalId":429348,"journal":{"name":"Proceedings of the 31st ACM Workshop on Network and Operating Systems Support for Digital Audio and Video","volume":"304 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114598207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
Proceedings of the 31st ACM Workshop on Network and Operating Systems Support for Digital Audio and Video
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1