首页 > 最新文献

Proceedings of the 9th ACM Multimedia Systems Conference最新文献

英文 中文
Mimir
Pub Date : 2018-06-12 DOI: 10.1145/3204949.3208129
S. Hicks, S. Eskeland, M. Lux, T. de Lange, K. Randel, Mattis Jeppsson, Konstantin Pogorelov, P. Halvorsen, M. Riegler
Automatic detection of diseases is a growing field of interest, and machine learning in form of deep learning neural networks are frequently explored as a potential tool for the medical video analysis. To both improve the "black box"-understanding and assist in the administrative duties of writing an examination report, we release an automated multimedia reporting software dissecting the neural network to learn the intermediate analysis steps, i.e., we are adding a new level of understanding and explainability by looking into the deep learning algorithms decision processes. The presented open-source software can be used for easy retrieval and reuse of data for automatic report generation, comparisons, teaching and research. As an example, we use live colonoscopy as a use case which is the gold standard examination of the large bowel, commonly performed for clinical and screening purposes. The added information has potentially a large value, and reuse of the data for the automatic reporting may potentially save the doctors large amounts of time.
{"title":"Mimir","authors":"S. Hicks, S. Eskeland, M. Lux, T. de Lange, K. Randel, Mattis Jeppsson, Konstantin Pogorelov, P. Halvorsen, M. Riegler","doi":"10.1145/3204949.3208129","DOIUrl":"https://doi.org/10.1145/3204949.3208129","url":null,"abstract":"Automatic detection of diseases is a growing field of interest, and machine learning in form of deep learning neural networks are frequently explored as a potential tool for the medical video analysis. To both improve the \"black box\"-understanding and assist in the administrative duties of writing an examination report, we release an automated multimedia reporting software dissecting the neural network to learn the intermediate analysis steps, i.e., we are adding a new level of understanding and explainability by looking into the deep learning algorithms decision processes. The presented open-source software can be used for easy retrieval and reuse of data for automatic report generation, comparisons, teaching and research. As an example, we use live colonoscopy as a use case which is the gold standard examination of the large bowel, commonly performed for clinical and screening purposes. The added information has potentially a large value, and reuse of the data for the automatic reporting may potentially save the doctors large amounts of time.","PeriodicalId":141196,"journal":{"name":"Proceedings of the 9th ACM Multimedia Systems Conference","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123560611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
dashc: a highly scalable client emulator for DASH video dashc:用于DASH视频的高度可扩展的客户端模拟器
Pub Date : 2018-06-12 DOI: 10.1145/3204949.3208135
A. Reviakin, A. Zahran, C. Sreenan
In this paper we introduce a client emulator for experimenting with DASH video. dashc is a standalone, compact, easy-to-build and easy-to-use command line software tool. The design and implementation of dashc were motivated by the pressing need to conduct network experiments with large numbers of video clients. The highly scalable dashc has low CPU and memory usage. dashc collects necessary statistics about video delivery performance in a convenient format, facilitating thorough post hoc analysis. The code of dashc is modular and new video adaptation algorithm can easily be added. We compare dashc to a state-of-the art client and demonstrate its efficacy for large-scale experiments using the Mininet virtual network.
本文介绍了一个用于DASH视频实验的客户端仿真器。Dashc是一个独立,紧凑,易于构建和易于使用的命令行软件工具。dashc的设计和实现是出于对大量视频客户端进行网络实验的迫切需要。高度可扩展的dashc具有较低的CPU和内存使用率。Dashc以方便的格式收集有关视频交付性能的必要统计数据,便于进行彻底的事后分析。dashc的代码是模块化的,可以方便地添加新的视频自适应算法。我们将dashc与最先进的客户端进行比较,并展示其使用Mininet虚拟网络进行大规模实验的有效性。
{"title":"dashc: a highly scalable client emulator for DASH video","authors":"A. Reviakin, A. Zahran, C. Sreenan","doi":"10.1145/3204949.3208135","DOIUrl":"https://doi.org/10.1145/3204949.3208135","url":null,"abstract":"In this paper we introduce a client emulator for experimenting with DASH video. dashc is a standalone, compact, easy-to-build and easy-to-use command line software tool. The design and implementation of dashc were motivated by the pressing need to conduct network experiments with large numbers of video clients. The highly scalable dashc has low CPU and memory usage. dashc collects necessary statistics about video delivery performance in a convenient format, facilitating thorough post hoc analysis. The code of dashc is modular and new video adaptation algorithm can easily be added. We compare dashc to a state-of-the art client and demonstrate its efficacy for large-scale experiments using the Mininet virtual network.","PeriodicalId":141196,"journal":{"name":"Proceedings of the 9th ACM Multimedia Systems Conference","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123566695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Virtual reality conferencing: multi-user immersive VR experiences on the web 虚拟现实会议:网络上的多用户沉浸式VR体验
Pub Date : 2018-06-12 DOI: 10.1145/3204949.3208115
S. Gunkel, H. Stokking, Martin Prins, N. V. D. Stap, F. T. Haar, O. Niamut
Virtual Reality (VR) and 360-degree video are set to become part of the future social environment, enriching and enhancing the way we share experiences and collaborate remotely. While Social VR applications are getting more momentum, most services regarding Social VR focus on animated avatars. In this demo, we present our efforts towards Social VR services based on photo-realistic video recordings. In this demo paper, we focus on two parts, the communication between multiple people (max 3) and the integration of new media formats to represent users as 3D point clouds. We enhance a green screen (chroma key) like cut-out of the person with depth data, allowing point cloud based rendering in the client. Further, the paper presents a user study with 54 people evaluating a three-people communication use case and a technical analysis to move towards 3D representations of users. This demo consists of two shared virtual environments to communicate and interact with others, i.e. i) a 360-degree virtual space with users being represented as 2D video streams (with the background removed) and ii) a 3D space with users being represented as point clouds (based on color and depth video data).
虚拟现实(VR)和360度视频将成为未来社会环境的一部分,丰富和增强我们分享经验和远程协作的方式。虽然社交虚拟现实应用的势头越来越大,但大多数与社交虚拟现实相关的服务都集中在动画化身上。在这个演示中,我们展示了基于逼真视频记录的社交VR服务。在这篇演示论文中,我们重点关注两部分,多人(最多3人)之间的通信和新媒体格式的集成,将用户表示为3D点云。我们通过深度数据增强了绿屏(色度键),允许在客户端中基于点云的渲染。此外,本文提出了一个54人的用户研究,评估了一个三人交流用例,并进行了技术分析,以实现用户的3D表示。该演示由两个共享的虚拟环境组成,用于与他人进行通信和交互,即i) 360度虚拟空间,用户表示为2D视频流(删除背景),ii) 3D空间,用户表示为点云(基于颜色和深度视频数据)。
{"title":"Virtual reality conferencing: multi-user immersive VR experiences on the web","authors":"S. Gunkel, H. Stokking, Martin Prins, N. V. D. Stap, F. T. Haar, O. Niamut","doi":"10.1145/3204949.3208115","DOIUrl":"https://doi.org/10.1145/3204949.3208115","url":null,"abstract":"Virtual Reality (VR) and 360-degree video are set to become part of the future social environment, enriching and enhancing the way we share experiences and collaborate remotely. While Social VR applications are getting more momentum, most services regarding Social VR focus on animated avatars. In this demo, we present our efforts towards Social VR services based on photo-realistic video recordings. In this demo paper, we focus on two parts, the communication between multiple people (max 3) and the integration of new media formats to represent users as 3D point clouds. We enhance a green screen (chroma key) like cut-out of the person with depth data, allowing point cloud based rendering in the client. Further, the paper presents a user study with 54 people evaluating a three-people communication use case and a technical analysis to move towards 3D representations of users. This demo consists of two shared virtual environments to communicate and interact with others, i.e. i) a 360-degree virtual space with users being represented as 2D video streams (with the background removed) and ii) a 3D space with users being represented as point clouds (based on color and depth video data).","PeriodicalId":141196,"journal":{"name":"Proceedings of the 9th ACM Multimedia Systems Conference","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123641529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 48
ImmersiaTV
Pub Date : 2018-06-12 DOI: 10.1145/3204949.3209620
David Gómez, Juan A. Núñez, Mario Montagud, S. Fernández
ImmersiaTV is a H2020 European project that targets the creation of novel forms of TV content production, delivery and consumption to enable customizable and immersive multi-screen TV experiences. The goal is not only to provide an efficient support for multi-screen scenarios, but also to achieve a seamless integration between the traditional TV content formats and consumption devices with the emerging omnidirectional ones, thus opening the door to new fascinating scenarios. This paper initially provides an overview of the end-to-end platform that is being developed in the project. Then, the created contents and considered pilot scenarios are briefly described. Finally, the paper provides details about the consumption part of the ImmersiaTV platform to be showcased. In particular, it enables a customizable, interactive and synchronized consumption of traditional and omnidirectional contents from an opera performance, in multiscreen scenarios, composed of main TVs, tablets and Head Mounted Displays (HMDs).
{"title":"ImmersiaTV","authors":"David Gómez, Juan A. Núñez, Mario Montagud, S. Fernández","doi":"10.1145/3204949.3209620","DOIUrl":"https://doi.org/10.1145/3204949.3209620","url":null,"abstract":"ImmersiaTV is a H2020 European project that targets the creation of novel forms of TV content production, delivery and consumption to enable customizable and immersive multi-screen TV experiences. The goal is not only to provide an efficient support for multi-screen scenarios, but also to achieve a seamless integration between the traditional TV content formats and consumption devices with the emerging omnidirectional ones, thus opening the door to new fascinating scenarios. This paper initially provides an overview of the end-to-end platform that is being developed in the project. Then, the created contents and considered pilot scenarios are briefly described. Finally, the paper provides details about the consumption part of the ImmersiaTV platform to be showcased. In particular, it enables a customizable, interactive and synchronized consumption of traditional and omnidirectional contents from an opera performance, in multiscreen scenarios, composed of main TVs, tablets and Head Mounted Displays (HMDs).","PeriodicalId":141196,"journal":{"name":"Proceedings of the 9th ACM Multimedia Systems Conference","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128598229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Low-latency delivery of news-based video content 基于新闻的视频内容的低延迟交付
Pub Date : 2018-06-12 DOI: 10.1145/3204949.3208110
Jeroen van der Hooft, Dries Pauwels, C. D. Boom, Stefano Petrangeli, T. Wauters, F. Turck
Nowadays, news-based websites and portals provide significant amounts of multimedia content to accompany news stories and articles. Within this context, HTTP Adaptive Streaming is generally used to deliver video over the best-effort Internet, allowing smooth video playback and a good Quality of Experience (QoE). To stimulate user engagement with the provided content, such as browsing and switching between videos, reducing the video's startup time has become more and more important: while the current median load time is in the order of seconds, research has shown that user waiting times must remain below two seconds to achieve an acceptable QoE. We developed a framework for low-latent delivery of news-related video content, integrating four optimizations either at server-side, client-side, or at the application layer. Using these optimizations, the video's startup time can be reduced significantly, allowing user interaction and fast switching between available content. In this paper, we describe a proof of concept of this framework, using a large dataset of a major Belgian news provider. A dashboard is provided, which allows the user to interact with available video content and assess the gains of the proposed optimizations. Particularly, we demonstrate how the proposed optimizations consistently reduce the video's startup time in different mobile network scenarios. These reductions allow the news provider to improve the user's QoE, reducing the startup time to values well below two seconds in different mobile network scenarios.
如今,以新闻为基础的网站和门户网站为新闻报道和文章提供了大量的多媒体内容。在这种情况下,HTTP自适应流媒体通常用于通过最努力的Internet传输视频,允许平滑的视频播放和良好的体验质量(QoE)。为了刺激用户对所提供内容的参与,例如浏览和切换视频,减少视频的启动时间变得越来越重要:虽然目前的中位数加载时间在几秒左右,但研究表明,用户等待时间必须保持在两秒以下才能达到可接受的QoE。我们开发了一个框架,用于新闻相关视频内容的低潜在交付,集成了服务器端、客户端或应用层的四种优化。使用这些优化,视频的启动时间可以显着减少,允许用户交互和在可用内容之间快速切换。在本文中,我们使用比利时主要新闻提供商的大型数据集描述了该框架的概念证明。提供了一个指示板,它允许用户与可用的视频内容进行交互,并评估所建议的优化的收益。特别是,我们演示了建议的优化如何在不同的移动网络场景中一致地减少视频的启动时间。这些减少使新闻提供商能够提高用户的QoE,在不同的移动网络场景中将启动时间减少到远低于两秒的值。
{"title":"Low-latency delivery of news-based video content","authors":"Jeroen van der Hooft, Dries Pauwels, C. D. Boom, Stefano Petrangeli, T. Wauters, F. Turck","doi":"10.1145/3204949.3208110","DOIUrl":"https://doi.org/10.1145/3204949.3208110","url":null,"abstract":"Nowadays, news-based websites and portals provide significant amounts of multimedia content to accompany news stories and articles. Within this context, HTTP Adaptive Streaming is generally used to deliver video over the best-effort Internet, allowing smooth video playback and a good Quality of Experience (QoE). To stimulate user engagement with the provided content, such as browsing and switching between videos, reducing the video's startup time has become more and more important: while the current median load time is in the order of seconds, research has shown that user waiting times must remain below two seconds to achieve an acceptable QoE. We developed a framework for low-latent delivery of news-related video content, integrating four optimizations either at server-side, client-side, or at the application layer. Using these optimizations, the video's startup time can be reduced significantly, allowing user interaction and fast switching between available content. In this paper, we describe a proof of concept of this framework, using a large dataset of a major Belgian news provider. A dashboard is provided, which allows the user to interact with available video content and assess the gains of the proposed optimizations. Particularly, we demonstrate how the proposed optimizations consistently reduce the video's startup time in different mobile network scenarios. These reductions allow the news provider to improve the user's QoE, reducing the startup time to values well below two seconds in different mobile network scenarios.","PeriodicalId":141196,"journal":{"name":"Proceedings of the 9th ACM Multimedia Systems Conference","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128214956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Cataract-101: video dataset of 101 cataract surgeries 白内障-101:101个白内障手术的视频数据集
Pub Date : 2018-06-12 DOI: 10.1145/3204949.3208137
Klaus Schöffmann, M. Taschwer, S. Sarny, Bernd Münzer, Manfred Jürgen Primus, Doris Putzgruber
Cataract surgery is one of the most frequently performed microscopic surgeries in the field of ophthalmology. The goal behind this kind of surgery is to replace the human eye lense with an artificial one, an intervention that is often required due to aging. The entire surgery is performed under microscopy, but co-mounted cameras allow to record and archive the procedure. Currently, the recorded videos are used in a postoperative manner for documentation and training. An additional benefit of recording cataract videos is that they enable video analytics (i.e., manual and/or automatic video content analysis) to investigate medically relevant research questions (e.g., the cause of complications). This, however, necessitates a medical multimedia information system trained and evaluated on existing data, which is currently not publicly available. In this work we provide a public video dataset of 101 cataract surgeries that were performed by four different surgeons over a period of 9 months. These surgeons are grouped into moderately experienced and highly experienced surgeons (assistant vs. senior physicians), providing the basis for experience-based video analytics. All videos have been annotated with quasi-standardized operation phases by a senior ophthalmic surgeon.
白内障手术是眼科领域最常见的显微手术之一。这种手术的目的是用人工晶状体取代人眼的晶状体,这是由于衰老而经常需要的一种干预。整个手术是在显微镜下进行的,但同时安装的摄像机可以记录和存档整个过程。目前,录制的视频以术后方式用于记录和培训。记录白内障视频的另一个好处是,它们使视频分析(即手动和/或自动视频内容分析)能够调查医学相关的研究问题(例如,并发症的原因)。然而,这就需要一个根据现有数据进行培训和评价的医疗多媒体信息系统,而这些数据目前尚未公开。在这项工作中,我们提供了101例白内障手术的公开视频数据集,这些手术是由四位不同的外科医生在9个月的时间里进行的。这些外科医生分为经验中等和经验丰富的外科医生(助理医生和高级医生),为基于经验的视频分析提供基础。所有视频均由资深眼科医生进行准标准化的手术阶段注释。
{"title":"Cataract-101: video dataset of 101 cataract surgeries","authors":"Klaus Schöffmann, M. Taschwer, S. Sarny, Bernd Münzer, Manfred Jürgen Primus, Doris Putzgruber","doi":"10.1145/3204949.3208137","DOIUrl":"https://doi.org/10.1145/3204949.3208137","url":null,"abstract":"Cataract surgery is one of the most frequently performed microscopic surgeries in the field of ophthalmology. The goal behind this kind of surgery is to replace the human eye lense with an artificial one, an intervention that is often required due to aging. The entire surgery is performed under microscopy, but co-mounted cameras allow to record and archive the procedure. Currently, the recorded videos are used in a postoperative manner for documentation and training. An additional benefit of recording cataract videos is that they enable video analytics (i.e., manual and/or automatic video content analysis) to investigate medically relevant research questions (e.g., the cause of complications). This, however, necessitates a medical multimedia information system trained and evaluated on existing data, which is currently not publicly available. In this work we provide a public video dataset of 101 cataract surgeries that were performed by four different surgeons over a period of 9 months. These surgeons are grouped into moderately experienced and highly experienced surgeons (assistant vs. senior physicians), providing the basis for experience-based video analytics. All videos have been annotated with quasi-standardized operation phases by a senior ophthalmic surgeon.","PeriodicalId":141196,"journal":{"name":"Proceedings of the 9th ACM Multimedia Systems Conference","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125718139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 34
Implementing 360 video tiled streaming system 实现360视频平铺流系统
Pub Date : 2018-06-12 DOI: 10.1145/3204949.3208119
Jangwoo Son, Dongmin Jang, Eun‐Seok Ryu
The computing power and bandwidth of the current VR are limited when compared to the high-quality VR. To overcome these limits, this study proposes a new viewport dependent streaming method that transmits 360-degree videos using the high efficiency video coding (HEVC) and the scalability extension of HEVC (SHVC). The proposed SHVC and HEVC encoders generate the bitstream that can transmit tiles independently. Therefore, the bitstream generated by the proposed encoder can be extracted in units of tiles. In accordance with what is discussed in the standard, the proposed extractor extracts the bitstream of the tiles corresponding to the viewport. SHVC video bitstream extracted by the proposed methods consist of (i) an SHVC base layer (BL) which represents the entire 360-degree area and (ii) an SHVC enhancement layer (EL) for selective streaming with viewport (region of interest (ROI)) tiles. When the proposed HEVC encoder is used, low and high resolution sequences are separately encoded as the BL and EL of SHVC. By streaming the BL(low resolution) and selective EL(high resolution) tiles with ROI instead of streaming whole high quality 360-degree video, the proposed method can reduce the network bandwidth as well as the computational complexity on the decoder side. Experimental results show more than 47% bandwidth reduction.
与高质量的VR相比,当前VR的计算能力和带宽是有限的。为了克服这些限制,本研究提出了一种新的视口依赖流传输方法,该方法利用高效视频编码(HEVC)和HEVC的可扩展性扩展(SHVC)传输360度视频。所提出的SHVC和HEVC编码器产生的比特流可以独立传输数据块。因此,所提出的编码器产生的比特流可以以块为单位提取。根据标准中讨论的内容,提出的提取器提取与视口对应的tile的比特流。采用该方法提取的SHVC视频比特流包括(i)代表整个360度区域的SHVC基础层(BL)和(ii)具有视口(感兴趣区域(ROI))块的选择性流的SHVC增强层(EL)。采用本文提出的HEVC编码器时,将低分辨率序列和高分辨率序列分别编码为SHVC的BL和EL。通过流式传输具有ROI的BL(低分辨率)和选择性EL(高分辨率)片段,而不是流式传输整个高质量360度视频,该方法可以减少网络带宽和解码器端的计算复杂度。实验结果表明,带宽降低了47%以上。
{"title":"Implementing 360 video tiled streaming system","authors":"Jangwoo Son, Dongmin Jang, Eun‐Seok Ryu","doi":"10.1145/3204949.3208119","DOIUrl":"https://doi.org/10.1145/3204949.3208119","url":null,"abstract":"The computing power and bandwidth of the current VR are limited when compared to the high-quality VR. To overcome these limits, this study proposes a new viewport dependent streaming method that transmits 360-degree videos using the high efficiency video coding (HEVC) and the scalability extension of HEVC (SHVC). The proposed SHVC and HEVC encoders generate the bitstream that can transmit tiles independently. Therefore, the bitstream generated by the proposed encoder can be extracted in units of tiles. In accordance with what is discussed in the standard, the proposed extractor extracts the bitstream of the tiles corresponding to the viewport. SHVC video bitstream extracted by the proposed methods consist of (i) an SHVC base layer (BL) which represents the entire 360-degree area and (ii) an SHVC enhancement layer (EL) for selective streaming with viewport (region of interest (ROI)) tiles. When the proposed HEVC encoder is used, low and high resolution sequences are separately encoded as the BL and EL of SHVC. By streaming the BL(low resolution) and selective EL(high resolution) tiles with ROI instead of streaming whole high quality 360-degree video, the proposed method can reduce the network bandwidth as well as the computational complexity on the decoder side. Experimental results show more than 47% bandwidth reduction.","PeriodicalId":141196,"journal":{"name":"Proceedings of the 9th ACM Multimedia Systems Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129306770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Improving quality and scalability of webRTC video collaboration applications 提高webbrtc视频协作应用的质量和可扩展性
Pub Date : 2018-06-12 DOI: 10.1145/3204949.3208109
Stefano Petrangeli, Dries Pauwels, Jeroen van der Hooft, T. Wauters, F. Turck, Jürgen Slowack
Remote collaboration is common nowadays in conferencing, tele-health and remote teaching applications. To support these interactive use cases, Real-Time Communication (RTC) solutions, as the open-source WebRTC framework, are generally used. WebRTC is peer-to-peer by design, which entails that each sending peer needs to encode a separate, independent stream for each receiving peer in the remote session. This approach is therefore expensive in terms of number of encoders and not able to scale well for a large number of users. To overcome this issue, a WebRTC-compliant framework is proposed in this paper, where only a limited number of encoders are used at sender-side. Consequently, each encoder can transmit to a multitude of receivers at the same time. The conference controller, a centralized Selective Forwarding Unit (SFU), dynamically forwards the most suitable stream to each of the receivers, based on their bandwidth conditions. Moreover, the controller dynamically recomputes the encoding bitrates of the sender, to follow the long-term bandwidth variations of the receivers and increase the delivered video quality. The benefits of this framework are showcased using a demo implemented using the Jitsi-Videobridge software, a WebRTC SFU, for the controller and the Chrome browser for the peers. Particularly, we demonstrate how our framework can improve the received video quality up to 15% compared to an approach where the encoding bitrates are static and do not change over time.
远程协作在远程会议、远程医疗和远程教学应用中非常普遍。为了支持这些交互式用例,通常使用实时通信(RTC)解决方案,作为开源的WebRTC框架。WebRTC在设计上是点对点的,这意味着每个发送点需要为远程会话中的每个接收点编码一个单独的、独立的流。因此,就编码器的数量而言,这种方法是昂贵的,并且不能很好地扩展到大量用户。为了克服这个问题,本文提出了一个符合webrtc的框架,其中仅在发送端使用有限数量的编码器。因此,每个编码器可以同时发送到多个接收器。会议控制器是一个集中的选择性转发单元(SFU),它根据每个接收器的带宽条件动态地将最合适的流转发给每个接收器。此外,控制器动态地重新计算发送端的编码比特率,以跟踪接收器的长期带宽变化,提高传输的视频质量。该框架的优点通过使用jitsi - videbridge软件(WebRTC SFU)实现的演示来展示,该软件用于控制器,Chrome浏览器用于对等体。特别是,我们演示了与编码比特率是静态的并且不随时间变化的方法相比,我们的框架如何将接收到的视频质量提高15%。
{"title":"Improving quality and scalability of webRTC video collaboration applications","authors":"Stefano Petrangeli, Dries Pauwels, Jeroen van der Hooft, T. Wauters, F. Turck, Jürgen Slowack","doi":"10.1145/3204949.3208109","DOIUrl":"https://doi.org/10.1145/3204949.3208109","url":null,"abstract":"Remote collaboration is common nowadays in conferencing, tele-health and remote teaching applications. To support these interactive use cases, Real-Time Communication (RTC) solutions, as the open-source WebRTC framework, are generally used. WebRTC is peer-to-peer by design, which entails that each sending peer needs to encode a separate, independent stream for each receiving peer in the remote session. This approach is therefore expensive in terms of number of encoders and not able to scale well for a large number of users. To overcome this issue, a WebRTC-compliant framework is proposed in this paper, where only a limited number of encoders are used at sender-side. Consequently, each encoder can transmit to a multitude of receivers at the same time. The conference controller, a centralized Selective Forwarding Unit (SFU), dynamically forwards the most suitable stream to each of the receivers, based on their bandwidth conditions. Moreover, the controller dynamically recomputes the encoding bitrates of the sender, to follow the long-term bandwidth variations of the receivers and increase the delivered video quality. The benefits of this framework are showcased using a demo implemented using the Jitsi-Videobridge software, a WebRTC SFU, for the controller and the Chrome browser for the peers. Particularly, we demonstrate how our framework can improve the received video quality up to 15% compared to an approach where the encoding bitrates are static and do not change over time.","PeriodicalId":141196,"journal":{"name":"Proceedings of the 9th ACM Multimedia Systems Conference","volume":"7 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123202465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
4G/LTE channel quality reference signal trace data set 4G/LTE信道质量参考信号跟踪数据集
Pub Date : 2018-06-12 DOI: 10.1145/3204949.3208132
Britta Meixner, Jan Willem Kleinrouweler, Pablo César
Mobile networks, especially LTE networks, are used more and more for high-bandwidth services like multimedia or video streams. The quality of the data connection plays a major role in the perceived quality of a service. Videos may be presented in a low quality or experience a lot of stalling events, when the connection is too slow to buffer the next frames for playback. So far, no publicly available data set exists that has a larger number of LTE network traces and can be used for deeper analysis. In this data set, we provide 546 traces of 5 minutes each with a sample rate of 100 ms. Thereof 377 traces are pure LTE data. We furthermore provide an Android app to gather further traces as well as R scripts to clean, sort, and analyze the data.
移动网络,尤其是LTE网络,越来越多地用于高带宽服务,如多媒体或视频流。数据连接的质量在服务的感知质量中起着重要作用。当连接太慢而无法缓冲下一帧播放时,视频可能会以低质量呈现或经历许多延迟事件。到目前为止,还没有任何公开可用的数据集具有更多的LTE网络轨迹,可以用于更深入的分析。在这个数据集中,我们提供了546个5分钟的跟踪,每个跟踪的采样率为100毫秒。其中377道是纯LTE数据。我们还提供了一个Android应用程序来收集进一步的跟踪,以及R脚本来清理、排序和分析数据。
{"title":"4G/LTE channel quality reference signal trace data set","authors":"Britta Meixner, Jan Willem Kleinrouweler, Pablo César","doi":"10.1145/3204949.3208132","DOIUrl":"https://doi.org/10.1145/3204949.3208132","url":null,"abstract":"Mobile networks, especially LTE networks, are used more and more for high-bandwidth services like multimedia or video streams. The quality of the data connection plays a major role in the perceived quality of a service. Videos may be presented in a low quality or experience a lot of stalling events, when the connection is too slow to buffer the next frames for playback. So far, no publicly available data set exists that has a larger number of LTE network traces and can be used for deeper analysis. In this data set, we provide 546 traces of 5 minutes each with a sample rate of 100 ms. Thereof 377 traces are pure LTE data. We furthermore provide an Android app to gather further traces as well as R scripts to clean, sort, and analyze the data.","PeriodicalId":141196,"journal":{"name":"Proceedings of the 9th ACM Multimedia Systems Conference","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121297739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
From theory to practice: improving bitrate adaptation in the DASH reference player 从理论到实践:改进DASH参考播放器的比特率适应
Pub Date : 2018-06-12 DOI: 10.1145/3204949.3204953
Kevin Spiteri, R. Sitaraman, D. Sparacio
Modern video streaming uses adaptive bitrate (ABR) algorithms than run inside video players and continually adjust the quality (i.e., bitrate) of the video segments that are downloaded and rendered to the user. To maximize the quality-of-experience of the user, ABR algorithms must stream at a high bitrate with low rebuffering and low bitrate oscillations. Further, a good ABR algorithm is responsive to user and network events and can be used in demanding scenarios such as low-latency live streaming. Recent research papers provide an abundance of ABR algorithms, but fall short on many of the above real-world requirements. We develop Sabre, an open-source publicly-available simulation tool that enables fast and accurate simulation of adaptive streaming environments. We used Sabre to design and evaluate BOLA-E and DYNAMIC, two novel ABR algorithms. We also developed a FAST SWITCHING algorithm that can replace segments that have already been downloaded with higher-bitrate (thus higher-quality) segments. The new algorithms provide higher QoE to the user in terms of higher bitrate, fewer rebuffers, and lesser bitrate oscillations. In addition, these algorithms react faster to user events such as startup and seek, and respond more quickly to network events such as improvements in throughput. Further, they perform very well for live streams that require low latency, a challenging scenario for ABR algorithms. Overall, our algorithms offer superior video QoE and responsiveness for real-life adaptive video streaming, in comparison to the state-of-the-art. Importantly all three algorithms presented in this paper are now part of the official DASH reference player dash.js and are being used by video providers in production environments. While our evaluation and implementation are focused on the DASH environment, our algorithms are equally applicable to other adaptive streaming formats such as Apple HLS.
现代视频流使用自适应比特率(ABR)算法,而不是在视频播放器内部运行,并不断调整下载并呈现给用户的视频片段的质量(即比特率)。为了最大限度地提高用户的体验质量,ABR算法必须以低再缓冲和低比特率振荡的高比特率流传输。此外,一个好的ABR算法能够响应用户和网络事件,并且可以用于要求苛刻的场景,例如低延迟的实时流。最近的研究论文提供了大量的ABR算法,但在上述许多现实世界的要求上都存在不足。我们开发了Sabre,这是一个开源的公开可用的仿真工具,可以快速准确地模拟自适应流环境。我们使用Sabre来设计和评估BOLA-E和DYNAMIC这两种新的ABR算法。我们还开发了一种FAST SWITCHING算法,可以用更高比特率(从而更高质量)的片段替换已经下载的片段。新算法通过更高的比特率、更少的重新缓冲和更少的比特率振荡为用户提供更高的QoE。此外,这些算法对启动和查找等用户事件的反应更快,对吞吐量提高等网络事件的反应更快。此外,它们在需要低延迟的实时流中表现非常好,这对ABR算法来说是一个具有挑战性的场景。总的来说,与最先进的算法相比,我们的算法为现实生活中的自适应视频流提供了卓越的视频QoE和响应能力。重要的是,本文中提出的所有三种算法现在都是官方DASH参考播放器DASH .js的一部分,并且正在被视频提供商用于生产环境。虽然我们的评估和实现主要集中在DASH环境,但我们的算法同样适用于其他自适应流媒体格式,如Apple HLS。
{"title":"From theory to practice: improving bitrate adaptation in the DASH reference player","authors":"Kevin Spiteri, R. Sitaraman, D. Sparacio","doi":"10.1145/3204949.3204953","DOIUrl":"https://doi.org/10.1145/3204949.3204953","url":null,"abstract":"Modern video streaming uses adaptive bitrate (ABR) algorithms than run inside video players and continually adjust the quality (i.e., bitrate) of the video segments that are downloaded and rendered to the user. To maximize the quality-of-experience of the user, ABR algorithms must stream at a high bitrate with low rebuffering and low bitrate oscillations. Further, a good ABR algorithm is responsive to user and network events and can be used in demanding scenarios such as low-latency live streaming. Recent research papers provide an abundance of ABR algorithms, but fall short on many of the above real-world requirements. We develop Sabre, an open-source publicly-available simulation tool that enables fast and accurate simulation of adaptive streaming environments. We used Sabre to design and evaluate BOLA-E and DYNAMIC, two novel ABR algorithms. We also developed a FAST SWITCHING algorithm that can replace segments that have already been downloaded with higher-bitrate (thus higher-quality) segments. The new algorithms provide higher QoE to the user in terms of higher bitrate, fewer rebuffers, and lesser bitrate oscillations. In addition, these algorithms react faster to user events such as startup and seek, and respond more quickly to network events such as improvements in throughput. Further, they perform very well for live streams that require low latency, a challenging scenario for ABR algorithms. Overall, our algorithms offer superior video QoE and responsiveness for real-life adaptive video streaming, in comparison to the state-of-the-art. Importantly all three algorithms presented in this paper are now part of the official DASH reference player dash.js and are being used by video providers in production environments. While our evaluation and implementation are focused on the DASH environment, our algorithms are equally applicable to other adaptive streaming formats such as Apple HLS.","PeriodicalId":141196,"journal":{"name":"Proceedings of the 9th ACM Multimedia Systems Conference","volume":"310 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122094574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 134
期刊
Proceedings of the 9th ACM Multimedia Systems Conference
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1