Proceedings of the 9th ACM Multimedia Systems Conference最新文献

英文中文

DASHing towards hollywood 奔向好莱坞

Proceedings of the 9th ACM Multimedia Systems Conference

Pub Date : 2018-06-12 DOI: 10.1145/3204949.3204959

Saba Ahsan, Stephen McQuistin, C. Perkins, J. Ott

Adaptive streaming over HTTP has become the de-facto standard for video streaming over the Internet, partly due to its ease of deployment in a heavily ossified Internet. Though performant in most on-demand scenarios, it is bound by the semantics of TCP, with reliability prioritised over timeliness, even for live video where the reverse may be desired. In this paper, we present an implementation of MPEG-DASH over TCP Hollywood, a widely deployable TCP variant for latency sensitive applications. Out-of-order delivery in TCP Hollywood allows the client to measure, adapt and request the next video chunk even when the current one is only partially downloaded. Furthermore, the ability to skip frames, enabled by multi-streaming and out-of-order delivery, adds resilience against stalling for any delayed messages. We observed that in high latency and high loss networks, TCP Hollywood significantly lowers the possibility of stall events and also supports better quality downloads in comparison to standard TCP, with minimal changes to current adaptation algorithms.

HTTP上的自适应流已经成为互联网上视频流的事实上的标准，部分原因是它易于在高度僵化的互联网上部署。尽管在大多数按需场景中性能良好，但它受到TCP语义的限制，可靠性优先于及时性，即使对于实况视频，也可能需要相反的效果。在本文中，我们提出了一种基于TCP好莱坞的MPEG-DASH实现，TCP好莱坞是一种广泛部署的TCP变体，用于延迟敏感的应用程序。TCP好莱坞的乱序传输允许客户端测量、调整和请求下一个视频块，即使当前的视频块只是部分下载。此外，通过多流和乱序交付实现的跳帧功能增加了防止任何延迟消息延迟的弹性。我们观察到，在高延迟和高损耗网络中，TCP好莱坞显著降低了失速事件的可能性，与标准TCP相比，它还支持更高质量的下载，对当前的自适应算法的改变很小。

引用次数: 3

HTTP adaptive streaming QoE estimation with ITU-T rec. P. 1203: open databases and software 使用ITU-T rec. P. 1203的HTTP自适应流QoE估计:开放数据库和软件

Proceedings of the 9th ACM Multimedia Systems Conference

Pub Date : 2018-06-12 DOI: 10.1145/3204949.3208124

W. Robitza, Steve Goering, A. Raake, David Lindero, Gunnar Heikkilä, Jorgen Gustafsson, P. List, B. Feiten, Ulf Wüstenhagen, Marie-Neige Garcia, Kazuhisa Yamagishi, S. Broom

This paper describes an open dataset and software for ITU-T Ree. P.1203. As the first standardized Quality of Experience model for audiovisual HTTP Adaptive Streaming (HAS), it has been extensively trained and validated on over a thousand audiovisual sequences containing HAS-typical effects (such as stalling, coding artifacts, quality switches). Our dataset comprises four of the 30 official subjective databases at a bitstream feature level. The paper also includes subjective results and the model performance. Our software for the standard was made available to the public, too, and it is used for all the analyses presented. Among other previously unpublished details, we show the significant performance improvements of using bitstream-based models over metadata-based ones for video quality analysis, and the robustness of combining classical models with machine-learning-based approaches for estimating user QoE.

本文描述了ITU-T Ree的开放数据集和软件。P.1203。作为视听HTTP自适应流(HAS)的第一个标准化的体验质量模型，它已经在超过一千个包含HAS典型效果(如失速，编码工件，质量切换)的视听序列上进行了广泛的训练和验证。我们的数据集包括比特流特征级别的30个官方主观数据库中的4个。本文还包括主观结果和模型性能。我们的标准软件也向公众开放，它被用于所有的分析。在其他先前未发表的细节中，我们展示了使用基于比特流的模型比基于元数据的模型进行视频质量分析的显着性能改进，以及将经典模型与基于机器学习的方法相结合用于估计用户QoE的鲁棒性。

引用次数: 102

MUSLIN demo: high QoE fair multi-source live streaming 穆斯林演示:高QoE公平多源直播

Proceedings of the 9th ACM Multimedia Systems Conference

Pub Date : 2018-06-12 DOI: 10.1145/3204949.3208108

Simon Da Silva, Joachim Bruneau-Queyreix, Mathias Lacaud, D. Négru, Laurent Réveillère

Delivering video content with a high and fairly shared quality of experience is a challenging task in view of the drastic video traffic increase forecasts. Currently, content delivery networks provide numerous servers hosting replicas of the video content, and consuming clients are re-directed to the closest server. Then, the video content is streamed using adaptive streaming solutions. However, some servers become overloaded, and clients may experience a poor or unfairly distributed quality of experience. In this demonstration, we showcase Muslin, a streaming solution supporting a high, fairly shared end-users quality of experience for live streaming. Muslin leverages on MS-Stream, a content delivery solution in which a client can simultaneously use several servers. Muslin dynamically provisions servers and replicates content into servers, and advertises servers to clients based on real-time delivery conditions. Our demonstration shows that our approach outperforms traditional content delivery schemes enabling to increase the fairness and quality of experience at the user side without requiring a greater underlying content delivery platform.

鉴于预测视频流量将大幅增加，提供高质量和公平共享的视频内容是一项具有挑战性的任务。目前，内容交付网络提供了许多服务器来托管视频内容的副本，消费客户端被重定向到最近的服务器。然后，使用自适应流解决方案流式传输视频内容。但是，有些服务器会过载，客户端可能会体验到较差或不公平的体验质量。在这个演示中，我们展示了Muslin，这是一个流媒体解决方案，支持高质量的、相当共享的终端用户流媒体体验。Muslin利用了MS-Stream，一个客户端可以同时使用多个服务器的内容交付解决方案。Muslin动态地提供服务器并将内容复制到服务器中，并根据实时交付条件向客户发布服务器。我们的演示表明，我们的方法优于传统的内容交付方案，能够增加用户端的公平性和体验质量，而不需要更大的底层内容交付平台。

{"title":"MUSLIN demo: high QoE fair multi-source live streaming","authors":"Simon Da Silva, Joachim Bruneau-Queyreix, Mathias Lacaud, D. Négru, Laurent Réveillère","doi":"10.1145/3204949.3208108","DOIUrl":"https://doi.org/10.1145/3204949.3208108","url":null,"abstract":"Delivering video content with a high and fairly shared quality of experience is a challenging task in view of the drastic video traffic increase forecasts. Currently, content delivery networks provide numerous servers hosting replicas of the video content, and consuming clients are re-directed to the closest server. Then, the video content is streamed using adaptive streaming solutions. However, some servers become overloaded, and clients may experience a poor or unfairly distributed quality of experience. In this demonstration, we showcase Muslin, a streaming solution supporting a high, fairly shared end-users quality of experience for live streaming. Muslin leverages on MS-Stream, a content delivery solution in which a client can simultaneously use several servers. Muslin dynamically provisions servers and replicates content into servers, and advertises servers to clients based on real-time delivery conditions. Our demonstration shows that our approach outperforms traditional content delivery schemes enabling to increase the fairness and quality of experience at the user side without requiring a greater underlying content delivery platform.","PeriodicalId":141196,"journal":{"name":"Proceedings of the 9th ACM Multimedia Systems Conference","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134031826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Modeling sensory effects as first-class entities in multimedia applications 多媒体应用中作为一级实体的感官效果建模

Proceedings of the 9th ACM Multimedia Systems Conference

Pub Date : 2018-06-12 DOI: 10.1145/3204949.3204967

M. Josué, R. Abreu, Fábio Barreto, D. Mattos, G. Amorim, J. Santos, D. Muchaluat-Saade

Multimedia applications are usually composed by audiovisual content. Traditional multimedia conceptual models, and consequently declarative multimedia authoring languages, do not support the definition of multiple sensory effects. Multiple sensorial media (mulsemedia) applications consider the use of sensory effects that can stimulate touch, smell and taste, in addition to hearing and sight. Therefore, mulsemedia applications have been usually developed using general-purpose programming languages. In order to fill in this gap, this paper proposes an approach for modeling sensory effects as first-class entities, enabling multimedia applications to synchronize sensorial media to interactive audiovisual content in a high-level specification. Thus, complete descriptions of mulsemedia applications will be made possible with multimedia models and languages. In order to validate our ideas, an interactive mulsemedia application example is presented and specified with NCL (Nested Context Language) and Lua. Lua components are used for translating sensory effect high-level attributes to MPEG-V SEM (Sensory Effect Metadata) files. A sensory effect simulator was developed to receive SEM files and simulate mulsemedia application rendering.

多媒体应用程序通常由视听内容组成。传统的多媒体概念模型，以及因此而产生的声明式多媒体创作语言，都不支持多重感官效果的定义。多种感官媒体(多媒体)应用考虑使用除了听觉和视觉之外，还可以刺激触觉、嗅觉和味觉的感官效果。因此，多媒体应用程序通常使用通用编程语言开发。为了填补这一空白，本文提出了一种将感官效果建模为一级实体的方法，使多媒体应用能够在高层次规范中同步感官媒体与交互式视听内容。因此，使用多媒体模型和语言对多媒体应用程序进行完整的描述将成为可能。为了验证我们的想法，给出了一个交互式多媒体应用实例，并使用NCL(嵌套上下文语言)和Lua语言进行了具体说明。Lua组件用于将感官效果高级属性转换为MPEG-V SEM(感官效果元数据)文件。开发了一个感官效果模拟器，用于接收SEM文件和模拟多媒体应用渲染。

{"title":"Modeling sensory effects as first-class entities in multimedia applications","authors":"M. Josué, R. Abreu, Fábio Barreto, D. Mattos, G. Amorim, J. Santos, D. Muchaluat-Saade","doi":"10.1145/3204949.3204967","DOIUrl":"https://doi.org/10.1145/3204949.3204967","url":null,"abstract":"Multimedia applications are usually composed by audiovisual content. Traditional multimedia conceptual models, and consequently declarative multimedia authoring languages, do not support the definition of multiple sensory effects. Multiple sensorial media (mulsemedia) applications consider the use of sensory effects that can stimulate touch, smell and taste, in addition to hearing and sight. Therefore, mulsemedia applications have been usually developed using general-purpose programming languages. In order to fill in this gap, this paper proposes an approach for modeling sensory effects as first-class entities, enabling multimedia applications to synchronize sensorial media to interactive audiovisual content in a high-level specification. Thus, complete descriptions of mulsemedia applications will be made possible with multimedia models and languages. In order to validate our ideas, an interactive mulsemedia application example is presented and specified with NCL (Nested Context Language) and Lua. Lua components are used for translating sensory effect high-level attributes to MPEG-V SEM (Sensory Effect Metadata) files. A sensory effect simulator was developed to receive SEM files and simulate mulsemedia application rendering.","PeriodicalId":141196,"journal":{"name":"Proceedings of the 9th ACM Multimedia Systems Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130487362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 24

PEAT, how much am i burning? 泥炭，我烧了多少?

Proceedings of the 9th ACM Multimedia Systems Conference

Pub Date : 2018-06-12 DOI: 10.1145/3204949.3204951

S. Nambi, R. V. Prasad, A. R. Lua, Luis Gonzalez

Depletion of fossil fuel and the ever-increasing need for energy in residential and commercial buildings have triggered in-depth research on many energy saving and energy monitoring mechanisms. Currently, users are only aware of their overall energy consumption and its cost in a shared space. Due to the lack of information on individual energy consumption, users are not being able to fine-tune their energy usage. Further, even-splitting of energy cost in shared spaces does not help in creating awareness. With the advent of the Internet of Things (IoT) and wearable devices, apportioning of the total energy consumption of a household to individual occupants can be achieved to create awareness and consequently promoting sustainable energy usage. However, providing personalized energy consumption information in real-time is a challenging task due to the need for collection of fine-grained information at various levels. Particularly, identifying the user(s) utilizing an appliance in a shared space is a hard problem. The reason being, there are no comprehensive means of collecting accurate personalized energy consumption information. In this paper we present the Personalized Energy Apportioning Toolkit (PEAT) to accurately apportion total energy consumption to individual occupants in shared spaces. Apart from performing energy disaggregation, PEAT combines data from IoT devices such as smartphones and smartwatches of occupants to obtain fine-grained information, such as their location and activities. PEAT estimates energy footprint of individuals by modeling the association between the appliances and occupants in the household. We propose several accuracy metrics to study the performance of our toolkit. PEAT was exhaustively evaluated and validated in two multi-occupant households. PEAT achieves 90% energy apportioning accuracy using only the location information of the occupants. Furthermore, the energy apportioning accuracy is around 95% when both location and activity information is available.

随着化石燃料的日益枯竭以及住宅和商业建筑对能源需求的不断增长，人们对许多节能和能源监测机制进行了深入的研究。目前，用户只知道他们在共享空间中的总体能源消耗及其成本。由于缺乏个人能源消耗的信息，用户无法微调他们的能源使用。此外，即使在共享空间中分摊能源成本也无助于创造意识。随着物联网(IoT)和可穿戴设备的出现，可以实现将家庭总能耗分配给个人居住者，从而提高人们的意识，从而促进可持续能源使用。然而，实时提供个性化的能耗信息是一项具有挑战性的任务，因为需要收集不同级别的细粒度信息。特别是，识别在共享空间中使用设备的用户是一个难题。究其原因，目前还没有全面、准确、个性化的能源消费信息采集手段。在本文中，我们提出了个性化能源分配工具包(PEAT)，以准确地分配共享空间中每个居住者的总能耗。除了进行能量分解外，PEAT还结合了来自用户智能手机和智能手表等物联网设备的数据，以获取用户的位置和活动等细粒度信息。PEAT通过模拟家用电器和住户之间的关系来估计个人的能源足迹。我们提出了几个准确性指标来研究我们的工具包的性能。在两个多住户家庭中对PEAT进行了详尽的评估和验证。PEAT仅使用居住者的位置信息就能达到90%的能量分配精度。此外，当位置和活动信息同时可用时，能量分配精度在95%左右。

{"title":"PEAT, how much am i burning?","authors":"S. Nambi, R. V. Prasad, A. R. Lua, Luis Gonzalez","doi":"10.1145/3204949.3204951","DOIUrl":"https://doi.org/10.1145/3204949.3204951","url":null,"abstract":"Depletion of fossil fuel and the ever-increasing need for energy in residential and commercial buildings have triggered in-depth research on many energy saving and energy monitoring mechanisms. Currently, users are only aware of their overall energy consumption and its cost in a shared space. Due to the lack of information on individual energy consumption, users are not being able to fine-tune their energy usage. Further, even-splitting of energy cost in shared spaces does not help in creating awareness. With the advent of the Internet of Things (IoT) and wearable devices, apportioning of the total energy consumption of a household to individual occupants can be achieved to create awareness and consequently promoting sustainable energy usage. However, providing personalized energy consumption information in real-time is a challenging task due to the need for collection of fine-grained information at various levels. Particularly, identifying the user(s) utilizing an appliance in a shared space is a hard problem. The reason being, there are no comprehensive means of collecting accurate personalized energy consumption information. In this paper we present the Personalized Energy Apportioning Toolkit (PEAT) to accurately apportion total energy consumption to individual occupants in shared spaces. Apart from performing energy disaggregation, PEAT combines data from IoT devices such as smartphones and smartwatches of occupants to obtain fine-grained information, such as their location and activities. PEAT estimates energy footprint of individuals by modeling the association between the appliances and occupants in the household. We propose several accuracy metrics to study the performance of our toolkit. PEAT was exhaustively evaluated and validated in two multi-occupant households. PEAT achieves 90% energy apportioning accuracy using only the location information of the occupants. Furthermore, the energy apportioning accuracy is around 95% when both location and activity information is available.","PeriodicalId":141196,"journal":{"name":"Proceedings of the 9th ACM Multimedia Systems Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128980794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Visual object tracking in a parking garage using compressed domain analysis 基于压缩域分析的停车场视觉目标跟踪

Proceedings of the 9th ACM Multimedia Systems Conference

Pub Date : 2018-06-12 DOI: 10.1145/3204949.3208117

Daniel Becker, Matthias Schmidt, Fernando Bombardelli da Silva, Serhan Gül, C. Hellge, Oliver Sawade, I. Radusch

Modern driver assistance systems enable a variety of use cases which rely on accurate localization information of all traffic participants. Due to the unavailability of satellite-based localization, the use of infrastructure cameras is a promising alternative in indoor spaces such as parking garages. This paper presents a parking management system which extends the previous work of the eValet system with a low-complexity tracking functionality on compressed video bitstreams (compressed-domain tracking). The advantages of this approach include the improved robustness to partial occlusions as well as a resource-efficient processing of compressed video bit-streams. We have separated the tasks into different modules which are integrated into a comprehensive architecture. The demonstrator setup includes a 2D visualizer illustrating the operation of the algorithms on a single camera stream and a 3D visualizer displaying the abstract object detections in a global reference frame.

现代驾驶员辅助系统可以实现各种各样的用例，这些用例依赖于所有交通参与者的准确定位信息。由于无法获得基于卫星的定位，在室内空间(如停车场)使用基础设施摄像机是一种很有前途的选择。本文提出了一个停车场管理系统，该系统扩展了eValet系统的先前工作，具有压缩视频比特流(压缩域跟踪)的低复杂度跟踪功能。该方法的优点包括提高了对部分遮挡的鲁棒性以及对压缩视频比特流的资源高效处理。我们将任务分成不同的模块，这些模块集成到一个全面的体系结构中。演示装置包括说明在单个摄像机流上算法操作的2D可视化装置和在全局参考框架中显示抽象对象检测的3D可视化装置。

引用次数: 4

The prefetch aggressiveness tradeoff in 360° video streaming 360°视频流中的预取侵略性权衡

Proceedings of the 9th ACM Multimedia Systems Conference

Pub Date : 2018-06-12 DOI: 10.1145/3204949.3204970

Mathias Almquist, Viktor Almquist, Vengatanathan Krishnamoorthi, Niklas Carlsson, D. Eager

With 360° video, only a limited fraction of the full view is displayed at each point in time. This has prompted the design of streaming delivery techniques that allow alternative playback qualities to be delivered for each candidate viewing direction. However, while prefetching based on the user's expected viewing direction is best done close to playback deadlines, large buffers are needed to protect against shortfalls in future available bandwidth. This results in conflicting goals and an important prefetch aggressiveness tradeoff problem regarding how far ahead in time from the current play-point prefetching should be done. This paper presents the first characterization of this tradeoff. The main contributions include an empirical characterization of head movement behavior based on data from viewing sessions of four different categories of 360° video, an optimization-based comparison of the prefetch aggressiveness tradeoffs seen for these video categories, and a data-driven discussion of further optimizations, which include a novel system design that allows both tradeoff objectives to be targeted simultaneously. By qualitatively and quantitatively analyzing the above tradeoffs, we provide insights into how to best design tomorrow's delivery systems for 360° videos, allowing content providers to reduce bandwidth costs and improve users' playback experiences.

使用360°视频，在每个时间点只显示完整视图的有限部分。这促使了流传输技术的设计，允许为每个候选观看方向提供可选的播放质量。然而，虽然基于用户预期观看方向的预取最好在播放截止日期前完成，但需要大的缓冲区来防止未来可用带宽的不足。这就导致了目标冲突和一个重要的预取侵略性权衡问题，即从当前游戏点预取应该提前多久完成。本文提出了这种权衡的第一个特征。主要贡献包括基于四种不同类别360°视频观看会话数据的头部运动行为的经验特征，基于优化的预取侵略性权衡比较，以及基于数据驱动的进一步优化讨论，其中包括一个新颖的系统设计，允许同时针对两个权衡目标。通过对上述权衡进行定性和定量分析，我们提供了如何最佳设计未来360°视频传输系统的见解，使内容提供商能够降低带宽成本并改善用户的播放体验。

{"title":"The prefetch aggressiveness tradeoff in 360° video streaming","authors":"Mathias Almquist, Viktor Almquist, Vengatanathan Krishnamoorthi, Niklas Carlsson, D. Eager","doi":"10.1145/3204949.3204970","DOIUrl":"https://doi.org/10.1145/3204949.3204970","url":null,"abstract":"With 360° video, only a limited fraction of the full view is displayed at each point in time. This has prompted the design of streaming delivery techniques that allow alternative playback qualities to be delivered for each candidate viewing direction. However, while prefetching based on the user's expected viewing direction is best done close to playback deadlines, large buffers are needed to protect against shortfalls in future available bandwidth. This results in conflicting goals and an important prefetch aggressiveness tradeoff problem regarding how far ahead in time from the current play-point prefetching should be done. This paper presents the first characterization of this tradeoff. The main contributions include an empirical characterization of head movement behavior based on data from viewing sessions of four different categories of 360° video, an optimization-based comparison of the prefetch aggressiveness tradeoffs seen for these video categories, and a data-driven discussion of further optimizations, which include a novel system design that allows both tradeoff objectives to be targeted simultaneously. By qualitatively and quantitatively analyzing the above tradeoffs, we provide insights into how to best design tomorrow's delivery systems for 360° videos, allowing content providers to reduce bandwidth costs and improve users' playback experiences.","PeriodicalId":141196,"journal":{"name":"Proceedings of the 9th ACM Multimedia Systems Conference","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133122352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 46

Latency and throughput characterization of convolutional neural networks for mobile computer vision 移动计算机视觉中卷积神经网络的延迟和吞吐量表征

Proceedings of the 9th ACM Multimedia Systems Conference

Pub Date : 2018-03-26 DOI: 10.1145/3204949.3204975

Jussi Hanhirova, Teemu Kämäräinen, S. Seppälä, M. Siekkinen, V. Hirvisalo, Antti Ylä-Jääski

We study performance characteristics of convolutional neural networks (CNN) for mobile computer vision systems. CNNs have proven to be a powerful and efficient approach to implement such systems. However, the system performance depends largely on the utilization of hardware accelerators, which are able to speed up the execution of the underlying mathematical operations tremendously through massive parallelism. Our contribution is performance characterization of multiple CNN-based models for object recognition and detection with several different hardware platforms and software frameworks, using both local (on-device) and remote (network-side server) computation. The measurements are conducted using real workloads and real processing platforms. On the platform side, we concentrate especially on TensorFlow and TensorRT. Our measurements include embedded processors found on mobile devices and high-performance processors that can be used on the network side of mobile systems. We show that there exists significant latency-throughput trade-offs but the behavior is very complex. We demonstrate and discuss several factors that affect the performance and yield this complex behavior.

我们研究了卷积神经网络(CNN)在移动计算机视觉系统中的性能特征。cnn已经被证明是实现这种系统的一种强大而有效的方法。然而，系统性能在很大程度上取决于硬件加速器的使用，硬件加速器能够通过大规模并行性极大地加快底层数学运算的执行。我们的贡献是在几种不同的硬件平台和软件框架下，使用本地(设备上)和远程(网络端服务器)计算，对多个基于cnn的对象识别和检测模型进行性能表征。这些测量是使用真实的工作负载和真实的处理平台进行的。在平台方面，我们特别关注TensorFlow和TensorRT。我们的测量包括移动设备上的嵌入式处理器和可用于移动系统网络端的高性能处理器。我们表明存在显著的延迟-吞吐量权衡，但行为非常复杂。我们演示并讨论了影响性能和产生这种复杂行为的几个因素。

{"title":"Latency and throughput characterization of convolutional neural networks for mobile computer vision","authors":"Jussi Hanhirova, Teemu Kämäräinen, S. Seppälä, M. Siekkinen, V. Hirvisalo, Antti Ylä-Jääski","doi":"10.1145/3204949.3204975","DOIUrl":"https://doi.org/10.1145/3204949.3204975","url":null,"abstract":"We study performance characteristics of convolutional neural networks (CNN) for mobile computer vision systems. CNNs have proven to be a powerful and efficient approach to implement such systems. However, the system performance depends largely on the utilization of hardware accelerators, which are able to speed up the execution of the underlying mathematical operations tremendously through massive parallelism. Our contribution is performance characterization of multiple CNN-based models for object recognition and detection with several different hardware platforms and software frameworks, using both local (on-device) and remote (network-side server) computation. The measurements are conducted using real workloads and real processing platforms. On the platform side, we concentrate especially on TensorFlow and TensorRT. Our measurements include embedded processors found on mobile devices and high-performance processors that can be used on the network side of mobile systems. We show that there exists significant latency-throughput trade-offs but the behavior is very complex. We demonstrate and discuss several factors that affect the performance and yield this complex behavior.","PeriodicalId":141196,"journal":{"name":"Proceedings of the 9th ACM Multimedia Systems Conference","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126738242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 79

Multi-codec DASH dataset 多编解码器DASH数据集

Proceedings of the 9th ACM Multimedia Systems Conference

Pub Date : 2018-03-19 DOI: 10.1145/3204949.3208140

Anatoliy Zabrovskiy, Christian Feldmann, C. Timmerer

The number of bandwidth-hungry applications and services is constantly growing. HTTP adaptive streaming of audio-visual content accounts for the majority of today's internet traffic. Although the internet bandwidth increases also constantly, audio-visual compression technology is inevitable and we are currently facing the challenge to be confronted with multiple video codecs. This paper proposes a multi-codec DASH dataset comprising AVC, HEVC, VP9, and AV1 in order to enable interoperability testing and streaming experiments for the efficient usage of these codecs under various conditions. We adopt state of the art encoding and packaging options and also provide basic quality metrics along with the DASH segments. Additionally, we briefly introduce a multi-codec DASH scheme and possible usage scenarios. Finally, we provide a preliminary evaluation of the encoding efficiency in the context of HTTP adaptive streaming services and applications.

需要带宽的应用程序和服务的数量在不断增长。视听内容的HTTP自适应流占当今互联网流量的大部分。虽然网络带宽也在不断增加，但视听压缩技术是不可避免的，我们目前面临着多种视频编解码器的挑战。本文提出了一个包含AVC、HEVC、VP9和AV1的多编解码器DASH数据集，以便在各种条件下有效使用这些编解码器进行互操作性测试和流式实验。我们采用最先进的编码和包装选项，并提供基本的质量指标以及DASH部分。此外，我们简要介绍了一个多编解码器的DASH方案和可能的使用场景。最后，我们对HTTP自适应流媒体服务和应用中的编码效率进行了初步评估。

引用次数: 45

Classifying flows and buffer state for youtube's HTTP adaptive streaming service in mobile networks youtube的HTTP自适应流媒体服务在移动网络中的流分类和缓冲状态

Proceedings of the 9th ACM Multimedia Systems Conference

Pub Date : 2018-03-01 DOI: 10.1145/3204949.3204955

D. Tsilimantos, Theodoros Karagkioules, S. Valentin

Accurate cross-layer information is very useful to optimize mobile networks for specific applications. However, providing application-layer information to lower protocol layers has become very difficult due to the wide adoption of end-to-end encryption and due to the absence of cross-layer signaling standards. As an alternative, this paper presents a traffic profiling solution to passively estimate parameters of HTTP Adaptive Streaming (HAS) applications at the lower layers. By observing IP packet arrivals, our machine learning system identifies video flows and detects the state of an HAS client's play-back buffer in real time. Our experiments with YouTube's mobile client show that Random Forests achieve very high accuracy even with a strong variation of link quality. Since this high performance is achieved at IP level with a small, generic feature set, our approach requires no Deep Packet Inspection (DPI), comes at low complexity, and does not interfere with end-to-end encryption. Traffic profiling is, thus, a powerful new tool for monitoring and managing even encrypted HAS traffic in mobile networks.

准确的跨层信息对于优化特定应用的移动网络非常有用。然而，由于端到端加密的广泛采用和跨层信令标准的缺乏，向较低的协议层提供应用层信息变得非常困难。作为一种替代方案，本文提出了一种流量分析方案来被动地估计HTTP自适应流(HAS)应用程序的底层参数。通过观察IP数据包到达，我们的机器学习系统识别视频流并实时检测HAS客户端播放缓冲区的状态。我们对YouTube移动客户端的实验表明，随机森林即使在链接质量变化很大的情况下也能达到非常高的准确性。由于这种高性能是在IP级别通过小型通用功能集实现的，因此我们的方法不需要深度数据包检测(DPI)，复杂度低，并且不会干扰端到端加密。因此，流量分析是一个强大的新工具，用于监控和管理移动网络中甚至加密的HAS流量。

{"title":"Classifying flows and buffer state for youtube's HTTP adaptive streaming service in mobile networks","authors":"D. Tsilimantos, Theodoros Karagkioules, S. Valentin","doi":"10.1145/3204949.3204955","DOIUrl":"https://doi.org/10.1145/3204949.3204955","url":null,"abstract":"Accurate cross-layer information is very useful to optimize mobile networks for specific applications. However, providing application-layer information to lower protocol layers has become very difficult due to the wide adoption of end-to-end encryption and due to the absence of cross-layer signaling standards. As an alternative, this paper presents a traffic profiling solution to passively estimate parameters of HTTP Adaptive Streaming (HAS) applications at the lower layers. By observing IP packet arrivals, our machine learning system identifies video flows and detects the state of an HAS client's play-back buffer in real time. Our experiments with YouTube's mobile client show that Random Forests achieve very high accuracy even with a strong variation of link quality. Since this high performance is achieved at IP level with a small, generic feature set, our approach requires no Deep Packet Inspection (DPI), comes at low complexity, and does not interfere with end-to-end encryption. Traffic profiling is, thus, a powerful new tool for monitoring and managing even encrypted HAS traffic in mobile networks.","PeriodicalId":141196,"journal":{"name":"Proceedings of the 9th ACM Multimedia Systems Conference","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134192001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 34

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the 9th ACM Multimedia Systems Conference

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀