Digibox is a prototyping environment for IoT applications. It enables a novel scene-centric prototyping where developers can program an ensemble of simulated devices to capture not only their individual but also their coordinated behaviors, making it possible to test, debug, and evaluate the behaviors of an IoT application. Using Digibox, developers can download and reuse existing scenes, customize, and repurpose them towards developing new applications; or replicate others' experiment results from scientific research. Digibox's Kubernetes-based runtime further allows developers to easily scale the prototyping environment from a single laptop to a cluster running simulated devices and scenes at a scale appropriate to the application.
{"title":"The internet of things in a laptop: rapid prototyping for IoT applications with digibox","authors":"Silvery Fu, Hong Zhang, S. Ratnasamy, I. Stoica","doi":"10.1145/3563766.3564087","DOIUrl":"https://doi.org/10.1145/3563766.3564087","url":null,"abstract":"Digibox is a prototyping environment for IoT applications. It enables a novel scene-centric prototyping where developers can program an ensemble of simulated devices to capture not only their individual but also their coordinated behaviors, making it possible to test, debug, and evaluate the behaviors of an IoT application. Using Digibox, developers can download and reuse existing scenes, customize, and repurpose them towards developing new applications; or replicate others' experiment results from scientific research. Digibox's Kubernetes-based runtime further allows developers to easily scale the prototyping environment from a single laptop to a cluster running simulated devices and scenes at a scale appropriate to the application.","PeriodicalId":339381,"journal":{"name":"Proceedings of the 21st ACM Workshop on Hot Topics in Networks","volume":"192 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121106157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In their unrelenting quest for lower latency, cloud providers are deploying servers closer to their customers and enterprises are adopting paid Network-as-a-Service (NaaS) offerings with performance guarantees. Unfortunately, these trends contribute to greater industry consolidation, benefiting larger companies and well-served regions while leaving little room for smaller cloud providers and enterprises to flourish. Instead, we argue that the public Internet could offer good enough performance, if only edge networks could work together to achieve better visibility and control over wide-area routing. We present Tango, a cooperative architecture where pairs of edge networks (e.g., access, enterprise, and data-center networks) collaborate to expose more wide-area paths, collect more accurate measurements, and split traffic more intelligently over the paths. Tango leverages programmable switches at the borders of the edge networks, coupled with techniques to coax BGP into exposing more paths, without requiring support from end hosts or intermediate ASes. Experiments with our preliminary Tango deployment (using IPv6 addresses and the Vultr cloud provider) show that Tango could offer much greater visibility and control over wide-area routing, allowing the public Internet to meet the needs of many modern networked applications.
{"title":"It takes two to tango: cooperative edge-to-edge routing","authors":"Henry Birge-Lee, M. Apostolaki, J. Rexford","doi":"10.1145/3563766.3564107","DOIUrl":"https://doi.org/10.1145/3563766.3564107","url":null,"abstract":"In their unrelenting quest for lower latency, cloud providers are deploying servers closer to their customers and enterprises are adopting paid Network-as-a-Service (NaaS) offerings with performance guarantees. Unfortunately, these trends contribute to greater industry consolidation, benefiting larger companies and well-served regions while leaving little room for smaller cloud providers and enterprises to flourish. Instead, we argue that the public Internet could offer good enough performance, if only edge networks could work together to achieve better visibility and control over wide-area routing. We present Tango, a cooperative architecture where pairs of edge networks (e.g., access, enterprise, and data-center networks) collaborate to expose more wide-area paths, collect more accurate measurements, and split traffic more intelligently over the paths. Tango leverages programmable switches at the borders of the edge networks, coupled with techniques to coax BGP into exposing more paths, without requiring support from end hosts or intermediate ASes. Experiments with our preliminary Tango deployment (using IPv6 addresses and the Vultr cloud provider) show that Tango could offer much greater visibility and control over wide-area routing, allowing the public Internet to meet the needs of many modern networked applications.","PeriodicalId":339381,"journal":{"name":"Proceedings of the 21st ACM Workshop on Hot Topics in Networks","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126766350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gina Yuan, David Zhang, Matthew Sotoudeh, M. Welzl, Keith Winstein
In response to ossification and privacy concerns, post-TCP transport protocols such as QUIC are designed to be "paranoid"---opaque to meddling middleboxes by encrypting and authenticating the header and payload---making it impossible for Performance-Enhancing Proxies (PEPs) to provide the same assistance as before. We propose a research agenda towards an alternate approach to PEPs, creating a sidecar protocol that is loosely-coupled to the unchanged and opaque, underlying transport protocol. The key technical challenge to sidecar protocols is how to usefully refer to the packets of the underlying connection without ossification. We have made progress on this problem by creating a tool we call a quACK (quick ACK), a concise representation of a multiset of numbers that can be used to efficiently decode the randomly-encrypted packet contents a sidecar has received. We implement the quACK and discuss how to achieve several applications with this approach: alternate congestion control, ACK reduction, and PEP-to-PEP retransmission across a lossy subpath.
{"title":"Sidecar: in-network performance enhancements in the age of paranoid transport protocols","authors":"Gina Yuan, David Zhang, Matthew Sotoudeh, M. Welzl, Keith Winstein","doi":"10.1145/3563766.3564113","DOIUrl":"https://doi.org/10.1145/3563766.3564113","url":null,"abstract":"In response to ossification and privacy concerns, post-TCP transport protocols such as QUIC are designed to be \"paranoid\"---opaque to meddling middleboxes by encrypting and authenticating the header and payload---making it impossible for Performance-Enhancing Proxies (PEPs) to provide the same assistance as before. We propose a research agenda towards an alternate approach to PEPs, creating a sidecar protocol that is loosely-coupled to the unchanged and opaque, underlying transport protocol. The key technical challenge to sidecar protocols is how to usefully refer to the packets of the underlying connection without ossification. We have made progress on this problem by creating a tool we call a quACK (quick ACK), a concise representation of a multiset of numbers that can be used to efficiently decode the randomly-encrypted packet contents a sidecar has received. We implement the quACK and discuss how to achieve several applications with this approach: alternate congestion control, ACK reduction, and PEP-to-PEP retransmission across a lossy subpath.","PeriodicalId":339381,"journal":{"name":"Proceedings of the 21st ACM Workshop on Hot Topics in Networks","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116606153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ziqiang Wang, Zhuotao Liu, Xiaoliang Wang, Songtao Fu, Ke Xu
The IP protocol has made a great contribution to the development of the Internet and has become the narrow waist of the Internet. However, the fixed packet processing of IP hinders the functional expansion and evolution of the Internet. In order to solve the rigidity of the Internet, our community has proposed various new L3 protocols to better support various network functions at the network layer. In this paper, we propose DIP (Dynamic Internet Protocol), a novel primitive to unify these protocols. DIP builds a common network function core shared by these L3 protocols based on a new L3 function core primitive, named Field Operation (FN). With FNs, each standalone L3 protocol can be decomposed into a combination of multiple FNs, and meanwhile it is feasible to compose various FNs to realize new (derived) L3 protocols. We demonstrate the feasibility of DIP by realizing five radically different network layer protocols1: the canonical IP forwarding, NDN [41], XIA [12], OPT [16], and NDN+OPT (a derived L3 protocol combining the merits of both NDN and OPT). We implement a prototype of DIP and evaluate its forwarding performance.
IP协议为互联网的发展做出了巨大的贡献,已经成为互联网的窄腰。但是,IP的固定分组处理阻碍了互联网功能的扩展和演进。为了解决互联网的刚性,我们社区提出了各种新的L3协议,以更好地支持网络层的各种网络功能。本文提出了一种新的原语DIP (Dynamic Internet Protocol)来统一这些协议。DIP基于一个新的L3功能核心原语(Field Operation, FN)构建了一个由这些L3协议共享的公共网络功能核心。有了FNs,每个独立的L3协议可以分解成多个FNs的组合,同时也可以组成多个FNs来实现新的(衍生的)L3协议。我们通过实现五种完全不同的网络层协议1来证明DIP的可行性:规范IP转发、NDN[41]、XIA[12]、OPT[16]和NDN+OPT(一种派生的L3协议,结合了NDN和OPT的优点)。我们实现了DIP的原型,并对其转发性能进行了评估。
{"title":"DIP: unifying network layer innovations using shared L3 core functions","authors":"Ziqiang Wang, Zhuotao Liu, Xiaoliang Wang, Songtao Fu, Ke Xu","doi":"10.1145/3563766.3564092","DOIUrl":"https://doi.org/10.1145/3563766.3564092","url":null,"abstract":"The IP protocol has made a great contribution to the development of the Internet and has become the narrow waist of the Internet. However, the fixed packet processing of IP hinders the functional expansion and evolution of the Internet. In order to solve the rigidity of the Internet, our community has proposed various new L3 protocols to better support various network functions at the network layer. In this paper, we propose DIP (Dynamic Internet Protocol), a novel primitive to unify these protocols. DIP builds a common network function core shared by these L3 protocols based on a new L3 function core primitive, named Field Operation (FN). With FNs, each standalone L3 protocol can be decomposed into a combination of multiple FNs, and meanwhile it is feasible to compose various FNs to realize new (derived) L3 protocols. We demonstrate the feasibility of DIP by realizing five radically different network layer protocols1: the canonical IP forwarding, NDN [41], XIA [12], OPT [16], and NDN+OPT (a derived L3 protocol combining the merits of both NDN and OPT). We implement a prototype of DIP and evaluate its forwarding performance.","PeriodicalId":339381,"journal":{"name":"Proceedings of the 21st ACM Workshop on Hot Topics in Networks","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131015034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
N. Galstyan, J. McCauley, H. Farid, S. Ratnasamy, S. Shenker
Common wisdom holds that once personal content such as photographs have been shared on the Internet, they will stay there forever. This paper explores how we could allow users to reclaim some degree of their privacy by "revoking" previously shared photographs, hindering (but not eliminating) any subsequent viewing or sharing by others. Our goal is not to build a system that can withstand determined efforts to subvert it, but rather to give well-intentioned users the ability to respect the privacy wishes of others. Achieving this goal at scale will eventually require the participation of large content aggregators, and they are unlikely (putting it mildly) to find our proposal compelling. We therefore propose an approach we call technology ecosystem transformation (TET) that begins with a transitional and more easily deployable (but not fully scalable) design that does not require the participation of large incumbents but is designed to change user and societal expectations enough so that these companies would find it in their interest to adopt the approach we propose here. The intellectual challenge in this TET approach is finding transitional designs that (i) have parties willing to deploy it and (ii) once deployed, would change the incentives for the incumbents so that they would be willing to adopt the proposal.
{"title":"Global content revocation on the internet: a case study in technology ecosystem transformation","authors":"N. Galstyan, J. McCauley, H. Farid, S. Ratnasamy, S. Shenker","doi":"10.1145/3563766.3564099","DOIUrl":"https://doi.org/10.1145/3563766.3564099","url":null,"abstract":"Common wisdom holds that once personal content such as photographs have been shared on the Internet, they will stay there forever. This paper explores how we could allow users to reclaim some degree of their privacy by \"revoking\" previously shared photographs, hindering (but not eliminating) any subsequent viewing or sharing by others. Our goal is not to build a system that can withstand determined efforts to subvert it, but rather to give well-intentioned users the ability to respect the privacy wishes of others. Achieving this goal at scale will eventually require the participation of large content aggregators, and they are unlikely (putting it mildly) to find our proposal compelling. We therefore propose an approach we call technology ecosystem transformation (TET) that begins with a transitional and more easily deployable (but not fully scalable) design that does not require the participation of large incumbents but is designed to change user and societal expectations enough so that these companies would find it in their interest to adopt the approach we propose here. The intellectual challenge in this TET approach is finding transitional designs that (i) have parties willing to deploy it and (ii) once deployed, would change the incentives for the incumbents so that they would be willing to adopt the proposal.","PeriodicalId":339381,"journal":{"name":"Proceedings of the 21st ACM Workshop on Hot Topics in Networks","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129635195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We have developed an open-source Internet Emulator, which is a Python library, consisting of the classes for each essential element of the Internet, including autonomous system, network, host, router, BGP router, Internet exchange, etc. It also includes classes for a variety of services, including Web, DHCP, DNS, Botnet, Darknet, and Blockchain. Many other interesting network technologies can also be deployed on the emulator. Using this library, users can easily construct a miniature Internet. Although it is small, it has all the essential elements of the real Internet. The construction is compiled into Docker container files, and the emulation is executed by Docker on a single machine, or on multiple cloud machines. This emulator has been primarily used for education since it was released in August 2021, but recently several research groups have started to use it for their research. In this paper, we present the design of this emulator and its applications. This work is still in its early stage, so the objective of this paper is to get feedback from the community, so it can be more useful to research and education.
{"title":"SEED emulator: an internet emulator for research and education","authors":"Wenliang Du, Honghao Zeng, Kyungrok Won","doi":"10.1145/3563766.3564097","DOIUrl":"https://doi.org/10.1145/3563766.3564097","url":null,"abstract":"We have developed an open-source Internet Emulator, which is a Python library, consisting of the classes for each essential element of the Internet, including autonomous system, network, host, router, BGP router, Internet exchange, etc. It also includes classes for a variety of services, including Web, DHCP, DNS, Botnet, Darknet, and Blockchain. Many other interesting network technologies can also be deployed on the emulator. Using this library, users can easily construct a miniature Internet. Although it is small, it has all the essential elements of the real Internet. The construction is compiled into Docker container files, and the emulation is executed by Docker on a single machine, or on multiple cloud machines. This emulator has been primarily used for education since it was released in August 2021, but recently several research groups have started to use it for their research. In this paper, we present the design of this emulator and its applications. This work is still in its early stage, so the objective of this paper is to get feedback from the community, so it can be more useful to research and education.","PeriodicalId":339381,"journal":{"name":"Proceedings of the 21st ACM Workshop on Hot Topics in Networks","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122799632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The need for high performance and custom software-based packet processing has resulted in decades of research. Most proposals bypass or replace the Linux networking stack with the unfortunate consequence of sacrificing the rich and robust functionality available within Linux and the ecosystem of management programs and control-plane software built on top of it. In this paper, we propose to rethink the design of the Linux network stack to address its shortcomings rather than creating alternative pipelines. This re-design involves (1) decomposing packet processing into a fast path and a slow path, and (2) transparently and dynamically creating a custom fast path that only implements the processing tasks currently configured. We leverage Linux's eXpress Data Path to load efficient and small fast-path modules, leaving the kernel stack to serve as the slow path. To materialize this vision, this paper introduces Transparent Network Acceleration (TNA), a prototype system that automatically generates a minimal data path based on introspection of the current networking configuration, avoiding many of the networking stack overheads in Linux while ensuring high performance and maintaining Linux's rich set of functionalities.
对高性能和基于定制软件的数据包处理的需求已经导致了数十年的研究。大多数建议绕过或替换Linux网络堆栈,不幸的是牺牲了Linux中可用的丰富而健壮的功能,以及构建在其上的管理程序和控制平面软件的生态系统。在本文中,我们建议重新考虑Linux网络堆栈的设计,以解决其缺点,而不是创建替代管道。这种重新设计包括(1)将数据包处理分解为快速路径和慢速路径,以及(2)透明且动态地创建仅实现当前配置的处理任务的自定义快速路径。我们利用Linux的eXpress Data Path来加载高效的小型快速路径模块,而让内核堆栈充当慢路径。为了实现这一愿景,本文介绍了透明网络加速(TNA),这是一个基于当前网络配置的自省自动生成最小数据路径的原型系统,在确保高性能和维护Linux丰富的功能集的同时,避免了Linux中的许多网络堆栈开销。
{"title":"Getting back what was lost in the era of high-speed software packet processing","authors":"M. Abranches, Oliver Michel, Eric Keller","doi":"10.1145/3563766.3564114","DOIUrl":"https://doi.org/10.1145/3563766.3564114","url":null,"abstract":"The need for high performance and custom software-based packet processing has resulted in decades of research. Most proposals bypass or replace the Linux networking stack with the unfortunate consequence of sacrificing the rich and robust functionality available within Linux and the ecosystem of management programs and control-plane software built on top of it. In this paper, we propose to rethink the design of the Linux network stack to address its shortcomings rather than creating alternative pipelines. This re-design involves (1) decomposing packet processing into a fast path and a slow path, and (2) transparently and dynamically creating a custom fast path that only implements the processing tasks currently configured. We leverage Linux's eXpress Data Path to load efficient and small fast-path modules, leaving the kernel stack to serve as the slow path. To materialize this vision, this paper introduces Transparent Network Acceleration (TNA), a prototype system that automatically generates a minimal data path based on introspection of the current networking configuration, avoiding many of the networking stack overheads in Linux while ensuring high performance and maintaining Linux's rich set of functionalities.","PeriodicalId":339381,"journal":{"name":"Proceedings of the 21st ACM Workshop on Hot Topics in Networks","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124511852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Acoustic sensing is a new sensing modality that senses the contexts of human targets and our surroundings using acoustic signals. It becomes a hot topic in both academia and industry owing to its finer sensing granularity and the wide availability of microphone and speaker on commodity devices. While prior studies focused on addressing well-known challenges such as increasing the limited sensing range and enabling multi-target sensing, we propose a novel scheme to leverage the non-linearity distortion of microphones to further boost the sensing granularity. Specifically, we observe the existence of the non-linear signal generated by the direct path signal and target reflection signal. We mathematically show that the non-linear chirp signal amplifies the phase variations and this property can be utilized to improve the granularity of acoustic sensing. Experiment results show that, by properly leveraging the hardware non-linearity, the amplitude estimation error for sub-millimeter-level vibration can be reduced from 0.137 mm to 0.029 mm.
{"title":"Boosting the sensing granularity of acoustic signals by exploiting hardware non-linearity","authors":"Xiang Chen, Dong Li, Yiran Chen, Jie Xiong","doi":"10.1145/3563766.3564091","DOIUrl":"https://doi.org/10.1145/3563766.3564091","url":null,"abstract":"Acoustic sensing is a new sensing modality that senses the contexts of human targets and our surroundings using acoustic signals. It becomes a hot topic in both academia and industry owing to its finer sensing granularity and the wide availability of microphone and speaker on commodity devices. While prior studies focused on addressing well-known challenges such as increasing the limited sensing range and enabling multi-target sensing, we propose a novel scheme to leverage the non-linearity distortion of microphones to further boost the sensing granularity. Specifically, we observe the existence of the non-linear signal generated by the direct path signal and target reflection signal. We mathematically show that the non-linear chirp signal amplifies the phase variations and this property can be utilized to improve the granularity of acoustic sensing. Experiment results show that, by properly leveraging the hardware non-linearity, the amplitude estimation error for sub-millimeter-level vibration can be reduced from 0.137 mm to 0.029 mm.","PeriodicalId":339381,"journal":{"name":"Proceedings of the 21st ACM Workshop on Hot Topics in Networks","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128219321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tobias Bühler, R. Schmid, Sandro Lutz, L. Vanbever
In theory, any network operator, developer, or vendor should have access to large amounts of live network traffic for testing their solutions. In practice, though, that is not the case. Network actors instead have to use packet traces or synthetic traffic, which is highly suboptimal: today's generated traffic is unrealistic. We propose a system for generating live application traffic leveraging massive codebases such as GitHub. Our key observation is that many repositories have now become "orchestrable" thanks to the rise of container technologies. To showcase the practicality of the approach, we iterate through >293k GitHub repositories and manage to capture >74k traces containing meaningful and diverse network traffic. Based on this first success, we outline the design of a system, Dynamo, which analyzes these traces to select and orchestrate open-source projects to automatically generate live application traffic matching a user's specification.
{"title":"Generating representative, live network traffic out of millions of code repositories","authors":"Tobias Bühler, R. Schmid, Sandro Lutz, L. Vanbever","doi":"10.1145/3563766.3564084","DOIUrl":"https://doi.org/10.1145/3563766.3564084","url":null,"abstract":"In theory, any network operator, developer, or vendor should have access to large amounts of live network traffic for testing their solutions. In practice, though, that is not the case. Network actors instead have to use packet traces or synthetic traffic, which is highly suboptimal: today's generated traffic is unrealistic. We propose a system for generating live application traffic leveraging massive codebases such as GitHub. Our key observation is that many repositories have now become \"orchestrable\" thanks to the rise of container technologies. To showcase the practicality of the approach, we iterate through >293k GitHub repositories and manage to capture >74k traces containing meaningful and diverse network traffic. Based on this first success, we outline the design of a system, Dynamo, which analyzes these traces to select and orchestrate open-source projects to automatically generate live application traffic matching a user's specification.","PeriodicalId":339381,"journal":{"name":"Proceedings of the 21st ACM Workshop on Hot Topics in Networks","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130519022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Foundational models have caused a paradigm shift in the way artificial intelligence (AI) systems are built. They have had a major impact in natural language processing (NLP), and several other domains, not only reducing the amount of required labeled data or even eliminating the need for it, but also significantly improving performance on a wide range of tasks. We argue foundation models can have a similar profound impact on network traffic analysis, and management. More specifically, we show that network data shares several of the properties that are behind the success of foundational models in linguistics. For example, network data contains rich semantic content, and several of the networking tasks (e.g., traffic classification, generation of protocol implementations from specification text, anomaly detection) can find similar counterparts in NLP (e.g., sentiment analysis, translation from natural language to code, out-of-distribution). However, network settings also present unique characteristics and challenges that must be overcome. Our contribution is in highlighting the opportunities and challenges at the intersection of foundation models and networking.
{"title":"Rethinking data-driven networking with foundation models: challenges and opportunities","authors":"Franck Le, M. Srivatsa, R. Ganti, V. Sekar","doi":"10.1145/3563766.3564109","DOIUrl":"https://doi.org/10.1145/3563766.3564109","url":null,"abstract":"Foundational models have caused a paradigm shift in the way artificial intelligence (AI) systems are built. They have had a major impact in natural language processing (NLP), and several other domains, not only reducing the amount of required labeled data or even eliminating the need for it, but also significantly improving performance on a wide range of tasks. We argue foundation models can have a similar profound impact on network traffic analysis, and management. More specifically, we show that network data shares several of the properties that are behind the success of foundational models in linguistics. For example, network data contains rich semantic content, and several of the networking tasks (e.g., traffic classification, generation of protocol implementations from specification text, anomaly detection) can find similar counterparts in NLP (e.g., sentiment analysis, translation from natural language to code, out-of-distribution). However, network settings also present unique characteristics and challenges that must be overcome. Our contribution is in highlighting the opportunities and challenges at the intersection of foundation models and networking.","PeriodicalId":339381,"journal":{"name":"Proceedings of the 21st ACM Workshop on Hot Topics in Networks","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126201740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}