{"title":"SPINE:安全的可编程集成网络环境","authors":"M. Fiuczynski, R. Martin, Tsutomu Owa, B. Bershad","doi":"10.1145/319195.319197","DOIUrl":null,"url":null,"abstract":"The emergence of fast, cheap embedded processors present s the opportunity to execute code directly on the network interface. We are developing an extensible execution environment, called SPINE, that enables applications to compute directly on the network interface This structure allows network-oriented applications to communicate with other applications executing on the host CPU, peer devices, and remote nodes with low latency and high efficiency. 1 I n t r o d u c t i o n Many I/O intensive applications such as multimedia client, file servers, host based IP routers often move large amounts of data between devices, and therefore place high I/O demands on both the host operating system and the underlying I/O subsystem. Although technology trends point to continued increases in link bandwidth, processor speed, and disk capacity the lagging performance improvements and scalability of I/O busses is increasingly becoming apparent for I/O intensive applications. This performance gap exists because recent improvements in workstation performance have not been balanced by similar improvements in I/O performance. The exponential growth of processor speed relative to the rest of the I/O system, though, presents the opportunity for application-specific processing to occur directly on intelligent I/O devices. Several network interface cards, such as the Myricom's LANai, Alteon's ACEnic, and I20 systems, provide the infrastructure to compute on the device itself. With the technology trend of cheap, fast embedded processors (e.g., StrongARM, PowerPC, MIPS) used by intelligent network interface cards, the challenge is not so much in the hardware design as in a redesign of the software architecture needed to match the capabilities of the raw hardware. We are working to move application-specific functionality directly onto the network interface, and thereby reduce I/O related data and control transfers to the host system to improve overall system performance. The resulting ensemble of host CPUs and device processors forms a potentially large distributed system. In the context of our work, we are exploring how to program such a system at two levels. At one level, we are investigating how to migrate application functionality onto the network interface. Our approach is empirical: we take a monolithic application and migrate its I/O specific functionality into a number of device extensions. An extension is code that is logically part of the application, but runs directly on the network interface. At the next level, we are defining the operating systems interfaces that enable applications to compute directly on an intelligent network interface. Our operating system services rely on two technologies. First, applications and extensions communicate via a message-passing model based on Active Messages [5]. Second, the extensions run in a safe execution environment, called SPINE, that is derived from the SPIN operating system [1]. Applications that will benefit from this software architecture range from those that perform streaming I/O (e.g., multimedia clients/servers and file-servers), host based IP routers [10], cluster based storage management (e.g., Petal [3]), to support for packet filtering (e.g., Lazy Receive Processing [2]). SPINE offers developers a software architecture for the following three features that are key to efficiently implement I/O intensive applications: • Device-to-device transfers. By avoiding extra copies of data, we can significantly reduce bandwidth requirements in and out of host memory as well as halving bandwidth over a shared bus, such as PCI. Additionally, intelligent devices can avoid unnecessary control transfers to the host system as they can process the data before transferring it to a peer device. Techniques, such as SPLICE [16], have been introduced to emulate the device-to-device transfer. ° Host/Device protocol partitioning. Low-level protocol support for application-specific multicast [6], packet filtering (e.g., DPF [12]) and quality of service (e.g., Lazy Receive Processing [2]) has shown to significantly improve system performance. • Device-level memory management. An important performance aspect of a network system is the ability to transfer directly between the network interface and the application buffers. This type of support has been investigated by various projects (e.g., UTLB [11], AMII [13], and UNET/MM [14]). The rest of this paper is organized as follows. In Section 2 we describe the technology trends that argue for the design of smarter I/O devices. In Section 3 we describe the software architecture of SPINE. In Section 4 we describe some example applications that we've built using SPINE in the context of Windows NT. In Section 5 we discuss issues in splitting applications between the host","PeriodicalId":335784,"journal":{"name":"Proceedings of the 8th ACM SIGOPS European workshop on Support for composing distributed applications","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1998-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"59","resultStr":"{\"title\":\"SPINE: a safe programmable and integrated network environment\",\"authors\":\"M. Fiuczynski, R. Martin, Tsutomu Owa, B. Bershad\",\"doi\":\"10.1145/319195.319197\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The emergence of fast, cheap embedded processors present s the opportunity to execute code directly on the network interface. We are developing an extensible execution environment, called SPINE, that enables applications to compute directly on the network interface This structure allows network-oriented applications to communicate with other applications executing on the host CPU, peer devices, and remote nodes with low latency and high efficiency. 1 I n t r o d u c t i o n Many I/O intensive applications such as multimedia client, file servers, host based IP routers often move large amounts of data between devices, and therefore place high I/O demands on both the host operating system and the underlying I/O subsystem. Although technology trends point to continued increases in link bandwidth, processor speed, and disk capacity the lagging performance improvements and scalability of I/O busses is increasingly becoming apparent for I/O intensive applications. This performance gap exists because recent improvements in workstation performance have not been balanced by similar improvements in I/O performance. The exponential growth of processor speed relative to the rest of the I/O system, though, presents the opportunity for application-specific processing to occur directly on intelligent I/O devices. Several network interface cards, such as the Myricom's LANai, Alteon's ACEnic, and I20 systems, provide the infrastructure to compute on the device itself. With the technology trend of cheap, fast embedded processors (e.g., StrongARM, PowerPC, MIPS) used by intelligent network interface cards, the challenge is not so much in the hardware design as in a redesign of the software architecture needed to match the capabilities of the raw hardware. We are working to move application-specific functionality directly onto the network interface, and thereby reduce I/O related data and control transfers to the host system to improve overall system performance. The resulting ensemble of host CPUs and device processors forms a potentially large distributed system. In the context of our work, we are exploring how to program such a system at two levels. At one level, we are investigating how to migrate application functionality onto the network interface. Our approach is empirical: we take a monolithic application and migrate its I/O specific functionality into a number of device extensions. An extension is code that is logically part of the application, but runs directly on the network interface. At the next level, we are defining the operating systems interfaces that enable applications to compute directly on an intelligent network interface. Our operating system services rely on two technologies. First, applications and extensions communicate via a message-passing model based on Active Messages [5]. Second, the extensions run in a safe execution environment, called SPINE, that is derived from the SPIN operating system [1]. Applications that will benefit from this software architecture range from those that perform streaming I/O (e.g., multimedia clients/servers and file-servers), host based IP routers [10], cluster based storage management (e.g., Petal [3]), to support for packet filtering (e.g., Lazy Receive Processing [2]). SPINE offers developers a software architecture for the following three features that are key to efficiently implement I/O intensive applications: • Device-to-device transfers. By avoiding extra copies of data, we can significantly reduce bandwidth requirements in and out of host memory as well as halving bandwidth over a shared bus, such as PCI. Additionally, intelligent devices can avoid unnecessary control transfers to the host system as they can process the data before transferring it to a peer device. Techniques, such as SPLICE [16], have been introduced to emulate the device-to-device transfer. ° Host/Device protocol partitioning. Low-level protocol support for application-specific multicast [6], packet filtering (e.g., DPF [12]) and quality of service (e.g., Lazy Receive Processing [2]) has shown to significantly improve system performance. • Device-level memory management. An important performance aspect of a network system is the ability to transfer directly between the network interface and the application buffers. This type of support has been investigated by various projects (e.g., UTLB [11], AMII [13], and UNET/MM [14]). The rest of this paper is organized as follows. In Section 2 we describe the technology trends that argue for the design of smarter I/O devices. In Section 3 we describe the software architecture of SPINE. In Section 4 we describe some example applications that we've built using SPINE in the context of Windows NT. In Section 5 we discuss issues in splitting applications between the host\",\"PeriodicalId\":335784,\"journal\":{\"name\":\"Proceedings of the 8th ACM SIGOPS European workshop on Support for composing distributed applications\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1998-09-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"59\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 8th ACM SIGOPS European workshop on Support for composing distributed applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/319195.319197\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 8th ACM SIGOPS European workshop on Support for composing distributed applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/319195.319197","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
SPINE: a safe programmable and integrated network environment
The emergence of fast, cheap embedded processors present s the opportunity to execute code directly on the network interface. We are developing an extensible execution environment, called SPINE, that enables applications to compute directly on the network interface This structure allows network-oriented applications to communicate with other applications executing on the host CPU, peer devices, and remote nodes with low latency and high efficiency. 1 I n t r o d u c t i o n Many I/O intensive applications such as multimedia client, file servers, host based IP routers often move large amounts of data between devices, and therefore place high I/O demands on both the host operating system and the underlying I/O subsystem. Although technology trends point to continued increases in link bandwidth, processor speed, and disk capacity the lagging performance improvements and scalability of I/O busses is increasingly becoming apparent for I/O intensive applications. This performance gap exists because recent improvements in workstation performance have not been balanced by similar improvements in I/O performance. The exponential growth of processor speed relative to the rest of the I/O system, though, presents the opportunity for application-specific processing to occur directly on intelligent I/O devices. Several network interface cards, such as the Myricom's LANai, Alteon's ACEnic, and I20 systems, provide the infrastructure to compute on the device itself. With the technology trend of cheap, fast embedded processors (e.g., StrongARM, PowerPC, MIPS) used by intelligent network interface cards, the challenge is not so much in the hardware design as in a redesign of the software architecture needed to match the capabilities of the raw hardware. We are working to move application-specific functionality directly onto the network interface, and thereby reduce I/O related data and control transfers to the host system to improve overall system performance. The resulting ensemble of host CPUs and device processors forms a potentially large distributed system. In the context of our work, we are exploring how to program such a system at two levels. At one level, we are investigating how to migrate application functionality onto the network interface. Our approach is empirical: we take a monolithic application and migrate its I/O specific functionality into a number of device extensions. An extension is code that is logically part of the application, but runs directly on the network interface. At the next level, we are defining the operating systems interfaces that enable applications to compute directly on an intelligent network interface. Our operating system services rely on two technologies. First, applications and extensions communicate via a message-passing model based on Active Messages [5]. Second, the extensions run in a safe execution environment, called SPINE, that is derived from the SPIN operating system [1]. Applications that will benefit from this software architecture range from those that perform streaming I/O (e.g., multimedia clients/servers and file-servers), host based IP routers [10], cluster based storage management (e.g., Petal [3]), to support for packet filtering (e.g., Lazy Receive Processing [2]). SPINE offers developers a software architecture for the following three features that are key to efficiently implement I/O intensive applications: • Device-to-device transfers. By avoiding extra copies of data, we can significantly reduce bandwidth requirements in and out of host memory as well as halving bandwidth over a shared bus, such as PCI. Additionally, intelligent devices can avoid unnecessary control transfers to the host system as they can process the data before transferring it to a peer device. Techniques, such as SPLICE [16], have been introduced to emulate the device-to-device transfer. ° Host/Device protocol partitioning. Low-level protocol support for application-specific multicast [6], packet filtering (e.g., DPF [12]) and quality of service (e.g., Lazy Receive Processing [2]) has shown to significantly improve system performance. • Device-level memory management. An important performance aspect of a network system is the ability to transfer directly between the network interface and the application buffers. This type of support has been investigated by various projects (e.g., UTLB [11], AMII [13], and UNET/MM [14]). The rest of this paper is organized as follows. In Section 2 we describe the technology trends that argue for the design of smarter I/O devices. In Section 3 we describe the software architecture of SPINE. In Section 4 we describe some example applications that we've built using SPINE in the context of Windows NT. In Section 5 we discuss issues in splitting applications between the host