{"title":"Tornado: maximizing locality and concurrency in a shared memory multiprocessor operating system","authors":"Benjamin Gamsa, O. Krieger, J. Appavoo, M. Stumm","doi":"10.1145/296806.296814","DOIUrl":null,"url":null,"abstract":"We describe the design and implementation of Tornado, a new operating system designed from the ground up specifically for today's shared memory multiprocessors. The need for improved locality in the operating system is growing as multiprocessor hardware evolves, increasing the costs for cache misses and sharing, and adding complications due to NUMAness. Tornado is optimized so that locality and independence in application requests for operating system services-whetherfrom multiple sequential applications or a single parallel application-are mapped onto locality and independence in the servicing of these requests in the kernel and system servers. By contrast, previous shared memory multiprocessor operating systems all evolved from designs constructed at a time when sharing costs were low, memory latency was low and uniform, and caches were small; for these systems, concurrency was the main performance concern and locality was not an important issue. Tornado achieves this locality by starting with an object-oriented structure, where every virtual and physical resource is represented by an independent object. Locality, as well as concurrency, is further enhanced with the introduction of three key innovations: (i) clustered objects that support the partitioning of contended objects across processors, (ii) a protected procedure call facility that preserves the locality and concurrency of IPC's, and (iii) a new locking strategy that allows all locking to be encapsulated within the objects being protected and greatly simplifies the overall locking protocols. As a result of these techniques, Tornado has far better performance characteristics, particularly for multithreaded applications, than existing commercial operating systems. Tornado has been fully implemented and runs both on Toronto's NUMAchine hardware and on the SimOS simulator.","PeriodicalId":90294,"journal":{"name":"Proceedings of the -- USENIX Symposium on Operating Systems Design and Implementation (OSDI). USENIX Symposium on Operating Systems Design and Implementation","volume":"43 1","pages":"87-100"},"PeriodicalIF":0.0000,"publicationDate":"1999-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"217","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the -- USENIX Symposium on Operating Systems Design and Implementation (OSDI). USENIX Symposium on Operating Systems Design and Implementation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/296806.296814","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 217
Abstract
We describe the design and implementation of Tornado, a new operating system designed from the ground up specifically for today's shared memory multiprocessors. The need for improved locality in the operating system is growing as multiprocessor hardware evolves, increasing the costs for cache misses and sharing, and adding complications due to NUMAness. Tornado is optimized so that locality and independence in application requests for operating system services-whetherfrom multiple sequential applications or a single parallel application-are mapped onto locality and independence in the servicing of these requests in the kernel and system servers. By contrast, previous shared memory multiprocessor operating systems all evolved from designs constructed at a time when sharing costs were low, memory latency was low and uniform, and caches were small; for these systems, concurrency was the main performance concern and locality was not an important issue. Tornado achieves this locality by starting with an object-oriented structure, where every virtual and physical resource is represented by an independent object. Locality, as well as concurrency, is further enhanced with the introduction of three key innovations: (i) clustered objects that support the partitioning of contended objects across processors, (ii) a protected procedure call facility that preserves the locality and concurrency of IPC's, and (iii) a new locking strategy that allows all locking to be encapsulated within the objects being protected and greatly simplifies the overall locking protocols. As a result of these techniques, Tornado has far better performance characteristics, particularly for multithreaded applications, than existing commercial operating systems. Tornado has been fully implemented and runs both on Toronto's NUMAchine hardware and on the SimOS simulator.