Pub Date : 1996-06-01DOI: 10.1109/M-PDT.1996.494613
C. Wu, H. Franke, Yew-Huey Liu
C. Eric Wu, Hubertus Franke, and Yew-Huey Liu IBM T J. Watson Research Center Distributed parallel processing can increase system computing power beyond the limits of current uniprocessor technology. However, programming in such a system based on the message-passing programming model is much more complex than writing sequential programs. To take advantage of the underlying hardware, understanding the communication behavior of parallel programs and system responses to user applications is extremely critical. One common way of monitoring a program’s behavior is to generate trace events while executing the program. Events generated can then be used for other purposes such as debugging and program visualization. However, as we’ll see, such a method potentially requires source code modification, increases overhead, and causes clocksynchronization problems. T o meet these challenges, we developed a Unified Trace Environment for IBM SP systems. The user-level U T E trace libraries require only relinking for generating message-passing and system events. With the UTE, users can generate message-passing events with minimum overhead, and mark specific portions of the program, such as various phases, loops, and routines, for performance analysis and visualization. Most user-level trace tools for messagepassing systems require source code modification to collect message-passing events. More advanced tools such as the Paradyn systeml require no source code modification; they insert the code for performance instrumentation into an application program during execution. However, instrumentation daemons cause substantial overhead. Collecting system events is as important as collecting message-passing events. System and I/O events such as process dispatch and page fault can reveal crucial information on system responses to user applications. The trace facility should also easily expand to trace activities from other software layers, such as parallel I/O file systems and high-level parallel languages. Such expandability enables the same trace facility to trace multiple software systems. One of the most serious problems in trace analysis for distributed parallel systems is clock synchronization. In such a system, multiple processors generate trace records, and often multiple nodes produce separate streams independently. The logical order of events might not be guaranteed in the trace because of discrepancies among local clocks. As a result, many trace facilities must do additional work to ensure consistent time stamps, thus increasing trace overhead. The challenges of trace analysis
C. Eric Wu, Hubertus Franke和Yew-Huey Liu IBM T J. Watson研究中心分布式并行处理可以提高系统计算能力,超越当前单处理器技术的限制。然而,在这样一个基于消息传递编程模型的系统中编程要比编写顺序程序复杂得多。为了利用底层硬件,理解并行程序的通信行为和系统对用户应用程序的响应是非常关键的。监视程序行为的一种常用方法是在执行程序时生成跟踪事件。然后生成的事件可用于其他目的,例如调试和程序可视化。然而,正如我们将看到的,这样的方法可能需要修改源代码,增加开销,并导致时钟同步问题。为了应对这些挑战,我们为IBM SP系统开发了统一跟踪环境。用户级U - T - E跟踪库只需要在生成消息传递和系统事件时进行重链接。使用UTE,用户可以以最小的开销生成消息传递事件,并标记程序的特定部分,例如各个阶段、循环和例程,以便进行性能分析和可视化。大多数用于消息传递系统的用户级跟踪工具都需要修改源代码来收集消息传递事件。更高级的工具,如Paradyn系统,不需要修改源代码;它们在执行期间将性能检测代码插入到应用程序中。但是,检测守护进程会导致大量的开销。收集系统事件与收集消息传递事件同样重要。系统和I/O事件(如进程调度和页面故障)可以揭示系统对用户应用程序响应的关键信息。跟踪功能还应该很容易地扩展到跟踪来自其他软件层的活动,比如并行I/O文件系统和高级并行语言。这种可扩展性使同一个跟踪工具能够跟踪多个软件系统。时钟同步是分布式并行系统跟踪分析中最严重的问题之一。在这样的系统中,多个处理器生成跟踪记录,并且通常多个节点独立地生成单独的流。由于本地时钟之间存在差异,在跟踪中可能无法保证事件的逻辑顺序。因此,许多跟踪工具必须做额外的工作来确保一致的时间戳,从而增加了跟踪开销。痕量分析的挑战
{"title":"A Unified Trace Environment for IBM SP systems","authors":"C. Wu, H. Franke, Yew-Huey Liu","doi":"10.1109/M-PDT.1996.494613","DOIUrl":"https://doi.org/10.1109/M-PDT.1996.494613","url":null,"abstract":"C. Eric Wu, Hubertus Franke, and Yew-Huey Liu IBM T J. Watson Research Center Distributed parallel processing can increase system computing power beyond the limits of current uniprocessor technology. However, programming in such a system based on the message-passing programming model is much more complex than writing sequential programs. To take advantage of the underlying hardware, understanding the communication behavior of parallel programs and system responses to user applications is extremely critical. One common way of monitoring a program’s behavior is to generate trace events while executing the program. Events generated can then be used for other purposes such as debugging and program visualization. However, as we’ll see, such a method potentially requires source code modification, increases overhead, and causes clocksynchronization problems. T o meet these challenges, we developed a Unified Trace Environment for IBM SP systems. The user-level U T E trace libraries require only relinking for generating message-passing and system events. With the UTE, users can generate message-passing events with minimum overhead, and mark specific portions of the program, such as various phases, loops, and routines, for performance analysis and visualization. Most user-level trace tools for messagepassing systems require source code modification to collect message-passing events. More advanced tools such as the Paradyn systeml require no source code modification; they insert the code for performance instrumentation into an application program during execution. However, instrumentation daemons cause substantial overhead. Collecting system events is as important as collecting message-passing events. System and I/O events such as process dispatch and page fault can reveal crucial information on system responses to user applications. The trace facility should also easily expand to trace activities from other software layers, such as parallel I/O file systems and high-level parallel languages. Such expandability enables the same trace facility to trace multiple software systems. One of the most serious problems in trace analysis for distributed parallel systems is clock synchronization. In such a system, multiple processors generate trace records, and often multiple nodes produce separate streams independently. The logical order of events might not be guaranteed in the trace because of discrepancies among local clocks. As a result, many trace facilities must do additional work to ensure consistent time stamps, thus increasing trace overhead. The challenges of trace analysis","PeriodicalId":325213,"journal":{"name":"IEEE Parallel & Distributed Technology: Systems & Applications","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125388148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1996-01-24DOI: 10.1109/MPDT.1996.7102340
J. D. Cavin
Advances in Distributed Sensor Technology by S.S. Iyengar, L. Prasad, and Hla Min 273 pp. $68 Prentice Hall Upper Saddle River, N.J. 1995 ISBN 0-13-360033-5
分布式传感器技术的进展,S.S. Iyengar, L. Prasad, Hla Min . 273页$68普伦提斯霍尔,N.J. 1995 ISBN 0-13-360033-5
{"title":"Advances in distributed sensor technology","authors":"J. D. Cavin","doi":"10.1109/MPDT.1996.7102340","DOIUrl":"https://doi.org/10.1109/MPDT.1996.7102340","url":null,"abstract":"Advances in Distributed Sensor Technology by S.S. Iyengar, L. Prasad, and Hla Min 273 pp. $68 Prentice Hall Upper Saddle River, N.J. 1995 ISBN 0-13-360033-5","PeriodicalId":325213,"journal":{"name":"IEEE Parallel & Distributed Technology: Systems & Applications","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128385193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1996-01-24DOI: 10.1109/M-PDT.1996.544438
L. Choi, Hock-Beng Lim, P. Yew
Compiler-directed cache coherence can help close the gap between processor and memory speed. The authors explain the concepts underlying techniques and survey various approaches to this strategy.
{"title":"Parallel architectures: Techniques for compiler-directed cache coherence","authors":"L. Choi, Hock-Beng Lim, P. Yew","doi":"10.1109/M-PDT.1996.544438","DOIUrl":"https://doi.org/10.1109/M-PDT.1996.544438","url":null,"abstract":"Compiler-directed cache coherence can help close the gap between processor and memory speed. The authors explain the concepts underlying techniques and survey various approaches to this strategy.","PeriodicalId":325213,"journal":{"name":"IEEE Parallel & Distributed Technology: Systems & Applications","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129955903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1996-01-24DOI: 10.1109/MPDT.1996.7102338
Everett Markowska-Scott
Integrating Personal Computers in a Distributed Client-Server Environment edited by Raman Khanna 662 pp. $48 Prentice Hall Upper Saddle River, N.J. 1995 ISBN 0-13-305152-8
在分布式客户机-服务器环境中集成个人计算机,Raman Khanna编辑,662页,$48 Prentice Hall Upper Saddle River, N.J. 1995 ISBN 0-13-305152-8
{"title":"Integrating personal computers in a distributed client-server environment","authors":"Everett Markowska-Scott","doi":"10.1109/MPDT.1996.7102338","DOIUrl":"https://doi.org/10.1109/MPDT.1996.7102338","url":null,"abstract":"Integrating Personal Computers in a Distributed Client-Server Environment edited by Raman Khanna 662 pp. $48 Prentice Hall Upper Saddle River, N.J. 1995 ISBN 0-13-305152-8","PeriodicalId":325213,"journal":{"name":"IEEE Parallel & Distributed Technology: Systems & Applications","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114067217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1996-01-24DOI: 10.1109/M-PDT.1996.544444
J. Zalewski, M. Pernice
Topics in Advanced Scientific Computation by Richard E. Crandall 340 pp. $49 Springer-Verlag, New York Telos, Santa Clara, Calif. 1996 ISBN 0-387-94473-7
《高级科学计算主题》,作者:Richard E. Crandall, 340页,49美元。斯普林格出版社,纽约Telos,加州圣克拉拉1996 ISBN 0-387-94473-7
{"title":"Topics in advanced scientific computation","authors":"J. Zalewski, M. Pernice","doi":"10.1109/M-PDT.1996.544444","DOIUrl":"https://doi.org/10.1109/M-PDT.1996.544444","url":null,"abstract":"Topics in Advanced Scientific Computation by Richard E. Crandall 340 pp. $49 Springer-Verlag, New York Telos, Santa Clara, Calif. 1996 ISBN 0-387-94473-7","PeriodicalId":325213,"journal":{"name":"IEEE Parallel & Distributed Technology: Systems & Applications","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123135628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1996-01-24DOI: 10.1109/MPDT.1996.7102341
N. Jha
Fault-Tolerant Computer System Design by Dhiraj K. Pradhan 550 pp. $72 Prentice Hall Upper Saddle River, N.J. 1996 ISBN 0-13-057887-8
《容错计算机系统设计》,Dhiraj K. Pradhan著,550页,$72普伦蒂斯霍尔,N.J. 1996 ISBN 0-13-057887-8
{"title":"Fault-tolerant computer system design","authors":"N. Jha","doi":"10.1109/MPDT.1996.7102341","DOIUrl":"https://doi.org/10.1109/MPDT.1996.7102341","url":null,"abstract":"Fault-Tolerant Computer System Design by Dhiraj K. Pradhan 550 pp. $72 Prentice Hall Upper Saddle River, N.J. 1996 ISBN 0-13-057887-8","PeriodicalId":325213,"journal":{"name":"IEEE Parallel & Distributed Technology: Systems & Applications","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121775313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1996-01-24DOI: 10.1109/MPDT.1996.7102337
Albert Y. Zomaya
Parallel Processing in Cellular Arrays by Yakov Fet 200 pp. $74.95 John Wiley & Sons New York 1995 ISBN 0-471-95409-8
并行处理的蜂窝阵列由Yakov Fet 200页$74.95约翰威利和儿子纽约1995 ISBN 0-471-95409-8
{"title":"Parallel processing in cellular arrays","authors":"Albert Y. Zomaya","doi":"10.1109/MPDT.1996.7102337","DOIUrl":"https://doi.org/10.1109/MPDT.1996.7102337","url":null,"abstract":"Parallel Processing in Cellular Arrays by Yakov Fet 200 pp. $74.95 John Wiley & Sons New York 1995 ISBN 0-471-95409-8","PeriodicalId":325213,"journal":{"name":"IEEE Parallel & Distributed Technology: Systems & Applications","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125075238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cache memory — so effective in traditional control-flow architecture — has the potential to enhance dataflow system performance as well. The authors explore the recent trend in combining dataflow and control-flow processing, which offers new alternatives in computer architecture design, and analyze cache memory's application to the dataflow environment.
{"title":"Parallel architectures: Cache memories for dataflow systems","authors":"A. Hurson, K. Kavi, B. Shirazi, Ben Lee","doi":"10.1109/88.544436","DOIUrl":"https://doi.org/10.1109/88.544436","url":null,"abstract":"Cache memory — so effective in traditional control-flow architecture — has the potential to enhance dataflow system performance as well. The authors explore the recent trend in combining dataflow and control-flow processing, which offers new alternatives in computer architecture design, and analyze cache memory's application to the dataflow environment.","PeriodicalId":325213,"journal":{"name":"IEEE Parallel & Distributed Technology: Systems & Applications","volume":"6 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132733129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1996-01-24DOI: 10.1109/MPDT.1996.7102339
M. Paprzycki
Parallel Computing Works! by G.C. Fox, R.D. Williams, and P.C. Messina 977 pp. $69.95 Morgan Kaufmann San Francisco 1994 ISBN 1-55860-253-4
并行计算工作!作者:G.C. Fox, R.D. Williams, P.C. Messina 977页69.95美元Morgan Kaufmann旧金山1994 ISBN 1-55860-253-4
{"title":"Parallel computing works!","authors":"M. Paprzycki","doi":"10.1109/MPDT.1996.7102339","DOIUrl":"https://doi.org/10.1109/MPDT.1996.7102339","url":null,"abstract":"Parallel Computing Works! by G.C. Fox, R.D. Williams, and P.C. Messina 977 pp. $69.95 Morgan Kaufmann San Francisco 1994 ISBN 1-55860-253-4","PeriodicalId":325213,"journal":{"name":"IEEE Parallel & Distributed Technology: Systems & Applications","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131774431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}