The Chaos router is an adaptive, randomized message router for multlicomputers. Aclaptive routers are superior to oblivious routers, the state-of-the-art, because they can by-pass congestion and faults. unlike other adaptive routers, however, the Chaos router has reduced the complexity along the critical path of the routing decision by using randomization to eliminate livelock protection, The foundational theory for Chaotic routing, proving that, this approach is sound, has been previously de~-eloped [1 1]. In this paper we present, the complete design of the router together with (simulated) performance figures. The results show that, the Chaos t-outer is competitive with the simple and fast obli~ious routers for random loads and greatly superior for loads with hot spots.
{"title":"Chaos router: architecture and performance","authors":"S. Konstantinidou, L. Snyder","doi":"10.1145/115952.115974","DOIUrl":"https://doi.org/10.1145/115952.115974","url":null,"abstract":"The Chaos router is an adaptive, randomized message router for multlicomputers. Aclaptive routers are superior to oblivious routers, the state-of-the-art, because they can by-pass congestion and faults. unlike other adaptive routers, however, the Chaos router has reduced the complexity along the critical path of the routing decision by using randomization to eliminate livelock protection, The foundational theory for Chaotic routing, proving that, this approach is sound, has been previously de~-eloped [1 1]. In this paper we present, the complete design of the router together with (simulated) performance figures. The results show that, the Chaos t-outer is competitive with the simple and fast obli~ious routers for random loads and greatly superior for loads with hot spots.","PeriodicalId":187095,"journal":{"name":"[1991] Proceedings. The 18th Annual International Symposium on Computer Architecture","volume":"213 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122519081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
It has been suggested that non-scientific code has very little parallelism not already exploited by existing vrocesso~s. In this ABSTRACT It has been suggested that non-scientific code has very little parallelism not already exploited by existing vrocesso~s. In this paper we show that &nt& to this notiOK (here is actually a significant amount of unexploited parallelism in typical general purpose code. In order to exploit this parallelism, a combination of hardware and software techniques must be applied. We analyze three techniques: dynamic scheduling, speculative execution and basic block enlargement. We will show that indeed for narrow instruction words little is tobegainedby applying these techniques. However, as the number of simultaneous operations increases, it becomes possible to achieve speedups of three to six on realistic processors.
{"title":"Exploiting fine-grained parallelism through a combination of hardware and software techniques","authors":"S. Melvin, Y. Patt","doi":"10.1145/115953.115981","DOIUrl":"https://doi.org/10.1145/115953.115981","url":null,"abstract":"It has been suggested that non-scientific code has very little parallelism not already exploited by existing vrocesso~s. In this ABSTRACT It has been suggested that non-scientific code has very little parallelism not already exploited by existing vrocesso~s. In this paper we show that &nt& to this notiOK (here is actually a significant amount of unexploited parallelism in typical general purpose code. In order to exploit this parallelism, a combination of hardware and software techniques must be applied. We analyze three techniques: dynamic scheduling, speculative execution and basic block enlargement. We will show that indeed for narrow instruction words little is tobegainedby applying these techniques. However, as the number of simultaneous operations increases, it becomes possible to achieve speedups of three to six on realistic processors.","PeriodicalId":187095,"journal":{"name":"[1991] Proceedings. The 18th Annual International Symposium on Computer Architecture","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125930551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We use mean value analysis models to compare representative hardware and software cache coherence schemes for a large-scale shared-memory system. Our goal is to identify the workloads for which either of the schemes is significantly better. Our methodology improves upon previous analytical studies and complements previous simulation studies by developing a common high-level workload model that is used to derive separate sets of lowlevel workload parameters for the two schemes. This approach allows an equitable comparison of the two schemes for a specific workload. is attractive because the overhead of detecting stale data is transferred from runtime to compile time, and the design complexity is transferred from hardware to software. However. software schemes may perform poorly because compile-time analysis may need IO be conservative, leading to unnecessary cache misses and main memory updates. In this paper, we use approximate Mean Value Analysis [U881 to compare the performance of a representative software scheme with a directory-based hardware scheme on a large-scale shared-memory system. In a previous study comparing the performance of hardware and software coherence, Cheong and VeidenOur resuIi, show that software schemes are haum used a parallelizing compiler to implement three difable (in terms of processor efficiency) IO hardware schemes ferent Software coherence schemes [Che90]. For selccted for a wide class of programs. The only cases for which subroutines Of Seven programs, they show that the hit ratio software schemes ,,erform sienificmtlv worse than of their most sophisticated software scheme (version con, ~~~ ~~~~~~ r~
{"title":"Comparison of hardware and software cache coherence schemes","authors":"S. Adve, Vikram S. Adve, M. Hill, M. Vernon","doi":"10.1145/115953.115982","DOIUrl":"https://doi.org/10.1145/115953.115982","url":null,"abstract":"We use mean value analysis models to compare representative hardware and software cache coherence schemes for a large-scale shared-memory system. Our goal is to identify the workloads for which either of the schemes is significantly better. Our methodology improves upon previous analytical studies and complements previous simulation studies by developing a common high-level workload model that is used to derive separate sets of lowlevel workload parameters for the two schemes. This approach allows an equitable comparison of the two schemes for a specific workload. is attractive because the overhead of detecting stale data is transferred from runtime to compile time, and the design complexity is transferred from hardware to software. However. software schemes may perform poorly because compile-time analysis may need IO be conservative, leading to unnecessary cache misses and main memory updates. In this paper, we use approximate Mean Value Analysis [U881 to compare the performance of a representative software scheme with a directory-based hardware scheme on a large-scale shared-memory system. In a previous study comparing the performance of hardware and software coherence, Cheong and VeidenOur resuIi, show that software schemes are haum used a parallelizing compiler to implement three difable (in terms of processor efficiency) IO hardware schemes ferent Software coherence schemes [Che90]. For selccted for a wide class of programs. The only cases for which subroutines Of Seven programs, they show that the hit ratio software schemes ,,erform sienificmtlv worse than of their most sophisticated software scheme (version con, ~~~ ~~~~~~ r~","PeriodicalId":187095,"journal":{"name":"[1991] Proceedings. The 18th Annual International Symposium on Computer Architecture","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130799567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper investigates communication in distributed memory multiprocessors to support tasklevel parallelism for real-time applications. It is shown that wormhole routing, used in second generation multicomputers, does not support task-level pipelining because its oblivious contention resolution leads to output inconsistency in which a constant throughput is not guaranteed. We propose scheduled routing which guarantees constant throughputs by integrating task specifications with flow-control. In this routing technique, communication processors provide explicit flowcontrol by independently executing switching schedules computed at compile-time. It is deadlock-free, contention-free, does not load the intermediate node memory, and makes use of the multiple equivalent paths between non-adjacent nodes. The resource allocation and scheduling problems resulting from such routing are formulated and related implementation issues are anal yzed. A comparison with wormhole routing for various generalized hyp ercubes and tori shows that scheduled routing is effective in providing a constant throughput when wormhole routing does not and enables pipelining at higher input arrival rates.
{"title":"Scheduling pipelined communication in distributed memory multiprocessors for real-time applications","authors":"S. Shukla, D. Agrawal","doi":"10.1145/115952.115975","DOIUrl":"https://doi.org/10.1145/115952.115975","url":null,"abstract":"This paper investigates communication in distributed memory multiprocessors to support tasklevel parallelism for real-time applications. It is shown that wormhole routing, used in second generation multicomputers, does not support task-level pipelining because its oblivious contention resolution leads to output inconsistency in which a constant throughput is not guaranteed. We propose scheduled routing which guarantees constant throughputs by integrating task specifications with flow-control. In this routing technique, communication processors provide explicit flowcontrol by independently executing switching schedules computed at compile-time. It is deadlock-free, contention-free, does not load the intermediate node memory, and makes use of the multiple equivalent paths between non-adjacent nodes. The resource allocation and scheduling problems resulting from such routing are formulated and related implementation issues are anal yzed. A comparison with wormhole routing for various generalized hyp ercubes and tori shows that scheduled routing is effective in providing a constant throughput when wormhole routing does not and enables pipelining at higher input arrival rates.","PeriodicalId":187095,"journal":{"name":"[1991] Proceedings. The 18th Annual International Symposium on Computer Architecture","volume":"268 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122756124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Most parallel applications (e.g. image processing, mtdtigrid algorithms) in a Transputer-network require a lot of communication between the processing nodes. For such applications the communication system TRACOS was developed to support data transfer between random Transputers in the network. To marimize the performance of the parallel system, its dynamic internal behavior has to be analyzed. For this purpose event-driven monitoring is an appropriate technique. It reduces the dynamic behavior of the system to sequences of events. They are recor&d by a monitor system and stored as event traces. In this paper the communication system TRACOS and its performance evaluation based on monitored event traces are presented. First a synthetic workload was instrumented and monitored with the distributed hardware monitor ZM4. The results showed that the performance of TRACOS is poor for packets smaller than 4 Kbyte. Therefore, TRACOS itself was instrumented and monitored to get insight into the interactions and interdependencies of all TRACOS processes. Based on the monitoring resuhs, TRACOS could be improved which led to a performance increase of 25T0.
{"title":"Performance evaluation of a communication system for transputer-networks based on monitored event traces","authors":"C. W. Oehlrich, Andreas Quick","doi":"10.1145/115952.115973","DOIUrl":"https://doi.org/10.1145/115952.115973","url":null,"abstract":"Most parallel applications (e.g. image processing, mtdtigrid algorithms) in a Transputer-network require a lot of communication between the processing nodes. For such applications the communication system TRACOS was developed to support data transfer between random Transputers in the network. To marimize the performance of the parallel system, its dynamic internal behavior has to be analyzed. For this purpose event-driven monitoring is an appropriate technique. It reduces the dynamic behavior of the system to sequences of events. They are recor&d by a monitor system and stored as event traces. In this paper the communication system TRACOS and its performance evaluation based on monitored event traces are presented. First a synthetic workload was instrumented and monitored with the distributed hardware monitor ZM4. The results showed that the performance of TRACOS is poor for packets smaller than 4 Kbyte. Therefore, TRACOS itself was instrumented and monitored to get insight into the interactions and interdependencies of all TRACOS processes. Based on the monitoring resuhs, TRACOS could be improved which led to a performance increase of 25T0.","PeriodicalId":187095,"journal":{"name":"[1991] Proceedings. The 18th Annual International Symposium on Computer Architecture","volume":"120 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132291776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A multiprocessor system with a large number of nodes can be built at low cost by combining the recent advances in high capacity channels available through optical fiber communication. A highly fault tolerant system is created with good performance characteristics at a reduction in system complexity. The system capitalizes on the optical selfrouting characteristic of wavelength division multiple access to improve performance and reduce complexity. This paper examines typical optical multiple access channel implementations and shows that the star-coupled approach is superior due to optical power budget considerations. Star-coupled configurations which exhibit the optical self-routing characteristic are then studied. A hypercube based structure is introduced where optical multiple access channels span the dimensional axes. This severely reduces the required degree since only one 1/0 port is required per dimension, and performance is maintained through the high capacity characteristics of optical communication.
{"title":"High performance interprocessor communication through optical wavelength division multiple access channels","authors":"P. Dowd","doi":"10.1145/115952.115963","DOIUrl":"https://doi.org/10.1145/115952.115963","url":null,"abstract":"A multiprocessor system with a large number of nodes can be built at low cost by combining the recent advances in high capacity channels available through optical fiber communication. A highly fault tolerant system is created with good performance characteristics at a reduction in system complexity. The system capitalizes on the optical selfrouting characteristic of wavelength division multiple access to improve performance and reduce complexity. This paper examines typical optical multiple access channel implementations and shows that the star-coupled approach is superior due to optical power budget considerations. Star-coupled configurations which exhibit the optical self-routing characteristic are then studied. A hypercube based structure is introduced where optical multiple access channels span the dimensional axes. This severely reduces the required degree since only one 1/0 port is required per dimension, and performance is maintained through the high capacity characteristics of optical communication.","PeriodicalId":187095,"journal":{"name":"[1991] Proceedings. The 18th Annual International Symposium on Computer Architecture","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116105426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Butler, Tse-Yu Yeh, Y. Patt, M. Alsup, H. Scales, M. Shebanow
Recent studies have concluded that little parallelism (less than two operations per cycle) is available in single instruction streams. Since the amount of available parallelism should influence the design of the processor, it is important to verify how much parallelism really exists. In this study we model the execution of the SPEC benchmarks under differing resource constraints. We repeat the work of the previous researchers, and show that under the hardware resource constraints they imposed, we get similar results. On the other hand, when all constraints are removed except those ~equired by the semantics oft he program, we have found degrees of parallelism in excess of 17 instructions per cycle. Finally, and perhaps most important for exploiting single instruction stream parallelism now, we show that if the hardware is properly balanced, one can sustain from 2.0 to 5.8 instructions per cycle on a processor that is reasonable to design today.
{"title":"Single instruction stream parallelism is greater than two","authors":"M. Butler, Tse-Yu Yeh, Y. Patt, M. Alsup, H. Scales, M. Shebanow","doi":"10.1145/115952.115980","DOIUrl":"https://doi.org/10.1145/115952.115980","url":null,"abstract":"Recent studies have concluded that little parallelism (less than two operations per cycle) is available in single instruction streams. Since the amount of available parallelism should influence the design of the processor, it is important to verify how much parallelism really exists. In this study we model the execution of the SPEC benchmarks under differing resource constraints. We repeat the work of the previous researchers, and show that under the hardware resource constraints they imposed, we get similar results. On the other hand, when all constraints are removed except those ~equired by the semantics oft he program, we have found degrees of parallelism in excess of 17 instructions per cycle. Finally, and perhaps most important for exploiting single instruction stream parallelism now, we show that if the hardware is properly balanced, one can sustain from 2.0 to 5.8 instructions per cycle on a processor that is reasonable to design today.","PeriodicalId":187095,"journal":{"name":"[1991] Proceedings. The 18th Annual International Symposium on Computer Architecture","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125864330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Trace-driven simulation is a commonly-used technique for evaluating multiprocessor memory systems. However, several open questions exist concerning the validity of multiprocessor traces. One is the extent to which tracing induced dilation affects the traces and consequently the results of the simulations. A second is whether the traces generated from multiple runs of the same program will yield the same simulation results. This study examines the variation in simulation results caused by both dilation and multiple runs of the same program on a shared-memory multiprocessor. Overall, our results validate the use of trace-driven simulation for these machines: variability due to dilation and multiple runs appears to be small. However, where small differences in simulated results are crucial to design decisions, multiple traces of parallel applications should be examined.
{"title":"On the validity of trace-driven simulation for multiprocessors","authors":"E. J. Koldinger, S. Eggers, H. Levy","doi":"10.1145/115953.115977","DOIUrl":"https://doi.org/10.1145/115953.115977","url":null,"abstract":"Trace-driven simulation is a commonly-used technique for evaluating multiprocessor memory systems. However, several open questions exist concerning the validity of multiprocessor traces. One is the extent to which tracing induced dilation affects the traces and consequently the results of the simulations. A second is whether the traces generated from multiple runs of the same program will yield the same simulation results. This study examines the variation in simulation results caused by both dilation and multiple runs of the same program on a shared-memory multiprocessor. Overall, our results validate the use of trace-driven simulation for these machines: variability due to dilation and multiple runs appears to be small. However, where small differences in simulated results are crucial to design decisions, multiple traces of parallel applications should be examined.","PeriodicalId":187095,"journal":{"name":"[1991] Proceedings. The 18th Annual International Symposium on Computer Architecture","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127020618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1991-04-01DOI: 10.1109/ISCA.1991.1021601
B. R. Rau
Interleaved memories are often used to provide the high bandwidth needed by multiprocessors and high performance uniprocessors such as vector and VLIW processors. The manner in which memory locations are distributed across the memory modules has a significant influence on whether, and for which types of reference patterns, the full bandwidth of the memory system is achieved. The most common interleaved memory architecture is the sequentially interleaved memory in which successive memory locations are assigned to successive memory modules. Although such an architecture is the simplest to implement and provides good performance with strides that are odd integers, it can degrade badly in the face of even strides, especially strides that are a power of two. In a pseudo-randomly interleaved memory architecture, memory locations are assigned to the memory modules in some pseudo-random fashion in the hope that those sequences of references, which are likely to occur in practice, will end up being evenly distributed across the memory modules. The notion of polynomial interleaving modulo an irreducible polynomial is introduced as a way of achieving pseudo-random interleaving with certain attractive and provable properties. The theory behind this scheme is developed and the results of simulations are presented. Kev words: supercomputer memory, parallel memory, interleaved memory, hashed memory, pseudo-random interleaving, memory buffering.
{"title":"Psfudo-randomly interleaved memory","authors":"B. R. Rau","doi":"10.1109/ISCA.1991.1021601","DOIUrl":"https://doi.org/10.1109/ISCA.1991.1021601","url":null,"abstract":"Interleaved memories are often used to provide the high bandwidth needed by multiprocessors and high performance uniprocessors such as vector and VLIW processors. The manner in which memory locations are distributed across the memory modules has a significant influence on whether, and for which types of reference patterns, the full bandwidth of the memory system is achieved. The most common interleaved memory architecture is the sequentially interleaved memory in which successive memory locations are assigned to successive memory modules. Although such an architecture is the simplest to implement and provides good performance with strides that are odd integers, it can degrade badly in the face of even strides, especially strides that are a power of two. In a pseudo-randomly interleaved memory architecture, memory locations are assigned to the memory modules in some pseudo-random fashion in the hope that those sequences of references, which are likely to occur in practice, will end up being evenly distributed across the memory modules. The notion of polynomial interleaving modulo an irreducible polynomial is introduced as a way of achieving pseudo-random interleaving with certain attractive and provable properties. The theory behind this scheme is developed and the results of simulations are presented. Kev words: supercomputer memory, parallel memory, interleaved memory, hashed memory, pseudo-random interleaving, memory buffering.","PeriodicalId":187095,"journal":{"name":"[1991] Proceedings. The 18th Annual International Symposium on Computer Architecture","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132163186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Semantic Network Array Processor (SNAP) is a parallel architecture for Artificial Intelligence (AI) applications. We haue implemented a first-generation hardware/soflware prototype called SNAP-1 using Digital Signal Processor chips and ouerlapping groups of multiport memories. The design features 32 processing clusters with four to five functionally dedicated Digital Signal Processors in each cluster. Processors within clusters share a marker-processing memo y while communication between clusters is implemented by a buffered messagepassing scheme.
{"title":"The SNAP-1 parallel AI prototype","authors":"R. Demara, D. Moldovan","doi":"10.1145/115952.115954","DOIUrl":"https://doi.org/10.1145/115952.115954","url":null,"abstract":"The Semantic Network Array Processor (SNAP) is a parallel architecture for Artificial Intelligence (AI) applications. We haue implemented a first-generation hardware/soflware prototype called SNAP-1 using Digital Signal Processor chips and ouerlapping groups of multiport memories. The design features 32 processing clusters with four to five functionally dedicated Digital Signal Processors in each cluster. Processors within clusters share a marker-processing memo y while communication between clusters is implemented by a buffered messagepassing scheme.","PeriodicalId":187095,"journal":{"name":"[1991] Proceedings. The 18th Annual International Symposium on Computer Architecture","volume":"249 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134218740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}