首页 > 最新文献

Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing最新文献

英文 中文
Mapping decisions by fuzzy inference 基于模糊推理的映射决策
A. Sodan, V. Torra
The approach presented is based on a system for mapping dynamic task tree-structures, such as occur in the relevant subfields of symbolic applications, to parallel machines. This mapping system provides multiple, and in many cases combinable, elementary strategies instead of a single universal one. The strategy configuration best matching the application characteristics, i.e. leading to optimal performance, can then be chosen. This requires establishing appropriate characteristics-oriented selection criteria that are expressive and precise enough to enable the compiler to find (close-to-)optimal configurations automatically. This paper focuses on the automatic-configuration aspect and presents the FiM system's solution to this task. FiM is implemented as a fuzzy-inference system, fuzziness allowing us to capture soft classifications of application characteristics and vague certainties or degrees of adequacy about the appropriateness of strategy selections. Existing approaches to fuzzy inference had to be extended to allow fuzzy multistage reasoning. The feasibility of the fuzzy-inference approach is shown. Though developed for mapping, the FiM approach can-using the corresponding selection rules-be applied to other configuration problems in multiple-strategy systems.
该方法基于一个将动态任务树结构映射到并行机器的系统,例如在符号应用的相关子领域中出现的动态任务树结构。这个映射系统提供了多个基本策略,在许多情况下是可组合的,而不是单一的通用策略。然后可以选择与应用程序特征最匹配的策略配置,即导致最佳性能。这就需要建立适当的面向特性的选择标准,这些选择标准具有足够的表现力和精确度,以使编译器能够自动找到(接近)最优配置。本文重点研究了自动配置方面的问题,并提出了FiM系统的解决方案。FiM是作为一个模糊推理系统实现的,模糊性使我们能够捕获应用特征的软分类,以及关于策略选择适当性的模糊确定性或充分性程度。现有的模糊推理方法必须扩展到允许模糊多阶段推理。证明了模糊推理方法的可行性。虽然是为映射而开发的,但FiM方法可以使用相应的选择规则应用于多策略系统中的其他配置问题。
{"title":"Mapping decisions by fuzzy inference","authors":"A. Sodan, V. Torra","doi":"10.1109/ICAPP.1997.651509","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651509","url":null,"abstract":"The approach presented is based on a system for mapping dynamic task tree-structures, such as occur in the relevant subfields of symbolic applications, to parallel machines. This mapping system provides multiple, and in many cases combinable, elementary strategies instead of a single universal one. The strategy configuration best matching the application characteristics, i.e. leading to optimal performance, can then be chosen. This requires establishing appropriate characteristics-oriented selection criteria that are expressive and precise enough to enable the compiler to find (close-to-)optimal configurations automatically. This paper focuses on the automatic-configuration aspect and presents the FiM system's solution to this task. FiM is implemented as a fuzzy-inference system, fuzziness allowing us to capture soft classifications of application characteristics and vague certainties or degrees of adequacy about the appropriateness of strategy selections. Existing approaches to fuzzy inference had to be extended to allow fuzzy multistage reasoning. The feasibility of the fuzzy-inference approach is shown. Though developed for mapping, the FiM approach can-using the corresponding selection rules-be applied to other configuration problems in multiple-strategy systems.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"11220 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122486085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Eigenvectors-based parallelisation of nested loops with affine dependences 具有仿射依赖的嵌套循环的基于特征向量的并行化
P. Lenders, Jingling Xue
This paper is concerned with parallelising a special class of nested loops with affine dependences. The data dependences of the program are captured in a so-called dependence matrix. Based on the eigenvalues and eigenvectors of this matrix, the proposed approach can generate a greater degree of DOALL parallelism than traditional unimodular transformations.
本文研究了一类特殊的具有仿射依赖的嵌套循环的并行化问题。程序的数据依赖关系在所谓的依赖矩阵中被捕获。基于该矩阵的特征值和特征向量,该方法比传统的单模变换产生更大程度的DOALL并行性。
{"title":"Eigenvectors-based parallelisation of nested loops with affine dependences","authors":"P. Lenders, Jingling Xue","doi":"10.1080/01495730108941442","DOIUrl":"https://doi.org/10.1080/01495730108941442","url":null,"abstract":"This paper is concerned with parallelising a special class of nested loops with affine dependences. The data dependences of the program are captured in a so-called dependence matrix. Based on the eigenvalues and eigenvectors of this matrix, the proposed approach can generate a greater degree of DOALL parallelism than traditional unimodular transformations.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114362931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
ILP architectures: trading hardware for software complexity ILP架构:用硬件换取软件复杂性
H. Corporaal
Several interesting superscalar and VLIW (very large instruction word) processors have hit the market. These processors exploit so-called instruction level parallelism (ILP); each cycle multiple operations are executed. This paper analyzes the data path complexity of ILP processors, in particular of VLIWs. It demonstrates that their complexity gets out of control when scaling to very high performance. Several methods are researched for reducing this complexity. Essentially these methods trade hardware for software complexity, i.e., performing as much as possible at compile time. This results in a new architectural approach called transport triggering. Its concept and characteristics are outlined. The application of this concept results in a number of hardware advantages, and introduces several new scheduling optimizations.
一些有趣的超标量和VLIW(非常大的指令字)处理器已经上市。这些处理器利用了所谓的指令级并行性(ILP);每个周期执行多个操作。本文分析了ILP处理器,特别是vliw的数据路径复杂度。它表明,当扩展到非常高性能时,它们的复杂性会失控。研究了几种降低这种复杂性的方法。本质上,这些方法以硬件换取软件复杂性,即在编译时尽可能多地执行。这就产生了一种称为传输触发的新体系结构方法。概述了其概念和特点。这个概念的应用带来了许多硬件优势,并引入了几个新的调度优化。
{"title":"ILP architectures: trading hardware for software complexity","authors":"H. Corporaal","doi":"10.1109/ICAPP.1997.651486","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651486","url":null,"abstract":"Several interesting superscalar and VLIW (very large instruction word) processors have hit the market. These processors exploit so-called instruction level parallelism (ILP); each cycle multiple operations are executed. This paper analyzes the data path complexity of ILP processors, in particular of VLIWs. It demonstrates that their complexity gets out of control when scaling to very high performance. Several methods are researched for reducing this complexity. Essentially these methods trade hardware for software complexity, i.e., performing as much as possible at compile time. This results in a new architectural approach called transport triggering. Its concept and characteristics are outlined. The application of this concept results in a number of hardware advantages, and introduces several new scheduling optimizations.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122128064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Integrating heterogeneous databases: a distributed model 集成异构数据库:分布式模型
P.A. Hepner, W. Zhou
We put forward a distributed model for accessing heterogeneous database systems. Database operations requested by the user are processed in a distributed manner that takes advantage of the inherent parallelism of distributed systems, minimises network traffic and uses almost any general purpose computer on the network. Processing is not confined to DBMS sites but is provided as a distributed service. The design is modular and provides the mechanisms that may be arranged in a variety of ways providing a range of integration paradigms from loosely coupled integrations to more tightly coupled integrations.
提出了一种访问异构数据库系统的分布式模型。用户请求的数据库操作以分布式方式处理,这种方式利用了分布式系统固有的并行性,最大限度地减少了网络流量,并且几乎可以使用网络上的任何通用计算机。处理并不局限于DBMS站点,而是作为分布式服务提供。该设计是模块化的,并提供了可以以各种方式排列的机制,提供了从松散耦合集成到更紧密耦合集成的一系列集成范例。
{"title":"Integrating heterogeneous databases: a distributed model","authors":"P.A. Hepner, W. Zhou","doi":"10.1109/ICAPP.1997.651535","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651535","url":null,"abstract":"We put forward a distributed model for accessing heterogeneous database systems. Database operations requested by the user are processed in a distributed manner that takes advantage of the inherent parallelism of distributed systems, minimises network traffic and uses almost any general purpose computer on the network. Processing is not confined to DBMS sites but is provided as a distributed service. The design is modular and provides the mechanisms that may be arranged in a variety of ways providing a range of integration paradigms from loosely coupled integrations to more tightly coupled integrations.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124193786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Parallel algorithm and architectures for two-step division-free Gaussian elimination 两步无除法高斯消去的并行算法和体系结构
S. Peng, S. Sedukhin
The design of optimal array processors for solving linear systems using two-step division-free Gaussian elimination method is considered. The two-step method circumvents the one-step one in terms of numerical stability. In spite of the rather complicated computations needed at each iteration of the two-step method, we develop an innovative parallel algorithm whose data dependency graph meets the requirements for regularity and locality. Then we derive two-dimensional array processors by adopting a systematic approach to investigate the set of all admissible solutions and obtain the optimal array processors under linear time-space scheduling. The array processors is optimal in terms of the number of processing elements used.
考虑了用两步无除法高斯消去法求解线性系统的最优阵列处理器的设计。两步法在数值稳定性方面避开了一步法。尽管两步法在每次迭代时需要相当复杂的计算,但我们开发了一种创新的并行算法,其数据依赖图满足规则性和局部性的要求。然后,采用系统的方法研究了线性时空调度下的所有允许解集,得到了二维阵列处理器的最优解。就所使用的处理元素的数量而言,阵列处理器是最优的。
{"title":"Parallel algorithm and architectures for two-step division-free Gaussian elimination","authors":"S. Peng, S. Sedukhin","doi":"10.1109/ICAPP.1997.651516","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651516","url":null,"abstract":"The design of optimal array processors for solving linear systems using two-step division-free Gaussian elimination method is considered. The two-step method circumvents the one-step one in terms of numerical stability. In spite of the rather complicated computations needed at each iteration of the two-step method, we develop an innovative parallel algorithm whose data dependency graph meets the requirements for regularity and locality. Then we derive two-dimensional array processors by adopting a systematic approach to investigate the set of all admissible solutions and obtain the optimal array processors under linear time-space scheduling. The array processors is optimal in terms of the number of processing elements used.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121568860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Parallel algorithms for spatial data partition and join processing 空间数据分区和连接处理的并行算法
Yanchun Zhang, Jitian Xiao, A. Roberts
The spatial join operations combine two sets of spatial data by their spatial relationships. They are among the most important, yet most time-consuming operations in spatial databases. We consider the problem of binary polygon intersection joins based on the filter-and-refine strategy. Our objective is to minimize the I/O cost and the response time for the refinement step. First, a graph model is proposed to formalize the refinement cost and matrix-based sequential data partition algorithms are introduced. Then a parallel data partitioning algorithm is developed with a detailed complexity analysis. Based on the data partition results, a distribution algorithm is also proposed for scheduling parallel spatial join processing.
空间连接操作通过空间关系组合两组空间数据。它们是空间数据库中最重要但也最耗时的操作之一。我们考虑了基于滤波-细化策略的二叉多边形相交连接问题。我们的目标是最小化I/O成本和优化步骤的响应时间。首先,提出了一种图模型来形式化改进成本,并引入了基于矩阵的顺序数据划分算法。在此基础上,提出了一种并行数据划分算法,并对算法的复杂度进行了详细的分析。在数据分区结果的基础上,提出了一种调度并行空间连接处理的分布算法。
{"title":"Parallel algorithms for spatial data partition and join processing","authors":"Yanchun Zhang, Jitian Xiao, A. Roberts","doi":"10.1109/ICAPP.1997.651536","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651536","url":null,"abstract":"The spatial join operations combine two sets of spatial data by their spatial relationships. They are among the most important, yet most time-consuming operations in spatial databases. We consider the problem of binary polygon intersection joins based on the filter-and-refine strategy. Our objective is to minimize the I/O cost and the response time for the refinement step. First, a graph model is proposed to formalize the refinement cost and matrix-based sequential data partition algorithms are introduced. Then a parallel data partitioning algorithm is developed with a detailed complexity analysis. Based on the data partition results, a distribution algorithm is also proposed for scheduling parallel spatial join processing.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114691927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Page-mapping techniques for CC-NUMA multiprocessors CC-NUMA多处理器的页面映射技术
J. Huang, G. Jin, Zhiyuan Li
Careful page mapping has been shown in the past to be effective for reducing cache conflicts on both uniprocessor and Uniform Memory Access (UMA) multiprocessors. This paper extends previous page-mapping schemes to the more recent Cache-Coherent Non-Uniform Memory Access (CC-NUMA) multiprocessors. These extensions maintain the program's data-task affinity, which is important to CC-NUMA, while reducing cache set conflicts by carefully selecting the page frames. Using an execution-driven simulator that simulates a CC-NUMA machine with a 4-MB secondary cache and a 16-KB primary cache on each of the 4-issue super-scalar processors, we find that, when non-coherence cache misses are relatively heavy, it is quite important for page mapping to preserve the compiler-generated memory module ID (MID) which determines data distribution among the processors. We also find that straight application of page-coloring performs worse than bin-hopping by 10-45%, while by hashing the page color with part of the MID, page-coloring can perform closely to bin-hopping.
仔细的页面映射在过去已经被证明可以有效地减少单处理器和统一内存访问(UMA)多处理器上的缓存冲突。本文将以前的页面映射方案扩展到最新的缓存一致非统一内存访问(CC-NUMA)多处理器。这些扩展维护程序的数据任务相关性,这对CC-NUMA很重要,同时通过仔细选择页面框架减少缓存集冲突。使用一个执行驱动的模拟器,在每个4问题标量处理器上模拟具有4 mb辅助缓存和16 kb主缓存的CC-NUMA机器,我们发现,当非相干缓存丢失相对较重时,对于页面映射来说,保留编译器生成的内存模块ID (MID)非常重要,它决定了处理器之间的数据分布。我们还发现,直接应用页面着色比bin-hopping的性能差10-45%,而通过将页面颜色与部分MID散列,页面着色可以接近bin-hopping的性能。
{"title":"Page-mapping techniques for CC-NUMA multiprocessors","authors":"J. Huang, G. Jin, Zhiyuan Li","doi":"10.1109/ICAPP.1997.651482","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651482","url":null,"abstract":"Careful page mapping has been shown in the past to be effective for reducing cache conflicts on both uniprocessor and Uniform Memory Access (UMA) multiprocessors. This paper extends previous page-mapping schemes to the more recent Cache-Coherent Non-Uniform Memory Access (CC-NUMA) multiprocessors. These extensions maintain the program's data-task affinity, which is important to CC-NUMA, while reducing cache set conflicts by carefully selecting the page frames. Using an execution-driven simulator that simulates a CC-NUMA machine with a 4-MB secondary cache and a 16-KB primary cache on each of the 4-issue super-scalar processors, we find that, when non-coherence cache misses are relatively heavy, it is quite important for page mapping to preserve the compiler-generated memory module ID (MID) which determines data distribution among the processors. We also find that straight application of page-coloring performs worse than bin-hopping by 10-45%, while by hashing the page color with part of the MID, page-coloring can perform closely to bin-hopping.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115809505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Active Expressions: a framework for concurrency 活动表达式:用于并发的框架
M. De Simone, Ashutosh Kumar Singh
Active Expressions (A/sub e/) is a language-based model for the instantiation of type-safe concurrent applications. Using facilities included in modern object-oriented languages, Ae allows the definition of communication and synchronization patterns that, when combined with user provided functionality through well defined interfaces, instantiate complete concurrent applications. The approach has two unique characteristics: First, it shows that common patterns of concurrency can be expressed using language provided facilities. Second, the model can be implemented without requiring any complex user-interfaces, preprocessing stages or language extensions. It also shows that the pattern-based approach has the potential to reduce the complexity of developing concurrent applications.
Active Expressions (A/sub / e/)是一种基于语言的模型,用于实例化类型安全的并发应用程序。使用现代面向对象语言中包含的功能,Ae允许定义通信和同步模式,当通过定义良好的接口与用户提供的功能相结合时,可以实例化完整的并发应用程序。该方法有两个独特的特点:首先,它表明可以使用语言提供的工具来表示常见的并发模式。其次,模型可以在不需要任何复杂的用户界面、预处理阶段或语言扩展的情况下实现。它还表明,基于模式的方法具有降低并发应用程序开发复杂性的潜力。
{"title":"Active Expressions: a framework for concurrency","authors":"M. De Simone, Ashutosh Kumar Singh","doi":"10.1109/ICAPP.1997.651510","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651510","url":null,"abstract":"Active Expressions (A/sub e/) is a language-based model for the instantiation of type-safe concurrent applications. Using facilities included in modern object-oriented languages, Ae allows the definition of communication and synchronization patterns that, when combined with user provided functionality through well defined interfaces, instantiate complete concurrent applications. The approach has two unique characteristics: First, it shows that common patterns of concurrency can be expressed using language provided facilities. Second, the model can be implemented without requiring any complex user-interfaces, preprocessing stages or language extensions. It also shows that the pattern-based approach has the potential to reduce the complexity of developing concurrent applications.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132840842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A parallel rendering approach to the adaptive supersampling method 自适应超采样方法的并行绘制方法
Sam Lin, Rynson W. H. Lau, X. Lin, P. Cheung
Original z-buffer method is a very efficient method for image generation. The limitation is that it introduces aliases into the output image. Although many methods have been proposed to address this problem. Most of them suffer from requiring a large memory space, demanding for high computational power, or having some other limitations. Recently, we presented a simple anti-aliasing method based on the supersampling method. Instead of supersampling every pixel, we supersample edge pixels only. In this paper, we discuss various approaches for parallelizing the method and their effects on memory usage and performance.
原始z缓冲法是一种非常有效的图像生成方法。限制是它在输出图像中引入了别名。尽管已经提出了许多方法来解决这个问题。它们中的大多数都需要很大的内存空间,需要很高的计算能力,或者有一些其他的限制。最近,我们提出了一种简单的基于超采样的抗混叠方法。我们只对边缘像素进行超采样,而不是对每个像素进行超采样。在本文中,我们讨论了并行化方法的各种方法及其对内存使用和性能的影响。
{"title":"A parallel rendering approach to the adaptive supersampling method","authors":"Sam Lin, Rynson W. H. Lau, X. Lin, P. Cheung","doi":"10.1109/ICAPP.1997.651518","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651518","url":null,"abstract":"Original z-buffer method is a very efficient method for image generation. The limitation is that it introduces aliases into the output image. Although many methods have been proposed to address this problem. Most of them suffer from requiring a large memory space, demanding for high computational power, or having some other limitations. Recently, we presented a simple anti-aliasing method based on the supersampling method. Instead of supersampling every pixel, we supersample edge pixels only. In this paper, we discuss various approaches for parallelizing the method and their effects on memory usage and performance.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128566894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The economics of large-memory computations 大内存计算的经济性
C. Thomborson
We propose and justify an economic theory to guide memory system design, operation and analysis. Our theory treats memory random-access latency, and its cost per installed megabyte, as fundamentals. We introduce incentives in our economic theory, and side-constraints in our analytic model of hierarchical memory to ensure sufficient memory bandwidth and processor speed in any "well-formed" system of a given latency and size. We suggest, on the basis of our theory, that computer users should be charged a "rental" cost, proportional to their use of the total capacity in a hierarchical memory system. Finally, we use our theory to compare the cost/performance of various large-memory organisations such as PoPCs (piles of PCs), NOWs (networks of workstations), SMPs (shared memory multiprocessors), MPPs (massively parallel processors), and even Cray-class vector supercomputers.
我们提出并论证了一个经济理论来指导存储系统的设计、运行和分析。我们的理论将内存随机访问延迟和每安装兆字节的成本作为基础。我们在经济理论中引入了激励机制,并在分层内存分析模型中引入了侧约束,以确保在给定延迟和大小的任何“格式良好”系统中都有足够的内存带宽和处理器速度。根据我们的理论,我们建议应该向计算机用户收取“租用”费用,与他们在分级存储系统中使用的总容量成比例。最后,我们用我们的理论来比较各种大内存组织的成本/性能,如PoPCs(成堆的pc), NOWs(工作站网络),SMPs(共享内存多处理器),mpp(大规模并行处理器),甚至克雷类矢量超级计算机。
{"title":"The economics of large-memory computations","authors":"C. Thomborson","doi":"10.1109/ICAPP.1997.651524","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651524","url":null,"abstract":"We propose and justify an economic theory to guide memory system design, operation and analysis. Our theory treats memory random-access latency, and its cost per installed megabyte, as fundamentals. We introduce incentives in our economic theory, and side-constraints in our analytic model of hierarchical memory to ensure sufficient memory bandwidth and processor speed in any \"well-formed\" system of a given latency and size. We suggest, on the basis of our theory, that computer users should be charged a \"rental\" cost, proportional to their use of the total capacity in a hierarchical memory system. Finally, we use our theory to compare the cost/performance of various large-memory organisations such as PoPCs (piles of PCs), NOWs (networks of workstations), SMPs (shared memory multiprocessors), MPPs (massively parallel processors), and even Cray-class vector supercomputers.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127879552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1