首页 > 最新文献

Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing最新文献

英文 中文
A fast parallel sorting algorithm on the k-dimensional reconfigurable mesh 基于k维可重构网格的快速并行排序算法
Ju-wook Jang, Kichul Kim
We presents a new parallel sorting algorithm on the k-dimensional reconfigurable mesh which is a generalized version of the well-studied (two dimensional) reconfigurable mesh. We introduce a new mapping technique which combines the enlarged bandwidth of the multidimensional mesh and the feature of the reconfigurable mesh. Using our mapping technique, we show that N/sup k/ numbers can be sorted in O(4/sup k/) (constant time for small k) time on a k+1 dimensional reconfigurable mesh of size k+1 times N/spl times/N/spl times/.../spl times/N. In addition, it is shown that the number of 1's in a 0/1 array of k times size N/spl times/N/spl times/.../spl times/N can be computed in O(log* N+log k) time on reconfigurable k times mesh of size N/spl times/N/spl times/.../spl times/N.
本文提出了一种新的基于k维可重构网格的并行排序算法,该算法是已有研究的二维可重构网格的推广版本。本文提出了一种新的映射技术,它结合了多维网格的宽频带和可重构网格的特点。使用我们的映射技术,我们证明了N/sup k/个数字可以在k+1次N/spl次/N/spl次/…的k+1维可重构网格上在O(4/sup k/)(小k的常数时间)时间内排序。/ spl倍/ N。此外,在大小为k倍的0/1数组中,1的个数为N/spl倍/N/spl倍/…/spl times/N可以在O(log* N+log k)时间内计算,可重构的k次网格大小为N/spl times/N/spl times/…/ spl倍/ N。
{"title":"A fast parallel sorting algorithm on the k-dimensional reconfigurable mesh","authors":"Ju-wook Jang, Kichul Kim","doi":"10.1109/ICAPP.1997.651519","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651519","url":null,"abstract":"We presents a new parallel sorting algorithm on the k-dimensional reconfigurable mesh which is a generalized version of the well-studied (two dimensional) reconfigurable mesh. We introduce a new mapping technique which combines the enlarged bandwidth of the multidimensional mesh and the feature of the reconfigurable mesh. Using our mapping technique, we show that N/sup k/ numbers can be sorted in O(4/sup k/) (constant time for small k) time on a k+1 dimensional reconfigurable mesh of size k+1 times N/spl times/N/spl times/.../spl times/N. In addition, it is shown that the number of 1's in a 0/1 array of k times size N/spl times/N/spl times/.../spl times/N can be computed in O(log* N+log k) time on reconfigurable k times mesh of size N/spl times/N/spl times/.../spl times/N.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126699787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Parallelization of the H.261 video coding algorithm on the IBM SP2(R) multiprocessor system H.261视频编码算法在IBM SP2(R)多处理器系统上的并行化
N. Yung, K. Leung
In this paper, the parallelization of the H.261 video coding algorithm on the IBM SP2 multiprocessor system is described. Based on domain decomposition as a framework, data partitioning, data dependencies and communication issues are carefully assessed. From these, two parallel algorithms were developed. The first one maximizes processor utilization and the second one minimizes communications. Our analysis shows that the first algorithm exhibits poor scalability and high communication overhead; and the second algorithm exhibits good scalability and low communication overhead. A best median speed up of 13.72 or 11 frames/sec was achieved on 24 processors.
本文描述了H.261视频编码算法在IBM SP2多处理器系统上的并行化。以领域分解为框架,仔细评估数据分区、数据依赖和通信问题。在此基础上,提出了两种并行算法。第一个最大限度地利用处理器,第二个最大限度地减少通信。我们的分析表明,第一种算法具有较差的可扩展性和较高的通信开销;第二种算法具有良好的可扩展性和较低的通信开销。在24个处理器上实现了13.72或11帧/秒的最佳中位数速度提升。
{"title":"Parallelization of the H.261 video coding algorithm on the IBM SP2(R) multiprocessor system","authors":"N. Yung, K. Leung","doi":"10.1109/ICAPP.1997.651523","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651523","url":null,"abstract":"In this paper, the parallelization of the H.261 video coding algorithm on the IBM SP2 multiprocessor system is described. Based on domain decomposition as a framework, data partitioning, data dependencies and communication issues are carefully assessed. From these, two parallel algorithms were developed. The first one maximizes processor utilization and the second one minimizes communications. Our analysis shows that the first algorithm exhibits poor scalability and high communication overhead; and the second algorithm exhibits good scalability and low communication overhead. A best median speed up of 13.72 or 11 frames/sec was achieved on 24 processors.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"140 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123367540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Parallelization of IP-packet filter rules ip包过滤规则的并行化
Takeshi Miei, M. Maruyama, T. Ogura, N. Takahashi
A compiler for parallelizing IP-packet filter rules is presented which will improve network security and reduce packet-forwarding performance degradation. It analyzes the interdependence of packet-filtering rules specified by a network administrator and translates them into an intermediate program whose instructions can be executed in parallel. Three types of compiler operations are introduced: division is used to divide the rules into parallel expressions, simplification is used to simplify redundant rules, deletion is used to delete infeasible rules.
提出了一种并行处理ip包过滤规则的编译器,提高了网络的安全性,减少了包转发性能的下降。它分析网络管理员指定的包过滤规则之间的相互依赖关系,并将其转换为可并行执行指令的中间程序。介绍了三种类型的编译操作:除法用于将规则划分为并行表达式,简化用于简化冗余规则,删除用于删除不可行的规则。
{"title":"Parallelization of IP-packet filter rules","authors":"Takeshi Miei, M. Maruyama, T. Ogura, N. Takahashi","doi":"10.1109/ICAPP.1997.651506","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651506","url":null,"abstract":"A compiler for parallelizing IP-packet filter rules is presented which will improve network security and reduce packet-forwarding performance degradation. It analyzes the interdependence of packet-filtering rules specified by a network administrator and translates them into an intermediate program whose instructions can be executed in parallel. Three types of compiler operations are introduced: division is used to divide the rules into parallel expressions, simplification is used to simplify redundant rules, deletion is used to delete infeasible rules.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"400 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122856083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A fibre channel-based architecture for Internet multimedia server clusters 基于光纤通道的因特网多媒体服务器集群体系结构
Shenze Chen, M. Thapar
In this paper, we present a cluster architecture for Internet multimedia servers, which uses the Fibre Channel (FC) technology to overcome some of the shortcomings of existing architectures. We also explore the design issues of an FC-based multimedia server cluster. A significant advantage of the FC-based cluster is that it allows physical storage attachment to the interconnect. Because of this feature, FC-based clusters will change the fundamental data-sharing paradigm of existing clusters by eliminating remote data accesses in a cluster. Many aspects of this architecture are critical to real-time multimedia applications, such as audio and video services.
本文提出了一种基于光纤通道(FC)技术的多媒体服务器集群架构,克服了现有架构的一些不足。我们还探讨了基于fc的多媒体服务器集群的设计问题。基于fc的集群的一个显著优势是,它允许物理存储连接到互连。由于这个特性,基于fc的集群将通过消除集群中的远程数据访问来改变现有集群的基本数据共享范式。该体系结构的许多方面对实时多媒体应用程序(如音频和视频服务)至关重要。
{"title":"A fibre channel-based architecture for Internet multimedia server clusters","authors":"Shenze Chen, M. Thapar","doi":"10.1109/ICAPP.1997.651512","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651512","url":null,"abstract":"In this paper, we present a cluster architecture for Internet multimedia servers, which uses the Fibre Channel (FC) technology to overcome some of the shortcomings of existing architectures. We also explore the design issues of an FC-based multimedia server cluster. A significant advantage of the FC-based cluster is that it allows physical storage attachment to the interconnect. Because of this feature, FC-based clusters will change the fundamental data-sharing paradigm of existing clusters by eliminating remote data accesses in a cluster. Many aspects of this architecture are critical to real-time multimedia applications, such as audio and video services.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114803689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
An enhanced 2D buddy strategy for submesh allocation in mesh networks 网格网络中子网格分配的改进二维伙伴策略
T. Juang, Y. Tseng, Yuh-Shyan Chen
The efficient allocation problem plays an important role in partitionable multiprocessor system. It is critical to the performance of parallel computers, especially for large-scale parallel computers. In this paper, we propose a new enhanced two-dimensional buddy system (E2DBS) strategy which overcomes the drawbacks of previous two-dimensional buddy system (2DBS) strategy, such as four non-buddy submeshes can be allocated, the requesting tasks and the system needs not be square. In E2DBS, we propose an adaptive data structure, called free sub-mesh matrix (FSM), to maintain the free submeshes, which can allocate and deallocate processors easily. Simulation results indicate that our strategy outperforms the previous ones, i.e. 2DBS strategy and best fit strategy, in terms of system processor utilization and average waiting time under various system loads for rectangle requesting tasks with side lengths are powers of 2.
有效分配问题在可分区多处理机系统中起着重要的作用。它对并行计算机,特别是大型并行计算机的性能至关重要。本文提出了一种新的增强型二维伙伴系统(E2DBS)策略,克服了以往二维伙伴系统(2DBS)策略可以分配4个非伙伴子网格、请求任务和系统不必是方形的缺点。在E2DBS中,我们提出了一种自适应的数据结构,称为自由子网格矩阵(FSM),用于维护自由子网格,可以轻松地分配和释放处理器。仿真结果表明,对于边长为2的矩形请求任务,我们的策略在不同系统负载下的系统处理器利用率和平均等待时间都优于2DBS策略和最佳拟合策略。
{"title":"An enhanced 2D buddy strategy for submesh allocation in mesh networks","authors":"T. Juang, Y. Tseng, Yuh-Shyan Chen","doi":"10.1109/ICAPP.1997.651503","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651503","url":null,"abstract":"The efficient allocation problem plays an important role in partitionable multiprocessor system. It is critical to the performance of parallel computers, especially for large-scale parallel computers. In this paper, we propose a new enhanced two-dimensional buddy system (E2DBS) strategy which overcomes the drawbacks of previous two-dimensional buddy system (2DBS) strategy, such as four non-buddy submeshes can be allocated, the requesting tasks and the system needs not be square. In E2DBS, we propose an adaptive data structure, called free sub-mesh matrix (FSM), to maintain the free submeshes, which can allocate and deallocate processors easily. Simulation results indicate that our strategy outperforms the previous ones, i.e. 2DBS strategy and best fit strategy, in terms of system processor utilization and average waiting time under various system loads for rectangle requesting tasks with side lengths are powers of 2.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124860290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
An FPGA-based on-line neural system in photon counting intensified imagers for space applications 空间光子计数增强成像仪中基于fpga的在线神经系统
M. Alderighi, S. D'Angelo, G. Sechi, F. d'Ovidio
A computational system based on a synchronous feedback neural network for the online event processing of a photon counting intensified CCD detector is presented. The hardware prototype, implemented by means of FPGA technology, consists of 5/spl times/5 and is able to identify photon events against spurious and/or noise events. It shows a high level of flexibility, which is essential in the characterization phase of the detector. It allows to implement different kinds of neurons, having different output functions and internal architectures, and to run actual, as well as virtual, networks of neurons.
提出了一种基于同步反馈神经网络的光子计数增强CCD探测器在线事件处理计算系统。硬件原型,通过FPGA技术实现,由5/spl倍/5组成,能够识别光子事件与杂散和/或噪声事件。它显示出高度的灵活性,这在探测器的表征阶段是必不可少的。它允许实现不同类型的神经元,具有不同的输出函数和内部架构,并运行实际的和虚拟的神经元网络。
{"title":"An FPGA-based on-line neural system in photon counting intensified imagers for space applications","authors":"M. Alderighi, S. D'Angelo, G. Sechi, F. d'Ovidio","doi":"10.1109/ICAPP.1997.651530","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651530","url":null,"abstract":"A computational system based on a synchronous feedback neural network for the online event processing of a photon counting intensified CCD detector is presented. The hardware prototype, implemented by means of FPGA technology, consists of 5/spl times/5 and is able to identify photon events against spurious and/or noise events. It shows a high level of flexibility, which is essential in the characterization phase of the detector. It allows to implement different kinds of neurons, having different output functions and internal architectures, and to run actual, as well as virtual, networks of neurons.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"163 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113949330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
HiPAR-DSP: a parallel VLIW RISC processor for real time image processing applications HiPAR-DSP:用于实时图像处理应用的并行VLIW RISC处理器
J. Wittenburg, M. Ohmacht, J. Kneip, W. Hinrichs, P. Pirsch
Derived from a thorough analysis of a wide class of image processing algorithms' properties, a parallel RISC architecture has been developed. The architecture gains performance from data level parallelism as well as from instruction level parallelism. From the beginning of the concept phase, high-level programming capabilities have been one of the major design goals. Thus, there has been a steady interaction between the design of the software development toolkit-optimizing assembler and C++ compiler-and the architecture itself. The RISC-typical register files are one of the most critical elements as well concerning die size and clock frequency as the assembler's ability in VLIW scheduling. Running at 100 MHz (200 mm/sup 2/, 0.35 /spl mu/m CMOS) the processor reaches a sustained performance of more than 2 GOPS for a wide range of image processing algorithms.
基于对多种图像处理算法特性的深入分析,开发了一种并行RISC架构。该体系结构从数据级并行性和指令级并行性中获得性能。从概念阶段开始,高级编程能力一直是主要设计目标之一。因此,在软件开发工具包的设计(优化汇编器和c++编译器)和体系结构本身之间存在着稳定的交互作用。risc典型的寄存器文件是影响芯片尺寸和时钟频率以及汇编器在VLIW调度中的能力的最关键因素之一。该处理器在100 MHz (200 mm/sup /, 0.35 /spl mu/m CMOS)下运行,可为各种图像处理算法提供超过2 GOPS的持续性能。
{"title":"HiPAR-DSP: a parallel VLIW RISC processor for real time image processing applications","authors":"J. Wittenburg, M. Ohmacht, J. Kneip, W. Hinrichs, P. Pirsch","doi":"10.1109/ICAPP.1997.651487","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651487","url":null,"abstract":"Derived from a thorough analysis of a wide class of image processing algorithms' properties, a parallel RISC architecture has been developed. The architecture gains performance from data level parallelism as well as from instruction level parallelism. From the beginning of the concept phase, high-level programming capabilities have been one of the major design goals. Thus, there has been a steady interaction between the design of the software development toolkit-optimizing assembler and C++ compiler-and the architecture itself. The RISC-typical register files are one of the most critical elements as well concerning die size and clock frequency as the assembler's ability in VLIW scheduling. Running at 100 MHz (200 mm/sup 2/, 0.35 /spl mu/m CMOS) the processor reaches a sustained performance of more than 2 GOPS for a wide range of image processing algorithms.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"105 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122737470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Virtual parallel processors 虚拟并行处理器
C. Dick, F. Harris
The introduction of SRAM-based field programmable gate arrays (FPGAs) has opened-up a new dimension to parallel computing architectures. This paper describes an alternative approach to parallel computing-reconfigurable or virtual parallel processing (VPP). Rather than mapping an application onto a given parallel machine, the VPP approach synthesizes the appropriate type and number of processing elements, as well as the interconnection topology, that is optimal for the application. For each application, configuration data is downloaded to the machine that personalizes the hardware for the task at hand. The paper provides a brief description of the authors reconfigurable computer, Archimedes. The benefits of the VPP approach are highlighted by an example application-the 2-D FFT. A novel parallel implementation of a polynomial transform based 2-D transform is described and compared to results for distributed memory parallel machines that have been reported in the literature. The comparison highlights the computational advantage provided by reconfigurable computing.
基于sram的现场可编程门阵列(fpga)的引入为并行计算架构开辟了一个新的维度。本文描述了并行计算的另一种方法——可重构并行处理或虚拟并行处理(VPP)。VPP方法不是将应用程序映射到给定的并行机器上,而是综合了对应用程序最优的适当类型和数量的处理元素以及互连拓扑。对于每个应用程序,配置数据被下载到为手头的任务定制硬件的机器上。本文简要介绍了作者的可重构计算机“阿基米德”。VPP方法的优点通过一个示例应用-二维FFT来突出。描述了一种基于多项式变换的二维变换的新型并行实现,并与文献中报道的分布式存储并行机的结果进行了比较。这种比较突出了可重构计算提供的计算优势。
{"title":"Virtual parallel processors","authors":"C. Dick, F. Harris","doi":"10.1109/ICAPP.1997.651485","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651485","url":null,"abstract":"The introduction of SRAM-based field programmable gate arrays (FPGAs) has opened-up a new dimension to parallel computing architectures. This paper describes an alternative approach to parallel computing-reconfigurable or virtual parallel processing (VPP). Rather than mapping an application onto a given parallel machine, the VPP approach synthesizes the appropriate type and number of processing elements, as well as the interconnection topology, that is optimal for the application. For each application, configuration data is downloaded to the machine that personalizes the hardware for the task at hand. The paper provides a brief description of the authors reconfigurable computer, Archimedes. The benefits of the VPP approach are highlighted by an example application-the 2-D FFT. A novel parallel implementation of a polynomial transform based 2-D transform is described and compared to results for distributed memory parallel machines that have been reported in the literature. The comparison highlights the computational advantage provided by reconfigurable computing.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129402902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Shadow Stacks-a hardware-supported DSM for objects of any granularity 影子堆栈——硬件支持的用于任何粒度对象的DSM
S. Groh, M. Pizka, J. Rudolph
This paper presents a new Distributed Shared Memory (DSM) management concept that is integrated into a scalable distributed virtual memory management technique and circumvents false sharing while still preserving simplicity to the application level. Objects defined as usual by variables in the declaration part of functions are made sharable among threads executing in the distributed environment. These objects of varying granularity and with different consistency requirements are managed separately to avoid false sharing. Consistency is enforced at runtime by a distributed manager-agent architecture, that supports automatic and dynamic selection of an adequate coherence protocol per object. To provide efficiency, the implementation of the Shadow Stacks concept is based on the exploitation of the page fault mechanism provided by of the shelf hardware.
本文提出了一种新的分布式共享内存(DSM)管理概念,该概念集成到可扩展的分布式虚拟内存管理技术中,避免了错误共享,同时仍然保持了应用程序级别的简单性。通常由函数声明部分中的变量定义的对象可以在分布式环境中执行的线程之间共享。这些粒度不同、一致性要求不同的对象被分开管理,以避免错误共享。一致性由分布式管理器-代理体系结构在运行时强制执行,该体系结构支持为每个对象自动和动态地选择适当的一致性协议。为了提高效率,影子堆栈概念的实现是基于利用由架子硬件提供的页面错误机制。
{"title":"Shadow Stacks-a hardware-supported DSM for objects of any granularity","authors":"S. Groh, M. Pizka, J. Rudolph","doi":"10.1109/ICAPP.1997.651493","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651493","url":null,"abstract":"This paper presents a new Distributed Shared Memory (DSM) management concept that is integrated into a scalable distributed virtual memory management technique and circumvents false sharing while still preserving simplicity to the application level. Objects defined as usual by variables in the declaration part of functions are made sharable among threads executing in the distributed environment. These objects of varying granularity and with different consistency requirements are managed separately to avoid false sharing. Consistency is enforced at runtime by a distributed manager-agent architecture, that supports automatic and dynamic selection of an adequate coherence protocol per object. To provide efficiency, the implementation of the Shadow Stacks concept is based on the exploitation of the page fault mechanism provided by of the shelf hardware.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128578821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Artificial neural architecture for real time modelling applications 用于实时建模应用的人工神经架构
E. Petriu, A. Guergachi, G. Patry, L. Zhao, D. Petriu, G. Vukovich
This paper presents the random-pulse machine concept and shows how it can be used for the modular design of artificial neural networks. Random-pulse machines deal with analog variables represented by the mean rate of random-pulse streams and use simple digital technology to perform arithmetic and logic operations. As an application example, a NN is proposed for modeling of the activated sludge wastewater treatment plants.
本文提出了随机脉冲机的概念,并说明了如何将其用于人工神经网络的模块化设计。随机脉冲机处理由随机脉冲流的平均速率表示的模拟变量,并使用简单的数字技术来执行算术和逻辑运算。作为应用实例,提出了一种神经网络对活性污泥污水处理厂进行建模的方法。
{"title":"Artificial neural architecture for real time modelling applications","authors":"E. Petriu, A. Guergachi, G. Patry, L. Zhao, D. Petriu, G. Vukovich","doi":"10.1109/ICAPP.1997.651529","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651529","url":null,"abstract":"This paper presents the random-pulse machine concept and shows how it can be used for the modular design of artificial neural networks. Random-pulse machines deal with analog variables represented by the mean rate of random-pulse streams and use simple digital technology to perform arithmetic and logic operations. As an application example, a NN is proposed for modeling of the activated sludge wastewater treatment plants.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"105 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116420418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1