Parallelizing loops in database programming languages

Proceedings 14th International Conference on Data Engineering Pub Date : 1998-02-23 DOI:10.1109/ICDE.1998.655762

D. Lieuwen

引用次数: 4

Abstract

Database programming languages (DBPLs), fourth generation languages (4GLs) and embedded SQL all include the ability to iterate sequentially through a set/relation. Nested iterators can be used to express joins. Without program analysis, such joins must be evaluated using a tuple-at-a-time join algorithm at a central site, otherwise program semantics may be violated. This paper's analysis often allows parallel join algorithms to be used. Also, this paper's compile-time optimizations can produce better parallel code than a straightforward parallelization of the nested iterators. The transformations allow the compiler to identify parallelization opportunities that it could not detect in the original code. These techniques are important for aiding the migration from hand-optimized code on a sequential machine to system-optimized code on a parallel machine. Without such rewrites, moving to a parallel system may produce only meager performance improvements when porting legacy systems.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

数据库编程语言中的并行循环

数据库编程语言(dbpl)、第四代语言(4GLs)和嵌入式SQL都包括通过集合/关系进行顺序迭代的能力。嵌套迭代器可用于表示连接。在没有程序分析的情况下，必须在中心站点使用一次元组连接算法对这种连接进行评估，否则可能会违反程序语义。本文的分析通常允许使用并行连接算法。此外，本文的编译时优化可以产生比嵌套迭代器的直接并行化更好的并行代码。转换允许编译器识别在原始代码中无法检测到的并行化机会。这些技术对于帮助从顺序机器上的手工优化代码迁移到并行机器上的系统优化代码非常重要。如果没有这样的重写，在移植遗留系统时，迁移到并行系统可能只会产生微不足道的性能改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings 14th International Conference on Data Engineering

自引率

0.00%

发文量

期刊最新文献

A distribution-based clustering algorithm for mining in large spatial databases Parallelizing loops in database programming languages Data logging: a method for efficient data updates in constantly active RAIDs Query processing in a video retrieval system Optimizing regular path expressions using graph schemas