XML SAX解析的混合并行性

Yinfei Pan, Y. Zhang, K. Chiu
{"title":"XML SAX解析的混合并行性","authors":"Yinfei Pan, Y. Zhang, K. Chiu","doi":"10.1109/ICWS.2008.107","DOIUrl":null,"url":null,"abstract":"XML has been widely adopted across a wide spectrum of applications. Its parsing efficiency, however, remains a concern, and can be a bottleneck. At the same time, with the trend towards multicore CPUs, parallelization to improve performance has become increasingly relevant. In previous work, we have investigated parallelizing DOM-style parsing and gained significant speedup. For streaming XML applications, however, SAX-style parsing is often required. In this paper, we present a technique and implementation of a parallel XML SAX parser. To handle inherent data dependencies in XML while still allowing reasonable scalability, we use a 4-stage software pipeline with a combination of strictly sequential stages and stages that can be further data-parallelized within the stage. We thus utilize a hybrid between pipelined parallelism and data parallelism. To demonstrate effectiveness, we test this approach on a Linux machine with two Intel Xeon L5320 CPUs for a total of 8 physical cores, and obtain good speedup up to about 8 CPUs.","PeriodicalId":275591,"journal":{"name":"2008 IEEE International Conference on Web Services","volume":"178 ","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"25","resultStr":"{\"title\":\"Hybrid Parallelism for XML SAX Parsing\",\"authors\":\"Yinfei Pan, Y. Zhang, K. Chiu\",\"doi\":\"10.1109/ICWS.2008.107\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"XML has been widely adopted across a wide spectrum of applications. Its parsing efficiency, however, remains a concern, and can be a bottleneck. At the same time, with the trend towards multicore CPUs, parallelization to improve performance has become increasingly relevant. In previous work, we have investigated parallelizing DOM-style parsing and gained significant speedup. For streaming XML applications, however, SAX-style parsing is often required. In this paper, we present a technique and implementation of a parallel XML SAX parser. To handle inherent data dependencies in XML while still allowing reasonable scalability, we use a 4-stage software pipeline with a combination of strictly sequential stages and stages that can be further data-parallelized within the stage. We thus utilize a hybrid between pipelined parallelism and data parallelism. To demonstrate effectiveness, we test this approach on a Linux machine with two Intel Xeon L5320 CPUs for a total of 8 physical cores, and obtain good speedup up to about 8 CPUs.\",\"PeriodicalId\":275591,\"journal\":{\"name\":\"2008 IEEE International Conference on Web Services\",\"volume\":\"178 \",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-09-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"25\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 IEEE International Conference on Web Services\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICWS.2008.107\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 IEEE International Conference on Web Services","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICWS.2008.107","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 25

摘要

XML已经在广泛的应用程序中被广泛采用。然而,它的解析效率仍然是一个问题,并且可能成为瓶颈。与此同时,随着多核cpu的发展趋势,提高性能的并行化变得越来越重要。在之前的工作中,我们研究了并行dom风格的解析,并获得了显著的加速。但是,对于流XML应用程序,通常需要sax样式的解析。在本文中,我们提出了一种并行XML SAX解析器的技术和实现。为了处理XML中固有的数据依赖关系,同时还允许合理的可伸缩性,我们使用了一个4阶段的软件管道,它结合了严格顺序的阶段和可以在阶段内进一步数据并行化的阶段。因此,我们利用了管道并行和数据并行的混合。为了证明有效性,我们在一台带有两个Intel Xeon L5320 cpu的Linux机器上测试了这种方法,总共有8个物理内核,并获得了高达8个cpu的良好加速。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Hybrid Parallelism for XML SAX Parsing
XML has been widely adopted across a wide spectrum of applications. Its parsing efficiency, however, remains a concern, and can be a bottleneck. At the same time, with the trend towards multicore CPUs, parallelization to improve performance has become increasingly relevant. In previous work, we have investigated parallelizing DOM-style parsing and gained significant speedup. For streaming XML applications, however, SAX-style parsing is often required. In this paper, we present a technique and implementation of a parallel XML SAX parser. To handle inherent data dependencies in XML while still allowing reasonable scalability, we use a 4-stage software pipeline with a combination of strictly sequential stages and stages that can be further data-parallelized within the stage. We thus utilize a hybrid between pipelined parallelism and data parallelism. To demonstrate effectiveness, we test this approach on a Linux machine with two Intel Xeon L5320 CPUs for a total of 8 physical cores, and obtain good speedup up to about 8 CPUs.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Redundant-Free Web Services Composition Based on a Two-Phase Algorithm Hybrid Parallelism for XML SAX Parsing Discovering Reference Process Models by Mining Process Variants Transparent Reputation Management for Composite Web Services SCWIM an Integrity Model for SOA Networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1