Pub Date : 2010-12-20DOI: 10.1109/WHPCF.2010.5671832
Alberto F. de Souza, Fábio Daros Freitas, Andre Gustavo Coelho de Almeida
This work presents a new weightless neural network-based time series predictor that uses Virtual Generalized Random Access Memory weightless neural network to predict future stock returns. This new predictor was evaluated in predicting future weekly returns of 46 stocks from the Brazilian stock market. Our results showed that Virtual Generalized Random Access Memory weightless neural network predictors can produce predictions of future stock returns with the same error levels and properties of baseline autoregressive neural network predictors, however, running 5,000 times faster.
本文提出了一种新的基于无权重神经网络的时间序列预测器,该预测器使用虚拟广义随机存取记忆无权重神经网络来预测未来股票收益。在预测巴西股市46只股票的未来周收益时,对这个新的预测器进行了评估。我们的研究结果表明,虚拟广义随机存取记忆(Virtual Generalized Random Access Memory)无权重神经网络预测器可以产生与基线自回归神经网络预测器相同的误差水平和属性的未来股票收益预测,但运行速度快5000倍。
{"title":"High performance prediction of stock returns with VG-RAM weightless neural networks","authors":"Alberto F. de Souza, Fábio Daros Freitas, Andre Gustavo Coelho de Almeida","doi":"10.1109/WHPCF.2010.5671832","DOIUrl":"https://doi.org/10.1109/WHPCF.2010.5671832","url":null,"abstract":"This work presents a new weightless neural network-based time series predictor that uses Virtual Generalized Random Access Memory weightless neural network to predict future stock returns. This new predictor was evaluated in predicting future weekly returns of 46 stocks from the Brazilian stock market. Our results showed that Virtual Generalized Random Access Memory weightless neural network predictors can produce predictions of future stock returns with the same error levels and properties of baseline autoregressive neural network predictors, however, running 5,000 times faster.","PeriodicalId":408567,"journal":{"name":"2010 IEEE Workshop on High Performance Computational Finance","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127030877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-12-20DOI: 10.1109/WHPCF.2010.5671842
C. Albanese
Valuing and hedging counterparty credit risk involves analyzing large portfolios of netting sets over time horizons of decades. Theory dictates that the simulation measure should be coherent, i.e. arbitrage free and be used consistently both for simulation and valuation. This talk describes the mathematical formalism and the software architecture of a risk system that accomplishes this task while delivering a very rich set of 3-dimensional risk metrics to the end user, including portfolio loss distributions and sensitivities thereof. The network communication bottleneck is bypassed by using capable boards with acceleration. The memory bottleneck is overcome at the algorithmic level by adapting the mathematical framework to revolve around a handful of compute bound algorithms.
{"title":"Coherent global market simulations for counterparty credit risk","authors":"C. Albanese","doi":"10.1109/WHPCF.2010.5671842","DOIUrl":"https://doi.org/10.1109/WHPCF.2010.5671842","url":null,"abstract":"Valuing and hedging counterparty credit risk involves analyzing large portfolios of netting sets over time horizons of decades. Theory dictates that the simulation measure should be coherent, i.e. arbitrage free and be used consistently both for simulation and valuation. This talk describes the mathematical formalism and the software architecture of a risk system that accomplishes this task while delivering a very rich set of 3-dimensional risk metrics to the end user, including portfolio loss distributions and sensitivities thereof. The network communication bottleneck is bypassed by using capable boards with acceleration. The memory bottleneck is overcome at the algorithmic level by adapting the mathematical framework to revolve around a handful of compute bound algorithms.","PeriodicalId":408567,"journal":{"name":"2010 IEEE Workshop on High Performance Computational Finance","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128789327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-12-20DOI: 10.1109/WHPCF.2010.5671811
G. Pagès, B. Wilbertz
The Quantization Tree algorithm has proven to be quite an efficient tool for the evaluation of financial derivatives with non-vanilla exercise rights as American-, Bermudan- or Swing options. Nevertheless, it relies heavily on a fast computation of the transition probabilities in the underlying Quantization Tree. Since this estimation is typically done by Monte-Carlo simulations, it is appealing to take advantage of the massive parallel computing capabilities of modern GPGPU-devices. We present in this article a parallel implementation of the transition probability estimation for a Gaussian 2-factor model in CUDA. Since we have to deal in this case with a huge amount of data and quite long MC-paths, it turned out that the naive path-wise parallel implementation is not optimal. We therefore present a time-layer wise parallelization which can better exploit the parallel computing power of GPGPU-devices by using faster memory structures.
{"title":"Parallel implementation of Quantization methods for the valuation of swing options on GPGPU","authors":"G. Pagès, B. Wilbertz","doi":"10.1109/WHPCF.2010.5671811","DOIUrl":"https://doi.org/10.1109/WHPCF.2010.5671811","url":null,"abstract":"The Quantization Tree algorithm has proven to be quite an efficient tool for the evaluation of financial derivatives with non-vanilla exercise rights as American-, Bermudan- or Swing options. Nevertheless, it relies heavily on a fast computation of the transition probabilities in the underlying Quantization Tree. Since this estimation is typically done by Monte-Carlo simulations, it is appealing to take advantage of the massive parallel computing capabilities of modern GPGPU-devices. We present in this article a parallel implementation of the transition probability estimation for a Gaussian 2-factor model in CUDA. Since we have to deal in this case with a huge amount of data and quite long MC-paths, it turned out that the naive path-wise parallel implementation is not optimal. We therefore present a time-layer wise parallelization which can better exploit the parallel computing power of GPGPU-devices by using faster memory structures.","PeriodicalId":408567,"journal":{"name":"2010 IEEE Workshop on High Performance Computational Finance","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129110975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-12-20DOI: 10.1109/WHPCF.2010.5671821
A. Bernemann, R. Schreyer, K. Spanderen
Pricing and risk analysis for today's structured equity products is computationally more and more demanding and time consuming. GPUs offer the possibility to significantly increase computing performance even at reduced costs. We applied this technology to replace a large amount of our CPU based computing grid by hybrid GPU/CPU pricing engines. One GPU based pricing engine with two Tesla C1060 replaced 140 CPU cores in performing Monte Carlo based simulation of our productive structured equity portfolio with the local and stochastic volatility model.
{"title":"Pricing structured equity products on GPUs","authors":"A. Bernemann, R. Schreyer, K. Spanderen","doi":"10.1109/WHPCF.2010.5671821","DOIUrl":"https://doi.org/10.1109/WHPCF.2010.5671821","url":null,"abstract":"Pricing and risk analysis for today's structured equity products is computationally more and more demanding and time consuming. GPUs offer the possibility to significantly increase computing performance even at reduced costs. We applied this technology to replace a large amount of our CPU based computing grid by hybrid GPU/CPU pricing engines. One GPU based pricing engine with two Tesla C1060 replaced 140 CPU cores in performing Monte Carlo based simulation of our productive structured equity portfolio with the local and stochastic volatility model.","PeriodicalId":408567,"journal":{"name":"2010 IEEE Workshop on High Performance Computational Finance","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116116556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-12-20DOI: 10.1109/WHPCF.2010.5671828
P. Selo, Yoonho Park, S. Parekh, C. Venkatramani, Hari K. Pyla, F. Zheng
Previously, we demonstrated that we can build a real-world financial application using a stream processing system running on commodity hardware. In this paper, we propose making stream processing systems more flexible and demonstrate how this flexibility can be used to exploit low-overhead communication systems to speed up streaming applications. With our prototype, we now have an options market data processing system that can achieve less than 30 µsec average latency at 30x the February 2008 OPRA rate on a cluster of blades using InfiniBand. Across shared memory, this system can achieve less than 20 µsec average latency at 25x the February 2008 OPRA rate on a single machine.
{"title":"Adding stream processing system flexibility to exploit low-overhead communication systems","authors":"P. Selo, Yoonho Park, S. Parekh, C. Venkatramani, Hari K. Pyla, F. Zheng","doi":"10.1109/WHPCF.2010.5671828","DOIUrl":"https://doi.org/10.1109/WHPCF.2010.5671828","url":null,"abstract":"Previously, we demonstrated that we can build a real-world financial application using a stream processing system running on commodity hardware. In this paper, we propose making stream processing systems more flexible and demonstrate how this flexibility can be used to exploit low-overhead communication systems to speed up streaming applications. With our prototype, we now have an options market data processing system that can achieve less than 30 µsec average latency at 30x the February 2008 OPRA rate on a cluster of blades using InfiniBand. Across shared memory, this system can achieve less than 20 µsec average latency at 25x the February 2008 OPRA rate on a single machine.","PeriodicalId":408567,"journal":{"name":"2010 IEEE Workshop on High Performance Computational Finance","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126094445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-12-20DOI: 10.1109/WHPCF.2010.5671816
Yu Tian, Zili Zhu, F. Klebaner, K. Hamza
In this paper, we will present our research on the acceleration for option pricing using Monte Carlo techniques on the GPU. We first introduce some basic ideas of GPU programming and then the stochastic volatility SABR model. Under the SABR model, we discuss option pricing with Monte Carlo techniques. In particular, we focus on European option pricing using quasi-Monte Carlo with the Brownian bridge method and American option pricing using the least squares Monte Carlo method. Next, we will study a GPU-based program for pricing European options and a hybrid CPU-GPU program for pricing American options. Finally, we implement our GPU programs, and compare their performance with their CPU counterparts. From our numerical results, around 100× speedup in European option pricing and 10× speedup in American option pricing can be achieved by GPU computing while maintaining satisfactory pricing accuracy.
{"title":"Option pricing with the SABR model on the GPU","authors":"Yu Tian, Zili Zhu, F. Klebaner, K. Hamza","doi":"10.1109/WHPCF.2010.5671816","DOIUrl":"https://doi.org/10.1109/WHPCF.2010.5671816","url":null,"abstract":"In this paper, we will present our research on the acceleration for option pricing using Monte Carlo techniques on the GPU. We first introduce some basic ideas of GPU programming and then the stochastic volatility SABR model. Under the SABR model, we discuss option pricing with Monte Carlo techniques. In particular, we focus on European option pricing using quasi-Monte Carlo with the Brownian bridge method and American option pricing using the least squares Monte Carlo method. Next, we will study a GPU-based program for pricing European options and a hybrid CPU-GPU program for pricing American options. Finally, we implement our GPU programs, and compare their performance with their CPU counterparts. From our numerical results, around 100× speedup in European option pricing and 10× speedup in American option pricing can be achieved by GPU computing while maintaining satisfactory pricing accuracy.","PeriodicalId":408567,"journal":{"name":"2010 IEEE Workshop on High Performance Computational Finance","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115286689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-12-20DOI: 10.1109/WHPCF.2010.5671827
Dariusz K Murakowski, W. Brouwer, V. Natoli
High Performance Computing on graphics processors (GPUs) has produced excellent results in a wide array of disciplines. Compute bound problems benefit from the massive parallelism and memory bound problems benefit from higher bandwidth and the ability to hide latency. In this work we apply GPU computing to a non-trivial option valuation problem to demonstrate its efficacy on problems with real world significance. Here we have focussed attention on barrier options modeled using an underlying jump-diffusion process and incorporating a Brownian bridge to account for inter-jump crossings. Exotic path-dependent options such as this often lack a closed-form solution and numerical methods must be used in their pricing. Monte Carlo methods which are commonly utilized involve simulation of the price trajectory along many independent paths, an approach that maps well to the GPU thread concept. Here we present the results of our CPU and GPU implementations comparing performance and providing details on both.
{"title":"CUDA implementation of barrier option valuation with jump-diffusion process and Brownian bridge","authors":"Dariusz K Murakowski, W. Brouwer, V. Natoli","doi":"10.1109/WHPCF.2010.5671827","DOIUrl":"https://doi.org/10.1109/WHPCF.2010.5671827","url":null,"abstract":"High Performance Computing on graphics processors (GPUs) has produced excellent results in a wide array of disciplines. Compute bound problems benefit from the massive parallelism and memory bound problems benefit from higher bandwidth and the ability to hide latency. In this work we apply GPU computing to a non-trivial option valuation problem to demonstrate its efficacy on problems with real world significance. Here we have focussed attention on barrier options modeled using an underlying jump-diffusion process and incorporating a Brownian bridge to account for inter-jump crossings. Exotic path-dependent options such as this often lack a closed-form solution and numerical methods must be used in their pricing. Monte Carlo methods which are commonly utilized involve simulation of the price trajectory along many independent paths, an approach that maps well to the GPU thread concept. Here we present the results of our CPU and GPU implementations comparing performance and providing details on both.","PeriodicalId":408567,"journal":{"name":"2010 IEEE Workshop on High Performance Computational Finance","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128039759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-12-20DOI: 10.1109/WHPCF.2010.5671822
Stephen Weston, Jean-Tristan Marin, James Spooner, O. Pell, O. Mencer
Huge growth in the trading and complexity of credit derivative instruments over the past five years has driven the need for ever more computationally demanding mathematical models. This has led to massive growth in data center compute capacity, power and cooling requirements. We report the results of an on-going joint project between J.P. Morgan and specialist acceleration solutions provider Maxeler Technologies to improve the price-performance for calculating the value and risk of a large complex credit derivatives portfolio. Our results show that valuing tranches of Collateralized Default Obligations (CDOs) on Maxeler accelerated systems is over 30 times faster per cubic foot and per Watt than solutions using standard multi-core Intel Xeon processors. We also report some preliminary results of further work that extends the approach to classes of interest rate derivatives.
{"title":"Accelerating the computation of portfolios of tranched credit derivatives","authors":"Stephen Weston, Jean-Tristan Marin, James Spooner, O. Pell, O. Mencer","doi":"10.1109/WHPCF.2010.5671822","DOIUrl":"https://doi.org/10.1109/WHPCF.2010.5671822","url":null,"abstract":"Huge growth in the trading and complexity of credit derivative instruments over the past five years has driven the need for ever more computationally demanding mathematical models. This has led to massive growth in data center compute capacity, power and cooling requirements. We report the results of an on-going joint project between J.P. Morgan and specialist acceleration solutions provider Maxeler Technologies to improve the price-performance for calculating the value and risk of a large complex credit derivatives portfolio. Our results show that valuing tranches of Collateralized Default Obligations (CDOs) on Maxeler accelerated systems is over 30 times faster per cubic foot and per Watt than solutions using standard multi-core Intel Xeon processors. We also report some preliminary results of further work that extends the approach to classes of interest rate derivatives.","PeriodicalId":408567,"journal":{"name":"2010 IEEE Workshop on High Performance Computational Finance","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123028296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-12-20DOI: 10.1109/WHPCF.2010.5671831
D. Dang, C. Christara, K. Jackson
We develop highly efficient parallel pricing methods on Graphics Processing Units (GPUs) for multi-asset American options via a Partial Differential Equation (PDE) approach. The linear complementarity problem arising due to the free boundary is handled by a penalty method. Finite difference methods on uniform grids are considered for the space discretization of the PDE, while classical finite differences, such as Crank-Nicolson, are used for the time discretization. The discrete nonlinear penalized equations at each timestep are solved using a penalty iteration. A GPU-based parallel Alternating Direction Implicit Approximate Factorization technique is employed for the solution of the linear algebraic system arising from each penalty iteration. We demonstrate the efficiency and accuracy of the parallel numerical methods by pricing American options written on three assets.
{"title":"Pricing multi-asset American options on Graphics Processing Units using a PDE approach","authors":"D. Dang, C. Christara, K. Jackson","doi":"10.1109/WHPCF.2010.5671831","DOIUrl":"https://doi.org/10.1109/WHPCF.2010.5671831","url":null,"abstract":"We develop highly efficient parallel pricing methods on Graphics Processing Units (GPUs) for multi-asset American options via a Partial Differential Equation (PDE) approach. The linear complementarity problem arising due to the free boundary is handled by a penalty method. Finite difference methods on uniform grids are considered for the space discretization of the PDE, while classical finite differences, such as Crank-Nicolson, are used for the time discretization. The discrete nonlinear penalized equations at each timestep are solved using a penalty iteration. A GPU-based parallel Alternating Direction Implicit Approximate Factorization technique is employed for the solution of the linear algebraic system arising from each penalty iteration. We demonstrate the efficiency and accuracy of the parallel numerical methods by pricing American options written on three assets.","PeriodicalId":408567,"journal":{"name":"2010 IEEE Workshop on High Performance Computational Finance","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131639975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-12-20DOI: 10.1109/WHPCF.2010.5671826
AJ Lindeman
Although much has been written about the “multi-core discontinuity”, and the impact on mathematical software, see, for example, [KD, LM], the full benefits to quantitative finance have yet to be realized. The purpose of this paper is to highlight the numerical structure of some common fixed income modeling problems with the aim of demonstrating how shared-memory parallelism may be brought to bear on improving performance, ultimately allowing us to calibrate larger and more complete models sufficiently fast to be useful in market making and risk management.
{"title":"Opportunities for shared memory parallelism in financial modeling","authors":"AJ Lindeman","doi":"10.1109/WHPCF.2010.5671826","DOIUrl":"https://doi.org/10.1109/WHPCF.2010.5671826","url":null,"abstract":"Although much has been written about the “multi-core discontinuity”, and the impact on mathematical software, see, for example, [KD, LM], the full benefits to quantitative finance have yet to be realized. The purpose of this paper is to highlight the numerical structure of some common fixed income modeling problems with the aim of demonstrating how shared-memory parallelism may be brought to bear on improving performance, ultimately allowing us to calibrate larger and more complete models sufficiently fast to be useful in market making and risk management.","PeriodicalId":408567,"journal":{"name":"2010 IEEE Workshop on High Performance Computational Finance","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117201305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}