分析市场波动的大数据方法

IF 0.3 Q4 BUSINESS, FINANCE Algorithmic Finance Pub Date : 2013-06-05 DOI:10.2139/ssrn.2274991

Kesheng Wu, E. Bethel, Ming Gu, D. Leinweber, O. Rübel

{"title":"分析市场波动的大数据方法","authors":"Kesheng Wu, E. Bethel, Ming Gu, D. Leinweber, O. Rübel","doi":"10.2139/ssrn.2274991","DOIUrl":null,"url":null,"abstract":"Understanding the microstructure of the financial market requires the processing of a vast amount of data related to individual trades, and sometimes even multiple levels of quotes. This requires computing resources that are not easily available to financial academics and regulators. Fortunately, data-intensive scientific research has developed a series of tools and techniques for working with a large amount of data. In this work, we demonstrate that these techniques are effective for market data analysis by computing an early warning indicator called Volume-synchronized Probability of Informed trading (VPIN) on a massive set of futures trading records. The test data contains five and a half year’s worth of trading data for about 100 most liquid futures contracts, includes about 3 billion trades, and takes 140GB as text files. By using (1) a more efficient file format for storing the trading records, (2) more effective data structures and algorithms, and (3) parallelizing the computations, we are able to explore 16,000 different parameter combinations for computing VPIN in less than 20 hours on a 32-core IBM DataPlex machine. On average, computing VPIN of one futures contract over 5.5 years takes around 1.5 seconds on one core, which demonstrates that a modest computer is sufficient to monitor a vast number of trading activities in real-time – an ability that could be valuable to regulators. By examining a large number of parameter combinations, we are also able to identify the parameter settings that improves the prediction accuracy from 80% to 93%.","PeriodicalId":42207,"journal":{"name":"Algorithmic Finance","volume":"1 1","pages":""},"PeriodicalIF":0.3000,"publicationDate":"2013-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2139/ssrn.2274991","citationCount":"35","resultStr":"{\"title\":\"A Big Data Approach to Analyzing Market Volatility\",\"authors\":\"Kesheng Wu, E. Bethel, Ming Gu, D. Leinweber, O. Rübel\",\"doi\":\"10.2139/ssrn.2274991\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Understanding the microstructure of the financial market requires the processing of a vast amount of data related to individual trades, and sometimes even multiple levels of quotes. This requires computing resources that are not easily available to financial academics and regulators. Fortunately, data-intensive scientific research has developed a series of tools and techniques for working with a large amount of data. In this work, we demonstrate that these techniques are effective for market data analysis by computing an early warning indicator called Volume-synchronized Probability of Informed trading (VPIN) on a massive set of futures trading records. The test data contains five and a half year’s worth of trading data for about 100 most liquid futures contracts, includes about 3 billion trades, and takes 140GB as text files. By using (1) a more efficient file format for storing the trading records, (2) more effective data structures and algorithms, and (3) parallelizing the computations, we are able to explore 16,000 different parameter combinations for computing VPIN in less than 20 hours on a 32-core IBM DataPlex machine. On average, computing VPIN of one futures contract over 5.5 years takes around 1.5 seconds on one core, which demonstrates that a modest computer is sufficient to monitor a vast number of trading activities in real-time – an ability that could be valuable to regulators. By examining a large number of parameter combinations, we are also able to identify the parameter settings that improves the prediction accuracy from 80% to 93%.\",\"PeriodicalId\":42207,\"journal\":{\"name\":\"Algorithmic Finance\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":0.3000,\"publicationDate\":\"2013-06-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.2139/ssrn.2274991\",\"citationCount\":\"35\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Algorithmic Finance\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2139/ssrn.2274991\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"BUSINESS, FINANCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Algorithmic Finance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2139/ssrn.2274991","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"BUSINESS, FINANCE","Score":null,"Total":0}

引用次数: 35

摘要

了解金融市场的微观结构需要处理与单个交易相关的大量数据，有时甚至是多个报价水平。这需要计算资源，而金融学者和监管机构并不容易获得这些资源。幸运的是，数据密集型科学研究已经开发出一系列处理大量数据的工具和技术。在这项工作中，我们证明了这些技术是有效的市场数据分析，通过计算一个早期预警指标，称为交易量同步概率的知情交易(VPIN)的大量期货交易记录。测试数据包含了大约100个最具流动性的期货合约五年半的交易数据，包括约30亿笔交易，并以140GB的文本文件形式存在。通过使用(1)更有效的文件格式来存储交易记录，(2)更有效的数据结构和算法，以及(3)并行计算，我们能够在不到20小时的时间内在32核IBM DataPlex机器上探索用于计算VPIN的16,000种不同参数组合。平均而言，在一个核心上计算5.5年期货合约的VPIN大约需要1.5秒，这表明一台普通的计算机足以实时监控大量的交易活动——这种能力对监管机构来说可能很有价值。通过检查大量的参数组合，我们还能够确定将预测精度从80%提高到93%的参数设置。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A Big Data Approach to Analyzing Market Volatility

Understanding the microstructure of the financial market requires the processing of a vast amount of data related to individual trades, and sometimes even multiple levels of quotes. This requires computing resources that are not easily available to financial academics and regulators. Fortunately, data-intensive scientific research has developed a series of tools and techniques for working with a large amount of data. In this work, we demonstrate that these techniques are effective for market data analysis by computing an early warning indicator called Volume-synchronized Probability of Informed trading (VPIN) on a massive set of futures trading records. The test data contains five and a half year’s worth of trading data for about 100 most liquid futures contracts, includes about 3 billion trades, and takes 140GB as text files. By using (1) a more efficient file format for storing the trading records, (2) more effective data structures and algorithms, and (3) parallelizing the computations, we are able to explore 16,000 different parameter combinations for computing VPIN in less than 20 hours on a 32-core IBM DataPlex machine. On average, computing VPIN of one futures contract over 5.5 years takes around 1.5 seconds on one core, which demonstrates that a modest computer is sufficient to monitor a vast number of trading activities in real-time – an ability that could be valuable to regulators. By examining a large number of parameter combinations, we are also able to identify the parameter settings that improves the prediction accuracy from 80% to 93%.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Algorithmic Finance BUSINESS, FINANCE-

CiteScore

0.40

自引率

0.00%

发文量

期刊介绍： Algorithmic Finance is both a nascent field of study and a new high-quality academic research journal that seeks to bridge computer science and finance. It covers such applications as: High frequency and algorithmic trading Statistical arbitrage strategies Momentum and other algorithmic portfolio management Machine learning and computational financial intelligence Agent-based finance Complexity and market efficiency Algorithmic analysis of derivatives valuation Behavioral finance and investor heuristics and algorithms Applications of quantum computation to finance News analytics and automated textual analysis.