Identifying profitable investment strategies has been a long-standing challenge for finance practitioners. The optimal number of clusters (ONC) algorithm is a reliable tool used to evaluate backtest results affected by multiple testing. The algorithm is necessary to calculate the deflated Sharpe ratio, a popular metric that detects potential false positive investment strategies. These methods are based on the familywise error rate approach, which provides stringent control over the overall error rate, reducing the likelihood of false discoveries and increasing the reliability of findings. The ONC algorithm’s time complexity, however, poses a significant challenge for practitioners. This study proposes a practical solution to reduce the number of clusters tested by the ONC algorithm while maintaining accuracy. Results from simulated datasets demonstrate that the proposed solution significantly reduces the algorithm’s runtime. Additionally, this study addresses the impact of outliers on the ONC algorithm, showing that they can lead to nonoptimal solutions, and provides a simple solution to mitigate their effects. These findings contribute to the literature on finance by enhancing the usability of the ONC algorithm.
{"title":"A Practitioner’s Guide to the Optimal Number of Clusters Algorithm","authors":"M. Andrews","doi":"10.3905/jfds.2023.1.133","DOIUrl":"https://doi.org/10.3905/jfds.2023.1.133","url":null,"abstract":"Identifying profitable investment strategies has been a long-standing challenge for finance practitioners. The optimal number of clusters (ONC) algorithm is a reliable tool used to evaluate backtest results affected by multiple testing. The algorithm is necessary to calculate the deflated Sharpe ratio, a popular metric that detects potential false positive investment strategies. These methods are based on the familywise error rate approach, which provides stringent control over the overall error rate, reducing the likelihood of false discoveries and increasing the reliability of findings. The ONC algorithm’s time complexity, however, poses a significant challenge for practitioners. This study proposes a practical solution to reduce the number of clusters tested by the ONC algorithm while maintaining accuracy. Results from simulated datasets demonstrate that the proposed solution significantly reduces the algorithm’s runtime. Additionally, this study addresses the impact of outliers on the ONC algorithm, showing that they can lead to nonoptimal solutions, and provides a simple solution to mitigate their effects. These findings contribute to the literature on finance by enhancing the usability of the ONC algorithm.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128993038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hongsong Chou, Jimin Han, Charles Huang, Danny D. Sun
With transaction-level market data for stocks in China A-share markets, the authors construct individual stocks’ kernel functions of market impact and analyze their statistical properties. Attribution analysis of such kernel functions is also performed to understand how market microstructure variables such as bid–ask spread and liquidity distribution in order books can be used to classify different groups of kernel functions. The authors’ analysis shows that stocks in China A-share markets exhibit clear patterns of market impact curves, which is likely due to specific market structure regulations such as constant tick size across different stocks and stock-specific order book dynamics resulting from market participants’ behaviors. The authors also explore the application of kernel functions in forecasting price movement in close-to-reality trading simulators that consider market impact costs at individual trade level.
{"title":"Kernel Market Impact Analysis in China A-Share Markets","authors":"Hongsong Chou, Jimin Han, Charles Huang, Danny D. Sun","doi":"10.3905/jfds.2023.1.131","DOIUrl":"https://doi.org/10.3905/jfds.2023.1.131","url":null,"abstract":"With transaction-level market data for stocks in China A-share markets, the authors construct individual stocks’ kernel functions of market impact and analyze their statistical properties. Attribution analysis of such kernel functions is also performed to understand how market microstructure variables such as bid–ask spread and liquidity distribution in order books can be used to classify different groups of kernel functions. The authors’ analysis shows that stocks in China A-share markets exhibit clear patterns of market impact curves, which is likely due to specific market structure regulations such as constant tick size across different stocks and stock-specific order book dynamics resulting from market participants’ behaviors. The authors also explore the application of kernel functions in forecasting price movement in close-to-reality trading simulators that consider market impact costs at individual trade level.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121095846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The authors introduce spatio-temporal momentum strategies, a class of models that unify both time-series and cross-sectional momentum strategies by trading assets based on their cross-sectional momentum features over time. Although both time-series and cross-sectional momentum strategies are designed to systematically capture momentum risk premiums, these strategies are regarded as distinct implementations and do not consider the concurrent relationship and predictability between temporal and cross-sectional momentum features of different assets. They model spatio-temporal momentum with neural networks of varying complexities and demonstrate that a simple neural network with only a single fully connected layer learns to simultaneously generate trading signals for all assets in a portfolio by incorporating both their time-series and cross-sectional momentum features. Back testing on portfolios of 46 actively traded US equities and 12 equity index futures contracts, they demonstrate that the model is able to retain its performance over benchmarks in the presence of high transaction costs of up to 5–10 basis points. In particular, they find that the model when coupled with least absolute shrinkage and turnover regularization results in the best performance over various transaction cost scenarios.
{"title":"Spatio-Temporal Momentum: Jointly Learning Time-Series and Cross-Sectional Strategies","authors":"Wee Ling Tan, Stephen Roberts, Stefan Zohren","doi":"10.3905/jfds.2023.1.130","DOIUrl":"https://doi.org/10.3905/jfds.2023.1.130","url":null,"abstract":"The authors introduce spatio-temporal momentum strategies, a class of models that unify both time-series and cross-sectional momentum strategies by trading assets based on their cross-sectional momentum features over time. Although both time-series and cross-sectional momentum strategies are designed to systematically capture momentum risk premiums, these strategies are regarded as distinct implementations and do not consider the concurrent relationship and predictability between temporal and cross-sectional momentum features of different assets. They model spatio-temporal momentum with neural networks of varying complexities and demonstrate that a simple neural network with only a single fully connected layer learns to simultaneously generate trading signals for all assets in a portfolio by incorporating both their time-series and cross-sectional momentum features. Back testing on portfolios of 46 actively traded US equities and 12 equity index futures contracts, they demonstrate that the model is able to retain its performance over benchmarks in the presence of high transaction costs of up to 5–10 basis points. In particular, they find that the model when coupled with least absolute shrinkage and turnover regularization results in the best performance over various transaction cost scenarios.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136370719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The rise of digital technologies in the 21st century has brought about a profound transformation in the way people live their lives, and this shift is having a significant impact on the global economy. As a result, businesses in all sectors are being forced to reevaluate their operations and decision-making processes, including the financial sector. This study investigates the different factors that are compelling financial institutions to rethink their approaches to doing business. The study also presents the emerging data-centric and analytics-driven business model of the finance sector, which is necessary to adapt, survive, and compete in today’s dynamic and highly digitized global market.
{"title":"The Impact of Technology, Big Data, and Analytics: The Evolving Data-Driven Model of Innovation in the Finance Industry","authors":"R. Malhotra, D. Malhotra","doi":"10.3905/jfds.2023.1.129","DOIUrl":"https://doi.org/10.3905/jfds.2023.1.129","url":null,"abstract":"The rise of digital technologies in the 21st century has brought about a profound transformation in the way people live their lives, and this shift is having a significant impact on the global economy. As a result, businesses in all sectors are being forced to reevaluate their operations and decision-making processes, including the financial sector. This study investigates the different factors that are compelling financial institutions to rethink their approaches to doing business. The study also presents the emerging data-centric and analytics-driven business model of the finance sector, which is necessary to adapt, survive, and compete in today’s dynamic and highly digitized global market.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123116476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The authors propose a novel method to improve estimation of asset returns for portfolio optimization. This approach first performs a monthly directional market forecast using an online decision tree. The decision tree is trained on a novel set of features engineered from portfolio theory: the efficient frontier functional coefficients. Efficient frontiers can be decomposed to their functional form, a square-root second-order polynomial, and the coefficients of this function capture the information of all the constituents that compose the market in the current time period. To make these forecasts actionable, these directional forecasts are integrated to a portfolio optimization framework using expected returns conditional on the market forecast as an estimate for the return vector. This conditional expectation is calculated using the inverse Mills ratio, and the capital asset pricing model is used to translate the market forecast to individual asset forecasts. This novel method outperforms baseline portfolios, as well as other feature sets including technical indicators and the Fama–French factors. To empirically validate the proposed model, the authors employ a set of market sector exchange-traded funds.
{"title":"Using Machine Learning to Forecast Market Direction with Efficient Frontier Coefficients","authors":"Nolan Alexander, William Scherer","doi":"10.3905/jfds.2023.1.128","DOIUrl":"https://doi.org/10.3905/jfds.2023.1.128","url":null,"abstract":"The authors propose a novel method to improve estimation of asset returns for portfolio optimization. This approach first performs a monthly directional market forecast using an online decision tree. The decision tree is trained on a novel set of features engineered from portfolio theory: the efficient frontier functional coefficients. Efficient frontiers can be decomposed to their functional form, a square-root second-order polynomial, and the coefficients of this function capture the information of all the constituents that compose the market in the current time period. To make these forecasts actionable, these directional forecasts are integrated to a portfolio optimization framework using expected returns conditional on the market forecast as an estimate for the return vector. This conditional expectation is calculated using the inverse Mills ratio, and the capital asset pricing model is used to translate the market forecast to individual asset forecasts. This novel method outperforms baseline portfolios, as well as other feature sets including technical indicators and the Fama–French factors. To empirically validate the proposed model, the authors employ a set of market sector exchange-traded funds.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126380776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this article, the authors apply graphics processing unit (GPU) computation to an American option pricing problem via Monte Carlo (MC) simulations and particle swarm optimization (PSO). Given that computations in both MC and PSO can be vectorized and made independent, the valuation can be readily performed on GPUs. As a result, we can increase the accuracy of the valuation by increasing MC paths and particles without spending more time. For example, with a large number of particles (but allocated to GPUs), convergence can be reached in very few steps. The method introduced in this article can be extended to a wide variety of exotic derivatives or a large portfolio of diverse derivatives (known as an eigen portfolio). This is helpful in both trading and risk management.
{"title":"Using the Graphics Processing Unit to Evaluate American-Style Derivatives","authors":"Leon Xing Li, Ren‐Raw Chen","doi":"10.3905/jfds.2023.1.127","DOIUrl":"https://doi.org/10.3905/jfds.2023.1.127","url":null,"abstract":"In this article, the authors apply graphics processing unit (GPU) computation to an American option pricing problem via Monte Carlo (MC) simulations and particle swarm optimization (PSO). Given that computations in both MC and PSO can be vectorized and made independent, the valuation can be readily performed on GPUs. As a result, we can increase the accuracy of the valuation by increasing MC paths and particles without spending more time. For example, with a large number of particles (but allocated to GPUs), convergence can be reached in very few steps. The method introduced in this article can be extended to a wide variety of exotic derivatives or a large portfolio of diverse derivatives (known as an eigen portfolio). This is helpful in both trading and risk management.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124503096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
When forecasting the equity risk premium, simple techniques generate results that are easier to interpret than results from more complex techniques. If complex techniques have better performance, does the virtue of superior performance trump the vice of lack of interpretability? This presumes simpler techniques underperform. Complex does not equate to superior performance. Old and simple techniques like discriminant analysis combine the virtue of performance with the virtue of intelligibility. This article performs a horse race among stepwise quadratic discriminant analysis, classification trees, regression trees, and ridgeless regression. Sometimes, accuracy can be sacrificed in favor of better out-of-sample Sharpe ratios. This article also shows that preprocessing data using rolling percentage ranks can be better than using either an expanding window or Z-scores.
{"title":"The Virtue and Vice of Complexity in Equity Risk Premium Prediction","authors":"Brian Jacobsen","doi":"10.3905/jfds.2023.1.126","DOIUrl":"https://doi.org/10.3905/jfds.2023.1.126","url":null,"abstract":"When forecasting the equity risk premium, simple techniques generate results that are easier to interpret than results from more complex techniques. If complex techniques have better performance, does the virtue of superior performance trump the vice of lack of interpretability? This presumes simpler techniques underperform. Complex does not equate to superior performance. Old and simple techniques like discriminant analysis combine the virtue of performance with the virtue of intelligibility. This article performs a horse race among stepwise quadratic discriminant analysis, classification trees, regression trees, and ridgeless regression. Sometimes, accuracy can be sacrificed in favor of better out-of-sample Sharpe ratios. This article also shows that preprocessing data using rolling percentage ranks can be better than using either an expanding window or Z-scores.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130821172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Deep hedging is a versatile framework for computing the optimal hedging strategy of derivatives in incomplete markets. However, it is subject to the action-dependence problem impeding efficient training because the appropriate hedging action at the next step depends on the current action. To overcome this issue, the authors leverage a no-transaction band strategy, an existing technique that provides optimal hedging strategies for European options and exponential utility. The authors theoretically argue this strategy to be optimal for a wider class of utilities and derivatives, including exotics. Based on the result, the authors propose a no-transaction band network, namely, a neural network architecture that facilitates fast training and precise evaluation of the optimal hedging strategy. Moreover, the authors experimentally demonstrate that, for European and lookback options, their architecture rapidly attains a better hedging strategy compared with a standard feed-forward network. The findings thus have important implications for the practical applications of deep hedging.
{"title":"No-Transaction Band Network: A Neural Network Architecture for Efficient Deep Hedging","authors":"Shota Imaki, Kentaro Imajo, Katsuya Ito, Kentaro Minami, Kei Nakagawa","doi":"10.3905/jfds.2023.1.125","DOIUrl":"https://doi.org/10.3905/jfds.2023.1.125","url":null,"abstract":"Deep hedging is a versatile framework for computing the optimal hedging strategy of derivatives in incomplete markets. However, it is subject to the action-dependence problem impeding efficient training because the appropriate hedging action at the next step depends on the current action. To overcome this issue, the authors leverage a no-transaction band strategy, an existing technique that provides optimal hedging strategies for European options and exponential utility. The authors theoretically argue this strategy to be optimal for a wider class of utilities and derivatives, including exotics. Based on the result, the authors propose a no-transaction band network, namely, a neural network architecture that facilitates fast training and precise evaluation of the optimal hedging strategy. Moreover, the authors experimentally demonstrate that, for European and lookback options, their architecture rapidly attains a better hedging strategy compared with a standard feed-forward network. The findings thus have important implications for the practical applications of deep hedging.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135017691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hedging is a common trading activity to manage the risk of engaging in transactions that involve derivatives such as options. Perfect and timely hedging, however, is an impossible task in the real market that characterizes discrete-time transactions with costs. Recent years have witnessed reinforcement learning (RL) in formulating optimal hedging strategies. Specifically, different RL algorithms have been applied to learn the optimal offsetting position based on market conditions, offering an automatic risk management solution that proposes optimal hedging strategies while catering to both market dynamics and restrictions. In this article, the author provides a comprehensive review of the use of RL techniques in hedging derivatives. In addition to highlighting the main streams of research, the author provides potential research directions on this exciting and emerging field.
{"title":"A Review on Derivative Hedging Using Reinforcement Learning","authors":"Peng Liu","doi":"10.3905/jfds.2023.1.124","DOIUrl":"https://doi.org/10.3905/jfds.2023.1.124","url":null,"abstract":"Hedging is a common trading activity to manage the risk of engaging in transactions that involve derivatives such as options. Perfect and timely hedging, however, is an impossible task in the real market that characterizes discrete-time transactions with costs. Recent years have witnessed reinforcement learning (RL) in formulating optimal hedging strategies. Specifically, different RL algorithms have been applied to learn the optimal offsetting position based on market conditions, offering an automatic risk management solution that proposes optimal hedging strategies while catering to both market dynamics and restrictions. In this article, the author provides a comprehensive review of the use of RL techniques in hedging derivatives. In addition to highlighting the main streams of research, the author provides potential research directions on this exciting and emerging field.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135796839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}