{"title":"Visualizing Structures in Financial Time-Series Datasets through Affinity-Based Diffusion Transition Embedding","authors":"Rui Ding","doi":"10.3905/jfds.2022.1.111","DOIUrl":null,"url":null,"abstract":"In this work, the author proposes a modified version of PHATE, a diffusion map-based embedding algorithm that is tuned for working on financial time-series data primarily. The new algorithm, financial affinity-based diffusion transition embedding (FATE), takes in user-specified distance metrics that make sense for time-series data and uses symmetrized f-divergences applied to the diffusion probabilities as the final embedding distance before passing them into a metric multidimensional scaling step. The proposed visualization method reveals both local and global structures of the input time-series dataset. Performance of this visualization algorithm is first demonstrated through numerical experiments with Dow Jones 30 stock returns and S&P 100 stock returns. The author compares FATE visualization results using correlation-type distances with t-stochastic neighbor embedding and PHATE embeddings, among others, to demonstrate the advantages and new perspectives of FATE both qualitatively and quantitatively. On the other hand, experiments on synthetic ARMA time series with fine control of the structure of the underlying model parameters are provided. The results demonstrate the ability of transfer function information distance and time-lagged Hellinger distance to identify structures within the generating time-series models from their time-series realizations alone, which cannot be identified by correlation-type distances or Euclidean distances. The author concludes that the choice of distance metrics has an important role in the kind of structure one can uncover from time-series datasets.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Journal of Financial Data Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3905/jfds.2022.1.111","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
In this work, the author proposes a modified version of PHATE, a diffusion map-based embedding algorithm that is tuned for working on financial time-series data primarily. The new algorithm, financial affinity-based diffusion transition embedding (FATE), takes in user-specified distance metrics that make sense for time-series data and uses symmetrized f-divergences applied to the diffusion probabilities as the final embedding distance before passing them into a metric multidimensional scaling step. The proposed visualization method reveals both local and global structures of the input time-series dataset. Performance of this visualization algorithm is first demonstrated through numerical experiments with Dow Jones 30 stock returns and S&P 100 stock returns. The author compares FATE visualization results using correlation-type distances with t-stochastic neighbor embedding and PHATE embeddings, among others, to demonstrate the advantages and new perspectives of FATE both qualitatively and quantitatively. On the other hand, experiments on synthetic ARMA time series with fine control of the structure of the underlying model parameters are provided. The results demonstrate the ability of transfer function information distance and time-lagged Hellinger distance to identify structures within the generating time-series models from their time-series realizations alone, which cannot be identified by correlation-type distances or Euclidean distances. The author concludes that the choice of distance metrics has an important role in the kind of structure one can uncover from time-series datasets.