{"title":"最佳聚类数算法的从业者指南","authors":"M. Andrews","doi":"10.3905/jfds.2023.1.133","DOIUrl":null,"url":null,"abstract":"Identifying profitable investment strategies has been a long-standing challenge for finance practitioners. The optimal number of clusters (ONC) algorithm is a reliable tool used to evaluate backtest results affected by multiple testing. The algorithm is necessary to calculate the deflated Sharpe ratio, a popular metric that detects potential false positive investment strategies. These methods are based on the familywise error rate approach, which provides stringent control over the overall error rate, reducing the likelihood of false discoveries and increasing the reliability of findings. The ONC algorithm’s time complexity, however, poses a significant challenge for practitioners. This study proposes a practical solution to reduce the number of clusters tested by the ONC algorithm while maintaining accuracy. Results from simulated datasets demonstrate that the proposed solution significantly reduces the algorithm’s runtime. Additionally, this study addresses the impact of outliers on the ONC algorithm, showing that they can lead to nonoptimal solutions, and provides a simple solution to mitigate their effects. These findings contribute to the literature on finance by enhancing the usability of the ONC algorithm.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Practitioner’s Guide to the Optimal Number of Clusters Algorithm\",\"authors\":\"M. Andrews\",\"doi\":\"10.3905/jfds.2023.1.133\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Identifying profitable investment strategies has been a long-standing challenge for finance practitioners. The optimal number of clusters (ONC) algorithm is a reliable tool used to evaluate backtest results affected by multiple testing. The algorithm is necessary to calculate the deflated Sharpe ratio, a popular metric that detects potential false positive investment strategies. These methods are based on the familywise error rate approach, which provides stringent control over the overall error rate, reducing the likelihood of false discoveries and increasing the reliability of findings. The ONC algorithm’s time complexity, however, poses a significant challenge for practitioners. This study proposes a practical solution to reduce the number of clusters tested by the ONC algorithm while maintaining accuracy. Results from simulated datasets demonstrate that the proposed solution significantly reduces the algorithm’s runtime. Additionally, this study addresses the impact of outliers on the ONC algorithm, showing that they can lead to nonoptimal solutions, and provides a simple solution to mitigate their effects. These findings contribute to the literature on finance by enhancing the usability of the ONC algorithm.\",\"PeriodicalId\":199045,\"journal\":{\"name\":\"The Journal of Financial Data Science\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The Journal of Financial Data Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3905/jfds.2023.1.133\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Journal of Financial Data Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3905/jfds.2023.1.133","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Practitioner’s Guide to the Optimal Number of Clusters Algorithm
Identifying profitable investment strategies has been a long-standing challenge for finance practitioners. The optimal number of clusters (ONC) algorithm is a reliable tool used to evaluate backtest results affected by multiple testing. The algorithm is necessary to calculate the deflated Sharpe ratio, a popular metric that detects potential false positive investment strategies. These methods are based on the familywise error rate approach, which provides stringent control over the overall error rate, reducing the likelihood of false discoveries and increasing the reliability of findings. The ONC algorithm’s time complexity, however, poses a significant challenge for practitioners. This study proposes a practical solution to reduce the number of clusters tested by the ONC algorithm while maintaining accuracy. Results from simulated datasets demonstrate that the proposed solution significantly reduces the algorithm’s runtime. Additionally, this study addresses the impact of outliers on the ONC algorithm, showing that they can lead to nonoptimal solutions, and provides a simple solution to mitigate their effects. These findings contribute to the literature on finance by enhancing the usability of the ONC algorithm.