{"title":"Inference of Utilities and Time Preference in Sequential Decision-Making","authors":"Haoyang Cao, Zhengqi Wu, Renyuan Xu","doi":"arxiv-2405.15975","DOIUrl":null,"url":null,"abstract":"This paper introduces a novel stochastic control framework to enhance the\ncapabilities of automated investment managers, or robo-advisors, by accurately\ninferring clients' investment preferences from past activities. Our approach\nleverages a continuous-time model that incorporates utility functions and a\ngeneric discounting scheme of a time-varying rate, tailored to each client's\nrisk tolerance, valuation of daily consumption, and significant life goals. We\naddress the resulting time inconsistency issue through state augmentation and\nthe establishment of the dynamic programming principle and the verification\ntheorem. Additionally, we provide sufficient conditions for the identifiability\nof client investment preferences. To complement our theoretical developments,\nwe propose a learning algorithm based on maximum likelihood estimation within a\ndiscrete-time Markov Decision Process framework, augmented with entropy\nregularization. We prove that the log-likelihood function is locally concave,\nfacilitating the fast convergence of our proposed algorithm. Practical\neffectiveness and efficiency are showcased through two numerical examples,\nincluding Merton's problem and an investment problem with unhedgeable risks. Our proposed framework not only advances financial technology by improving\npersonalized investment advice but also contributes broadly to other fields\nsuch as healthcare, economics, and artificial intelligence, where understanding\nindividual preferences is crucial.","PeriodicalId":501294,"journal":{"name":"arXiv - QuantFin - Computational Finance","volume":"22 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuantFin - Computational Finance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2405.15975","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper introduces a novel stochastic control framework to enhance the
capabilities of automated investment managers, or robo-advisors, by accurately
inferring clients' investment preferences from past activities. Our approach
leverages a continuous-time model that incorporates utility functions and a
generic discounting scheme of a time-varying rate, tailored to each client's
risk tolerance, valuation of daily consumption, and significant life goals. We
address the resulting time inconsistency issue through state augmentation and
the establishment of the dynamic programming principle and the verification
theorem. Additionally, we provide sufficient conditions for the identifiability
of client investment preferences. To complement our theoretical developments,
we propose a learning algorithm based on maximum likelihood estimation within a
discrete-time Markov Decision Process framework, augmented with entropy
regularization. We prove that the log-likelihood function is locally concave,
facilitating the fast convergence of our proposed algorithm. Practical
effectiveness and efficiency are showcased through two numerical examples,
including Merton's problem and an investment problem with unhedgeable risks. Our proposed framework not only advances financial technology by improving
personalized investment advice but also contributes broadly to other fields
such as healthcare, economics, and artificial intelligence, where understanding
individual preferences is crucial.