顺序决策中的效用和时间偏好推断

Haoyang Cao, Zhengqi Wu, Renyuan Xu
{"title":"顺序决策中的效用和时间偏好推断","authors":"Haoyang Cao, Zhengqi Wu, Renyuan Xu","doi":"arxiv-2405.15975","DOIUrl":null,"url":null,"abstract":"This paper introduces a novel stochastic control framework to enhance the\ncapabilities of automated investment managers, or robo-advisors, by accurately\ninferring clients' investment preferences from past activities. Our approach\nleverages a continuous-time model that incorporates utility functions and a\ngeneric discounting scheme of a time-varying rate, tailored to each client's\nrisk tolerance, valuation of daily consumption, and significant life goals. We\naddress the resulting time inconsistency issue through state augmentation and\nthe establishment of the dynamic programming principle and the verification\ntheorem. Additionally, we provide sufficient conditions for the identifiability\nof client investment preferences. To complement our theoretical developments,\nwe propose a learning algorithm based on maximum likelihood estimation within a\ndiscrete-time Markov Decision Process framework, augmented with entropy\nregularization. We prove that the log-likelihood function is locally concave,\nfacilitating the fast convergence of our proposed algorithm. Practical\neffectiveness and efficiency are showcased through two numerical examples,\nincluding Merton's problem and an investment problem with unhedgeable risks. Our proposed framework not only advances financial technology by improving\npersonalized investment advice but also contributes broadly to other fields\nsuch as healthcare, economics, and artificial intelligence, where understanding\nindividual preferences is crucial.","PeriodicalId":501294,"journal":{"name":"arXiv - QuantFin - Computational Finance","volume":"22 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Inference of Utilities and Time Preference in Sequential Decision-Making\",\"authors\":\"Haoyang Cao, Zhengqi Wu, Renyuan Xu\",\"doi\":\"arxiv-2405.15975\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper introduces a novel stochastic control framework to enhance the\\ncapabilities of automated investment managers, or robo-advisors, by accurately\\ninferring clients' investment preferences from past activities. Our approach\\nleverages a continuous-time model that incorporates utility functions and a\\ngeneric discounting scheme of a time-varying rate, tailored to each client's\\nrisk tolerance, valuation of daily consumption, and significant life goals. We\\naddress the resulting time inconsistency issue through state augmentation and\\nthe establishment of the dynamic programming principle and the verification\\ntheorem. Additionally, we provide sufficient conditions for the identifiability\\nof client investment preferences. To complement our theoretical developments,\\nwe propose a learning algorithm based on maximum likelihood estimation within a\\ndiscrete-time Markov Decision Process framework, augmented with entropy\\nregularization. We prove that the log-likelihood function is locally concave,\\nfacilitating the fast convergence of our proposed algorithm. Practical\\neffectiveness and efficiency are showcased through two numerical examples,\\nincluding Merton's problem and an investment problem with unhedgeable risks. Our proposed framework not only advances financial technology by improving\\npersonalized investment advice but also contributes broadly to other fields\\nsuch as healthcare, economics, and artificial intelligence, where understanding\\nindividual preferences is crucial.\",\"PeriodicalId\":501294,\"journal\":{\"name\":\"arXiv - QuantFin - Computational Finance\",\"volume\":\"22 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - QuantFin - Computational Finance\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2405.15975\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuantFin - Computational Finance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2405.15975","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

本文介绍了一种新颖的随机控制框架,通过从过去的活动中准确推断客户的投资偏好来增强自动投资经理或机器人顾问的能力。我们的方法利用了一个连续时间模型,该模型结合了效用函数和时间变化率的通用贴现方案,根据每位客户的风险承受能力、日常消费估值和重要人生目标量身定制。我们通过状态增强以及动态编程原理和验证定理的建立,解决了由此产生的时间不一致性问题。此外,我们还为客户投资偏好的可识别性提供了充分条件。为了补充我们的理论发展,我们在离散时间马尔可夫决策过程框架内提出了一种基于最大似然估计的学习算法,并对其进行了熵正则化处理。我们证明了对数似然函数是局部凹陷的,这有助于我们提出的算法快速收敛。通过两个数值示例,包括默顿问题和不可对冲风险的投资问题,展示了算法的实用性和效率。我们提出的框架不仅通过改进个性化投资建议推动了金融技术的发展,而且还为医疗保健、经济学和人工智能等其他领域做出了广泛贡献,在这些领域,理解个人偏好至关重要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Inference of Utilities and Time Preference in Sequential Decision-Making
This paper introduces a novel stochastic control framework to enhance the capabilities of automated investment managers, or robo-advisors, by accurately inferring clients' investment preferences from past activities. Our approach leverages a continuous-time model that incorporates utility functions and a generic discounting scheme of a time-varying rate, tailored to each client's risk tolerance, valuation of daily consumption, and significant life goals. We address the resulting time inconsistency issue through state augmentation and the establishment of the dynamic programming principle and the verification theorem. Additionally, we provide sufficient conditions for the identifiability of client investment preferences. To complement our theoretical developments, we propose a learning algorithm based on maximum likelihood estimation within a discrete-time Markov Decision Process framework, augmented with entropy regularization. We prove that the log-likelihood function is locally concave, facilitating the fast convergence of our proposed algorithm. Practical effectiveness and efficiency are showcased through two numerical examples, including Merton's problem and an investment problem with unhedgeable risks. Our proposed framework not only advances financial technology by improving personalized investment advice but also contributes broadly to other fields such as healthcare, economics, and artificial intelligence, where understanding individual preferences is crucial.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A deep primal-dual BSDE method for optimal stopping problems Robust financial calibration: a Bayesian approach for neural SDEs MANA-Net: Mitigating Aggregated Sentiment Homogenization with News Weighting for Enhanced Market Prediction QuantFactor REINFORCE: Mining Steady Formulaic Alpha Factors with Variance-bounded REINFORCE Signature of maturity in cryptocurrency volatility
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1