Anmerkungen:
Nach Informationen von SSRN wurde die ursprüngliche Fassung des Dokuments March 14, 2023 erstellt
Beschreibung:
We present a class of least squares reinforcement learning algorithms for optimal consumption under elasticity of intertemporal substitution and risk aversion preferences. The classical setting of Epstein-Zin utility preferences is cast into a dynamic utility functional framework and shown to exhibit time consistency. As a dynamic utility function, we find the robust approximation of the optimal consumption problem as a discrete time Markov Decision Process. We present a least-squares Q-Learning algorithm suitable for non-linear monotone certainty equivalents and benchmark its policy estimation convergence properties on an optimal wealth consumption problem against Least Squares Monte-Carlo and binomial tree methods. Finally, we demonstrate our least-squares Q-learning algorithm on an optimal consumption problem applied to SPDR S&P 500 ETF Trust (SPY) data