• Medientyp: E-Artikel
  • Titel: Variable risk control via stochastic optimization
  • Beteiligte: Kuindersma, Scott R; Grupen, Roderic A; Barto, Andrew G
  • Erschienen: SAGE Publications, 2013
  • Erschienen in: The International Journal of Robotics Research, 32 (2013) 7, Seite 806-825
  • Sprache: Englisch
  • DOI: 10.1177/0278364913476124
  • ISSN: 0278-3649; 1741-3176
  • Schlagwörter: Applied Mathematics ; Artificial Intelligence ; Electrical and Electronic Engineering ; Mechanical Engineering ; Modeling and Simulation ; Software
  • Entstehung:
  • Anmerkungen:
  • Beschreibung: We present new global and local policy search algorithms suitable for problems with policy-dependent cost variance (or risk), a property present in many robot control tasks. These algorithms exploit new techniques in non-parametric heteroscedastic regression to directly model the policy-dependent distribution of cost. For local search, the learned cost model can be used as a critic for performing risk-sensitive gradient descent. Alternatively, decision-theoretic criteria can be applied to globally select policies to balance exploration and exploitation in a principled way, or to perform greedy minimization with respect to various risk-sensitive criteria. This separation of learning and policy selection permits variable risk control, where risk-sensitivity can be flexibly adjusted and appropriate policies can be selected at runtime without relearning. We describe experiments in dynamic stabilization and manipulation with a mobile manipulator that demonstrate learning of flexible, risk-sensitive policies in very few trials.