Acting and Bayesian reinforcement structure learning of partilally observable environment

Publication at Faculty of Mathematics and Physics |

2014

Abstract

This article shows how to learn both the structure and the parameters of partially observable environment simultaneously while also online performing near-optimal sequence of actions taking into account exploration-exploitation tradeoff. It combines two results of recent research: The former extends model-based Bayesian reinforcement learning of fully observable environment to bigger domains by learning the structure.

The latter shows how a known structure can be exploited to model-based Bayesian reinforcement learning of partially observable domains. This article shows that merging both approaches is possible without too excessive increase in computational complexity.

Keywords

POMDP reinforcement learning partially observable environment