This article shows a novel approach to modelling and reinforcement learning of dynamic stochastic partially observable environment. We present an MCMC algorithm which learns the structure of a graphical model representing the environment.
We use an approximation to a Bayesian method to learn posterior distribution over parameters of learned structure. The learning algorithm is online which allows us to use it in reinforcement learning setup.
We demonstrate that this algorithm is usable on several simple experiments.