1 min readMay 2, 2017
Hi Ashis,
That is a really good question. There is indeed a line of research combining GANs and policy learning. Instead of using RL, it is a form of imitation learning, which takes a set of expert behaviors and learns to generate policies that are indistinguishable from the expert ones.
http://papers.nips.cc/paper/6391-generative-adversarial-imitation-learning.pdf