Hi Akshay,

Thanks for reading the post! In the case of MountainCar, the simple algorithms I provide here probably won’t help. The issue is that the state space is much more complex than a grid-world.

You will likely need to utilize a policy-gradient method, or more complex Q algorithm such as DQN. I would suggest starting with a policy method however as it is simpler and works well on other similar control problems like CartPole.

I would recommend checking out the next few articles in my series, and then trying to apply https://github.com/awjuliani/DeepRL-Agents/blob/master/Vanilla-Policy.ipynb to the problem. You may need to adjust a few parameters first though to tune it to MountainCar. I haven’t actually worked with the environment myself, so I am unsure of what it’s particularities may be.

PhD. Interests include Deep (Reinforcement) Learning, Computational Neuroscience, and Phenomenology.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store