Hi,

You are correct. A3C typically is used in simulation environments only because it becomes infeasible to simultaneously run multiple robots at once. It is also not sample efficient enough to be used in real-world environments, since it would take literally thousands of episodes per worker to converge to a meaningful policy.

PhD. Interests include Deep (Reinforcement) Learning, Computational Neuroscience, and Phenomenology.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store