Hi Ryan,

It would certainly be possible to employ some of the exploration techniques discussed in part 7 in A3C. The issue would be that each technique introduces new hyperparameters to adjust. While adding them on top of entropy regularization may help, they would need to be tuned properly.

PhD. Interests include Deep (Reinforcement) Learning, Computational Neuroscience, and Phenomenology.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store