Hi Kelvin,

Good question. This is because each batch comes from a continuous stream of experiences, rather than a random one. As such, we would like the RNN to process it as a single example with a length of batch_size rather than batch_size separate examples of length 1. In this way the RNN unrolls itself properly to learn from the temporal dependencies in the data.

Hope that clears things up.

PhD. Interests include Deep (Reinforcement) Learning, Computational Neuroscience, and Phenomenology.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store