1 min readJan 30, 2017
Hi Kelvin,
Good question. This is because each batch comes from a continuous stream of experiences, rather than a random one. As such, we would like the RNN to process it as a single example with a length of batch_size
rather than batch_size
separate examples of length 1. In this way the RNN unrolls itself properly to learn from the temporal dependencies in the data.
Hope that clears things up.