I think your confusion is coming from my using a dynamic rnn, which is less common in text-based systems. The dynamic rnn does not have a fixed length of time-steps. This is not a computational problem because at any given time only a single time series, rather than a batch, is passed through the network. So when taking actions a batch of 1x1 is sent through the network, and when training a batch of 1x30 is sent through the network. I hope that clears things up.