Hi Tim,

The end_multiplier corresponds to the check for whether the specific step was the final one in an episode. The Q update for final steps is different from those of other steps in an episode, so need to be treated accordingly.

PhD. Interests include Deep (Reinforcement) Learning, Computational Neuroscience, and Phenomenology.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store