Hi Gabriel,

The traditional role of biases is to encourage generalization (and prevent overfitting) between the training set and overall dataset. In the case of RL, generalization isn’t actually something we are interested in. Instead we want our Policy and Value outputs to be as accurate as possible, which means having them being completely conditioned on the state input. Biases would introduce, well, a bias, which isn’t something we want when selecting actions.

Research Scientist. Interested in Artificial Intelligence, Neuroscience, Philosophy, and Literature.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store