Interaction-Grounded Learning: Learning from feedback, not rewards

The IGL Setting

In the paper, the authors motivate IGL with examples from human-computer interface research. If we want machines which can interact with humans in a natural way, we need them to be able to learn from human feedback in a natural way as well. Asking the human to provide a discrete reward signal to train the agent after every action it takes is an unreasonably cumbersome burden. It is also the case that demonstration data may not be available, or may not make sense in a number of contexts. Instead, if the computer could learn to interpret the humans hand gestures, facial features, or even brain signal to infer the latent reward signal, learning could happen in a much smoother way.

Taken from Xie et al., 2021
Code available here.
Learned and true reward achieved over time. X-axis corresponds to epochs. Y-axis corresponds to achieved reward.

Applying IGL to “Real Problems”

As I mentioned above, IGL has the potential to be applicable to many real-world domains where a nice reward signal is not available, but a messy feedback signal still might be. It is likely that a number of extensions might be required to the current approach before that becomes feasible though. Indeed, this novel formulation is riple for additional follow-up work.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Arthur Juliani

Arthur Juliani

Research Scientist. Interested in Artificial Intelligence, Neuroscience, Philosophy, and Literature.