Learning and memory
The brain is able to learn because connection strengths between neurons are modifiable. At present, however, it is unknown what strategy the brain employs to adjust connection strengths. Current methods that train artificial neural networks to map inputs onto outputs are either not biologically realistic or not efficient. Supervised learning, such as error-backpropagation, is very powerful and is widely used in machine learning. Unfortunately, it is implausible from a biological point of view, because it uses neuron-specific error signals and a "teacher" that specifies the correct output during learning.
Reinforcement learning, on the other hand, is biologically plausible. In reinforcement learning, the output is chosen stochastically, so the network can try various outputs for each of the inputs. This is reminiscent of animal learning, with the animal trying out various responses until it finds the correct one. Moreover, the teacher is replaced by a global reinforcer, such as the presence or absence of reward. Connection strengths are modified on the basis of whether the amount of reward on a particular learning trial is higher or lower than expected. The popularity of reinforcement learning models has greatly increased in recent years, as signals predicted by these models have been found in the brain.
Surprisingly, only few neural network implementations of reinforcement learning exist. Moreover, these implementations are not as efficient as supervised learning, and cannot solve all problems that can be solved by supervised learning. This is because they lack an efficient mechanism that assigns credit to those units at early processing levels that contribute most to the network's output.
Together with Pieter Roelfsema, we investigate whether this so-called credit-assignment problem can be solved by a new role for feedback connections and attention in learning. We are developing a new learning theory, called attention-gated reinforcement learning (AGREL), that we hope will become as powerful as supervised learning and yet be biologically realistic.