Reinforcement Learning and Attention

Learning and memory

The brain is able to learn because connection strengths between neurons are modifiable. At present, however, it is unknown what strategy the brain employs to adjust connection strengths. Current methods that train artificial neural networks to map inputs onto outputs are either not biologically realistic or not efficient. Supervised learning, such as error-backpropagation, is very powerful and is widely used in machine learning. Unfortunately, it is implausible from a biological point of view, because it uses neuron-specific error signals and a "teacher" that specifies the correct output during learning.

Reinforcement learning, on the other hand, is biologically plausible. In reinforcement learning, the output is chosen stochastically, so the network can try various outputs for each of the inputs. This is reminiscent of animal learning, with an animal trying out various responses until it finds the correct one. Moreover, the teacher is replaced by a global reinforcer, such as the presence or absence of reward. Connection strengths are modified on the basis of whether the amount of reward on a particular learning trial is higher or lower than expected. The popularity of reinforcement learning models has greatly increased in recent years, as signals predicted by these models have been found in the brain.

Surprisingly, only a few neural network implementations of reinforcement learning exist. Moreover, these implementations are not as efficient as supervised learning and cannot solve all problems that can be solved by supervised learning. This is because they lack an efficient mechanism that assigns credit to those units at early processing levels that contribute most to the network's output.

Together with Pieter Roelfsema, we investigate whether this so-called credit-assignment problem can be solved by a new role for feedback connections and attention in learning. We are developing a new learning theory, called attention-gated reinforcement learning (AGREL), that we hope will become as powerful as supervised learning and yet be biologically realistic.

Dynamic Hebbian cross-correlation learning resolves the spike timing-dependent plasticity conundrum
Olde Scheper, T. V., Meredith, R. M., Mansvelder, H. D., Van Pelt, J., and Van Ooyen, A. (2018). Frontiers in Computational Neuroscience doi: 10.3389/fncom.2017.00119.
[Abstract] [Full text: PDF]
Biologically plausible multi-dimensional reinforcement learning in neural networks
Rombouts, J. O., Van Ooyen, A., Roelfsema, P. R., and Bohte, S. M. (2012). In: Villa, A. E., et al., eds. Artificial Neural Networks - ICANN 2012, Lausanne, Switzerland, September 2012, Vol. 7552 of Lecture Notes in Computer Science, pp. 443-450.
[Abstract] [Full text: PDF]
Perceptual learning rules based on reinforcers and attention
Roelfsema, P. R., Van Ooyen, A., and Watanabe, T. (2010). Trends in Cognitive Sciences 14: 64-71.
[Abstract] [Full text: PDF]
Generation of time delays: simplified models of intracellular signalling in cerebellar Purkinje cells
Steuber, V., Willshaw, D., and Van Ooyen, A. (2006). Network: Computation in Neural Systems 17: 173-191.
[Abstract] [Full text: PDF] [Erratum: PDF]
Envisioning the reward (Preview of M. G. Shuler and M. F. Bear, 2006, Reward timing in the primary visual cortex, Science 311: 1606-1609.)
Van Ooyen, A., and Roelfsema, P. R. (2006). Neuron 50: 188-190.
[Abstract] [Full text: PDF]
Attention-gated reinforcement learning of internal representations for classification
Roelfsema, P. R., and Van Ooyen, A. (2005). Neural Computation 17: 2176-2214.
[Abstract] [Full text: PDF]
A biologically plausible implementation of error-backpropagation for classification tasks
Van Ooyen, A., and Roelfsema, P. R. (2003). In: Kaynak, O., Alpaydin, E., Oja, E., and Xu, L., eds. Supplementary Proceedings of the International Conference on Artificial Neural Networks - ICANN/ICONIP 2003, Istanbul, Turkey, June 2003, pp. 442-444.
[Abstract] [Full text: PDF]
Pattern recognition in the Neocognitron is improved by neuronal adaptation
Van Ooyen, A., and Nienhuis, B. (1993). Biol. Cybern. 70: 47-53.
[Abstract] [Full text: PDF]
Improving the convergence of the back-propagation algorithm
Van Ooyen, A., and Nienhuis, B. (1992). Neural Networks 5: 465-471.
[Abstract] [Full text: PDF]

My home page | Home