Especially young colleagues are fascinated by the potential of deep learning for neuroscience. This was obvious at the recent Society for Neuroscience meeting in Washington DC, where the few posters that had the magical words in their title attracted large crowds of attendees who seemed almost exclusively in their twenties. The success of deep learning of data representation has led to impressive applications in image, video and speech processing.Footnote 1 Compared to these, recent advances in applying reinforcement learning to playing games are outright mind blowing, with AlphaGo Zero achieving superhuman performance in just three days of training on a single machine with specialized hardware.Footnote 2 It is, therefore, easy to predict that the interest in deep learning among young computational neuroscientists will only increase, but the reality may be more complex than they surmise. In this Editorial, I will focus on the question of correspondence between deep learning and how the brain works.Footnote 3 I will not consider the many opportunities of applying deep learning as a supporting technology.

The original breakthrough leading to the success of deep learning tested the method on an image recognition task, classifying handwritten digits.Footnote 4 Correspondingly, most of the applications of deep learning to computational neuroscience are about understanding the visual system (including the posters at the recent Society for Neuroscience meeting). As pointed out in a recent reviewFootnote 5 one category of deep learning models, goal-driven hierarchical convolutional neural networks, has been very successful at predicting neural responses in several layers of primate visual cortex, including V1, V2, V4 and inferior temporal cortex (IT). But the authors also point out that this success is probably due to convolutional neural networks closely mimicking the overall architecture of cortex5, in particular implementing features similar to receptive fields of increasing size across the hierarchy. This leads to a warning that any correspondences between deep learning methods and the brain may not generalize to all deep learning. In fact, though the field of machine learning has clearly been inspired by neuroscience3, it has never seen this as a limitation on the methods it can use. For example, the breakthrough referred to earlier4 was a method to teach layers in a multilayer network one at a time, something that is hard to imagine occurring in a real brain. Deep learning networks have typically also many more layers than corresponding brain systems and one of the current hypes are “very deep” models with tens of layers.Footnote 6 A recent breakthrough, also used in AlphaGo Zero, are residual networks where shortcut connections are used that connect units in lower layers directly with units in higher layers6. Residual networks are an example of deep learning methods that do not reflect real neural systems, this would be like V1 densely projecting directly to V4 or IT. Conversely, there are well known brain circuits that have clearly quite different architectures than visual cortex, like for example the olfactory system.

Another difference between deep learning and human brains is the number of training examples required, with millions of labeled images needed to learn simple categorization tasks5. In fact, deep learning would not exist if the digital revolution hadn’t made big data available. Fortunately, the human brain is better at generalizing from smaller sets of experiences, but recent machine learning approaches try to mimic this3. Conversely, we may not learn to recognize some features because we do not routinely train ourselves on labeled data. An example is the recent controversial study claiming that a deep network learned to recognize sexual preferences of people by analyzing pictures on a dating website.Footnote 7 In newspapers, this was reported as a demonstration of how artificial intelligence can now beat the human mind. In real life, however, people often don’t know who is gay or not (we lack the label) and I doubt any rational person would consider training themselves by going through all the profiles on a dating site.

Returning to the AlphaGo Zero example2, the success of deep learning may soon be surpassed by reinforcement learning, which is again directly based on neuroscience concepts.3,Footnote 8 In this case the big data challenge was overcome by having AlphaGo play games against itself, so no prior data was required. This may be conceptually similar to many forms of learning in infancy, where trial and error clearly play a very big role. But while being successful at playing Go and chess seems a big achievement because of the close to infinite number of possible game states, the important limiting metric for reinforcement learning is the number of possible actions and that number is, clearly, limited for any board game. Nevertheless, it is probably worthwhile to carefully investigate lessons that can be drawn from AlphaGo Zero.

Finally, I want to report on an interesting recent reportFootnote 9 that shows a fairly realistic way to solve one of the most vexing problems in mapping machine learning to the brain: the credit assignment problem. The first machine learning revolution in the 80ies was based on the discovery of the back-propagation algorithmFootnote 10 and it is well known that real brains have no back-propagation3, i.e. transmission of weight updates from higher layers to lower layers. In more modern terms, this is called the credit assignment problem: which neurons in lower layers directly contributed to the final behavioral outcome? Interestingly, leading researchers in deep learning are quite concerned about finding solutions to this conundrumFootnote 11,Footnote 12 but till now only partial answers were proposed. Guerguiev et al.9 propose to use the dendritic structure of neurons and this is a remarkably complete scheme, although there are still a few unresolved issues. Specifically, sparse feedback connections carrying higher-order feedback to the apical dendrites are used to drive changes in synaptic weights in basal dendrites that receive sensory input. The dendrites are essential because they provide for physical separation of the two inputs onto the same neuron. I will not describe the results in detail, but encourage reading of the paper.

From my perspective, this is quite an ironic result. As an avid modeler of dendritesFootnote 13,Footnote 14 I have been frustrated by the swing of the field of computational neuroscience towards point neuron network modeling. I have seen a gradual loss of interest among students of summer schools in computational neuroscience in simulating single neuron models and the shift is also noticeable in recent textbooks.Footnote 15 If what Guerguiev and colleagues (including a scientist working at DeepMind) propose9 is true, then understanding how neural networks learn complex tasks will require models that include both apical and basal dendrites, though the models themselves do not have to be very complex. This nicely matches the recent demonstration in in vivo experiments of the behavioral importance of dendritic spikesFootnote 16 and suggests a bright future for modeling dendrites.