Deep Reinforcement Learning for Navigation

A recent Nature article ( and accompanying blog post by the paper’s authors ( describes how the Andrea Banino et. al. from Deep Mind developed an artificial neural network to investigate how mammals use neural grid cells to perform vector-based navigation.  The network they created spontaneously developed a grid-cell-like architecture similar to the hexagonal arrangement of grid cells in mammals (see image, above). 

The concept of grid cells as a driving factor for navigation in the mammalian brain was originally determined by John O’Keefe, May-Britt Moser, and Edvard I. Moser about ten years ago, and they were recently (2014) awarded a Nobel Prize for their work.  The Deep Mind team developed and trained a Recurrent Neural Network (RNN) to maintain a sense of place within a virtual environment, using velocity vectors to guide movement through the artificial environment.  The resulting RNN spontaneously recapitulated the hexagonal arrangement of grid cells reflecting the structure of grid cells in the mammalian brain.  The following diagram illustrates an unfolded view of an RNN, demonstrating the recursive power inherent in these structures.

Image Source: Wikipedia Commons, Creative Commons License

The Deep Mind team then went a step further by creating a deep reinforcement learning agent to investigate whether or not the resulting artificial neural network was indeed capable of supporting navigation.  As the Deep Mind team explained, “This agent performed at a super-human level…and exhibited the type of flexible navigation normally associated with animals, taking novel routes and shortcuts when they became available”.  An example demonstrating these abilities is illustrated, below.  In this example, the agent was trained in a maze with 5 doors when all but door #5 were closed (a).  During testing, all doors were opened (b) and the agent successfully found shortcuts to the desired destination.

The Deep Reinforcement agent trained in a maze with all but one door closed (left) found shortcuts during testing (right) when all doors were open.

In my opintion, the most interesting features of this work included the use of two different neural network architectures in a single study:

  • An RNN to develop a model of a portion of a mammalian brain that spontaneously mimicked the grid-cell structure of actual mammalian brains, and
  • A deep reinforcement learning, agent-based system to explore how the resulting network enables velocity-vector-based navigation.

About David Calloway

Hi! I'm David Calloway, the author of this blog on deep learning and artificial intelligence. I first started working with neural networks in the mid-80's, before the "dark winter" of neural networking technologies. I graduated from the U.S. Air Force Academy in 1979 with B.S. degrees in Physics and Electrical Engineering. In 1982, I received an MS degree in Electrical Engineering from Purdue University where I worked on early attempts at speech recognition. In 2005, I obtained another M.S. degree, this time in Biology from the University of Central Florida. My interest in neural networks and deep learning was rekindled recently, when I got involved in a project at Nova Technologies where I am using deep learning and TensorFlow to recognize and classify objects from satellite imagery.
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s