Deep Reinforcement Learning for AI Chip Design

A team from Google Brain recently published a paper (on arXiv) describing the use of a Deep Reinforcement Learning algorithm to design chips customized for AI applications.  In other words, they used an AI to build AI chips.  The problem they faced was “placement of TensorFlow graphs onto hardware devices to minimize training or inference time, or placement of an ASIC or FPGA netlist onto a grid to optimize for power, performance, and
area”.  They note in the paper that Deep RL is well-suited to problems like this, “where exhaustive or hueristic-based methods cannot scale”.  The specific Deep RL family of algorithms they used were policy-gradient methods, such as REINFORCE, Proximal Policy Optimization (PPO), and Soft Actor Critic (SAC).   Using these techniques, the authors state their belief that “it is AI itself that will provide the means to shorten the chip design cycle, creating a symbiotic relationship between hardware and AI, with each fueling advances in the other”.

Posted in Uncategorized | Leave a comment

TensorFlow or PyTorch: Which Framework to Choose?

Image Source: Google, Inc.

I was recently asked by Udacity to be a beta tester (and, subsequently, a mentor and project reviewer) for one of their newest course offerings:  the Deep Reinforcement Learning NanoDegree (DRLND) program.  This is a very interesting course with some fascinating projects.  Students get to work with the same types of Reinforcement Learning algorithms that have recently made headlines (for example, the AlphaGo project that beat the world’s best professional Go players while demonstrating very deep and strategic concepts and the Dota2 RL that recently showed professional-level skills in an extremely complicated multi-player video game).   Some example projects in the DRLND program include developing an RL that can play a video game that collects yellow bananas while navigating a 2D field of play and finding an actor/critic model that is able to control 20 independent 2-axis robot arms to position an end-effector within a target sphere that continuously moves in 3D space about the robot.

This newest Udacity course requires all projects to be written in Python using the PyTorch framework.  This was the first time I’ve had an opportunity to work with PyTorch, so I thought I would relay my experience and compare the advantages and disadvantages of the PyTorch framework compared to TensorFlow as I see them.

  1.  Sponsoring Companies:  These two frameworks were developed by and continue to be supported by two of the biggest names in the AI industry today.  TensorFlow was originally developed by Google Brain (the AI company within the Google/Alphabet corporate structure) and Google not only continues to support this framework but has even started to develop and field custom hardware specifically designed to execute TensorFlow graphs at high speed (competitive with the fasted Nvidia GPUs that also have libraries customized to execute TensorFlow computation graphs.  PyTorch was developed by Facebook and they have used PyTorch for many of the projects that help to improve the user experience on their platform, including the Facebook face-recognition app.  Both companies have released their respective frameworks as open source projects in order to achieve the widest distribution and acceptance of their products.  In summary, I would rate this category a tie:  Both frameworks are supported by industry titans, each committed to the success and longevity of their respective frameworks.
  2. User Communities:  TensorFlow wins in this category, hands down.  As one of the original deep learning frameworks, TensorFlow has a very broad and experienced user community.  The documentation is excellent and help sites such as StackOverflow are available to answer just about any question a user might have.  PyTorch, however, might be thought of as the new kid on the block.  The PyTorch user base is not as deep, the documentation is not as comprehensive, and getting answers to questions is not as easy as is the case with TensorFlow.  Score one for TensorFlow.
  3. Static vs. Dynamic Computation Graphs:  This category is by far the one that most differentiates the approach taken by the two frameworks.  The TensorFlow framework is based on the concept of a static computation graph.  The API is designed to enable the programmer to define a computation graph that, once it is complete, can be passed to a GPU (or a CPU library) for execution.  Just about every call made to the API is designed to describe the nodes of this graph, how they are connected, how and in what format the data will be input to the graph, how intermediate data will be updated during execution, and how the outputs will be delivered back to the function that triggered the graph’s execution.  The problem with this approach, however, is that the entire architecture of your network must basically be defined and described using the TensorFlow API before you will see any results.  In addition, there is a deep learning curve with this framework, with many new and potentially confusing artifices that must be mastered to achieve success (such as placeholders and variables, and the TensorFlow session concepts).  In contrast, PyTorch is based on a dynamic execution graph. The graph is constructed and can be executed statement by statement.  This is a much more intuitive approach and has a very natural feel for Python programmers that are used to this interactive flow.  The learning curve for PyTorch is not nearly as steep as TensorFlow’s learning curve and even though the documentation is not as polished, most users will find PyTorch’s dynamic computation graph easier to use and master.  Score one for PyTorch!
  4. Ease of Development and Debugging:  Besides the natural flow that PyTorch’s dynamic computation graph provides, there is another big advantage to this approach:  ease of debugging.  At any point in developing the computation graph, the programmer is free to insert print statements between nodes or at any point, really, in the graph to see what’s going on.  This is a great benefit to debugging one’s code.  Things are not nearly as simple with TensorFlow, and I can tell you from experience that when there is a problem with your TensorFlow Graph, it can be very difficult to find out the root cause of the problem.  This is another point that goes to PyTorch.

Bottom Line:  There are good reasons to use each of these different deep learning frameworks.  TensorFlow  most certainly represents basic knowledge that every deep learning practitioner is expected to know.  However, the more I use PyTorch, the more I am liking it and think it may be the future.  My recommendation is to be familiar with both these frameworks.  Know TensorFlow for basic DL knowledge, learn PyTorch for ease of use and less troublesome debugging.

Posted in Uncategorized | Leave a comment

Nvidia Announces Apex: A New PyTorch Extension


Nvidia recently announced a new, open-source PyTorch extension that helps users improve the performance of deep learning training on Nvidia’s Volta GPUs.  The key improvement that APEX brings to deep learning is that it enables engineers to use mixed precision arithmetic to improve training speed while still maintaining accuracy and stability of training algorithms.  The extension requires PyTorch 0.4, Python 3, and Nvidia’s CUDA 9 library.  Additional information is available on the Nvidia site.

Posted in Uncategorized | Leave a comment

IBM Debater

Noa Ovadia Interacting with IBM Debater

IBM just demonstrated a deep-learning system that is a follow-on of sorts to the Watson Jeopardy demonstration from several years ago.  For this IBM Debater project, Watson was trained to intelligently debate on approximately 100 different topics.  In this particular demonstration, the IBM system was challenged by Noa Ovadia a college senior who was the Israeli debate champion of 2016.  The two held a traditional debate on the topic of Subsidized Space Exploration.  Each side made an opening statement followed by a rebuttal of the opponent’s position, and then a closing statement.  Although this demonstration was not a traditional Turing test, it shows that progress is being made in the quest for machines that can intelligently interact with humans in conversational speech.

In fact, this is the 2nd demonstration of recent progress toward this goal.   A few weeks ago, Google unveiled the Duplex system that demonstrated very human-like speech and interaction with a phone call to schedule restaurant reservations and hair salon appointments.  It appeared in those demos that the real humans on the other end of a phone call did not actually realize they were talking to a computer and held a very natural conversation.  Although the Google Duplex conversations took place in a very constrained topic area (as was the IBM Debater demonstration) these advancements show that rapid progress is being made to extend the limited conversation ability of systems like Amazon Echo, Apple Siri, and Google Assistant.

Experience shows that once the basic infrastructure has been implemented and demonstrated for one domain, these types of systems rapidly expand to support many more domains of knowledge and interaction.  I can foresee similar systems supporting conversations one might have with a doctor, for example, or a financial planner, or in fact, any relatively-constrained domain of knowledge.  I expect it won’t be long before many of the jobs currently held by humans, to provide advice to other humans on various topics, will transition to systems like those now being demonstrated by IBM and Google.

This rapid progress is both fascinating and worrisome.  Past labor transitions where technology has eliminated certain jobs has always resulted in new jobs being created that never existed before.  Is this time different, or are there whole new classes of jobs on the horizon that none of us currently envision?  Only time will tell.

Posted in Uncategorized | Leave a comment

The h Index & the Top 15 Deep Learning Conferences and Journals

The Google Scholar resource ranks the top journals and conferences using a fully automated h-index score.  The h-index is named after Jorge Hirsch, a physicist at the University of California, San Diego (UCSD), who proposed the index to determine theoretical physicists’ relative quality.  It is sometimes called the Hirsch index.  According to Wikipedia, the h index measures “both the productivity and citation impact of the publications of a scientist or scholar. The index is based on the set of the scientist’s most cited papers and the number of citations that they have received in other publications. The index can also be applied to the productivity and impact of a scholarly journal” (as is the case here).

Searching the Google Scholar site for Deep Learning resources returned the following list of the top 15 journals and conferences (the number to the right of each entry is the resource’s h5-index).  For comparison, Nature, the top-rated journal in the sciences, has an h5-index rating of 366.

The #1 Deep Learning resource in this list is the International Conference on Deep Learning, which takes place next month (Jul 10-15, 2018) in Stockholm, Sweden.

The #2 resource is the arXiv Machine Learning (stat.ML) archive of pre-press journal papers, hosted by the Cornell University Library.  This is an excellent collection of scholarly papers on topics related to machine learning.

The full list of Google Scholar’s top-15 resources follows:

1. International Conference on Machine Learning (ICML) – 91
2. arXiv Machine Learning (stat.ML) – 76
3. The Journal of Machine Learning Research – 73
4. Machine Learning – 37
5. European Conference on Machine Learning and Knowledge Discovery in Databases – 31
6. International Journal of Machine Learning and Cybernetics – 23
7. IEEE International Workshop on Machine Learning for Signal Processing – 19
8. International Conference on Machine Learning and Applications – 18
9. International Journal of Machine Learning and Computing – 16
10. International Workshop on Machine Learning in Medical Imaging – 12
11. Machine Learning and Data Mining in Pattern Recognition (MLDM) – 11
12. International Conference on Machine Learning and Cybernetics – 10
13. Asian Conference on Machine Learning – 10
14. Artificial Intelligent Systems and Machine Learning – 5
15. Transactions on Machine Learning and Artificial Intelligence – 5


Posted in Uncategorized | Leave a comment

MIT’s RoadTracer Uses Deep Learning to Generate Road Networks from Satellite Imagery

Image Credit: MIT CSAIL Group (

The CSAIL group at the Massachusetts Institute of Technology (MIT) have improved the state-of-the art in inferring road networks from satellite imagery.  This is a time-consuming, tedious, and error-prone process that has traditionally relied on human inputs.  Open Street Map (OSM) is the gold-standard for cataloging road networks throughout the world, but relies almost exclusively on human input.  So there are many areas that have yet to be mapped and since the data is provided by volunteers with a mixed bag of skill and attention to detail, the data is not 100% accurate.  For example, the city of Toronto produces a gold standard road map and recent studies indicate this map differs from the OSM version with an error rate of approximately 14%.

Previous attempts at using deep learning to infer road networks from satellite imagery have relied on a traditional Convolutional Neural Network (CNN) trained on a large number of labeled images to produce pixel-by-pixel classification of road (vs. non-road) pixels in an image.  This technique has achieved limited success with real-world imagery due primarily to varying lighting conditions and the many occlusions caused by trees, buildings, and shadows in satellite imagery that greatly complicate this process (even for human analysts).

The advancement made by the MIT engineers was to change from making pixel-by-pixel classifications to a new technique where the CNN’s goal is re-oriented to iteratively construct a graph of the road network directly from the imagery.  As described in the paper:  “RoadTracer: Automatic Extraction of Road Networks from Aerial Images “, the MIT process “consists of a search algorithm, guided by a decision function implemented via a CNN, to compute the graph iteratively.  The search walks along roads starting from a single location known to be on the road network. Vertices and edges are added in the path that the search follows. The decision function is invoked at each step to determine the best action to take: either add an edge to the road network, or step back to the previous vertex in the search tree.”

Roadtracer identifies 45% more road segments than the authors’ previous segmentation approach (see figure, above) and out-performs the previous state-of-the art system by a wide margin.  It would be interesting to see if the search algorithm could be improved by a Reinforcement Learning network – another  technique that is gaining widespread prominance in the deep learning community.

Posted in Uncategorized | Leave a comment

Deep Reinforcement Learning for Navigation

A recent Nature article ( and accompanying blog post by the paper’s authors ( describes how the Andrea Banino et. al. from Deep Mind developed an artificial neural network to investigate how mammals use neural grid cells to perform vector-based navigation.  The network they created spontaneously developed a grid-cell-like architecture similar to the hexagonal arrangement of grid cells in mammals (see image, above). 

The concept of grid cells as a driving factor for navigation in the mammalian brain was originally determined by John O’Keefe, May-Britt Moser, and Edvard I. Moser about ten years ago, and they were recently (2014) awarded a Nobel Prize for their work.  The Deep Mind team developed and trained a Recurrent Neural Network (RNN) to maintain a sense of place within a virtual environment, using velocity vectors to guide movement through the artificial environment.  The resulting RNN spontaneously recapitulated the hexagonal arrangement of grid cells reflecting the structure of grid cells in the mammalian brain.  The following diagram illustrates an unfolded view of an RNN, demonstrating the recursive power inherent in these structures.

Image Source: Wikipedia Commons, Creative Commons License

The Deep Mind team then went a step further by creating a deep reinforcement learning agent to investigate whether or not the resulting artificial neural network was indeed capable of supporting navigation.  As the Deep Mind team explained, “This agent performed at a super-human level…and exhibited the type of flexible navigation normally associated with animals, taking novel routes and shortcuts when they became available”.  An example demonstrating these abilities is illustrated, below.  In this example, the agent was trained in a maze with 5 doors when all but door #5 were closed (a).  During testing, all doors were opened (b) and the agent successfully found shortcuts to the desired destination.

The Deep Reinforcement agent trained in a maze with all but one door closed (left) found shortcuts during testing (right) when all doors were open.

In my opintion, the most interesting features of this work included the use of two different neural network architectures in a single study:

  • An RNN to develop a model of a portion of a mammalian brain that spontaneously mimicked the grid-cell structure of actual mammalian brains, and
  • A deep reinforcement learning, agent-based system to explore how the resulting network enables velocity-vector-based navigation.
Posted in Uncategorized | Leave a comment