I was recently asked by Udacity to be a beta tester (and, subsequently, a mentor and project reviewer) for one of their newest course offerings: the Deep Reinforcement Learning NanoDegree (DRLND) program. This is a very interesting course with some fascinating projects. Students get to work with the same types of Reinforcement Learning algorithms that have recently made headlines (for example, the AlphaGo project that beat the world’s best professional Go players while demonstrating very deep and strategic concepts and the Dota2 RL that recently showed professional-level skills in an extremely complicated multi-player video game). Some example projects in the DRLND program include developing an RL that can play a video game that collects yellow bananas while navigating a 2D field of play and finding an actor/critic model that is able to control 20 independent 2-axis robot arms to position an end-effector within a target sphere that continuously moves in 3D space about the robot.
This newest Udacity course requires all projects to be written in Python using the PyTorch framework. This was the first time I’ve had an opportunity to work with PyTorch, so I thought I would relay my experience and compare the advantages and disadvantages of the PyTorch framework compared to TensorFlow as I see them.
- Sponsoring Companies: These two frameworks were developed by and continue to be supported by two of the biggest names in the AI industry today. TensorFlow was originally developed by Google Brain (the AI company within the Google/Alphabet corporate structure) and Google not only continues to support this framework but has even started to develop and field custom hardware specifically designed to execute TensorFlow graphs at high speed (competitive with the fasted Nvidia GPUs that also have libraries customized to execute TensorFlow computation graphs. PyTorch was developed by Facebook and they have used PyTorch for many of the projects that help to improve the user experience on their platform, including the Facebook face-recognition app. Both companies have released their respective frameworks as open source projects in order to achieve the widest distribution and acceptance of their products. In summary, I would rate this category a tie: Both frameworks are supported by industry titans, each committed to the success and longevity of their respective frameworks.
- User Communities: TensorFlow wins in this category, hands down. As one of the original deep learning frameworks, TensorFlow has a very broad and experienced user community. The documentation is excellent and help sites such as StackOverflow are available to answer just about any question a user might have. PyTorch, however, might be thought of as the new kid on the block. The PyTorch user base is not as deep, the documentation is not as comprehensive, and getting answers to questions is not as easy as is the case with TensorFlow. Score one for TensorFlow.
- Static vs. Dynamic Computation Graphs: This category is by far the one that most differentiates the approach taken by the two frameworks. The TensorFlow framework is based on the concept of a static computation graph. The API is designed to enable the programmer to define a computation graph that, once it is complete, can be passed to a GPU (or a CPU library) for execution. Just about every call made to the API is designed to describe the nodes of this graph, how they are connected, how and in what format the data will be input to the graph, how intermediate data will be updated during execution, and how the outputs will be delivered back to the function that triggered the graph’s execution. The problem with this approach, however, is that the entire architecture of your network must basically be defined and described using the TensorFlow API before you will see any results. In addition, there is a deep learning curve with this framework, with many new and potentially confusing artifices that must be mastered to achieve success (such as placeholders and variables, and the TensorFlow session concepts). In contrast, PyTorch is based on a dynamic execution graph. The graph is constructed and can be executed statement by statement. This is a much more intuitive approach and has a very natural feel for Python programmers that are used to this interactive flow. The learning curve for PyTorch is not nearly as steep as TensorFlow’s learning curve and even though the documentation is not as polished, most users will find PyTorch’s dynamic computation graph easier to use and master. Score one for PyTorch!
- Ease of Development and Debugging: Besides the natural flow that PyTorch’s dynamic computation graph provides, there is another big advantage to this approach: ease of debugging. At any point in developing the computation graph, the programmer is free to insert print statements between nodes or at any point, really, in the graph to see what’s going on. This is a great benefit to debugging one’s code. Things are not nearly as simple with TensorFlow, and I can tell you from experience that when there is a problem with your TensorFlow Graph, it can be very difficult to find out the root cause of the problem. This is another point that goes to PyTorch.
Bottom Line: There are good reasons to use each of these different deep learning frameworks. TensorFlow most certainly represents basic knowledge that every deep learning practitioner is expected to know. However, the more I use PyTorch, the more I am liking it and think it may be the future. My recommendation is to be familiar with both these frameworks. Know TensorFlow for basic DL knowledge, learn PyTorch for ease of use and less troublesome debugging.