Generative Adversarial Networks (GANs)

Two interesting papers were published this week demonstrating some remarkable progress in the use of GANs to produce and enhance imagery that is surprisingly realistic.  The first paper, from Nvidia (  – a company that produces GPUs that are often re-purposed to train deep learning networks, was trained to generate images of celebrities that look very realistic, as the following image demonstrates.  These are not pictures of real people, but rather were generated by Nvidia’s implementation of a GAN network.


The second paper, by a group at the Max Planck Institute (, demonstrates a machine learning technique to take a blurry image and produce a sharpened image that ends up being very similar to the original from which the blurry image was made.  The first image, below, is a low-resolution image of a bird on a branch.  The middle image is the one produced by the author’s GAN network.  Compare the generated image to the original high-resolution image on the right.  They are, indeed, very similar.


GANs are composed of two adversarial networks, the first network (the Generator) produces images that attempt to “fool” the second network into believing they are “real”.  The second network tries its best to discriminate between real images and those produced by the Generator.  Through multiple rounds of back-and-forth (like an arms race in AI land), the generator gets more and more capable of generating images that fool the discriminator and, as a result, generate images that appear to be very real.  Traditional GANs start with an image consisting of random noise, and iteration by iteration, the generator begins to morph these noisy images into something that shares characteristics from the dataset that the discriminator uses to distinguish between real and artificially-generated images.

The Nvidia team used this traditional method, training its discriminator GANs on a database of celebrity images.  The Max Planck team, by contrast, started with the blurry image of a sample from the discriminator training set and the generator, using this blurry image as the seed, began to iteratively refine the image until it was virtually indistinguishable from the original.  It is conceivable that a large training set of domain-specific imagery might be capable of similarly improving the resolution of new images that were not in the original training set, but this is a topic for future research.

Applications of this technology might include, for instance, the processing of old imagery or movies shot in low resolution and re-issuing them to today’s audience as a high-def facsimile.  Or, adding capabilities to tools such as Adobe Photoshop that photographers could use to improve the quality of their own images.

About David Calloway

Hi! I'm David Calloway, the author of this blog on deep learning and artificial intelligence. I first started working with neural networks in the mid-80's, before the "dark winter" of neural networking technologies. I graduated from the U.S. Air Force Academy in 1979 with B.S. degrees in Physics and Electrical Engineering. In 1982, I received an MS degree in Electrical Engineering from Purdue University where I worked on early attempts at speech recognition. In 2005, I obtained another M.S. degree, this time in Biology from the University of Central Florida. My interest in neural networks and deep learning was rekindled recently, when I got involved in a project at Nova Technologies where I am using deep learning and TensorFlow to recognize and classify objects from satellite imagery.
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s