Google Brain, the company’s deep learning research center, just unveiled a major breakthrough in sharpening pixelated images. What’s that, you ask? Well, think of it as the “zoom and enhance” technique you’ve seen in so many movies. A security camera picks up a blurry image and some computer genius clears it up. Google developed it, sort of, and called it “pixel recursive super resolution.”

Google basically tried to enlarge low resolution photographs to recover a plausible high resolution version of it. But because the input image does’t really contain that much information, there are usually a lot of plausible high resolution images.

A super resolution model must account for the complex variations of objects, viewpoints, illumination, and occlusions, especially as the zoom factor increases.

Realistic, high resolution images are only possible when hard decisions are made about the type of textures, shapes, and patterns present at different parts of an image. Google just had to figure out a way how to make those decisions, and neural networks were their way to go.


Google’s new upscaling method (NN) versus the previously existing one (Bicubic).

If you feel like diving into every little detail of this reasearch: by all means. But thanks to some talented science translators at Ars Technica, we’re able to give you a comprehensible summary.

First, Google had a conditioning network map 8×8 source images against other high resolution images. It downsized them to 8×8, and tried to make a match. Then, Google had a prior network use an implementation of PixelCNN, to add realistic high resolution details to the 8×8 source image. The prior network therefore contained a lot of information of a large number of real high-res images, and therefore was able to add new pixels to the image it wanted to “blow up,” based on what it “knows” about that type of image.

Still with us? Here’s an example by Ars Technica:

If there’s a brown pixel towards the top of the image, the prior network might identify that as an eyebrow: so, when the image is scaled up, it might fill in the gaps with an eyebrow-shaped collection of brown pixels.

Not every image eventually resembled the actual ground image, but they did look like realistic images, and that is the actual breakthrough. Researchers showed a test panel enlarged images versus real images and they asked them which one they thought had not been artificially altered. Ten percent guessed it wrong when comparing pictures of human beings, while 28 percent got it wrong when comparing pictures of bedrooms. To put that in perspective: 50 percent would be perfect, but… up until now, existing upscaling techniques (bicubic scaling) had zero percent fooled.


The best and worst images in the panel study. Fractions below the images denote how many times a person choose that image over the ground truth.

[via Ars Technica]