Every day it seems researchers are solving problems that most of us didn’t realize we had. Now, courtesy Yijun Li, Ph.D. candidate at Cal Merced, and his team at the “Vision and Learning Lab,” we have a solution to “Photorealistic Image Stylization.”
The team sought to create software which could, when fed two images—one the “content” image, the other the “style” image—project the aesthetic of the latter image onto the content of the former. Thus the content of the first image would remain the same, simply restyled in whatever colors and lighting are found in the “style” reference photo. Further, this was all to be done quickly, and without showing signs that the images had been manipulated. As the team sums it up:
“For a faithful stylization, the content in the output photo should remain the same, while the style of the output photo should resemble the one of the reference photo. Furthermore, the output photo should look like a real photo captured by a camera.”
They recently announced their success, formally submitting their paper, “A Closed-Form Solution to Photorealistic Image Stylization,” on Feb. 19, 2018, concluding that not only was their method 60 times faster than preexisting ones, but that their resulting images were also preferred 62% of the time by human viewers, “significantly outperform[ing]” competing systems.
What made their approach unique was its use of a two-step system.
First, it employs Whitening and Color Transform (WCT), a technique used, for example, to make photographs appear as if they were painted. As per Vignesh Ungrapalli, Software Engineer at nVIDIA, this itself is a two step process which first drains all style from the content image, leaving it without color, and then, in a reverse process, takes the drained color from the “style” reference and uses it to transform the drained image back into one with color.
However this in itself is often not enough to achieve photorealism. Thus they employ a second, “smoothing” algorithm “for suppressing [the] structural artifacts” created during WCT. This process is much less intuitive than the first, but the images they provide help explain what is happening.
Figure (c) is the image immediately after the WCT process has completed. As you can see, the sky is a blotchy mix of white and blue, a pattern which it rarely, if ever, takes on. To make the image photorealistic, then, they need to even out this pattern, which is where the “smoothing” comes in. This algorithm is told that “pixels with similar content in a local neighborhood should be stylized similarly,” and thus proceeds to even out sharp differences between, for example, the navy blue and white that make up the sky. The result is an image which does in fact appear real, even after having taken on the aesthetic aspects of another.
What this will all be used for in practice we have yet to see, but the code is open source and listed on github, along with numerous examples of it at work. One real estate agent on Reddit hinted at a possible use when they asked if it could make a house appear as if it was being captured during dusk.
Here’s a video explaining the process as it was performed prior to the current paper:
Meanwhile, Redditor whodoyoucall testifies that it does in fact work “in a few seconds,” and while it’s “not as great as I think you all think it is,” he declares, “it’s pretty cool!” He also warns that it currently doesn’t work on images bigger than 600kb.
Some more examples from github:
Feature Image Courtesy Chris Ried