Blog-gallery

Byte Sized Tech Review — AI and Upsampling of images

 

Since the resurgence of machine learning in 2012, there has been a variety of research done into using Deep Learning as methods of image processing. There are some very interesting use cases like ML denoising or GANs for face generation,  but another use case we’re particularly excited for is super-resolution of images. This technique in promise could let you take tiny pixelated images from renders or references and up-sample them up to perfect 4k resolution. 

But enough words, lets see how it all works! 

I used this framework to generate the following images. https://letsenhance.io/boost 

Target Image: (Ground truth)

Target Image: (Ground truth)

Output 1: Bilinear Interpolation: 2048 x 1024 (Photoshop Default)

Output 1: Bilinear Interpolation: 2048 x 1024 (Photoshop Default)

Pixelated Image: 512 x 256 (What I input into the upsampling program)

Pixelated Image: 512 x 256 (What I input into the upsampling program)

Output 2: 2048 x 1024 ML Upsampling of Image

Output 2: 2048 x 1024 ML Upsampling of Image

As shown above, ML super-resolution can retain some detail that is otherwise lost in conventional methods like bilinear up-sampling. There is a trade-off though, depending on the net, some high res details can be distorted in odd ways (for example it doesn’t seem to want to up sample this poor woman’s face).  

Original Image (Ground Truth): 1024 x 512

Original Image (Ground Truth): 1024 x 512

Default Photoshop Bilinear Up-Sampling: 1024 x 512

Default Photoshop Bilinear Up-Sampling: 1024 x 512

Scaled Down, Pixelated Version of Image: 512 x 256

Scaled Down, Pixelated Version of Image: 512 x 256

ML Up-sampled Image: 1024 x 512

ML Up-sampled Image: 1024 x 512

In this case, the machine learning up-sampling seems to only slightly outperform the default Photoshop up-sampling. The failure in this case segues nicely into the main reason we’re so excited about machine learning up-sampling.

The neural network used in both of the above examples was trained on a general set of images, so it will perform best on inputs most like the things it has been trained on. So while the net seems to work better on a closeup of a house and car, rather than on a zoomed out image of a house and river, it indicates that the dataset it was trained on includes more closeups of houses and cars.

The good news is improving such a ML up-sampling system for a specific use case can be trivial: the researchers building the net can simply change the dataset the net is trained on and in theory get better results for that use case. For example, a super resolution net trained on millions of architectural images will perform the best for architecture and related imagery. The main bottleneck here is collecting such a dataset for training. The interesting thing is as far as I can tell, the datasets needed for such an undertaking already exist. Architecture firms thrive off documentation and organization: and I wonder what machine learning researchers would do if they knew all the organized and dated renders from thousands of building projects are simply sitting on the servers of architectural firms, waiting for their time to be used in a machine learning project.

By Brian Aronowitz, LAB Technologist


More links for experimentation: 

https://topazlabs.com/gigapixel-ai/  
https://letsenhance.io/boost  

 

The Technique if you want to know more: 

The research keywords generally revolve around “Upsampling, SuperResolution, Deep Learning”, just plugging those into google should get good results.  

More reading and github links for the super curious:  

https://towardsdatascience.com/deep-learning-based-super-resolution-without-using-a-gan-11c9bb5b6cd5 

https://github.com/idealo/image-super-resolution 

 
Guest User