Jump to content

Nvidia GauGAN takes rough sketches and creates 'photo-realistic' landscape images

The AchieVer

Recommended Posts

The AchieVer

Nvidia GauGAN takes rough sketches and creates 'photo-realistic' landscape images

Using segmentation maps and a new deep-learning model, GauGAN can create fairly realistic images.



Researchers at Nvidia have created a new generative adversarial network model for producing realistic landscape images from a rough sketch or segmentation map, and while it's not perfect, it is certainly a step towards allowing people to create their own synthetic scenery. 


The GauGAN model is initially being touted as a tool to help urban planners, game designers, and architects quickly create synthetic images. The model was trained on over a million images, including 41,000 from Flickr, with researchers stating it acts as a "smart paintbrush" as it fills in the details on the sketch.


"It's like a colouring book picture that describes where a tree is, where the sun is, where the sky is," Nvidia vice president of applied deep learning research Bryan Catanzaro said. "And then the neural network is able to fill in all of the detail and texture, and the reflections, shadows and colours, based on what it has learned about real images." 


In a demonstration to journalists at its GTC conference on Monday, the researchers showed GauGAN in action, and how it was able to render images in real-time, switch the styling between different seasons, and how water reflected and interacted with the landscape. 


The machine used for the task contained a recently released Titan RTX, however Catanzaro said it could be possible to run the same application on a CPU if the rendering of the image was limited to once every few seconds, or created on-demand. 



"This technology is not just stitching together pieces of other images, or cutting and pasting textures," Catanzaro said. "It's actually synthesising new images, very similar to how an artist would draw something." 


In a research paper to be presented as an oral presentation at CVPR conference in June, the researchers said using human testing via Mechanical Turk showed its images were preferred to those generated by CRN, pix2pixHD, and SIMS algorithms, although in the category of cityscapes, it barely beat out the latter two techniques. Compared to other algorithms, Catanzaro said GauGAN had a better vocabulary, and required fewer parameters. 


At the end of 2018, a team of researchers including Catanzaro presented a paper on predicting future frames of video for synthesised city scenes. 


Nvidia also used generative adversarial networks to create artificial brain MRI imagery, to help overcome a lack of brain imagery to train networks on. 


Diversity is critical to success when training neural networks, but medical imaging data is usually imbalanced," Hoo Chang Shin, a senior research scientist at Nvidia, explained to ZDNet in September. "There are so many more normal cases than abnormal cases, when abnormal cases are what we care about, to try to detect and diagnose."







Link to comment
Share on other sites

  • Replies 0
  • Views 347
  • Created
  • Last Reply


This topic is now archived and is closed to further replies.

  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Create New...