- Algorithms go to Hollywood in new Kristen Stewart film
- Computational technique redraws a scene in a predetermined artistic style
- Amazon GPUs enabled high-resolution impressionistic image transfers
An artist holds a brush in her hand. Sweeping it across the canvas, she uses stroke, color, and texture to share a small part of what she feels and sees inside her own head.
A powerful image can convey a wealth of emotion and introduce the viewer to a new perspective, a new way of viewing the world. But sometimes it’s just the beginning.
In Come Swim, a short film from Starlight Studios, a single image hand-drawn by director Kristen Stewart became the starting point for a novel combination of technology and artistry.
Stewart and her collaborator, Bhautik Joshi, made use of Neural Style Transfer (NST), a recently-developed technique that uses neural networks to artistically redraw images in the style of a source image, to realize her vision.
The ultimate mash-up
Joshi, a research engineer at Adobe, has long been fascinated by the intersection of art and science. His popular online video “2001: A Picasso Odyssey” redraws scenes from the famous film in the style of Picasso.
The video was Joshi’s first attempt at taking to two thematically and artistically divergent images and combining them using an algorithm developed by Leon A. Gatys. The video caught Stewart’s eye, and she recruited him to help create a poetic, impressionistic style for Come Swim.
“Kristen was a great collaborator,” says Joshi. “Working with her exacting eye and intuition for effects and how to achieve the look she wanted was an amazing experience.”
From canvas to film
At Stewart’s request, Joshi used NST to redraw key scenes in the movie in the style of her original painting. But his first attempts at transforming the images initially resulted in a false sense of success.
“Seeing an image redrawn as a painting is compelling enough that nearly any result seems passable,” says Joshi. “But repeated viewings factored out novelty and focused on our real metric of success — is the look in service of the story?”
Because of the huge number of creative possibilities, the team initially concentrated on narrowing that number. They began by focusing on a single, representative frame from the film (of a man on a bed underwater), experimenting with color, texture, and other parameters for style.
“We explored mapping what we wanted to achieve creatively to what was possible to do algorithmically,” says Joshi. “This mapping wasn’t always straightforward, and we went through a series of hybrid creative/technical iterations before converging on the final look.”
Swim in the cloud
Joshi processed the first iterations of NST on a laptop using a low-resolution cellphone photograph of the original painting. Later iterations, however, required a well-lit, high-resolution photograph that produced results more faithful to the textural complexities of the original painting.
But with higher resolution images came a higher demand for computing power. The final images for both scenes in the film required 500 hours of compute time on an Amazon Web Services EC2’s G2 instance. (And that’s not counting the many earlier iterations it took to reach the look they wanted.)
Optimized for graphics-intensive applications, the AWS G2 instances (a.k.a. virtual servers) boast NVIDIA GPUs, each with 1,536 CUDA cores and 4 GB of video memory. But even with the power and convenience offered inexpensively by the AWS G2 instance, they were only able to produce 1,024 pixel wide images, whereas Stewart needed 2,048 pixel images.
Joshi’s tactic was to upscale images to 2,048 pixels and de-noise aggressively, arriving at a solution consistent with the impressionistic, sketchbook look Stewart desired.
Joshi points out that every year before the televised Academy awards ceremony, there is a smaller ceremony for the ‘Sci-Tech Oscars’ celebrating technological innovations. “I think in a few years we’re going to see a flurry of Academy awards centered around cloud and on-demand computing for film,” he says.
Leap of faith
The use of NST in a production context is pretty new, but will eventually become common place – and then surely surpassed as technologically-enhanced artistry evolves.
“Next time, I’d like to decouple the techniques we were using from the creative iterations,” says Joshi. “The way we set it up, it became difficult to unwind the toolset from the pipeline, so if we wanted to try out a different implementation of NST it became difficult to do so.”
Tinkering with computerized art since he was in high school, Joshi has made steady progress into the mainstream cinematic workspace. Thanks to Stewart’s bold vision, Joshi's pace has been accelerated, as has the acceptance of neural networks to augment filmmaking.
“I’d like to call out Kristen especially for taking the leap of faith needed to use a totally new computational artistic technique and see it through all the stages of production,” says Joshi.