We do Deep-Learning.
May 24, 2018 • Noon van der Silk
Some of you may have dropped by our stand at Melbourne Knowledge Week (on Mothers Day!) earlier this month (my mum did!):
If you came by, then thanks heaps! It was a really great feeling to be chatting to everyone about something we’ve worked hard on, mostly in our spare ttime, over the last few months. It was also fantastic to be amongst all the cool people from RMIT (thanks Kiri and Jessica!), who are working on lots of very interesting fashion projects. Hoping to collaborate on something with them soon!
We posted a few photos on our instagram, @silverpond_ml, from the day.
I wanted to take some time now to break down how the pieces fit together, and also provide a little bit of a summary of what people found on the day.
The Fashion Space Explorer
Firstly, the Fashion Space Explorer tool is online, permanently, here:
It’s entirely browser-based (including all of the deep learning) so it actually works best when you view it on a computer with a GPU. I think it is also only supported on Google Chrome/Chromium at the moment, and it will take ~1 minute to load the first time.
In any case, those details aside, once it opens, you’ll be taken to a random point, and should see something like this:
There are three main areas:
- The “Design Canvas”,
- The “Landmarks”,
- and, the “Parameters”.
Design Canvas: There are two main things to do here. The first is: It shows you what you’re currently looking at! Secondly, it’s a canvas upon which you can draw (using the mouse). The buttons on the side are:
- “R”: New random image,
- “C”: Clear the canvas,
- “W”: Change the brush to white,
- “B”: Change the brush to black.
Hover over the icons with the mouse for more information.
On the right side of this little section is the main “designed” image. In the middle you are able to control what “print” is placed on this item.
Landmarks: This is a set of locations that we identified beforehand as containing interesting items. Note in particular the little blue diagram next to them, that I refer to as the “Signature Diagram”. The idea here is that this diagram is a visual shorthand representation of the 8 parameters that control the image. You can get a feeling for how this image relates to the settings of the parameters by considering the following picture:
Here I’ve drawn an indicative blue line that, when wrapped around in the circle (in an anti-clockwise way) gives the signature diagram!
Parameters: In this region you can have direct control over the parameters that influence the image that is shown. In particular, there are 8 parameters here, and each of them can be controlled via the slider. The images next to each slider show what the image would be if that particular parameter was set to the value at that region of the slider.
You can control these by dragging, or if you’re on a fast enough computer, you can just hold down the numbers 1-8 to move them up, and Shift 1-8 to move them down.
The Programming Environment
Perhaps of mild interest to people is the suite of technologies that were involved in putting this together.
Specifically, the process was:
- Train some models in TensorFlow,
- Export the trained models to “deeplearnjs” (what is now TensorFlow.js),
- Deploy it to GitHub Pages.
The training time for the fashion-space model took only about 30 minutes on a single GTX 1080 TI.
The Deep Learning Models Involved
The essential parts are:
- A GAN to learn to generate images like in the training data,
- A VAE so that we can sample, in a finite domain, easily from the trained model,
- A CPPN so that we can generate arbitraryly-large images that are smooth and not pixelated.
- A style-transfer network (infact, fast style transfer) to apply the prints to the images.
The gan-vae-cppn is munged together as one model, where we build an autoencoding network with a latent vector of size 8 (the number of parameters we allow the user to control!).
A big part of the success of this project was how the people interacted with it. Andy built an amazing joystick setup that allowed people to move in the 8 available dimensions:
We found that this approach made it much easier for people to interact with the interface. Without it, the interaction has reasonably complicated, it either required knowledge of gaming controls (which a lot of adults don’t have) or an overly-complicated custom control system, which no-one but me was familiar with.
The MKW Designs
So, after Melbourne Knowledge Week, we ended up with a few people posing for a photo for instragram for a small competition (congrats to the winner!), but we had plenty of people play around with the joysticks, so I wanted to take a look at what kind of clothes they found. I grabbed the photos and here’s a sample (there were actually 1,574 images):
But what I really wanted was to see how well the people explored the entire space of available designs.
What I wanted to do was get a 2d view of my fashion space.
I picked at random 3,000 images from the training data; ran them through my model, obtained the z-vectors, and pushed them into the UMAP code like so:
Now, this is super-cool! One reason it’s very cool is that the network I trained had no direct incentive to think about classes. It wasn’t trained on the type of item it was looking at at all; it was just trained to be good at reconstruction (as well as the VAE part, which encourages it to be well-distributed). But even still, this model has learned some pretty good ability to perform classification.
For example, it would very rarely confuse a bag with anything else, and it would be quite good at distinguishing boots, sneakers, and sandals.
There’s a fair amount of overlap between shirt, coat, and pull-over, and this is clear in the kind of images that come out of the tool itself.
Now, let’s take a look where the MKW points fall on this map:
Cool! We can see that the MKW points (the purple ones) hit at least every class. We can also see, quite interestingly, that we hit many points wildly outside the training set. In part this is probably due to the “hyperspace jump” button, that took people to a random point (here the z-vector was drawn uniformly at random from the range -1 to 1 in each dimension).
This is all well and good, but I wanted this plot to be interactive, so I could mouse-over the points and see the images, and more. So, I played around with Vega-Lite (I’m not sure I’d use Vega-Lite again, but at least I got it working, with some mild hacking), and I managed to get this graph embedded in the Fashion Space Explorer itself:
It’s cool because now you can mouse-over points and see the reconstructed image for that dataset, and you can click on the points and load them up in the fashion space explorer!
To get to this mode of this interface, the
show2d param needs to be
Overall I’m quite happy with how this whole thing turned out. Clearly, the images have a long way to go before they are anywhere near the kind of quality that we’d like; but I do think this is a generally-interesting technique for a way to collaborate creatively with a machine. Heaps of inspiration has come out of this for us, such as:
- “Can we get the clothes on people using ML?”
- “Can we design richer objects?”
- “What new GAN-capabilities would we need to design more detailed items?”
- “How to decide what model produces the “best-looking” things?”
- “How can we run several AI models in the browser efficiently?”
- and very importantly: “How to design collaborative AI interfaces?”
I want to thank everyone at Silverpond who put in heaps of hard work and moral support for this project, but specifically highlight Susie and AndyG for doing a great deal to make this possible and engaging for people.
Finally, if you’re inspired by this or in any way interested, please do get in contact with me: firstname.lastname@example.org.