I was at ExCel earlier this week for Microsoft’s annual Future Decoded event. Future Decoded’s a combination of big-picture keynote speeches – Internet of Things, quantum computing, artificial intelligence – and focused talks on current and future Microsoft technology like ASP.NET 5, Windows 10, the new Roslyn compiler infrastructure. It’s always an excellent event, but something that really jumped out at me this year was a talk by Chris Bishop from Microsoft Research about Project Oxford, a set of AI services for dealing with speech, natural language – and human faces. As you can appreciate, human faces are a hugely important part of casting. From 10×8″ headshots to online portfolios, a performer’s photographs have always been an essential part of any sort of casting service, and Spotlight is no different.
We humans are sociable animals, and one of the things we are astonishingly good at is recognising each other’s faces – our parents, our friends, celebrities, even the grainy photocopies in the picture round of your local pub quiz. This capacity to detect and recognise faces is vital to our social groups and communities, and accurate face recognition has long been one of the holy grails of artificial intelligence research. Over the last decade, there’s been some remarkable developments in the areas of computer vision associated with human faces.
First, there’s face detection – analysing a photograph, and working out if there’s any people in it. Like this example from Apple’s iOS libraries:
When I visited Japan in 2007, Sony were proudly showing off a cutting-edge digital camera that would detect human faces and adjust the autofocus so that your subjects’ faces would be in focus – very cool, very innovative, very expensive. Eight years later, most of us have a phone in our pocket that can do face detection via a built-in camera, and if it doesn’t, Facebook will detect the faces when you upload your photographs.
So… what’s next? The really exciting thing – certainly from a casting perspective – is face recognition, and being able to measure similarity between faces. How many casting briefs have you seen looking for someone to play a historical figure, or brothers and sisters of a character who’s already been cast? Or those breakdowns looking for a “Kate Winslet type” or a “Michael Fassbender type”?
Among the technologies Microsoft demonstrated at ExCel on Wednesday was Project Oxford’s “similar face search” capability. It’s available via an HTTP API from Microsoft Research, but they’ve also put together this rather neat demo called TwinsOrNot.net. So I decided to kick it around a bit and see what it can do – and, since this is Spotlight, I’ve tried it out on a couple of castings to see how well Project Oxford thinks these performers matched the people they’re portraying.
That’s Morgan Freeman, who portrayed Nelson Mandela in “Invictus”; Tom Hanks playing Walt Disney in “Saving Mr Banks”, Helen Mirren playing Elizabeth II in “The Queen” – and Sasha Baron Cohen, who was in talks to play Freddie Mercury in a Queen biopic, but it was confirmed in 2013 that Cohen was no longer involved and the project is now on hold.
Of course, making a fun online technology demo is one thing; to actually turn this kind of technology into a usable casting tool is still some way off. For starters, the processing power involved in this kind of analysis is considerable – there’s nearly a quarter of a million performer photographs in Spotlight’s database, so to analyse our whole data set for similarity would mean analysing over sixty million pairs of photographs, and Microsoft’s beta programme is currently limited to 5,000 requests per month. But not long ago, this kind of stuff wasn’t just expensive, it was actually impossible, and with the cost of computation halving every eighteen months, it won’t be long before this kind of research opens up a whole new range of possibilities for digital casting tools.