What is the difference between a Cat and Logans Run?
Nov 24, 2008 | Mobile & Web
WHAT IS IT?
After decades of advances in machine learning and computer vision algorithms I am happy to report that mankind has finally been able to come up with a useful application for this internet thing that everyones been raving about of late.
You will see above Exhibit A. A seemingly ordinary picture of an antiquated DVD release of a camp 70’s sci-fi classic alongside a seemingly ordinary Felis silvestris catus, or domestic cat (don’t tell her I called her ordinary or it’s likely that I’ll meet a mysterious demise in my sleep).
Next to it you will see the results of the processing of the aforementioned Logans Run / Cat imagery by a truly brilliant mobile application named ‘Snaptell‘.
So what’s going on here? Well basically Snaptell is a very smart and efficient visual database lookup application. From what I can glean from their website and my own first hand experience with similar techniques in Photosynth here’s what I believe to be going on.
The first stage takes the image, uploads it to their servers and processes the image looking for distinctive features, maybe a logo, someones face, essentially elements of the artwork. This technique was first developed in 1999 by David Lowe with his Scale Invariant Feature Transform (SIFT) algorithm.
The SIFT will give you something that you can think of as the unique DNA of the image. The nice thing about this technique is that it is robust enough to take into account differences in scale, orientation, lighting conditions, etc and still be able to match against a known image.
Snaptell have run these same algorithms against some known corpora of popular media, namely DVD, CD, videogames, and book covers. Probably by writing some crawler algorithms against Amazon, IMDB or Google. The net result is that they have a massive database of image DNA for popular media that they can use to compare against the images sent in by users from their cellphones. The time required to process the images has seen huge leaps forward in terms of speed in recent years and with the high bandwidth connections of many newer mobile phones the whole process from snapping the image with your cameraphone to getting the results is ~5 seconds.
I’ve included a couple of obscure and blurred tests I ran that returned perfect results so you can see just how powerful this application really is.
If you have an iphone you can install their custom application from the Appstore or with any other cameraphone simply email the image of the DVD, CD, Book or Videogame to firstname.lastname@example.org to receive reviews, pricing, searches in seconds.
WHY IS IT RELEVANT?
Most humans would have no trouble discerning between a miniature Michael York and my cat (There are some possible exceptions who frequent 2nd Ave. but lets not go there right now). The ability for computers to effectively take an image, process it, discern meaningful features and then do something interesting with it in realtime has been mainly limited to the academic world and a few big brother face recognition systems for finding soccer hooligans and terrorists from CCTV cameras.
Of course the ability to quickly return results is currently helped along by the fact that the categories they’re comparing against are cataloged and known. We’re not *quite yet* at a point where you can take a picture of any object and know what it is.
Our scientists used a lot of these ideas and techniques in the creation of the Photosynth project when I was at Microsoft Live Labs. Click here to see a great demo of the technology and some demonstrations up on my personal blog.
In my opinion, SnapTell is a very attractive acquisition candidate for one of the big online networks or telcos to pick up. They’re being very smart at positioning themselves as an advertising platform, they’ve got an excellent showcase application and a long list of forward thinking big brands that they’re working with. There are very real solutions to broadening the different categories of content one can recognize beyond just books and dvd’s but it needs a big player to enable it. It’s great news for consumers as well as we start to get closer to a world of Augmented Reality. In the not too distant future cameras attached to your eye glasses will be constantly scanning the environment, looking for matches against known objects and providing contextual information about the object you’re seeing in a fraction of second. Think Wikipedia entries and personalized offers. This isn’t science fiction, this is going to happen and it’s going to happen soon.
Anyone see this recently…