Digitologist» Projects

fastICA in AS3

admin — Tue, 12 Apr 2011 02:03:01 +0000

[click the image above for a demonstration]

I’m not going to spend a lot of time explaining what fastICA is, or what Independent Component Analysis is in general, but if you need it in Actionscript and you’ve been looking for it, here it is. I only vaguely understand how it works (for now!), but math is math and code is code, and once I had my Matrix Math package set up, it was just a matter of effort and optimization.

It’s ported it over from the MDP package in Python. It’s not the full implementation, just as much as I needed to make the Pulse project work, so at some point I’m going to finish porting over the rest of the internal methods. It’s mad slow, yo! My next goal is to offload the heavier calculations to PixelBender, like I did with the Lomb-Scargle code the Pulse project used, but until then, you’ll need to only run it against reasonably-small data sets if you want it to run with anything approaching “speed”.

While I work on getting a code repository set up, use the contact form or leave a message in the comments if you want me to send it to you, I want to share! Click through, or on the image above, to launch a demo.

I Can Read Your Pulse by Webcam

admin — Tue, 05 Apr 2011 04:33:18 +0000

[click the image above to try it yourself]

The above is the latest version of a labor of love I’ve been working on, based on research coming out of the MIT Media Lab. Turns out the veeeery minute changes in color of your skin tone can, frame-by-frame, be read and decoded to arrive at a just-fine measure of your pulse.

The general process by which it works is called photoplethysmography, and it’s the same way normal fingertip pulse readers work. Those use an LED to light up your skin and then measure how the light absorption changes whenever blood flows through.

I read through the original research paper, and then later got to peek at the code the researchers used (through the sponsor relationship of my parent company with MIT), and challenged myself to rebuild it in Flash so that it could be web-deployable (and so could be thrown into a banner or a microsite as a gimmick for one of our clients). Forgive the interface I gave it, I honestly don’t have a design bone in my body.

Want to make your own? Awesome, here’s what you need to work out:

First, you need to identify a region of open skin from the webcam feed, so fire up a face identification method and have it mark out a subregion of the user’s face. I used a tweaked version of the Marilena library for AS3, and had it strip out only the nose and cheeks area, so that eye and mouth movements wouldn’t screw things up.
Then you need to average out the red, green, and blue color values for the pixels in that subregion, and store a bunch of consecutive frames of them in a matrix. Not having found a satisfactory AS3 matrix math class, I went ahead and made my own AND BOY WAS IT A DOOZY. Turns out things like matrix multiplication, while straightforward, are not very computationally efficient, so I had to research, optimize and reoptimize before I could move very far.
Once you have a sufficient number of frames worth of data (this version gets by with around 70 frames, a little over two seconds’ worth, but the more you have, the more stable your readings will be), the real fun begins. You should have three signals at this point–the traces of each of the color channels of the user’s skin as they change over time–so the MIT researchers figured out that you could pass them through a blind source separation algorithm (more on that in a later post) and at least one of the outputs would contain a fairly dependable pulse reading. Again, not having found, really, ANY blind source separation libraries for AS3, I went ahead and made my own AND BOY WAS IT A DOOZY. My version of FastICA for AS3 is basically just a port of the version in the MDP parkage for Python, but with a crapload of optimizations to speed it up for Flash.
So by now you should have a clean pulse signal, and it’s time to extract out the period of the beats. Fire up your favorite periodogram function and pass your cleaned-up signal through it and you should be able to get a beats-per-minute measure easily. I went ahead and rolled my own implementation of the Lomb-Scargle method that was based, again, on the work from the MIT researchers and BOY WAS IT A DOOZY. Again, not a very computationally efficient process (though I recently found a better method I want to try), so this version depends on a customized Pixel Bender kernel to do all the heavy lifting outside of the Flash thread (I’ll do a whole post on it eventually).

So I plan to keep playing with it. The low frame count is primarily due to the fact that fastICA starts choking when it has to chew on too much data, so I want to get that going faster (maybe by dumping it off to a Pixel Bender kernel, too) so that I can load in more frames for it to process over. I don’t like how jumpy and off the reading can be sometimes. Work in progress!

Face Tracking Research – DroidDoes.com

admin — Thu, 17 Mar 2011 06:10:12 +0000

[click above to view a short video (I show up at 00:11)]

Last spring one of the guys on the team whipped up a fun little Flash game that used motion tracking (via frame-differencing) to have a boxing glove follow the user’s hand around the screen, batting away projectiles. It was cute and fluffy, but kept messing up whenever the user moved their head or talked, because that motion was getting picked up, too, skewing the centroid of the motion and ruining the experience. I knew that if we could locate the face, and mask out its effects on the center of motion, it would clean up the whole thing and make it a lot better.

I set off to make it happen, and quickly found the Marilena libraries, based on Haar cascade face recognition, which could pick out a face in a frame and mark its location. It worked really well, and probably could have worked for a low-level execution, maybe a really light banner ad. Armed with this fix ready in my back pocket, I set up a conversation with our Chief Creative Officer to show it off.

He was interested, but not at the edge of his seat, until I pointed out that by tweaking the method somewhat, we could use it to do some calculations of where the user’s face was in relation to their screen and mimic the now famous head-tracking effect demonstrated by Johnny Chung Lee with a Wiimote and some infrared emitters.

After showing the video and promising we could pull off a similar effect, the room suddenly filled up with people, all Creative Directors from various accounts that the CCO leaned out and called in. The conversation became all about this, and burger-swatting fell by the wayside while each CD asked how this head-tracking method could be used in their client work. We settled on building a prototype for the upcoming Droid summer campaign that the user could navigate around just by moving their head.

Once we got started, though, we quicky hit the limit of what an average machine could process. The processing power it used in finding the face meant that we couldn’t add a lot of additional functionality (Papervision 3D was also ruled out due to performance issues), so I resumed my research and finally found the answer in a recently published Danish research paper, calling out a method called CAMSHIFT as a computationally efficient method of tracking a face, once identified by the Haar cascade method.

With a little further research, I stumbled onto a Flash port of CAMSHIFT, where you sampled a region of color by drawing a box around it, and it could track that color region as a blob that moved around frame-by-frame without losing tracking. Best of all, it was lightning fast, compared to Marilena. The only issue with the CAMSHIFT method was that the user would have to either sample their own face by dragging a box across it, or else be tricked into doing it by having them line their face up with an on-screen prompt.

We chose the latter, and created a “Calibration” stage, just after the loader, that prompted the user to center their face inside an on screen oval, which would then snap the sample and start tracking right away. The client looked at the demos we sent over and approved the project, and a few months later DroidDoes.com was launched. Over time, the effect was severely diminished and at times removed, at the clients’ request, so at this point all that’s left is a subtle twist when you lean back and forth.

This past winter we also found a French anti-smoking website/game/anime thing that seems to use the same CAMSHIFT library we used, only with less subtlety about it. In wrapping up the project, I put together a combined library that used Marilena to identify the face, then passed that rectangle on to the CAMSHIFT side to take over tracking it, which helped reduce some of the performance issues. I’ve used it from time to time, but I’m still looking for a really killer application of the idea. Shoot me a note if you want to take a look at the library and offer any thoughts.

Building a Social Media Listening Platform from scratch

admin — Wed, 09 Mar 2011 06:59:56 +0000

[click above to view a short video]

The brief from management was to build a system capable of collecting brand mentions from all over the web, organizing and analyzing them, and then displaying them on an interface that we could distribute to our clients and account managers. Mcgarrybowen needed its very own Social Media Listening Platform.

With some further questioning, more requirements were established. The system should be a peace-of-mind application for distribution to our clients, as a monitor of their real-time brand reputation on the web. It should be optimized for quick-glance reviews, with broad but shallow content, but with the ability to drill down from top-level reports into granular metrics reporting. It should be capable of tracking the reputation not just of the brand itself, but of its key competition as well. It would need to be accessible from the web or the iPad, so HTML5 was a must.

With a mammoth assignment like this, our first step was to break it down into manageable, workable chunks.

Finding the data sources:

Our first challenge: Where on the Internet is the brand being mentioned, and how do we collect that data?

We divided our focus into three buckets:

News & Headlines – What is the mainstream news media saying about the brand? What press releases have been published that reference the brand?
Online Authorities and In-Market sources – What sites are consumers likely to visit when they’re searching for information about the brand or the industry? What ratings and reviews are they likely to view that mention the brand?
Buzz – What are consumers likely to come across on the web when they are not actively in-market? What are bloggers, tweeters, and Diggers saying that might affect a consumer’s opinion of the brand?

We knew that a robust solution would eventually take us down the route of building screen scrapers that could collect and organize data from any site we pointed it at, but for the initial prototypes, we decided to focus strictly on sources with well-established APIs. We picked 30 sources, checked documentation and ran test queries, organized the returned data and did a gap analysis to figure out how we would organize the data across multiple sources.

This is where we ran into our first set of issues:

How often are we querying? Some sources might only need to update once a day, but others, like Twitter, would require near-constant monitoring to keep up with the sheer volume of results.
What format are the returns in? We realized that our pulls would need to be capable of parsing XML, JSON, or CSV depending on the source.
Are we violating anyone’s Terms of Service? We knew we wanted to store everything in a centralized database, but several sources had specific prohibitions against storing their data in external frameworks.
But the biggest question turned out to be: What, exactly, are we searching for? We quickly realized that our pulls were going to have to be keyword-driven, submitting a given term to the API and logging what the returns were.

A keyword search strategy would need to be established. We started out by searching only with the brand’s name, but realized that even subtle misspellings would be lost in this search, so we created the “Brand-words” category. One of our test cases was Marriott, which meant also including “Marriot”, “Mariott”, and “Mariot”. Sub-brand terms like “Courtyard”, “Renaissance”, and “Residence Inn” filled out this category.

Our second grouping was “Competitor-words”, which included search terms with the names of the top competition within the brand’s industry (in the case of Marriott, we used “Hilton”, “Intercontinental”, and “Four Seasons”). The final grouping was “Industry-words”, hoping to capture conversations about more general topics within the industry.

Once we’d worked to satisfy all of these issues, we had a clean and stable database, pulling each source regularly, and with an API for getting the data back out. We now could move on to the next most pressing concern:

How to analyze the data:

Since reputation management is all about keeping people’s opinions more positive than negative about your brand, our first priority for the prototype was to set up a sentiment analysis engine capable of reading through each of our database items and appending an evaluation of how positive or negative they were. Even the smallest amount of research revealed this to be a huge task, but we were up for the challenge.

We looked at a number of different approaches, both custom and off-the-shelf, and determined that we’d get the most value out of building and training our own Naïve Bayesian classifier, a well-documented method of extracting sentiment from unstructured text. Given a number of sample text snippets, each with a manually-supplied categorization, the system should, in time, be able to recognize which of the categories any new text snippet should belong in. Anything that’s noticed as mis-scored can be resubmitted to the system with a correction, gradually increasing the tool’s accuracy over time.

We knew that our system should be capable of recognizing “Positive” vs. “Negative” vs. “Neutral”, but after looking at the data we were accumulating, we were again surprised by how much of the data we were grabbing wasn’t right for the system. We included a “NSFW” category to weed out the more colorful entries, a “Non-Applicable” category (hadn’t realized that most “Hilton” searches would result in the latest Paris Hilton scandals), and a “Spam” category to filter out the surprising volume of Tweet-spam featuring bogus vacation offers.

A handful of lucky interns were tasked with poring over 20,000 of our database entries and manually scoring them as one of the 6 categories twice, each being scored again by someone else as confirmation. If they two scores differed, they were flagged for further administrator review. After amassing this much data, testing confirmed that any new items were being scored with up to 72% accuracy.

Presenting the Data:

With our back-end in order, our attention shifted to the interface: How are our clients going to view the data? Our only limitation was the desire to have it viewable on the iPad, so a purely-HTML5 Canvas application was needed.

A designer was brought in to prepare the visualization scheme, based around a radial “health” metric that compared the the total number of brand mentions to how many of those were positive or negative. We ended up with three data views:

Up-to-the-minute, live-streaming brand mentions, monitoring a single day’s brand health as they are picked up by the system
An at-a-glance historical review review of brand performance for recent pre-set time periods
A deep analytics toolkit to monitor the brand’s sentiment over time, keyword-by-keyword and for any selected date range

The combination of these three techniques satisfied all our original requirements and created a platform for future data viz designers to pick up where we left off.

What we learned:

Building a Social Media Listening Platform is not easy, but once the fundamentals are in place, they start to click together like puzzle pieces. Broken down by challange:

Intake: for each source you plan to monitor, you will need custom scripts to call for new data, clean it up, and store it to your database. Make sure you know the limits of each API’s Terms of Service and if they have a limit on how often and how much data you can fetch. Your data storage will need to grow as you add new keywords and will depend on how general or specific your terms are.
Processing: find a sentiment-analysis algorithm you like and train the heck out of it. The smarter you can make your system, the more accurate and valuable it’ll be. Unfortunately, this cannot (in most cases) be automated, so account for the training time in your planning. Bribe your interns to do it with pizza and iTunes gift cards and you’ll have great success.
Display: compelling data visualization is at least as important as the data itself. Get a designer who knows what they’re doing, and it’ll be a simple matter of hooking the right data pipes to the right display outputs.

Get these three areas right and it’ll be a piece of cake–or just ask nicely and I can help you make one, should be pretty good at it by now.