An Obsession With Programming: May 2011

Monday, May 9, 2011

More With Google's WebGL Globe: Legen - Wait For It - Dary

In my last post, I talked about Google's WebGL data visualization globe and briefly mentioned the "legend" format for the data array.

I finally got a chance to do something with it, and it is both simple and powerful. If you're in legend mode, your big data array is four pieces of data at a time, not three. The first two pieces are latitude and longitude. The third is the magnitude of the line that will be drawn (divided by 200 to compensate for their multiplying by 200). So what is the fourth value?

Anything you want. When you specify legend mode, the color function you pass in to the globe's constructor gets that fourth value for each point on the globe. That in turn allows you to define the color of the line based on something other than magnitude (the default).

What that really means is that you get an extra dimension in your data. Height is always height, but legend mode allows you to add non-height information based on the color of the line.

Google's initial example — search language by volume — is a great example. The color of the line comes from the dominant language of the area. Looking at their globe, you can immediately figure out where the high search volume comes from (big cities, mainly) just by looking for the tall lines. You can also see which languages dominate search in a given area. English (blue) covers the United States and the United Kingdom. French (light green) covers France and also Quebec. Portugese (dark green) dominates Portugal and Brazil, and also Madeira. Likewise, the Canary Islands are yellow, because they're a part of Spain.

For a new visualization at work I did with the globe (as usual, NDAs make me cautious about giving more exact details), people at my studio wanted to see the height of the line represent the number of events at a given location. But they wanted the color to represent whether the average data from that spot represented a "good" or "bad" user experience. Little red lines would be unfortunate, but might not get flagged as high priority. Big red lines would be a problem, because it would mean that a lot of players were having a bad experience. Fortunately, there weren't any of those, but there were some long yellow lines, which suggests an area where we could improve the player's interaction with our game.

To get that view, I set the globe to legend mode and set the magnitude field of each point to the sample size. Then I set the fourth data point to a very heterogeneous number that represented the user experience. My color function looks at that number and puts it into a bucket. Good returns green, bad returns red, and a middle ground returns yellow.

That means each line on my globe gives three pieces of information: location on the planet, sample size from that location, and quality of user experience from that location. It's easy to spin the globe and look for red hot spots. It's also just a pleasure to play with the visual, even though you can find the worst spots pretty quickly.

One person saw my work and suggested adding a calculation that would make the line more or less red, for instance, depending on how far over the bad threshold it went. Bright red would mean a really bad experience; dim red would be right at the line. I guess that would give us three and a half data points. Before, we knew it was bad. Now, we'll know how bad it is.

Legend mode makes this all possible.

Thursday, May 5, 2011

Working With Google's WebGL Globe

Today, Google released a data visualization that shows search volume by language across the globe. It uses WebGL, so you'll need a recent version of Chrome and decent video drivers to see it.

It's different than the normal "things on a globe" visualizations you see in — for instance — Google Earth, because it incorporates height as an additional dimension. Google Maps and Google Earth give flat perspectives: You can only guess magnitude based on clusters of the familiar red, upside-down teardrops.

By itself, the visualization would be a five-minute folderol to enjoy on a lunch break. But Google's Data Arts team released all the source. Take your own data, muck with it a bit, and you too can have an interactive globe dripping in an oh-so-modern, shades-of-black aesthetic.

I took the bait.

Over the course of today, I took some of our game analytics data and built a local WebGL globe visualization. We have an established workflow for creating latitude/longitude tables from our data — this isn't the first map visualization I've done — and I built on top of that. Once I extracted the information I wanted, I wrote a quick Ruby script that converts the exported data into the format their code looks for.

I can't share a link with you — it's on our internal network — and I can't tell you what I mapped. But I can tell you that it got lots of oohs and aahs in the office.

I can also tell you that it wasn't a simple "replace their data with our data" exercise. If you're planning on doing something with the tool, here are a few things I figured out by trial and error.

Note: This has since been fixed. Don't rely on their README: It's wrong. At least, the JSON format is. It's much better to read the globe.js file to see what it wants, though that requires you to know JavaScript. Rather than a complex set of nested arrays, the code prefers one long array with latitude, longitude, and magnitude strung together like beads on a string. (There's an optional "legend" mode that requires a fourth point, but I haven't played with it. I assume it lets you define different data series. Their addData method takes the data block and a set of options, and one of those options is format: The default is magnitude — three data points — and you can specify legend — four data points.)
A portion of their code multiplies the magnitude by 200 to ensure the small numbers in their data — percentages, I suppose — become big enough for bars that reach high into the sky. For our particular data set, I had to divide the number by 200 in order to get the bars to be correct once they multiply it by 200.
The default coloring of the lines relates to the height, but it looks like you can pass a custom "get the color" function for different logic.
Their source code relies on a top-level directory on your website named /globe for the location of the map image that wraps onto the sphere. You can change that easily enough. Search for "imgDir" in the source.