Friday, August 31, 2012

metricsmaw

Web applications have always been heterogenous environments -- Apache or Nginx talking to PHP or Java which is talking to MySQL or Oracle. But these days, it seems like the number of potential components in a system has exploded. And those components, often drawn from the front lines of new tech, come with different levels of maturity.

That realization happened again as I evaluated node.js at work and wanted to capture metrics about the system. How many of X could node.js do versus a Java version? And it's occurred to me at home when working with Ruby, Erlang, or Scala. I can't always get the metrics I want from the environment I want in a consistent way.

So I wrote a program to fix that. For lack of a better name, I called it metricsmaw, and I checked the very early version into github.

metricsmaw is an Erlang server that does one thing: It receives data into metrics and relays those metrics to other places. Currently, the other places are csv files and a Graphite server. The metrics it supports right now are counters, gauges, and a by-the-minute rate meter that provides rates for the last minute, the last five minutes, and the last 15 minutes. Metrics and reporters are set up as Erlang behaviours so I can add new ones easily.

While this solves a problem I often run into, it's also a good chance to flex my Erlang muscles and dig more into a language and environment I'm really enjoying. Erlang seems like a good fit here, as it excels as long-lived software that needs to be fault-tolerant and highly concurrent.

And since I first thought of this in the context of node.js, I added a node.js library for talking to it. I haven't really dug in to the node.js idioms, so it doesn't handle, say, disconnected sockets very well, but it solves the basic problem.