Sunday, November 29, 2009

Knowledge Creates Possibilities

When I first left college 16 years ago — a bio major with a head for computers — I worked for Bank of America. I helped design software for a lucrative trading desk. One of the strongest memories I have from that time is showing a piece of software to the head of the trading desk, a tall, nervous, white-haired man who stuttered when his thoughts outpaced his ability to voice them. My boss and I had taken his requirements and created software that met them. He looked at the software and said, "That's nice. I wonder: Could it do this, too?"

I don't remember what the "this" was, but it wasn't a feature he had originally mentioned. This happened again. And again. And again. Couldn't he just tell us what he wanted the first time around? It took me a year or so to realize what was happening: He couldn't define the features he wanted because he didn't know what was possible. Once he saw one feature, it expanded his horizon of what was doable, and he could then make the mental leap to bigger, and ultimately better, feature ideas.

It may be the most important thing I learned on that job.

It's certainly one reason that Agile programming made sense to me when I first learned about it: The mindset urges you to get functional software in front of customers as early as possible so that they can give feedback. The stated reason for this idea is that customers can tell you what's important and get what they want. I have always added a mental note: It also lets them know what's possible.

But there's no line between users and developers here. I think everyone works this way. Well, everyone except those few people who are truly visionaries.

A co-worker and I are both prototyping features that require APIs. Being modern web programmers, we're building them as RESTful APIs. REST simply means that you present data organized hierarchically by "nouns," or resources (compare flickr.com/photos and flickr.com/photos/melissanicole, for instance). You typically present it as XML or JSON via HTTP so that even inexperienced programmers can retrieve the data and parse it.

He showed me a page he had set up at the top level of his API. It didn't reference any particular resource. Instead, it presented a list of all the calls in the API. I thought it was a good idea: Rather than have the documentation in some separate place, he had put it right in front of the user. At the time, he was updating this manually. I've maintained API documents before, and it's a pain in the ass.

But seeing his page, I realized that I could generate a similar page by having the system itself analyze its own internals. Look at all the request mappings you've set up, see which ones are under the path for your API calls, and add them to the page. Voila! Documentation that keeps itself up to date. (See below if you want some of the details.)

It took some effort. I dug around in Spring code, read through tips on forums, and yelled at my computer. But by the end of the day, I had a page of API documentation, in the normal format of the API, that I would never have to maintain or remember to update. New API calls would get added without me thinking about it. A simple annotation on the method that handles the request gives a human-readable definition of what the call does. (I also added an exclude capability for calls that we don't want to expose.) You could argue that the time I spent on this feature was greater than the time I would spend maintaining the documentation. That may be true.

But a week later, I had an epiphany.

As I've said, I'm experimenting with the idea of having load testing scripts that run early and often for new features. To that end, I had a JMeter script where I manually added the API calls as I added them to the system.

See the problem?

What if I could make a load testing script that would hit that top-level page and use it to figure out all the other calls that were possible? I'd have a script that would, with no maintenance on my part, load test every API call as I added it.

Or when someone else added one. They wouldn't even have to know about the load testing script, and their call would get load tested.

It took a bit of digging in JMeter's sparse documentation, but I got it working after a few hours. I add an API call, it shows up in the generated documentation page, and then the loadtesting script picks it up. Beyond my initial excitement of seeing the flow work, I stopped thinking about it. (Again, see below for some tips.)

I felt like the head of the trading desk from all those years ago. I didn't start with a vision of a load testing script that would dynamically adjust itself. But when I implemented one feature, that script came into view. And now I will always see it as a possibility, and that knowledge will create new possibilities I can't see today.

A Self-Documenting API
I'm using Spring 3.0 as a framework, and that's what made this work. If you're in the same boat, there are a few things that will frustrate you along the way.

You'll want a reference to the DefaultAnnotationHandlerMapping object that Spring maintains. However, you can't autowire a reference into your class, because Spring doesn't, by default, register the DefaultAnnotationHandlerMapping as a bean. But you can create a bean of your own for that class, and then it will work. (Note that you'll probably want to Autowire for an array of AbstractUrlHandlerMapping objects, in case you add more down the road.)

DefaultAnnotationHandlerMapping will give you a map of all the request mappings in your system, but the values in that map are controllers, not methods. You'll have to use reflection to go through all the methods in the class and figure out which ones actually have the @RequestMapping annotation for the given request mapping.

On that note, keep in mind that you may actually be getting a CGLIB proxy for the controller, so you'll need to get the actual class name in order to use reflection.

A Self-Discovering Load Testing Script

Because my API documentation page returns its results in XML, I was able to use JMeter's XPath Extractor to derive all the URLs to hit. Couple that with a ForEach controller, and you've got a loop that will iterate over each URL in the document. Put an Http Request Sampler in that loop, set the path field to the iteration variable created by the controller, and you're hitting whatever the current URL is. That's the basic format.

Spring's request mapping syntax looks something like /foo/bar/{id}, where {id} gets replaced at request time by whatever's in that space. (This is, incidentally, an awesome feature.) But /foo/bar/{id} is not a useful URL. I wrote a preprocessor for the Http Request Sampler that replaces {id} with some other value. You may be clever enough to do it with a Regex preprocessor; I did it with a BeanShell script that takes the current path — /foo/bar/{id} — and transforms it to /foo/bar/1234. Then it puts that value into a new variable called correctedPath.

JMeter allows you to use variables in the name field for a component. This is very useful. If you leave the sampler's name as Http Request Sampler, every single URL will get put in the "Http Request Sampler" test name, which will make your load testing reports useless. If you set the sampler's name to — using the above example — ${correctedPath}, the test name will get set to /foo/bar/1234, and each discovered URL will get its own line item in your report.

Sunday, November 15, 2009

Learning JMeter

At work recently, I set up automated load testing for some new features I'm working on. There are teams within EA that do large-scale load testing, but I have a need they can't meet. I want to know, every single day, if our performance metrics have changed in a marked way even against a development server. If they have, the change probably came from code that was checked in that day. If performance went down, our team can find the fix quickly or acknowledge the new baseline. If performance went up, we can figure out why and see if the fix can be applied in other places.

I downloaded JMeter and started giving myself a crash course on the software. I've used it in the past, but it was much simpler in those days. So were my requirements.

A simple case was pretty easy: hit a couple URLs that don't require parameters but do return dynamic data. I set up a thread group, the "how many threads should I use" required container, then I added HTTP requests that hit the URLs to that thread group, and then I added a "listener" that parses the responses. That gave me basic familiarity with the tool and allowed me to set up an automatic, nightly run of any checked-in JMeter tests. We use Hudson as a build system, and it has a nice little JMeter plugin that will graph the response times over time and also give you a simple per-run breakdown of performance.

But then I wanted a test suite that would hit a URL, parse the response, and run subsequent tests based on the data (an ID) that came back in the first response. There are plenty of hints that JMeter can do this kind of thing, but it took a lot of muddling to get it right.

I often say that I should contribute to open-source projects not as a professional programmer but as a professional writer. Documentation is often sparse and unclear, even on an established project like JMeter. Where is the "cookbook" section with "you want to do this common thing; here's how" entries? Where is the list of best practices?

You read their manual, you read the FAQ and the wiki, and you still spend a lot of time bumbling about, poking this and prodding that to see if the errors clear up. And I've barely touched the more powerful features.

Here are a few of the things I learned along the way, which I'm posting mostly for my own benefit.


  • JMeter can't do anything with a response — even show it to you — without knowing the MIME type. On the one hand, that forced me to return a clean response. On the other hand, I feel like JMeter should fall back to plain text if it has no other information.


  • Extract data from a response with a post-processor. If you want to use the response data from one http request in subsequent calls, you make the post-processor a child of that http request. (You can make a post-processor a sibling, in which case it runs after every request in the thread group.) I used a regular expression post-processor that matched the entire response. Even though there was no other data, I still needed to enclose the regex in (). I also needed the $1$ default template, even though I wasn't doing anything with multiple values.


  • It seems like every JMeter tutorial suggests that you add a Graph Results listener to your test plan. It shows you a graph of your response time. Ooh. Aah. Pretty. Also? Useless for debugging. I moved through errors at a rapid pace after I added a View Results Tree listener to the thread group. With that listener, you can drill down on the request/response for each and every server call. I also found the Summary Report listener to be more useful than Graph Results. Keep the pretty picture in to show your boss — it doesn't add time — but add the others to make your life easier.


  • Learn to love the JMeter functions. They're buried in Chapter 19 of the JMeter manual, but they're essential for making scripts that can be re-used in multiple places. I sprinkled the ${__P(name)} function throughout the text fields of my scripts so that I could fetch command-line properties such as TARGET_SERVER and TARGET_PORT. That means that running my scripts against a different environment will require nothing more than a different set of command-line arguments (prefaced with -J).

Friday, November 13, 2009

Profiling With Spring's AOP

Recently, at work, I wanted to set up an easy way to get lightweight profiling information about method calls in our system. There are lots of heavyweight profilers out there that give you great features at the cost of bogging down the machine they're profiling. Fine if it's a development box; less so if it's a production machine your users are using.

Because we're using Spring, we have access to a pretty decent aspect-oriented-programming system. (AOP is basically a way to add system-wide concerns to objects without those objects having to be aware of the systems in any way: no subclassing or encapsulation.) Since profiling is a classic cross-cutting concern — something you need throughout the system but don't want any individual object to know anything about — Spring's mechanism works pretty well.

In my final system, a programmer only needs to add a @Profile annotation on a method, and it will generate profiling information in the logs. I didn't want to profile every method all the time because of the noise it would generate and the performance hit it would cause. The programmer can refine the output of the profiling system a bit, but the default is to show method name, args, return value, and timing information.(Spring's AOP framework is limited to non-private methods that are proxied by the Spring system, so private methods and intraobject calls don't get tracked. If that becomes a problem, I'll add in true AspectJ support, but that shouldn't require any real changes in my code, since Spring's support uses AspectJ syntax.)

Here are the key parts.

First, the profiling method, with a pointcut/advice combination that says "run this code around any method annotated with a @Profile annotation.


@Around("@annotation(profileAnnotation)")
public Object profileMethod(ProceedingJoinPoint pjp,
Profile profileAnnotation) throws Throwable {

StringBuilder buf = new StringBuilder("method: " +
pjp.getSignature().getName());
if (profileArgs(profileAnnotation)) {
buf.append("; args: ");
if (pjp.getArgs() != null &&
pjp.getArgs().length > 0) {
buf.append("[");
for (Object o : pjp.getArgs()) {
buf.append(o.toString() + ",");
}
buf.append("]");
} else {
buf.append(" (no args)");
}
}

long currentTime = System.currentTimeMillis();
Object retVal = pjp.proceed();
if (profileReturnValue(profileAnnotation)) {
buf.append("; returnValue: " +
String.valueOf(retVal)); // valueof prevents NPEs
}

if(profileTiming(profileAnnotation)) {
buf.append("; timing: " + (System.currentTimeMillis() - currentTime));
}
log.info(buf.toString());
return retVal;
}



The Profile annotation looks like this:


/** Annotation to demark methods that should be profiled. */
@Retention( RetentionPolicy.RUNTIME )
@Target( ElementType.METHOD )

public @interface Profile {
ProfilingType[] value() default {ProfilingType.ALL};
}



And the ProfilingType enum looks like this:
public enum ProfilingType {ARGS,RETURN,TIMING,ALL}

The default output (ProfilingType.ALL) looks something like this:
method: [methodName]; args: [each arg as String]; returnValue: [returnValue]; timing: [some number of milliseconds]

Thursday, November 5, 2009

Sad, But Probably True

Reading the website for Apache Jmeter, I noticed this line: "JMeter's target audience is the testing community, which is generally not the hangout of developers or technical people."

Except for those developers who understand that the sooner you catch an issue, the cheaper it is to fix. Which, granted, is not as many as there should be.