Saturday, October 2, 2010

Integration Tests

I've been a fan of unit tests for a couple of years now. Once I buckled down and wrote some, I realized how powerful they could be. They give you more stability out of the gate and they provide good regression tests. They allow me, the developer, to keep moving forward in tasks and minimize the time I spend going backward to debug or test older functions.

But I've always held the stance that unit tests shouldn't test things such as interactions with Hibernate or Spring. Those are well-tested frameworks with strong community support. Writing unit tests that touch those layers always struck me as a waste of time.

That said, most server-side Java code ends up moving into those layers. Virtually everything I write ends up talking to a database. And the server code itself is a loosely knit tapestry of services that chat with each other. That code can certainly have bugs — incorrect queries, edge conditions, bad assumptions, whatever. So how do I get tests against it?

Integration tests. Like unit tests, integration tests demonstrate that function A returns result X given inputs Y and Z. The difference is that in an integration test, you are testing the interaction between systems versus the simple in and out of a self-contained function.

I finally decided that I wasn't getting the test coverage I wanted with just unit tests. I was finding subtle bugs tucked away in database calls and service-to-service communication. Mock objects — code that presents the expected interface to a layer without providing the full functionality — only get you part of the way. If your mock object is keeping information in a local map instead of hitting the database, you're not testing that the query to the database returns the right thing.

Once I decided to incorporate integration tests, however, I went down a bit of rabbit hole. Your services need to talk to each other, so you need to wire them up to each other. But in the real, running code, I use Spring to manage all that. Fortunately, I can use Spring in the test environment, too. And Spring allows me to have a supplemental bean configuration file that overrides the production-code config file. So, for instance, I can have a bean named "telemetryService" that overrides the bean of the same name in the main config file. The test version doesn't actually send telemetry information. It effectively becomes a mock object. (Though in that particular case, it's done with a simple boolean.) I have an S3 service layer in the test config file that points to a test S3 bucket instead of our development one. Any beans that aren't overridden are pulled from the main file.

My integration tests do have to call a method to set up that configuration, however. Since the Spring config doesn't change at runtime, I have a static utility method that checks to see if the configuration is set up and, if it isn't, sets it up. This violates the principle that a test should not rely on existing state from another test, but that configuration load is time consuming.

But services don't just talk to each other. They also talk to the database. You don't want them to point to a database you actually use, so you need a test database. And that test database needs to a) be kept up to date, b) have some test data, and c) not let data linger (to ensure that no test ends up passing because of the coincidence of some other test running before it).

For the last two requirements, I incorporated DBUnit. DBUnit can do unit testing against the database, but I don't really need its form of that. Because I'm using Spring to set up my app in test mode, all my services can continue to use Hibernate for database work. (They can also use my JRuby-based query files.) But DBUnit offers two key services: It can take a data file of test data and load the database with it, and it can wipe any desired tables clean between test runs. When a test suite starts, it calls a method that ensures the Spring configuration is read (if it hasn't already been) and wipes/reloads the test database. When a test suite starts, I know that there is predictable data in the database and that that's the only data in the database.

What about between tests, though? I could do a wipe and reload around every test, but that's time-consuming. Instead, I have a setup method that configures a transaction and a teardown method that rolls back that transaction. All of this means that developers have to remember to do all this when writing an integration test, which I dislike, but I hope to integrate some tools that will automate that.

My test database also needs to be current with the latest schema. For that, I incorporated Autopatch, which will automatically get your database up to date when a code base runs against it.

So far, so good. But integration tests take a while to run, and I worry that developers might, faced with progressively longer test runs, disable tests on local builds. To keep our tests spry, I separate integration tests so that they're only run if you tell Buildr you're in the "longtests" environment. (That environment also specifies the test database properties.) The build system, which runs within minutes of any check-in, always runs them, however, so even a developer who gets hasty has an automatic monitoring.

All of this took three solid days to get working properly, but now that's it running, I got exactly what I wanted: the ability to have much deeper test coverage. The other day, I wrote a test that called service methods that altered the database and moved files around on S3. At each step in the chain, I could check the state of each component and verify that it had worked succesfully. And I knew that it was executing real queries and running real code. No mock objects.

The ability to have deeper test coverage means I've set myself a goal of writing those tests. Every feature I work on now has tests against it, even if they require database work or multiple services. And when I find a bug in my existing code, I force myself to write a test to reproduce it and then work on the code until the test passes. That way I get free regression testing whenever a check-in happens.

I have noticed one negative aspect, however. Unit tests, by their very nature, force you to write smaller and smaller methods that are easier to test. With no limit on the amount of activity you can do in an integration test, however, I find that my methods aren't channeled into smaller pieces. I try and do that because I know that ultimately produces more maintainable code, but my integration test system doesn't give an almost automatic push the way unit tests do.

Big methods or not, I still get to run a complete suite of extensive, realistic tests with a single command. That's pretty powerful.