Saturday, January 30, 2010

JAXB -> JSON

I'm working on a new project at work, and it has a by-now-standard RESTful API web service layer. And, of course, like all such layers, it needs to support output in XML or JSON.

Supporting XML is easy in Java, thanks to a technology called JAXB. Among its many capabilities is one which lets you annotate Java objects and generate an XML document from those annotations.

For instance, you could write an object with the following:

@XmlElement(name="name")
private String name = "Joe";


Pass that to the JAXB marshaller, and you'd get this XML:


<name>Joe</name>



And the reverse is true: Make JAXB parse the XML, and you'd end up with an object whose name instance variable was set to Joe.

So our XML version was done in no time. But JSON support is a less-entrenched technology, and thus there's no elegant built-in solution. My initial idea was to just use one of the JSON libraries out there and have it construct a JSON object from the XML document generated by JAXB: Dump the XML into a buffer, convert it, and print that text to the HTTP response output stream. That works (though it's not very efficient), except in this particular case:

<things>
<thing>
<name>Joe</name>
</thing>
<things>


Your schema might define things as a collection with zero or more items, but the XML-to-JSON converter doesn't see the schema, and so it creates a things object instead of a things collection. Once there are two items in the collection, everything works fine. (It's possible that if we had an actual schema doc somewhere, the converter could know about it. But since the annotations define the output, we don't have an xsd file around at the moment. )

I looked at various options for mapping objects, but most felt like duplicate work: I'd have to maintain a JSON mapping in sync with the XML mapping, a setup that's guaranteed to get messed up in some subtle way someday. What I really wanted to do was use the JAXB annotations as the definition for the JSON output.

Jersey looks like it's trying to accomplish the same thing, but it also seemed (when I first looked at it) to be not yet ready for prime time.

So I came up with my own solution. While I can't post the code, I can give you a good sense of how it worked.

I created a concrete object,JAXBBridgeParser. I also created an interface, JAXBBridgeParserListener. You throw a JAXB annotated object and an implementation of JAXBBridgeParserListener at the parser, and it uses reflection to find the annotated fields and methods in that object. For each annotated field/method, it calls some appropriate method on the listener.

In addition to the ubiquitous startParsing and finishParsing messages, the parser fires special-purpose messages at the listener. The easiest scenario is what I call the "simple type" field or method. A String, a Java primitive, a Date, etc. In that case, the parser says to the listener, "Here's the name of this field and here's the value." In the JSON scenario, this translates to a key-value pair.

Next up is what I call the "complex type" field or method. This is a value that is itself an annotated object. In that scenario, the parser first tells the listener that it's beginning a complex object with a given name; then it recurses into a processObject method with that new object. That will in turn trigger its own "simple type" or "complex type" messages. When it comes out of the recursive call, the parser tells the listener that it's done with the complex object of the given name. This corresponds to a JSON key-value pair where the value is an object.

Finally, I have to worry about collections. These could contain simple types or complex types. When the parser sees an annotated field or method that is a collection, it tells the listener that it's starting a collection via the beginCollection method in the interface. For each item in the collection, it sends a message to the listener telling it that it's processing an object inside a collection. There are separate methods for simple types and complex types in collections. When it's done with the collection, it tells the listener that it's finished. In JSON, that corresponds to a list that might look like this: ["a","b",{c: "c",d:"d"}].

The end result works like a charm: My JSON objects line up perfectly with our XML documents, and I don't have to do anything to get them there. The JSON listener is about 20 lines of code. Any object that can be converted to XML can also be converted to JSON simply by passing it through the system. (A custom Spring ViewResolver figures out the right converter to use based on the request, so anything that can be served up as XML also automatically gets a JSON version.)

And I can support new formats pretty easily as they rise in prominence in the web world. I've thought about wiring up an HTML formatter that would give default browser versions of the data. HTML is a little trickier because I'd want some of the items — object IDs, for instance — to be attributes and some of the items — names, for instance — to be page content. But it should be doable. I've also thought of wiring up YAML support, which should be brain-dead simple, just because I can.

My particular scenario is pretty simple: We don't use some of the deeper features of JAXB, so I don't have to worry about handling them. I do, however, support @XmlJavaTypeAdapter, running the referenced adapter on a field before the parser sends the value to the listener. And my version has the downside that I, and not some larger open-source community, have to support and extend it. Still, it was a pleasant little exercise, and it's working well.

If you go this route, I encourage you not only to have lots of unit tests to catch subtle edge cases, but also to set up a more behavior-focused test. In my case, I made a test listener that simply counts the number of messages it gets from the parser. I set up an object with a fixed set of annotations, and then passed it and the listener to the parser. My "unit test" then checks the counts for each message in the listener.

5 comments:

  1. what would be the added value of JSON output other than client side scripting? JSON seems to focus on a much narrower set of problems... sounds like one more thing to learn but with little or no technical merit over XML for what it does ...

    ReplyDelete
  2. Hm, I responded to this a while back, but I don't know why my comment didn't show up.

    Anyway, JSON is very definitely for client-side scripting. It's faster to parse and work with within JavaScript. And since my API layer will be used by browser-based clients, I want to support that.

    But system-to-system exchanges still seem to be largely XML based.

    ReplyDelete
  3. JAXB can marshall to JSON in the exact same way it handles XML. You might want to consider revisiting the APIs because you are a lot unnecessary complexity.

    ReplyDelete
  4. That's good to hear. It would certainly reduce complexity. However, by this point I've added a number of custom annotations that lets our parser do things JAXB can't (even in XML), so I'd have to see if I could make those work.

    ReplyDelete
  5. Derrick.,

    Howz it scaling up? I have very similar requirement where i need to publish ReST for the world but fetch data making use of SOAP calls.

    Regards
    Balaji Setty

    ReplyDelete