Saturday, May 12, 2012

Functional Programming For The Imperative Mind

I've been learning erlang on the side lately, largely because of its increasing role in the universe of scalable systems. I know of a number of highly-scalable pieces of software that are written in erlang. But I also think that shifting programming paradigms every now and then is a good mental exercise.

I've mentioned before that I often write word puzzle solvers (for puzzles you might find in the National Puzzlers' League newsletter) when learning a new language. It's a problem I understand, it doesn't involve a lot of error handling or user interface, and you have to think a little about scale because wordlists can be reasonably big. I'm doing the same with erlang, but I'm focusing on writing a cryptogram solver.

Getting the gist of the language wasn't too difficult, but I struggle with thinking in functional terms. How do I take a problem that I know how to solve in an imperative language such as Java and translate it into functional patterns? It's fine that one can code a Fibonnaci sequence so quickly, but how often do you need to do that in a production system?

I asked some functional programming fans at work, and they gave me some ideas.  "Functional Programming for the Rest of Us" is a six-year-old post that recently made the rounds on Hacker News. It trumpets the benefits of these languages but didn't help me with my problem. Various books on algorithm design were also suggested, because functional programming is really derived from lambda calculus. While I'm looking at those, that obviously increases the friction to learning. It's not as if I'm writing business-critical software in erlang yet, so increasing the work to learn it seems like a good way to let it fall by the wayside.

I still haven't found a good single resource for my not-so-unique situation, but I've begun to piece together a strategy based on bits here and there. The paper "Why Functional Programming?" got me focused on decomposing programs into smaller and smaller bits. That's a good strategy in imperative programming, too, but I've found with functional programming that I need to brunoise my ideas, not just dice them. O'Reilly's Erlang Programming articulated something that I had to begun to come to myself: accumulators. It's hard for me to think of building up a list from the end out, but it turns out I can easily picture how to do it by passing values down through the recursive calls and then returning that list of values when I'm at the end of the chain.

That technique has a benefit: You often naturally end up with tail-recursive code. But I worry it's becoming too much of a crutch for me. Because I can easily think in those terms, it's the technique I often use. I write a "pretty" function for clients of the code, and that largely calls down to a function that uses accumulators to compose a return value.

Are there other good resources for programmers in my situation?

5 comments:

  1. This doesn't answer your question, I just wanted to point out that I think Erlang's focus is in high-availability services, rather than scalable ones. Though obviously these can be the same, Erlang's obtuse design is geared towards hot-swapping code and the VM is optimized for long runtimes, rather than speed or throughput.

    I admire the willingness to brave Erlang's insane syntax in order to learn more about functional systems, but maybe your efforts would be better rewarded learning more about Go or Scala or something? :-)

    ReplyDelete
  2. The availability bits are cool, but it seems to have a lot of scalability aspects as well. Facebook's chat supposedly runs on it (presumably via ejabberd or a modification thereof), and both RabbitMQ and CouchDb are written in it (when Linden Labs analyzed RabbitMQ, the numbers they produced were pretty phenomenal).

    I actually thought about Scala, but I figured if I could learn Erlang, Scala would be a snap to go back to. (Of course I stopped before the other end, pure ML or Haskell, because Erlang actually is used in the server world :) )

    ReplyDelete
  3. This is my rudimentary understanding, which you're probably are well of, but I figured I'd share anyway. :-)

    Say you're baking a bunch of cakes... You have a bunch of ingredients to gather and mix, then you put the batter in the oven, wait a while and out pops a cake. You're using the baking time to mix the next batter, but eventually you realize you're not making cakes fast enough, so you decide to use another rack and a second oven. Great! You've quadrupled your output, but soon you find you can't mix the ingredients together fast enough. So now you get a friend or two to help out. The problem is, your kitchen is too small, so you'll be constantly bumping into each other. You decide to go use another bigger kitchen so you can all work together.

    The first is adding scalability (increasing output), the second is adding concurrency (increasing things done at the same time). Erlang is focused on concurrency - which is definitely a part of scaling any system, but not always. In the example above, say you change it to baking cookies, which don't take nearly as long to cook - you still need help mixing the ingredients, but the number of ovens isn't as important. Or the reverse, where you're cooking a 3-hour turkey, which doesn't take as long to prepare, but you do need lots of ovens.

    Erlang excels at managing lots and lots of tiny connections (for a long time, without ever dropping any) like chat, MQ-style messaging and routing DB requests. But it doesn't do so well when it has to actually process the data itself or manage gigabytes of streaming data, etc.

    My main point is that developer productivity is important as well. Is the system going to be developed and then left alone for a long time? If so, Erlang might be fine - it's cryptic, but hey, you don't actually have to look at it much. Is the system going to be constantly tweaked? Maybe some other language that actually makes sense would be better.

    I just happened to see this in my reader today:

    http://radar.oreilly.com/m/2012/05/four-short-links-15-may-2012.html

    The two middle links are relevant - akka and pykka - they're libraries that (theoretically) bring Erlang-style share-nothing message passing to Java/Scala and Python.

    None of this is actually helping you with your problem, but it does bring up a question - is your problem the right one to solve with Erlang? :-D

    -Russ

    ReplyDelete
  4. That's a good analogy. I used a similar at work to explain scalability to the rest of the team, except that I used an analogy with Residential, Commercial, and Industrial zones in SimCity. Because, you know.

    (At one point I pondered modeling our network architecture using our GlassBox simulator engine: As more people show up, we spawn more servers, and so forth. But then I got busy with my real work. Plus, I couldn't figure out how to model scaling along more than just horizontal axes [partitioning and optimizing based on functionality, and so forth])

    Functional languages are actually good candidates for puzzle solvers, because those so often reduce down to recursive logic. (Whether this one is the right one to solve with Erlang is a good question, though the real problem space is "solve a 'real' problem with Erlang to learn it better".).

    I know of akka, for sure. But I thought Scala had built-in support for actor models? It's been a while since I've looked at the language.

    ReplyDelete
  5. Oh! And I should add that the end goal is not a command-line program but an actual server that you can ping and say "solve this cryptogram." And after that the task is to see how much load it can handle as a server. i.e., I want to see what deploying and load testing an Erlang program is like, not just learn the language. "Cryptogram solver" was a bit misleading in the original post. :)

    So the ability to handle lots of connections is very relevant (though mostly for fun; even if this became the most popular cryptogram solver on the entire Internet, I'd guess -- maybe -- 500 concurrent users ever,).

    ReplyDelete