Tuesday, September 28, 2010

Config Files As Programs

While working on a few recent tasks, I've hit upon a small but powerful technique: My configuration files are actually programs in their own right.

Expression languages of some form are de rigueur in modern frameworks. Especially in server code, where you need to deploy the same code base to different servers and have them point to different databases, mail servers, or whatever. In Spring, for instance, you can write something like ${project.database.url} and it will assume you mean to use not the literal string but the value of the project.database.url property, which is probably different in your dev environment and your production environment.

But my technique is vastly more powerful. I'm sure others have thought of it, too, but it sort of snuck up on me.

The first time I used this trick was when I put my work's SQL into Ruby scripts. I did that for readability's sake, but as I added more queries, I began to do the things one does does with programs. I refactored two similar queries into a method that took a differentiating argument and returned the query. I declared constants to hold common values. I started to ponder a Ruby module that I could import for various utility functions. The end result from the Java code's view was a set of query definitions. But within the query definition itself, I had normal programming tools at my disposal.

On a personal project (which I hope to write about soon), I decided to use YAML for the config files. I've gotten used to the format for config files, thanks to my work with AppEngine, and Ruby can easily work with it. My original idea for the config file was just fixed dates and times when I wanted my script to wake up and record an Internet radio stream. But then I realized that I wanted to record some streams on every weekday or on the first Sunday of the month.

So I thought, what if, before I feed the file to the YAML parser, I first run it through ERB, Ruby's powerful templating system? As long as the YAML parser sees something that looks like YAML, it doesn't matter how it gets there, right?

Right. I defined a method in my config file — in my config file! — that would return the next weekday after the given date. Then, when defining the value for a field, I called that method to calculate the value. My "YAML" file looks something like: start_date: <%next_weekday.strftime('%Y-%m-%d')%> 18:00. Which won't give you anything useful from the YAML parser. But run it through the ERB parser, and the YAML parser sees start_date: 2010-09-28 18:00, which is perfectly valid YAML syntax and is what you intend for the value to be.

Friday, September 3, 2010

Words With Friends Bingos, Redux

In my last post, I described writing a program to find the five-letter combinations most likely to yield a bingo in Words With Friends.

Then I realized there was a subtle bug.

Consider the word abreast. When you look at how Ruby's combination method splits it up, you'll see two entries equivalent to STARE, because there are two a's. This is correct from a programming view, but from a player's perspective, it doesn't matter. Having STARE is sufficient to get to abreast. You don't view the a's as different.

Here is the new word list with the corrected code. TEARS and TIERS are still your best bet, but the list has shifted a bit and the numbers are different.

aerst 1295
eirst 1237
aeirs 988
aeist 984
einst 983
einrs 972
eorst 949
aelrs 940
eerst 920
aeins 854
aeirt 816
aenrs 805
aenst 799
eeirs 792
aelst 788
eilst 782
aeint 765
aeils 765
ainst 751
eilrs 748


And here's the corrected script.

# find the 5-letter combos most represented in 7- or 8-letter words

wordlist = ARGV[0]

combos_to_count = Hash.new(1)

File.open(wordlist,"r") do |file|
line = file.gets
while(line)
line = file.gets
if line then
line.chomp!
next if line.length != 7 && line.length != 8
chars = line.split(//)
unique_combos = []
chars.combination(5) do |combo|
combo.sort! {|a,b| a <=> b}
unique_combos << combo.join
end
unique_combos.uniq!
unique_combos.each {|combo| combos_to_count[combo] = combos_to_count[combo] + 1}
end
end
end

sorted_combos = combos_to_count.sort {|a,b| b[1] <=> a[1]}
(0...20).each {|num| puts sorted_combos[num][0] + " " + sorted_combos[num][1].to_s}

Optimize For Bingos In Words With Friends

See the updated version here.

Somewhere in my library of books is one called Everything Scrabble, a guide to improving your Scrabble game. I read through it a few years back and have probably forgotten half its contents, but I do remember the author's comment that highly ranked Scrabble players often swap tiles in an effort to increase their chances of hitting a bingo (a word that uses all seven letters on your tray).

I've been playing Words With Friends (screen name: linechef) and have been thinking about that Scrabble strategy. Bingoing is less of an advantage in WWF. They're only worth 35 points instead of 50, and they often open up large swaths of bonus tiles for your opponent to gobble up. But they are usually worth it if you can hit a bonus tile yourself.

Everything Scrabble provides the scoop on which five tiles you should keep in your tray to maximize bingoing, but my mind let that knowledge go into the abyss. So I decided to write a program to rediscover it.

You can imagine how the program works: Go through every seven- and eight-letter word in the Words With Friends dictionary, the ENABLE wordlist, and find every unique combination of five letters in it. Keep track of how many copies of that combination you've seen throughout the dictionary, and then sort.

Let's start with the answers first. Here are the top 20 combinations, along with the number of times they occurred (the ENABLE list has 172,823 words):

aerst 2413
eirst 2199
einst 1780
eorst 1698
eerst 1646
einrs 1596
aeist 1587
aeirs 1484
aelrs 1468
aelst 1350
aenst 1346
eilst 1338
eiprs 1286
aeirt 1273
aeprs 1266
aenrs 1258
aeins 1245
einrt 1238
deirs 1233
ainst 1232



And here's the Ruby code. Ruby 1.8.7 added a combination method to the Array class, which makes this program tiny. You tell it how big you want each subset to be, and you pass it a block which takes each subarray as an argument.


wordlist = ARGV[0]

combos_to_count = Hash.new(1)

File.open(wordlist,"r") do |file|
line = file.gets
while(line)
line = file.gets
if line then
line.chomp!
next if line.length != 7 && line.length != 8
chars = line.split(//)
chars.combination(5) do |combo|
combo.sort! {|a,b| a <=> b}
combo_str = combo.join
combos_to_count[combo_str] = combos_to_count[combo_str] + 1
end
end
end
end

sorted_combos = combos_to_count.sort {|a,b| b[1] <=> a[1]}
(0...20).each {|num| puts sorted_combos[num][0] + " " + sorted_combos[num][1].to_s}



Once you have the code, you can start asking other questions. Is the list different for words with exactly seven letters? A little bit:

aerst 609
eirst 493
eorst 450
aelrs 413
eerst 404
einst 385
aeprs 379
einrs 348
aelst 346
eilst 335
acers 331
aeirs 330
aenst 327
aeist 322
deirs 320
aders 318
eiprs 317
eilrs 297
aerss 296
deers 295


Note that there are only 676 permutations of the other two letters you could fit in your tray, so with STARE you can bingo with almost any combination. I leave learning those 609 words as an exercise for the reader.

What about a different dictionary? Here's the same run I first did but with the Scrabble Official Dictionary, Third Edition.

aerst 2395
eirst 2182
einst 1759
eorst 1670
eerst 1628
einrs 1581
aeist 1562
aeirs 1473
aelrs 1458
aelst 1335
aenst 1327
eilst 1317
eiprs 1278
aeprs 1258
aeirt 1257
aenrs 1239
aeins 1232
ainst 1224
einrt 1219
deirs 1218


Of course, some of this is common sense: EST gives you the superlative of many words. Likewise, ER gives you the comparative along with the prefix RE and the "person who does" suffix. Those four letters plus vowels other than U and Y form top contenders in many of the runs.

If you're playing WWF (or Scrabble), work to keep STARE in your tray, playing or swapping other tiles, until you get a bingo. Placing it on the board, of course, is a different matter.