Sunday, October 13, 2013

Some observations on performance

When I was at the barbers' the other day, the puzzle section of the newspaper had a word game that involved taking a 9 letter word and using the letters to fill in a degenerate crossword box with 1 Across being 5 letters, and 2 Down also 5 letters, starting at the 3rd letter of 1 Across. So naturally, I thought "I can write a script for that", so I did.

My first take used Set operations because the target word had no duplicate letters; so I thought, "let's generalise this, and tidy up the scruffy looking bits" so I could get all the possible variations.

Which left me with this

which ran like a slow thing in SlowTown -- it took a few minutes to scan 97 words beginning with "ab" and yield 9 words with a unique decomposition.

As my original take hadn't seemed to run anywhere near as sluggishly, I reverted all the Seq.choose and match to Array.filter and Option.isSome and got an order of magnitude speed-up.

So how much faster could I make it? I tried a lot of things -- everything as a seq, cache the sequence form of the words, adding a look-up table so given one five-letter word, I could look up a precomputed list of ones starting with its third letter, -- but the times all stuck around the 10 second mark for the "ab" words (or got worse, like when I cached the result of taking the five letter word from the 9-letter one).

The only thing that made any positive difference was replacing the immutable structures in the test for whether a word was a subset of another, like

That got about a factor of 3 improvement, and worked through all 10803 9-letter words and 4717 5-letter ones to yield 1942 candidates in about eight and a half minutes.

So I'm baffled about what is making the slow case so slow, but I'm making this note about it here in case I hit it again.

This was all done on my '07 vintage laptop using F#3.0, with the word list I got through Stack Overflow, for those who want to try out timings. A British word list, of course, for a puzzle in a British newspaper.

No comments :