Friday, April 15, 2011

Applying IEnumerable to chunked IEnumerable in F# -- Cipher Feedback mode

Having developed the supporting infrastructure to pull data from any IEnumerable into a sequence fixed length buckets, we can now apply it to some practical use. One such is to perform encryption of an input byte sequence in a block-cipher feeback (CFB) mode.

CFB mode is so simple, it almost feels like cheating. Given an initialization vector (IV) -- an arbitrary block of bytes -- as the 0-th ciphertext block, the iterative definition is that the n-th ciphertext block is the encrypted value of the n-1-th block XOR'd with the n-th plaintext block; and by the symmetry of XOR, the n-th plaintext block is recovered from the n-th ciphertext block by the same XOR.

So, let's implement it, and do a simple test that we can easily work out the cipherstream for -- and doesn't actually involve real encryption. So let's take 128-bit blocks, all zeros for the IV, and the identity transformation for our "encryption", and have the integers 0 to 41 as input.

The first ciphertext block will be all zeroes (IV encrypted) XOR'd with 0-15, i.e. 0-15. That XORs with 16-31 to give a second block of all 16s; the final block is 32-41 XORd with 16s, i.e. 48-57.

So, armed with expectations we can write the implementation with appropriate self-test code, thus --

where I've renamed the previous private Ratchet class to the public StreamingSequence type -- a more self-explanatory name for possible re-users; and exposed the Dispose method of the constant (reference, mutable internal state) IEnumerator for end-of-enumeration use.

Here, Seq.collect is being used to flatten the sequence after the block-level operation has been done.

CFB mode is so simple that -- having already done the hard work of breaking any sequence into blocks -- the only thing that stopped the test running through first time was forgetting to re-initialize the CFB operation for the decryption pass.

Actually, the decryption operation shouldn't be working -- but somehow it does, even when I push in random input, and replace the identity encryption operation with a one-way hash! Something weird is going on here.


Carsten said...

just some short comments.

If you have something like an uncurry operator (see for example you can make (^) much easier by replacing "(fun x -> (fst x ^^^ snd x))" with "(uncurry (^^^)".

I think it might be Fsharper ;) if you would use tail-recursion instead of ref-cells in CfbEncrypt.

Steve Gilham said...

The cfb operation could be cast as tail-recursive byte[] -> byte[]; but at the cost of accumulating all the input in memory and having to reverse it (or doing O(n^2) appends-to-end) within the function call.

Hiding state via closure inside the function returned from CfbEncrypt -- making it no longer a pure function -- preserves the pipeline behaviour : you can pull one block at a time through the process, requiring constant space whatever the sequence length.

Steve Gilham said...

That should be

"tail-recursive byte[] seq -> byte[] seq"

of course.