String processing as a fold
Having occasion recently to ensure that text in XML/HTML containing non-ASCII (high-bit set) characters, but no control codes aside from line breaks, was presenting them as character references, the obvious algorithm in C#, using a StringBuilder
, sb
, was
foreach( char ch in text ) { if ( ch < 127 ) { sb.Append(ch); } else { sb.AppendFormat( "&#x{0:X4}", (int) ch ); } }
In F# though, the obvious direct Seq.iter
translation ends up needing |> ignore
the results of the append operations. Since this is actually an accumulation operation into the StringBuilder
, the better functional representation would be more like
let sb = Seq.fold (fun (b:StringBuilder) (c:char) -> let ic = int c if ic >= 127 then b.AppendFormat( "&#x{0:X4};", ic ) else b.Append(c)) (StringBuilder(text.Length + extra)) // estimate the expansion up front text
which lets the StringBuilder
flow naturally through the process, rather than closing over it and having to discard the value of the if
expression. This could be done in C#, too along the lines of
var sb = text.Aggregate(new StringBuilder(), (b, c) => if (c >= 127) {return b.AppendFormat("&#x{0:X4};", c);} else {return b.Append(c);});
only here the returns have to be explicit.
No comments :
Post a Comment