String processing as a fold
Having occasion recently to ensure that text in XML/HTML containing non-ASCII (high-bit set) characters, but no control codes aside from line breaks, was presenting them as character references, the obvious algorithm in C#, using a StringBuilder, sb, was
foreach( char ch in text )
{
if ( ch < 127 )
{ sb.Append(ch); }
else
{ sb.AppendFormat( "&#x{0:X4}", (int) ch ); }
}
In F# though, the obvious direct Seq.iter translation ends up needing |> ignore the results of the append operations. Since this is actually an accumulation operation into the StringBuilder, the better functional representation would be more like
let sb = Seq.fold (fun (b:StringBuilder)
(c:char) -> let ic = int c
if ic >= 127
then b.AppendFormat( "&#x{0:X4};", ic )
else b.Append(c))
(StringBuilder(text.Length + extra)) // estimate the expansion up front
text
which lets the StringBuilder flow naturally through the process, rather than closing over it and having to discard the value of the if expression. This could be done in C#, too along the lines of
var sb = text.Aggregate(new StringBuilder(), (b, c) =>
if (c >= 127)
{return b.AppendFormat("&#x{0:X4};", c);}
else
{return b.Append(c);});
only here the returns have to be explicit.
No comments :
Post a Comment