Thursday, May 14, 2015

F# under the covers XV -- functions, types, and what you see isn't what you get

Consider this innocent example


When run in a new interactive session, it yields (after trimming the SOAP baggage, and noting that _x002B_ would be an encoding of + and _x0040_ of @)

FSI_0002+clo@20
<SOAP-ENV:..>
<SOAP-ENV:Body>
<a1:FSI_0002_x002B_clo_x0040_20 id="ref-1" xmlns:a1=...>
</a1:FSI_0002_x002B_clo_x0040_20>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

FSI_0002+clo@20-1
<SOAP-ENV:...>
<SOAP-ENV:Body>
<a1:FSI_0002_x002B_clo_x0040_20-1 id="ref-1" xmlns:a1=...>
</a1:FSI_0002_x002B_clo_x0040_20-1>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

FSI_0002+add2@11
<SOAP-ENV:...>
<SOAP-ENV:Body>
<a1:FSI_0002_x002B_add2_x0040_11 id="ref-1" xmlns:a1=...>
</a1:FSI_0002_x002B_add2_x0040_11>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

FSI_0002+clo@20-2
<SOAP-ENV:..>
<SOAP-ENV:Body>
<a1:FSI_0002_x002B_clo_x0040_20-2 id="ref-1" xmlns:a1=...>
</a1:FSI_0002_x002B_clo_x0040_20-2>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

-------------------
FSI_0002+it@25
<SOAP-ENV:...>
<SOAP-ENV:Body>
<a1:FSI_0002_x002B_it_x0040_25 id="ref-1" xmlns:a1=...>
</a1:FSI_0002_x002B_it_x0040_25>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

FSI_0002+it@25-1
<SOAP-ENV:...>
<SOAP-ENV:Body>
<a1:FSI_0002_x002B_it_x0040_25-1 id="ref-1" xmlns:a1=...>
</a1:FSI_0002_x002B_it_x0040_25-1>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

FSI_0002+add2@11
<SOAP-ENV:...>
<SOAP-ENV:Body>
<a1:FSI_0002_x002B_add2_x0040_11 id="ref-1" xmlns:a1=..>
</a1:FSI_0002_x002B_add2_x0040_11>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

FSI_0002+it@25-2
<SOAP-ENV:...>
<SOAP-ENV:Body>
<a1:FSI_0002_x002B_it_x0040_25-2 id="ref-1" xmlns:a1=...>
</a1:FSI_0002_x002B_it_x0040_25-2>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>


val mhead : m:System.Object list -> System.Object list
val addN : n:int -> x:int -> int
val add2 : (int -> int)
val emitString : x:'a -> unit
val it : unit = ()

Where we observe that of the user input names, only add2 survives as part of the serialized name; and that the second instance of what looked like the same function is actually an instance of a different type.

Running the same code as a compiled .exe, we get

Program+clo@20
<SOAP-ENV:...>
<SOAP-ENV:Body>
<a1:Program_x002B_clo_x0040_20 id="ref-1" xmlns:a1=...>
</a1:Program_x002B_clo_x0040_20>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

Program+clo@20-1
<SOAP-ENV:...>
<SOAP-ENV:Body>
<a1:Program_x002B_clo_x0040_20-1 id="ref-1" xmlns:a1=...>
</a1:Program_x002B_clo_x0040_20-1>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

Program+add2@11
<SOAP-ENV:...>
<SOAP-ENV:Body>
<a1:Program_x002B_add2_x0040_11 id="ref-1" xmlns:a1=...>
<n>2</n>
</a1:Program_x002B_add2_x0040_11>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

Program+clo@20-2
<SOAP-ENV:...>
<SOAP-ENV:Body>
<a1:Program_x002B_clo_x0040_20-2 id="ref-1" xmlns:a1=...>
</a1:Program_x002B_clo_x0040_20-2>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

-------------------
Program+clo@25-4
<SOAP-ENV:...>
<SOAP-ENV:Body>
<a1:Program_x002B_clo_x0040_25-4 id="ref-1" xmlns:a1=...>
</a1:Program_x002B_clo_x0040_25-4>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

Program+clo@25-5
<SOAP-ENV:...>
<SOAP-ENV:Body>
<a1:Program_x002B_clo_x0040_25-5 id="ref-1" xmlns:a1=...>
</a1:Program_x002B_clo_x0040_25-5>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

Program+add2@11
<SOAP-ENV:...>
<SOAP-ENV:Body>
<a1:Program_x002B_add2_x0040_11 id="ref-1" xmlns:a1=...>
<n>2</n>
</a1:Program_x002B_add2_x0040_11>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

Program+clo@25-6
<SOAP-ENV:...>
<SOAP-ENV:Body>
<a1:Program_x002B_clo_x0040_25-6 id="ref-1" xmlns:a1=...>
</a1:Program_x002B_clo_x0040_25-6>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

where the names are similar -- and now the closure add2 actually contains that 2 as part of its body.

So, what's going on? Well, let's look at what we get when we decompile.


This is the debug version, the release version just moves the main program locals to be static members of the Program class, with no other relevant changes.

This shows us that our named functions are compiled as normal functions as we expect in C#, but the first-class function objects we pass around are actually indirections to those functions; and different ones at each site, even if they close over nothing, and so must be identical, even in the release build.

So, what does this mean?

Well, for one thing, type-needy serialization methods like DataContract aren't going to work well with functions (including lambdas inside of methods of innocent looking types), because the types are instance-unique. And deserialization is only going to be possible in the same binary as the serialization came from (or at least have the same functions on the same types, methods and line numbers), unless you do a fair bit of decompilation to provide a translation binding.

The latter is a potential pitfall -- if you implement a Windows Workflow in F#, and have objects persisted, things could work just fine within the first working version of your program; but processes that are hibernating mid-task while you're rolling out a new version can break when they wake because these secret types are no longer there by the same names.


No comments :