Wednesday, December 18, 2013

An introduction to Functional Programming in F# -- Part 3: Interpreters

Last time, we developed a parser for a simple LISP style language; now we develop from the abstract syntax tree the mechanism whereby we can actually evaluate expressions (programs) written in it.

Topics covered in the series


A simple language

We define the behaviour of our language on top of the abstract syntax thus:


Expr =Example
(LET list_of_dotted_pairs EXPR) |(Let ((x . 1) (y. 2)) (Add x y))
(IF EXPR EXPR EXPR) |(IF (EQ x y) x (CONS x y))
(LAMBDA list_of_variables EXPR) |(LAMBDA (x y) (ADD x y 1))
(REC variable list_of_variables EXPR) |(REC F (N) (IF (EQ x 0) 1 (MULT N (F (SUB N 1)))))
(QUOTE EXPR) |(QUOTE (1 2 3))
(EVAL EXPR) |(EVAL (QUOTE (ADD 1 2)))
(EVAL EXPR environment_expression) |(EVAL (QUOTE (ADD X 1)) (QUOTE ((X . 2))))
(EXPR EXPR*) |(ADD (LENGTH L) (LENGTH J) 3)
SYMBOL |X
NUMBER 3

where the coloured background denotes special forms (keyword related constructs) in the language, and the rest are call-by-value expressions. The lower-case descriptive names represent the following expansions

list_of_dotted_pairs = ((EXPR . EXPR)*)
list_of_variables = ( SYMBOL*)
variable = SYMBOL
environment_expression = EXPR

In code, we extend the algebraic type we used last time with the notion of a function

in the form of the COMP type; and modify the value of a SYMBOL to be a string.

Aside: the value of a COMP type is a record, a series of named fields gathered together. In this case there is one field, and that has a value that is a function mapping one SEXPR to another.

The Structure of the Evaluator

In pseudocode terms, given an expression as above, an interpreter would evaluate it with a recursive function eval that looks like this:

To eval an expression e in an environment (a construct where symbol definitions live) env

  • if e is NIL then NIL
  • else if e is a number n then n
  • else if e is a SYMBOL then look it up in the environment
  • else if e is a special form then evaluate special form
  • else if e is of form (expr1 expr2 ... exprN) then (eval expr1 env) (eval expr2 env) ... (eval exprN env)
  • else ERROR

The Structure of the Environment

We introduced the notion of an environment as where symbol definitions live; we express it as a look-up table from symbol to value like this (where we continue to use the list as our standard functional data-structure):

so that structurally the environment is an SEXPR which itself is a list of pairs of SYMBOL * SEXPR.

In code, the evaluator looks like this


Note: Unlike our previous examples this doesn't just paste straight into the interpreter and work, as it has dangling references. The full set of new source appears at the end of this post.

The function starts out with the call by value cases from our pseudo-code; their behaviour should be obvious, given a suitable definition of lookup that walks the environment and returns the matching SEXPR or a suitable default if not found.


The Special forms

Our special form IF works like

where we have made the design decision that all values are "truthy" except NIL. We have to match on our predicate because, as an implementation detail, we can't test functions for equality.

Our special form LET introduces functions applySnd which takes the list of dotted pairs represented as SEXPR * SEXPR, and recursively evaluates the second of each pair, as in

and APPEND which takes a list of SEXPR * SEXPR and returns that list merges that with the input environment (another list of SEXPR * SEXPR).

Having generated a new environment by merging all the evaluated bindings, i.e. (APPEND pairs e), it then evaluates the body of the expression in that new environment.

Our special form LAMBDA behaves similarly, creating a function that when executed makes local bindings from the formal parameters to the corresponding values (ZIP being the standard operation to map a pair of lists to a list of pairs), and uses that to define a new environment for evaluating the body.

The special form REC returns a self-referencing computation, but is otherwise similar to the LAMBDA form. Note that f is not itself a function; this use emphasises that the F# rec keyword merely brings a variable into scope inside its own definition. so the definition can be recursive.

The special form QUOTE returns the input expression unevaluated.


The "Extras", or "code is data is code"

The evaluator provides access to the environment from within the language; symbol ENV yields the environment as a list of dotted pairs.

The symbol EVAL with one argument evaluates an expression within the current environment for the interpreter, and with two arguments, allows a different environment to be specified. This is how data becomes code -- it is able to cause itself self to execute.


Remaining cases -- Functions and errors

Function application is modelled by taking a fully evaluated list, whose first element is expected to be a COMPUTATION and applying it to the remainder of the list of arguments.

In the code

MAP applies the evaluator to each member of the list b to achieve this fully evaluated state; and apply looks like


Built-in functions are represented in this model by providing a standard environment with computations defined.

Anything else that is superficially syntactically valid but matches none of the expected forms results in an error.


Exercises

  • Modify the parser from last time to work with the new definition of SEXPR
  • Add Boolean operators AND, OR, NOT, EQUIV, IMPLIES to evaluator
  • Add built in functions to map between SYMBOLS and List of Characters and back again.
  • Add QUASIQUOTE (QQ), UNQUOTE with the meaning that (QQ (ADD (UNQUOTE X) (UNQUOTE Y) 23)) in an environment where X and Y are 123 and Z respectively evaluates to the S-expression (ADD 123 Z 23)
  • (Advanced): Change the evaluator to allow deal with interpreted code in the environment e.g. F is bound to (INTERPRET E) in env and (F X) is (eval E env)(eval X env). Alternatively extend the type of SEXPR to include an INTERPRETED constructor and do the same
  • (Advanced) Change the evaluator to allow special forms to be placed in the environment.

Putting it all together

I haven't copied over the parser from last time (that's the first exercise); but all the missing new bits are here.

No comments :