An introduction to Functional Programming in F# -- Part 3: Interpreters
Last time, we developed a parser for a simple LISP style language; now we develop from the abstract syntax tree the mechanism whereby we can actually evaluate expressions (programs) written in it.
Topics covered in the series
- Introduction to Functional Programming via circuits
- Lexing and Parsing
- Interpreters (this part)
- Simple ‘Functional’ Distributed/Concurrent Programming
- Non-Functional bits - Assignment, Exceptions, Objects, Libraries, etc.
- Advanced stuff - Continuations, Monads, Generative Programming (Language Based programming), etc.
A simple language
We define the behaviour of our language on top of the abstract syntax thus:
Expr = Example
(LET list_of_dotted_pairs EXPR) | (Let ((x . 1) (y. 2)) (Add x y))
(IF EXPR EXPR EXPR) | (IF (EQ x y) x (CONS x y))
(LAMBDA list_of_variables EXPR) | (LAMBDA (x y) (ADD x y 1))
(REC variable list_of_variables EXPR) | (REC F (N) (IF (EQ x 0) 1 (MULT N (F (SUB N 1)))))
(QUOTE EXPR) | (QUOTE (1 2 3))
(EVAL EXPR) | (EVAL (QUOTE (ADD 1 2)))
(EVAL EXPR environment_expression) | (EVAL (QUOTE (ADD X 1)) (QUOTE ((X . 2))))
(EXPR EXPR*) | (ADD (LENGTH L) (LENGTH J) 3)
SYMBOL | X
NUMBER 3
where the coloured background denotes special forms (keyword related constructs) in the language, and the rest are call-by-value expressions. The lower-case descriptive names represent the following expansions
list_of_dotted_pairs = ((EXPR . EXPR)*)
list_of_variables = ( SYMBOL*)
variable = SYMBOL
environment_expression = EXPR
In code, we extend the algebraic type we used last time with the notion of a function
in the form of the COMP
type; and modify the value of a SYMBOL
to be a string.
Aside: the value of a COMP
type is a record, a series of named fields gathered together. In this case there is one field, and that has a value that is a function mapping one SEXPR
to another.
The Structure of the Evaluator
In pseudocode terms, given an expression as above, an interpreter would evaluate it with a recursive function eval
that looks like this:
To eval
an expression e
in an environment (a construct where symbol definitions live) env
- if
e
isNIL
thenNIL
- else if
e
is a numbern
thenn
- else if
e
is aSYMBOL
then look it up in the environment - else if
e
is a special form then evaluate special form - else if
e
is of form(expr1 expr2 ... exprN)
then(eval expr1 env) (eval expr2 env) ... (eval exprN env)
- else
ERROR
The Structure of the Environment
We introduced the notion of an environment as where symbol definitions live; we express it as a look-up table from symbol to value like this (where we continue to use the list as our standard functional data-structure):
so that structurally the environment is an SEXPR which itself is a list of pairs of SYMBOL * SEXPR.
In code, the evaluator looks like this
Note: Unlike our previous examples this doesn't just paste straight into the interpreter and work, as it has dangling references. The full set of new source appears at the end of this post.
The function starts out with the call by value cases from our pseudo-code; their behaviour should be obvious, given a suitable definition of lookup
that walks the environment and returns the matching SEXPR
or a suitable default if not found.
The Special forms
Our special form IF
works like
where we have made the design decision that all values are "truthy" except NIL
. We have to match on our predicate because, as an implementation detail, we can't test functions for equality.
Our special form LET
introduces functions applySnd
which takes the list of dotted pairs represented as SEXPR * SEXPR
, and recursively evaluates the second of each pair, as in
and APPEND
which takes a list of SEXPR * SEXPR
and returns that list merges that with the input environment (another list of SEXPR * SEXPR
).
Having generated a new environment by merging all the evaluated bindings, i.e. (APPEND pairs e)
, it then evaluates the body of the expression in that new environment.
Our special form LAMBDA
behaves similarly, creating a function that when executed makes local bindings from the formal parameters to the corresponding values (ZIP
being the standard operation to map a pair of lists to a list of pairs), and uses that to define a new environment for evaluating the body.
The special form REC
returns a self-referencing computation, but is otherwise similar to the LAMBDA
form. Note that f
is not itself a function; this use emphasises that the F# rec
keyword merely brings a variable into scope inside its own definition. so the definition can be recursive.
The special form QUOTE
returns the input expression unevaluated.
The "Extras", or "code is data is code"
The evaluator provides access to the environment from within the language; symbol ENV
yields the environment as a list of dotted pairs.
The symbol EVAL
with one argument evaluates an expression within the current environment for the interpreter, and with two arguments, allows a different environment to be specified. This is how data becomes code -- it is able to cause itself self to execute.
Remaining cases -- Functions and errors
Function application is modelled by taking a fully evaluated list, whose first element is expected to be a COMPUTATION
and applying it to the remainder of the list of arguments.
In the code
MAP
applies the evaluator to each member of the list b
to achieve this fully evaluated state; and apply
looks like
Built-in functions are represented in this model by providing a standard environment with computations defined.
Anything else that is superficially syntactically valid but matches none of the expected forms results in an error.
Exercises
- Modify the parser from last time to work with the new definition of
SEXPR
- Add Boolean operators
AND, OR, NOT, EQUIV, IMPLIES
to evaluator - Add built in functions to map between SYMBOLS and List of Characters and back again.
- Add
QUASIQUOTE (QQ), UNQUOTE
with the meaning that(QQ (ADD (UNQUOTE X) (UNQUOTE Y) 23))
in an environment whereX
andY
are123
andZ
respectively evaluates to the S-expression(ADD 123 Z 23)
- (Advanced): Change the evaluator to allow deal with interpreted code in the environment e.g.
F
is bound to(INTERPRET E)
inenv
and(F X)
is(eval E env)(eval X env)
. Alternatively extend the type of SEXPR to include anINTERPRETED
constructor and do the same - (Advanced) Change the evaluator to allow special forms to be placed in the environment.
Putting it all together
I haven't copied over the parser from last time (that's the first exercise); but all the missing new bits are here.
No comments :
Post a Comment