Chai – Functions with Applications

As usual, we invite the reader to follow along with the source code from this blog’s Google Code page, and to test programs with the interactive online interpreter.

One interpretation of a "Chai function."

Reserved Words, Values, and Ifs

In our last version of Chai we noticed a few things that needed fixing, so we’ll address those before moving on. First, consider the following chai program:

(with true false (or true true))

The user, being as odd as users will be, wants to bind the value “false” to the variable whose name is “true,” in a dirty attempt to trick somebody. The last version of Chai (see, Chai – Environments and Variables) interpreted this expression as “true,” and rightfully so in this mathematician’s opinion, since truth is not something which can be changed!

On the other hand, something just doesn’t feel right about this program. We can either allow such programs to be written (and wag a stern finger at all those who do!), or forbid the program from continuing once we find such a transgression. The latter makes more sense, and it introduces reserved words into Chai. As of the end of this post, our list of reserved words contains:

true false with if end pair fun

And we will add to this list as necessary. It’s ambiguous whether we should include things like +, -, *, /, etc. Someone may want to redefine what “plus” means, and that might not be an awful thing, say, once every two thousand years. For now, we will leave it alone. At least, we may stop someone from writing the appalling program:

(with with 1 (with with (with with with with) with))

All this requires to our code is that during parsing, whenever we inspect a variable binding (or list of variable bindings, or list of function arguments), we must ensure none are reserved words. This is straightforward enough, and the interested reader may inspect the source code.

Next, we realized ahead of time the need to discriminate between the input to our interpreter and its output. This is a direct result of our desire to add support for functions and basic lists. Now we have a new EBNF for “Chai values”:

chai-value = number
           | boolean
           | string
           | (pair chai-value chai-value)
           | end

One might notice that we have added strings, pairs (lists are nested series of pairs), and end, a null-value, named so to signify the terminating value of a list. In addition, we’ve added primitives for a whole host of operations, including: =, !=, <=, >=, <, >, pair (pair construction), end (list termination), first and rest (pair extraction), and end? (testing if a list is empty). We will certainly add more in the future, but for now this is enough.

In Racket code, we have the following new data type:

(define-type chai-value
  [num-v (val number?)]
  [bool-v (val boolean?)]
  [str-v (val string?)]
  [pair-v (head chai-value?) (tail chai-value?)]
  [end-v])

The new discrimination between Chai expressions and Chai values affects every branch of our interpreter and in every primitive operation, but only in how we accept and package up our data types. The interested reader can check out the changes in the source code; we will not discuss them here. (But we will mention that none of the changes occur in the type-checker. Nice.)

Finally, we’ve added “if” expressions for control flow. In the EBNF, an if looks like:

expr = ...
     | (if expr expr expr)

Where the first expression is the “test” expression, and the second is the “then” expression, and the third is the “else” expression. This translates directly to Racket as:

(define-type chai-expr
   ...
   [if? (test chai-expr?) (then-branch chai-expr?) (else-branch chai-expr?)])

Because the symbol ‘if’ conflicts with the Racket language form, we rename it ‘if?’; note that in Chai programs, the language form is still “if”. The parser is straightforward for if expressions, so we’ll jump straight to the interpreter. The new clause is:

(define (interp input-expr env)
  (type-case chai-expr input-expr
    ...
    [if? (test-expr then-branch else-branch)
         (let ([test-result (interp test-expr env)])
           (type-case chai-value test-result
             [bool-v (test-value) (interp (if test-value then-branch else-branch) env)]
             [else (error 'interp "test expression was not a boolean: ~a"
                          (chai-expr->string test-expr))]))]))

Note that we need to ensure the result of evaluating the test-expression is a boolean, and we ignore the unevaluated branch. For instance, the following program will run in Chai:

(if true 7 (/ 1 0))

even when the else branch will throw an error as a standalone program.

Functions

Finally, we’ve reached a very important milestone in Chai: figuring out what to do with functions. Let’s start with the basic syntax. In EBNF, a function has the form

expr = ...
     | (fun (vars ...) expr)

Where “vars …” indicates the presence of zero or more identifiers representing input variables, which are collectively referred to as “vars”. For instance, here is a Chai program which evaluates to the identity function:

(fun (x) x)

In Chai, we will treat functions the same way we treat any other primitive value. One can pass them around as if they were a number. For instance, here is a function which accepts one argument x, and returns a function which accepts one argument y and adds x to y:

(fun (x) (fun (y) (+ x y)))

Applying this function to a number returns a new function. In some languages, functions are required to be defined “at the top level,” i.e., given special names and put in special places. Since Chai expressions still consist of a single expression, we don’t have anywhere else to put them. So if one wants to “name” a function one can, but as it is most functions will be “anonymous.”

Speaking of function applications, they look like:

expr = ...
     | (expr expr ...)

Where the first expression should result in a function, and the remaining zero or more results should result in arguments to the function. Note that syntactically, these “shoulds” are irrelevant.

So here is a program which applies the previous example, which we name “make+=”, to some values:

(with make+= (fun (x) (fun (y) (+ x y)))
   (with +=7 (make+= 7)
     (pair (+=7 9) (pair (+=7 14) end))))

The advanced readers will recognize this as an explicit form of “currying” (accepting a partial list of arguments). We may in the future decide to have Chai support implicit currying, so that we may rewrite the above program as something like:

(with +=7 (+ 7)
  (pair (+=7 9) (pair (+=7 14) end)))

But for now let’s focus on the implementation.

The Parser

In the parser, our new branches look like:

[(and (equal? 'fun first-thing)
      (equal? 3 list-len)
      (list? (second input-expr)))
 (let ([vars (check-vars (second input-expr))]
       [body (parse-expr (third input-expr))])
   (function vars body))]
[else
 (app (parse-expr first-thing)
      (map parse-expr (rest input-expr)))]))] 

To elaborate, these two branches are wrapped within the condition that the expression is a list. If the first element of the list is not a reserved keyword like “with”, “if”, or “fun”, we assume the expression is a function application. Indeed, we could have some weird expressions that evaluate to functions, so we can’t discriminate any further here.

For functions, we have to ensure some things about the list of input variables. For instance, we can’t allow the user to repeat the same variable name twice, or use any reserved words, so we encapsulate both of those checks into a function called “check-vars”:

;; check-vars: listof symbol -> listof symbol
;; check a list of variable names for invalidity
(define (check-vars names)
  (begin
    (map (λ (name) (when (reserved? name) (parse:bad-identifier name))) names)
    (when (has-duplicates? names) (error 'parse "list of variables has duplicates: ~a" names))
    names))

We leave the implementation of has-duplicates? as an exercise to the reader (with the solution in chai-utils.rkt), and reserved? is just a membership test in a list of symbols (in chai-ast.rkt).

While it has its details, the parser is the easy part of implementing functions. We didn’t actually have to make any important decisions. We’ll see the semantics are a bit more complicated.

Scope, and Closures

In general, the scope of a variable x refers to the places in the program where we can inspect the value of x. For instance, in the following program x is only in scope within the “with” expression.

(with y (+ x y)
  (with x 7) y)

So, in fact, the reference to x before the nested with statement is invalid, and this program will raise an error upon evaluation. Note that y is in scope in the nested with. Recalling how our interpreter works, this is obvious, because we only augment the environment with new variables after encountering their definition in a with.

To illustrate the confusion arising in discussions about scope, we have to investigate functions. In particular, the definition of a function can refer to a variable defined somewhere else, but the body of the function isn’t evaluated until it’s actually called. Since functions can be passed around like numbers, it raises ambiguity as to which variable we should refer to during a function application.

For instance, we could write the following program and ask what it should do:

(with my-function (with x 33
                     (fun (y) (+ x y)))
   (with x 44 (my-function 55)))

Should this program evaluate to 88 or 99? Before continuing, make a guess as to what will happen (and try to argue why your choice is the best option), and verify it by running the program through our interactive interpreter.

In fact, there is no completely “correct” answer to this question, and it highlights the differences between lexical and dynamic scope. In the latter, a variable will always refer to the most recent binding which is in scope. In other words, if Chai had dynamic binding, the program above would evaluate to 99. On the other hand, if we implemented lexical scope, “my-function” would retain the information about what variables were in scope during its definition, and so the resulting call would evaluate to 88.

The engaged reader will have by now verified that Chai has lexical scope, and not dynamic scope (in the future, we may provide an alternative version of Chai which has dynamic scope). Most contemporary programming languages implement lexical scope, because it just so happens that otherwise the logic flow in a program is harder to determine just by reading it. However, some popular languages originally had dynamic scope, and later decided to tack on support for lexical scoping. Sometimes lexical scope is referred to as static scope.

In order to support lexical scope in Chai, we have to now discriminate between a function definition, and the function definition combined with the variables that are in scope during definition. In other words, we need the result of interpreting a function definition to contain both pieces of information. In particular, we call this piece of data a closure.

Now the chai-value datatype has an additional branch:

chai-value = number
           | boolean
           | string
           | (pair chai-value chai-value)
           | end
           | (closure environment (vars...) chai-expr)

Where the environment is our datatype for variable bindings, the vars… is the same list of function variables from our function definitions, and the chai-expr is the body expression, in unevaluated form. (Indeed, we can’t evaluate the body until it’s applied to some arguments!)

So this adds one more branch to our type-case:

(define-type chai-value
   ...
   [closure-v (env environment?) (vars (listof symbol?)) (body chai-expr?)])

Now we are ready to interpret functions, and see what we can do with them.

The Interpreter

In fact, interpreting function definitions is easy! Since we have already been keeping track of the environment all along, we can package it up in a closure in a single line:

(define (interp input-expr env)
  (type-case chai-expr input-expr
     ...
    [function (vars body) (closure-v env vars body)]))

Applying a function is quite a bit more work. In particular, we have two things we need to check before believing that the user knows how to write programs:

  • Is the first thing in a function application a function? (Does it evaluate to a closure?)
  • Does the number of provided arguments match the number of accepted arguments?

If neither of these are the case, we need to raise an error. Assuming these requirements are met, we can perform the actual function application as follows:

(let* ([interped-args (map (λ (arg) (interp arg env)) args)]
       [new-env (foldl add-binding closure-env vars interped-args)])
  (interp body new-env))

where args, vars, closure-env, and body are all the expected parts of our function application and closure. Note that we are very specific to augment the closure environment with the new bindings for the function arguments. This is the key of lexical scope, whereas dynamic scope would use the standard environment.

With all of the case-checking, the branch in our interpreter is a bit longer:

(define (interp input-expr env)
  (type-case chai-expr input-expr
    ...
   [app (clo args)
         (let ([expected-clo (interp clo env)])
           (type-case chai-value expected-clo
              [closure-v (closure-env vars body)
                         (let ([arg-count (length args)]
                               [var-count (length vars)])
                           (when (not (eq? arg-count var-count))
                             (error 'interp "expected ~a args for a function application, but got ~a" var-count arg-count))
                           (let* ([interped-args (map (λ (arg) (interp arg env)) args)]
                                  [new-env (foldl add-binding closure-env vars interped-args)])
                             (interp body new-env)))]
              [else (error 'interp "invalid function in function application: ~a" (chai-expr->string expected-clo))]))]))

Try to follow the logic above. Even for the author, determining what someone else’s code does is a nontrivial process, so don’t worry if it’s overwhelming at first. Just remember that most of it is error-checking, and the key part is the three lines where we actually perform the function application. Of course, to get an even better understanding, try to run programs in our interactive interpreter which hit every error case, and compare that with the flow of logic in the code above.

Recursion, But Not Quite Easy Enough

The experienced reader might be asking herself “Why haven’t we implemented any loops yet? We can’t really do anything interesting without loops.” And she is correct! A programming language cannot be Turing-complete (informally, have the true power of a computer) if it can’t branch out, or make programs which run indefinitely.

As of now, we couldn’t, say, write the function map (see our functional programming primer, A Taste of Racket, for a definition). If we try, we get a nasty error:

(with map (fun (f list)
            (if (end? list)
                end
                (pair (f (first list)) (map f (rest list)))))
  (map (fun (x) (+ 1 x)) (pair 7 (pair 6 (pair 5 end)))))

It complains that “map” is not bound in the current environment, which is a shame, because map is a really great function to have around, and we can’t build it yet (well, that’s not quite true, but it wouldn’t be as straightforward as what we have above).

It turns out that even though we can’t quite do this kind of recursion with names, we can still achieve infinite loops in Chai! We challenge the reader to figure out how to do it, and see what our interactive interpreter does in response. As a hint, the shortest solution the author has found is a mere 33 characters long, including spaces and parentheses, and it only requires functions and applications.

Some Further Issues to Address

We recall saying we would like the user to be able to redefine what “+” means via a with expression. However, upon trying to evaluate the following program which “adds” two functions, we get an error:

(with + (fun (f g) (fun (arg) (+ (f arg) (g arg))))
  ((+ (fun (x) (+ x 1)) (fun (y) (* y  2))) 7))

In order to fix this, we need to stop distinguishing between primitive operations and functions which are bound to symbols in our environment. In other words, we want to pre-populate the environment with bindings for primitive functions, which can be overwritten by the programmer. This will simplify our code significantly, since we can move even more of the logic outside of the interpreter and parser, and think of everything as a function application. This is something we have to look forward to in the future.

As long as we’re thinking of everything as a function application, why don’t we simplify Chai even further so that we only have functions and applications? No numbers, no primitives, no conditionals. What sort of language would we get? We’ll see just what happens with a little intermezzo next time. Of course, we will only do this temporarily to explore its theoretical properties, and then we’ll return to the civilized world.

Until then!

A Taste of Racket

or, How I Learned to Love Functional Programming

We recognize that not every reader has an appreciation for functional programming. Yet here on this blog, we’ve done most of our work in languages teeming with functional paradigms. It’s time for us to take a stand and shout from the digital mountaintops, “I love functional programming!” In fact, functional programming was part of this author’s inspiration for Math ∩ Programming.

And so, to help the reader discover the joys of functional programming, we present an introduction to programming in Racket, with a focus on why functional programming is amazing, and a functional solution to a problem on Project Euler. As usual, the entire source code for the examples in this post is available on this blog’s Github page.

Lists

Lists are the datatype of choice for functional programmers. In particular, a list is either the empty list or a pair of objects:

list = empty
     | (x list)

Here () denotes a pair, the first thing in the pair, “x”, is any object, and the second thing is another list. In Racket, all lists have this form, and they are constructed with a special function called “cons”. For instance, the following program outputs the list containing 7 and 8, in that order.

(cons 7 (cons 8 empty))

The reader will soon get used to the syntax: every function application looks like (function arguments…), including the arithmetic operators.

This paradigm is familiar; in imperative (“normal”) programming, this is called a linked list, and is generally perceived as slower than lists based on array structures. We will see why shortly, but in general this is only true for some uncommon operations.

Of course, Racket has a shorthand for constructing lists, which doesn’t require one to write “cons” a hundred times:

(list 7 9)

gives us the same list as before, without the nested parentheses. Now, if we wanted to add a single element to the front of this list, “cons” is still the best tool for the job. If the variable “my-list” is bound to a list, we would call

(cons 6 my-list)

to add the new element. For lists of things which are just numbers, symbols, or strings , there is an additional shorthand, using the “quote” macro:

'(1 2 3 4 "hello" my-symbol!)

Note that Racket does not require such lists are homogeneous, and it automatically converts the proper types for us.

To access the elements of a list, we only need two functions: first and rest. The “first” function accepts a list and returns the first element in it, while “rest” returns the tail of the list excluding the first element. This naturally fits with the “cons” structure of a list. For instance, if “my-list” is a variable containing (cons 8 (cons 9 empty)), then first and rest act as follows:

> (first my-list)
8
> (rest my-list)
'(9) 
> (first (rest my-list))
9

In particular, we can access any element of a list with sufficiently many calls to first and rest. But for most problems this is unnecessary. We are about to discover far more elegant methods for working with lists. This is where functional programming truly shines.

Double-List, Triple-List, and Sub-List

Suppose we want to take a list of numbers and double each number. If we just have what we know now about lists, we can write a function to do this. The general function definition looks like:

(define (function-name arg1 arg2 ...)
body-expression)

To be completely clear, the return value of a Racket function is whatever the body expression evaluates to, and we are allowed to recursively call the function. Indeed, this is the only way we will ever loop in Racket (although it has some looping constructs, we frown upon their use).

And so the definition for doubling a list is naturally:

(define (double-list my-list)
  (if (empty? my-list) empty
      (cons (* 2 (first my-list))
            (double-list (rest my-list)))))

In words, we have two cases: if my-list is empty, then there is nothing to double, so we return a new empty list (well, all empty lists are equal, so it’s the same empty list). This is often referred to as the “base case.” If my-list is nonempty, we construct a new list by doubling the first element, and then recursively  doubling the remaining list. Eventually this algorithm will hit the end of the list, and ultimately it will return a new list with each element of my-list doubled.

Of course, the mathematicians will immediately recognize induction at work. If the program handles the base case and the induction step correctly, then it will correctly operate on a list of any length!

Indeed, we may test double-list:

> (double-list empty)
'()
> (double-list '(1 2 3 5))
'(2 4 6 10)

And we are confident that it works. Now say we wanted to instead triple the elements of the list. We could rewrite this function with but a single change:

(define (triple-list my-list)
  (if (empty? my-list) empty
      (cons (* 3 (first my-list))
            (triple-list (rest my-list)))))

It’s painfully obvious that coding up two such functions is a waste of time. In fact, at 136 characters, I’m repeating more than 93% of the code (eight characters changing “doub” to “trip” and one character changing 2 to 3). What a waste! We should instead refactor this code to accept a new argument: the number we want to multiply by.

(define (multiply-list my-list n)
  (if (empty? my-list) empty
      (cons (* n (first my-list))
            (multiply-list (rest my-list) n))))

This is much better, but now instead of multiplying the elements of our list by some fixed number, we want to subtract 7 from each element (arbitrary, I know, but we’re going somewhere). Now I have to write a whole new function to subtract things!

(define (sub-list my-list n)
  (if (empty? my-list) empty
      (cons (- (first my-list) n)
            (sub-list (rest my-list) n))))

Of course, we have the insight to make it generic and accept any subtraction argument, but this is the problem we tried to avoid by writing multiply-list! We obviously need to step things up a notch.

Map

In all of this work above, we’ve only been changing one thing: the operation applied to each element of the list. Let’s create a new function, which accepts as input a list and a function which operates on each element of the list. This special operation is called map, and it is only possible to make because Racket treats functions like any other kind of value (they can be passed as arguments to functions, and returned as values; functions are first class objects).

The implementation of map should look very familiar by now:

(define (map f my-list)
  (if (empty? my-list) empty
      (cons (f (first my-list))
            (map f (rest my-list)))))

In particular, we may now define all of our old functions in terms of map! For instance,

(define (double x) (* 2 x))
(define (double-list2 my-list) (map double my-list))

Or, even better, we may take advantage of Racket’s anonymous functions, which are also called “lambda expressions,” to implement the body of double-list in a single line:

(map (λ (x) (* 2 x)) my-list)

Here the λ character has special binding in the Dr. Racket programming environment (Alt-\), and one can alternatively write the string “lambda” in its place.

With map we have opened quite a large door. Given any function at all, we may extend that function to operate on a list of values using map. Consider the imperative equivalent:

for (i = 0; i < list.length; i++):
   list.set(i, f(list.get(i)))

Every time we want to loop over a list, we have to deal with all of this indexing nonsense (not to mention the extra code needed to make a new list if we don’t want to mutate the existing list, and that iterator shortcuts prohibit mutation). And the Racket haters will have to concede, the imperative method has just as many parentheses 🙂

Of course, we must note that map always creates a new list. In fact, in languages that are so-called “purely” functional, it is impossible to change the value of a variable (i.e., there is no such thing as mutation). The advantages and disadvantages of this approach are beyond the scope of this post, but we will likely cover them in the future.

Fold and Filter

Of course, map is just one kind of loop we might want. For instance, what if we wanted to sum up all of the numbers in a list, or pick out the positive ones? This is where fold and filter come into play.

Fold is a function which reduces a list to a single value. To do this, it accepts a list, an initial value, and a function which accepts precisely two arguments and outputs a single value. It then uses this to combine the elements of a list. It’s implementation is not that different from map:

(define (fold f val my-list)
  (if (empty? my-list) val
      (fold f
            (f val (first my-list))
            (rest my-list))))

Here the base case is to simply return “val”, while the induction step is to combine “val” with the first element of the list using “f”, and then recursively apply fold to the remainder of the list. Now we may implement our desired summing function as

(define (sum my-list) (fold + 0 my-list))

And similarly, make a multiplying function:

(define (prod my-list) (fold * 1 my-list))

Notice now that we’ve extracted the essential pieces of the problem: what operation to apply, and what the base value is. In fact, this is the only relevant information to the summing and multiplying functions. In other words, we couldn’t possibly make our code any simpler!

Finally, filter is a function which selects specific values from a list. It accepts a selection function, which accepts one argument and returns true or false, and the list to select from. It’s implementation is again straightforward induction:

(define (filter select? my-list)
  (if (empty? my-list) empty
      (let ([elt (first my-list)]
            [the-rest (rest my-list)])
        (if (select? elt)
            (cons elt (filter select? the-rest))
            (filter select? the-rest)))))

To avoid superfluous calls to “first” and “rest”, we use Racket’s “let” form to bind some variables. Logically, the base case is again to return an empty list, while the inductive step depends on the result of applying “select?” to the first element in our list. If the result is true, we include it in the resulting list, recursively calling filter to look for other elements. If the result is false, we simply skip it, recursively calling filter to continue our search.

Now, to find the positive numbers in a list, we may simply use filter:

(filter (λ (x) (> x 0)) my-list)

Again, the only relevant pieces of this algorithm are the selection function and the list to search through.

There is one important variant of fold, in particular, the function we’re using to fold may depend on the order in which it’s applied to elements of the list. We might require that folding begin at the end of a list instead of the beginning. Fold functions are usually distinguished as left- or right-folds. Of course, Racket has map, fold, and filter built in, but the fold functions are renamed “foldl” and “foldr”. We have implemented foldl, and we leave foldr as an exercise to the reader.

A Brighter, More Productive World

Any loop we ever want to implement can be done with the appropriate calls to map, fold, and filter. We will illustrate this by solving a Project Euler problem, specifically problem 67. For those too lazy to click a link, the problem is to find the maximal sum of paths from the apex to the base of a triangular grid of numbers. For example, consider the following triangle:

   3
  7 4
 2 4 6
8 5 9 3

Here the maximal path is 3,7,4,9, whose sum is 23. In the problem, we are provided with a file containing a triangle with 100 rows (2^{99} paths!) and are asked to find the maximal path.

First, we recognize a trick. Suppose that by travelling the optimal route in the triangle above, we arrived in the second to last row at the number 2. Then we would know precisely how to continue: simply choose the larger of the two next values. We may reduce this triangle to the following:

      3
   7     4
 10   13   15

where we replace the second-to-last row with the sum of the entries and the larger of the two possible subsequent steps. Now, performing the reduction again on this reduced triangle, we get

   3
20   19

And performing the reduction one last time, we arrive at our final, maximal answer of 23.

All we need to do now is translate this into a sequence of maps, folds, and filters.

First, we need to be able to select the “maximum of pairs” of a given row. To do this, we convert a row into a list of successive pairs. Considering the intended audience, this is a rather complicated fold operation, and certainly the hardest part of the whole problem. We will let the reader investigate the code to understand it.

;; row->pairs: list of numbers -> list of successive pairs of numbers
(define (row->pairs row)
  (if (equal? (length row) 1)
      (list row)
      (let ([first-pair (list (list (first row) (second row)))]
            [remaining-row (rest (rest row))]
            [fold-function (λ (so-far next)
                             (let ([prev-pair (first so-far)])
                               (cons (list (second prev-pair) next) so-far)))])
        (reverse (fold fold-function first-pair remaining-row)))))

All we will say is that we need to change the base case so that it is the first pair we want, and then apply the fold to the remaining elements of the row. This turns a list like ‘(1 2 3 4) into ‘((1 2) (2 3) (3 4)).

Next, we need to be able to determine which of these pairs in a given row are maximal. This is a simple map, which we extend to work not just with pairs, but with lists of any size:

;; maxes: list of lists of numbers -> list of maxes of each list
(define (maxes pairs)
  (map (λ (lst) (apply max lst)) pairs))

Next, we combine the two operations into a “reduce” function:

;; reduce: combine maxes with row->pairs
(define (reduce row) (maxes (row->pairs row)))

Finally, we have the bulk of the algorithm. We need a helper “zip” function:

;; zip: list of lists -> list
;; turn something like '((1 2 3) (5 6 7)) into '((1 5) (2 6) (3 7))
(define (zip lists)
  (if (empty? (first lists)) empty
      (cons (map first lists)
            (zip (map rest lists))))) 

and the function which computes the maximal path, which is just a nice fold. The main bit of logic is highlighted.

;; max-path: list of rows -> number
;; takes the upside-down triangle and returns the max path
(define (max-path triangle)
  (fold (λ (cumulative-maxes new-row)
          (reduce (map sum (zip (list new-row cumulative-maxes)))))
        (make-list (length (first triangle)) 0)
        triangle))

In particular, given the previously computed maxima, and a new row, we combine the two rows by adding the two lists together element-wise (mapping sum over a zipped list), and then we reduce the result. The initial value provided to fold is an appropriately-sized list of zeros, and the rest falls through.

With an extra bit of code to read in the big input file, we compute the answer to be 7273, and eat a cookie.

Of course, we split this problem up into much smaller pieces just to show how map and fold are used. Functions like zip are usually built in to languages (though I haven’t found an analogue in Racket through the docs), and the maxes function would likely not be separated from the rest. The point is: the code is short without sacrificing modularity, readability, or extensibility. In fact, most of the algorithm we came up with translates directly to code! If we were to try the same thing in an imperative language, we would likely store everything in an array with pointers floating around, and our heads would spin with index shenanigans. Yet none of that has anything to do with the actual algorithm!

And that is why functional programming is beautiful.

As usual, the entire source code for the examples in this post is available on this blog’s Github page.

Until next time!

Addendum: Consider, if you will, the following solutions from others who solved the same problem on Project Euler:

C/C++:

#define numlines 100

int main()
{

  	int** lines = new int*[numlines];
  	int linelen[numlines];

  	for(int i=0; i<numlines; i++) linelen[i] = i+1;

   	ifstream in_triangle("triangle.txt");

   	// read in the textfile
   	for (int i=0;i<numlines;i++)
   	{
   		lines[i] = new int[i+1];
   		linelen[i] = i+1;
   		for (int j=0; j<i+1; j++)
   		{
       		in_triangle>>lines[i][j];
       	}
       	in_triangle.ignore(1,'\n');
   	}

   	int routes1[numlines];
   	int routes2[numlines];

   	for (int i=0;i<numlines;i++) routes1[i] = lines[numlines-1][i];

   	//find the best way
   	for (int i=numlines-2;i>=0;i--)
   	{
     	for(int j=0;j<i+1;j++)
   		{
   			if(routes1[j] > routes1[j+1])
   			{
      			routes2[j] = routes1[j] + lines[i][j];
            } else
            {
            	routes2[j] = routes1[j+1] + lines[i][j];
            }
     	}
     	for (int i=0;i<numlines;i++) routes1[i] = routes2[i];
   	}
   	cout<<"the sum is: "<<routes1[0]<<endl;
}

PHP:

<?php
 $cont = file_get_contents("triangle.txt");
 $lines = explode("\n",$cont);
 $bRow = explode(" ",$lines[count($lines)-1]);
 for($i=count($lines)-1; $i>0; $i--)
 {
   $tRow = explode(" ",$lines[$i-1]);
   for($j=0; $j<count($tRow); $j++)
   {
      if($bRow[$j] > $bRow[$j+1])
         $tRow[$j] += $bRow[$j];
      else
         $tRow[$j] += $bRow[$j+1];
   }
   if($i==1)
    echo $tRow[0];
   $bRow = $tRow;
 }
?>

J (We admit to have no idea what is going in programs written in J):

{. (+ 0: ,~ 2: >./\ ])/ m

Python:

import sys

l = [[0] + [int(x) for x in line.split()] + [0] for line in sys.stdin]

for i in range(1, len(l)):
    for j in range(1, i+2):
        l[i][j] += max(l[i-1][j-1], l[i-1][j])
print max(l[-1])

Haskell:

 module Main where
import IO

main = do
  triangle <- openFile "triangle.txt" ReadMode
              >>= hGetContents
  print . foldr1 reduce .
      map ((map read) . words) . lines $ triangle
    where reduce a b = zipWith (+) a (zipWith max b (tail b)) 

Ah, foldr! map! zip! Haskell is clearly functional 🙂 Now there is a lot more going on here (currying, function composition, IO monads) that is far beyond the scope of this post, and admittedly it could be written more clearly, but the algorithm is essentially the same as what we have.

Chai – Environments and Variables

As usual, we encourage the reader to follow along with our online interpreter for this version of Chai. In addition, the source code for this version of Chai is available on this blog’s Google Code page. So go ahead: experiment, discover, and play!

The Dr. Racket programming environment (not to scale, color, font, or positioning) 🙂

Variables, Variables, Variables!

A land without variables would be quite ugly. Runtimes would be sluggish, superfluous arithmetic expressions would litter every corner, and the architects (we coders) would be quite melancholy. Want to reuse a value or function? Forget about it. Want to inspect a randomly generated number more than once? Yeah right. Want to have code that someone else can actually read? Dream on.

Luckily, we’re on an exodus to a better place. A land of loops, first-class functions, and all sorts of other neat language forms. Let’s begin by implementing variables and variable references.

Substitution, Schmubstitution

The first naive approach to variable references would go something like this: whenever the programmer binds a variable, say “x” is bound to 7, the interpreter goes ahead and replaces every occurrence of the symbol “x” with the value 7. So a (pseudocode) program like

bind x to 7
bind y to 8
x + y

would first evaluate to

bind y to 8
7 + y

and then to

7 + 8

and finally result in 15. Of course, this is fine for simple arithmetic expressions, but what if we want to write a program like

bind x to 1
print x
bind x to (x + 1)
print x
...

This kind of thing happens all the time in (procedural) programs, specifically when we’re looping over a counting variable. With our current substitution method, this program would evaluate (with a grain of sense) to something counter-intuitive:

print 1
bind x to (1 + 1)
print 1
...

This is obviously not what the programmer has in mind. So what we really want with a variable is to capture the current value of a variable at the time it is referenced. In other words, we want to keep track of the scope of our variables. Plainly speaking, a variable’s scope is the part of the program where its values may be accessed. Usually, and for the remainder of this post, such “parts of programs” will be nested expressions. Note that we will go into more depth about scope in future posts, in particular the difference between dynamic scope and lexical scope, but for now the difference is moot, because our programs consist of but a single expression.

Of course there are ways to avoid conflicting substitutions, but they are more work than the solution we will arrive at anyway. So let’s skip banal direct substitution method and go straight to a more civilized discussion.

Environments

Environments will be our alternative to substitution. An environment is simply a mapping from variable names to values which is updated as a program runs and new variable bindings are introduced. In other words, as our program evaluates it has a new piece of data floating around with it. Returning to our problematic program above, it might evaluate as follows:

bind x to 1
print x
bind x to (x + 1)
print x

In the first step, we add “x” to the environment and give it a value of one:

[environment: x -> 1]
print x
bind x to (x+1)
print x

When the print statement is reached, the interpreter looks up the appropriate value of x in the environment, and uses that value. After the subsequent bind statement, the evaluation looks like

[environment: x -> 2, x -> 1]
print x

In the environment, we simply use the first binding of x as our replacement value. The astute reader might ask: “Why bother keeping the first binding of x around at all?” This is a reasonable question in the simple pseudocode language we have here, but once we implement our syntax form for Chai, we will find virtue in this peculiar decision. In general, the argument in favor of environments will become stronger and stronger as we add more features to Chai. So without further ado, let us design the syntax form for variable bindings and references.

New Syntax: with and ref

Variable references are easy. We simply add a new language that consists of any valid identifier (here, a symbol as judged by Racket). Our EBNF syntax tree now includes the line:

expr = ...
     | variable-reference

where a variable reference is a string of characters which is recognized as an identifier by Racket. Note that for our purposes we don’t actually care about these kinds of details, and the Racket specification is fairly liberal with identifiers. This means we can use all of the nice descriptive variable names like “number->string”. The specification is more explicit when we implement the additional line of code in our AST:

[ref (id symbol?)]

For binding new variables, we will make variable scope explicit. In particular, we want the programmer to be responsible for telling the interpreter where variables are in scope. This extra burden on the programmer is not so bad. In most languages, the scope of a variable is determined by a set of rules, and here we just trade memorization for explicitness and ease of writing the interpreter. So now we introduce the “with” form, which binds a single variable and describes its scope. Its EBNF:

expr = ...
     | (with variable-name expr expr)

The first sub-expression corresponds to the value of the new variable, and the second sub-expression is the “body” of the with. In particular, the result of the “with” expression is the result of the body expression, after adding these new bindings. In our interpreter, the body will be the only place that references to this particular variable are granted. For example,

(with x (+ 3 4) (* x x))

evaluates to 49, but the following program, while well-formed, might sensibly raise an error upon evaluation:

(with x 2 (with y 3 z))

Since “z” is a free variable. Here the environment only contains “x” and “y”, and “z” is unbound. Another language designer (say, a designer of Mathematica) might reasonably wish to run programs with free variables evaluating trivially to themselves. Unfortunately, doing this right requires a lot of overhead which we are simply not equipped to handle yet. To a mathematician, for instance, the program “(+ 1 (+ 2 z))” should simplify to “(+ 3 z)”, but our primitive expressions are not set up to allow for simplification. There’s a reason Mathematica is so expensive! So we will take the easy route and raise an error whenever we encounter an unbound variable.

In Racket, our abstract syntax tree now includes the line:

[with (var symbol?) (bound-expr chai-expr?) (body chai-expr?)]

Note that as our language becomes more complex, we can make these language forms more convenient for the programmer. For instance, we may want to bind any number of expressions in a single “with,” instead of nesting an expression for each (way too much typing). This will come in due time. But first, let’s translate all of our ideas into code.

The Parser

Our entire abstract syntax tree now looks like this:

;; a chai expression, as defined in our blog post
(define-type chai-expr
  [num (val number?)]
  [bool (val boolean?)]
  [prim (id symbol?) (args list?)]
  [ref (id symbol?)]
  [with (var symbol?) (bound-expr chai-expr?) (body chai-expr?)])

Implementing this in our parser is straightforward, but we have some structural changes from last time:

;; parse:unsupported: s-expr -> error
;; raise an error for unsupported language forms
(define (parse:unsupported input-expr)
  (error (string-append "parse: unsupported language form: "
                        (s-expr->string input-expr))))

;; parse-expr: s-expr -> chai-expr
;; parse an s-expression into a chai expression, or
;;  throw an error if it is not well-formed
(define (parse-expr input-expr)
  (cond [(number? input-expr) (num input-expr)]
        [(eq? 'true input-expr) (bool #t)]
        [(eq? 'false input-expr) (bool #f)]
        [(symbol? input-expr) (ref input-expr)]
        [(list? input-expr)
         (let ([first-sym (first input-expr)]
               [list-len (length input-expr)])
           (cond [(primitive-symbol? first-sym)
                  (prim first-sym (map parse-expr (rest input-expr)))]
                 [(and (equal? 'with first-sym) (equal? list-len 4))
                  (let ([id (second input-expr)]
                        [bound-expr (parse-expr (third input-expr))]
                        [body-expr (parse-expr (fourth input-expr))])
                    (if (symbol? id) (with id bound-expr body-expr)
                        (error 'parse "bad variable name: ~a" id)))]
                 [else (parse:unsupported input-expr)]))]
        [else (parse:unsupported input-expr)]))

Where the highlighted lines are new. Since we tire of writing (list? input-expr) and (first input-expr), etc., we extract those pieces first (anticipating we may need them for future languagge forms), and then do our usual check for a primitive operation. Finally, we recursively parse the “third” and “fourth” expressions, the binding expression and body expression, respectively, and pack them into a “with” type, raising an error if the variable which we’re trying to bind is not a symbol. For instance, we certainly wouldn’t want to allow the user to rebind things like numbers or parentheses. Though that’s not actually possible with this implementation, we should sensibly signal an error in the parser, since such an attempt is not well-formed.

With the relevant, added test cases, we see our parser is correct, and we turn to the interpreter.

The Interpreter

Now that we need to carry around an environment, we’d like to change our interpreter’s signature to:

;; interp: chai-expr environment -> chai-expr
;; interpret a chai-expression
(define (interp input-expr environment)
   ... )

This raises the obvious question: how are we going to represent an environment? There are many ways, but in particular we don’t want to bind ourselves to a particular implementation (no pun intended). So before we get to the interpreter, let’s develop a “chai-environment.rkt” module which provides an interface for manipulating environments.

This interface should provide functions to create a new set of bindings, add a binding to a list of bindings, and lookup a binding by its identifier. This corresponds neatly to three functions:

;; new-environment: -> environment
;; create a new, empty environment
(define (new-environment) ... )

;; add-binding: environment symbol any -> environment
;; add a new binding to the environment
(define (add-binding old-env id val) ... )

;; lookup-binding: environment id -> any
;; lookup a binding, raising an error if the requested binding is not found.
(define (lookup-binding envrionment sought-id) ... )

We will again use define-type to create a data-type for bindings, and type-case to access them.

;; the bindings type, a linked list of binding values
(define-type environment
  [binding (id symbol?) (value any/c) (rest environment?)]
  [empty-env])

So an “environment” is just the head of a list, containing both a key/value pair, and a list of successive bindings. Implementing the above functions is now easy. For the first two, we just have

;; new-environment: -> environment
;; create a new, empty environment
(define (new-environment)
  (empty-env))

;; add-binding: environment symbol any -> environment
;; add a new binding to the environment
(define (add-binding old-env id val)
  (binding id val old-env))

We just invoke the appropriate constructor, and in the future if we decide to change our environment structure (say, look things up with hash-tables), the rest of our code is unaffected by the change.

Finally, the look-up function requires actually requires a little bit of logic:

;; lookup-binding: environment id -> any
;; lookup a binding, throwing an error if the requested
;;  binding is not found.
(define (lookup-binding env sought-id)
  (type-case environment env
    [empty-env () (error 'lookup-binding
                         "id ~a is unbound in the environment"
                         sought-id)]
    [binding (id val rest-env) (if (eq? id sought-id) val
                               (lookup-binding rest-env sought-id))]))

In particular, we recursively search through the list of bindings, comparing each id to the sought-id, and returning the associated value if it is found. Otherwise, we raise an error.

Bringing this back to our interpreter, we need to alter our interp function to accept a new parameter:

;; interp: chai-expr environment -> chai-expr
;; interpret a chai-expression
(define (interp input-expr env) ... )

Now, when we see a variable reference, we just use our lookup function to extract the correct value. In the type-case, this adds the new line:

[ref (id) (lookup-binding env id)]

And finally, the with case looks like

[with (var bound-expr body-expr)
      (let ([value (interp bound-expr env)])
        (interp body-expr (add-binding env var value)))]

where we recursively interpret the new binding, and then interpret the body expression with an augmented environment containing the new variable reference.

So, our entire interp function is as follows:

;; interp: chai-expr environment -> chai-expr
;; interpret a chai-expression
(define (interp input-expr env)
  (type-case chai-expr input-expr
    [num (val) val]
    [bool (val) val]
    [ref (id) (lookup-binding env id)]
    [with (var bound-expr body-expr)
          (let ([value (interp bound-expr env)])
            (interp body-expr (add-binding env var value)))]
    [prim (id args)
          (let* ([interpreted-args (map (λ (arg) (interp arg env)) args)]
                 [operation (type-check/get-op id interpreted-args)])
            (operation interpreted-args))]))

Note that in order to maintain our other language forms (in particular, prim), we need to alter our recursive calls to interp to pass along the environment appropriately.

And that’s all there is to it! As usual, we have a host of test cases which prove the correctness of the program, and this code is available on this blog’s Google Code page. And, of course, the reader is invited to evaluate programs at this blog’s interactive online interpreter. Happy coding!

Observations, and a Peek into the Future

Before we close, we note some interesting features of Chai so far. First, we can do some weird things with variables. In particular, while we can’t yet overwrite bindings of primitive operations, we can use primitive symbols as variable names. Here is a bizarre program that actually runs:

(with + 7 (+ + +))

Technically, we haven’t “overwritten” the plus primitive operator, since it still functions as we expect it to. In the future, we may experiment with allowing the programmer to locally overshadow primitive operations with their own functions (if only just for fun).

Second, we recognize a very important choice we made in designing the with form. Specifically, in the interpreter, we first interpret the bound expression, and then bind it to the variable. The astute reader might ask, why bother? Why don’t we just leave the expression unevaluated until something references it? For instance, this would allow us to write programs like:

(with x (/ 1 0) (+ 2 3))

Which would run without error. It turns out that this choice has a name! It’s called “laziness,” and it’s used as the default in some significant programming languages, including Haskell. Haskell has a number of other very interesting features, especially pertaining to the study of logic and semantics. For instance, the Haskell compiler refuses to run a program unless it can prove that the program outputs the correct type at each step, and hence guarantee that the program cannot perform certain misbehaviors. We find this all quite fascinating, and add it to our lengthy list of Wonderful Things to Learn and Implement.

Until then!

Chai – Arithmetic and Organization

As you read along, why not try writing some programs and seeing how they evaluate in this version of Chai? You can do so at our live web server.

Organization

Last time, we implemented a simple interpreter for a language which exclusively computed sums of numbers. Before we continue with more interesting language forms, we need to give ourselves some meat to work with.

Specifically, we need to add more primitive operations (+, -, *, /, and, or, not, …), and modularize our files so we can keep track of things better. After we cover the new syntax forms, this post will be admittedly be technical; we’ll spend most of our efforts here designing look-up tables, and implementing simple type-checking for primitive operations. Afterward, we will be able to add new primitive operations and data types without modifying our interpreter, and only slightly modifying our parser. The point of all this is to hide the messy code which handles primitives, so that we can better focus on more interesting language forms. For those readers without a neurotic desire to organize code (and hence no desire to read the entire post), just remember that in the future, we will add new primitives without detailing the implementation. For reference, the source code is available on this blog’s Google Code page.

The first change we’d like to make is to move our abstract syntax tree to a separate file, “chai-ast.rkt.” In order to have all the generated functions (constructors and field accessors) available for the rest of our files to use, we need a special “provide” command:

(provide (all-defined-out))
(require plai)

;; a chai-expression
(define-type chai-expr
  [num (val number?)]
  [sum (lhs chai-expr?) (rhs chai-expr?)]))

So (all-defined-out) is a special function which provides all functions defined in this file. Now, instead of having separate language form for each primitive operation (say, a sum, a diff, a mult, a div, etc), we can abstract this to a “primitive” form. In other words, our “sum” line changes to this:

 [prim (id symbol?) (args list?)] 

We also recognize that we will want to include unary primitives, and ternary primitives, so we have an identifying symbol and a list of arguments. In other words, our new EBNF syntax looks like:

expr = number
     | (prim expr expr...) 

prim = + | - | * | /

Where the “expr…” notation means we allow zero or more following things which are exprs. For now, we do not have any null-ary operations (with zero arguments). Finally, we’d like to include boolean-valued expressions and boolean algebra as well, so this extends our syntax to:

expr = number
     | true
     | false
     | (prim expr expr...) 

prim = + | - | * | / | and | or | not

Note that the syntax does not keep track of and information associated with the operations, like type or arity. Syntactically, it is perfectly okay to write the program: “(not 1 2 3)”. This is well-formed, because the first thing is a primitive symbol, and the other things are valid arguments. We will leave the semantics of these primitive operations to the interpreter.

Implementing all of this in our abstract syntax tree is straightforward:

;; a chai expression
(define-type chai-expr
  [num (val number?)]
  [bool (val boolean?)]
  [prim (id symbol?) (args list?)])

Now let us turn to the parser, which is about as simple as it was last time. Again, we assume that we have already converted an input string into an s-expression (via the utility functions in chai-utils.rkt).

;; parse-expr: s-expr -> chai-expr
;; parse an s-expression into a chai expression,
;;  or throw an error if it is not well-formed
(define (parse-expr input-expr)
  (cond [(number? input-expr) (num input-expr)]
        [(eq? 'true input-expr) (bool #t)]
        [(eq? 'false input-expr) (bool #f)]
        [(and (list? input-expr)
              (primitive-symbol? (first input-expr)))
         (prim (first input-expr)
               (map parse-expr (rest input-expr)))]
        [else (error (string-append
                       "parse: unsupported language form: "
                       (s-expr->string input-expr)))]))

The first three cases of the “cond” expression are trivial. In the fourth, we introduce a new function, which determines whether the first thing in our s-expression is a primitive symbol. Of course, we actually need to write this function, but before we do that let’s come up with a good way to structure a table with information about primitives.

We create a new file called “chai-primitives.rkt” just for this purpose. In it, we’ll have a look-up table with some information about primitives:

;; a helper function for the primitive operation table
(define (make-op op) (λ (arglist) (apply op arglist)))

;; primitive operation lookup-table
;; symbol -> (list op num-args (list type1? type2? ...))
(define primitives-table
  (hasheq '+ (list (make-op +) 2 (list number? number?))
          '- (list (make-op -) 2 (list number? number?))
          '* (list (make-op *) 2 (list number? number?))
          '/ (list (make-op /) 2
                   (list number?
                         (λ (x) (if (equal? 0 x)
                                    (error "division by zero!")
                                    (number? x)))))
          'or (list (λ (args) (ormap (λ (x) x) args))
                    2 (list boolean? boolean?))
          'and (list (λ (args) (andmap (λ (x) x) args))
                    2 (list boolean? boolean?))
          'not (list (make-op not) 1 (list boolean?))))

“hasheq” creates a new hash table where the “eq?” function is used to compare keys. In this case, the keys are Racket symbols, and “eq?” happens to work very well with symbols. Associated to each  symbol is a list of three things. First, the operation that will be applied to the arguments. Since each function has to be applied to a packaged list of values, we will construct the “operation” via make-op (with the exception of and and or, for quirky Racket reasons), which creates a function that applies the given operation to a list of arguments. Second, we have a number representing the allowed number of arguments; and finally, we have a list of functions. In this list, the nth function is used to determine whether the nth argument has the correct type (and they are applied in order). This allows us to package important information here, such as the condition that the second argument of division is never zero. This sort of ho-hum logic would severely clutter our interpreter.

But instead of allowing our other files to access the primitives table willy-nilly, we should create an extra layer of protection, which will make it easier for us to maintain our code in the future. Specifically, we want a function which determines if a given symbol is a key in the table, and a function to retrieve the contents of the table. We do this as follows:

;; primitive-symbol?: symbol -> boolean
;; return true if the given symbol is a primitive
(define (primitive-symbol? id)
  (hash-has-key? primitives-table id))

;; get-prim: symbol -> function number (list function)
;; lookup the data associated with a primitive
(define (get-prim id)
  (apply values (hash-ref primitives-table id)))

The bit with “(apply values … )” takes the items in the list resulting from the hash-lookup, and returns them as multiple values. Note that in Racket, functions can have multiple return values. As we will see, this decision makes our code shorter. Instead of writing code to extract the pieces of the list, we just require the caller to bind all three return values to variables at the same time. Our foresight tells us that when we are type-checking, we will need all of this information at the same time, so requiring us to bind them all is no hindrance. Providing these two functions gives us all we need to continue with our interpreter.

The actual interpreter is quite simple now. There are only three cases:

;; interp: chai-expr -> chai-expr
;; interpret a chai-expression
(define (interp input-expr)
  (type-case chai-expr input-expr
    [num (val) val]
    [bool (val) val]
    [prim (id args)
      (let* ([interped-args (map interp args)]
             [operation (type-check/get-op id interped-args)])
        (operation interpreted-args))]))

At this point, we note that at some point we will want to perform type-checking and retrieve the required primitive operation given its identifier. We also recognize that this code belongs elsewhere. So we defer the logic to a function called “type-check/get-op”, which accepts the primitive operation identifier, and the list of arguments (after they’ve themselves been interpreted!), and returns the function which may then be applied to a list of values, throwing an error if there is an incorrect number of arguments or if any has the wrong type. The rest of the logic that goes into the interpreter is quite clear.

Type Checking

We write a ghostly shell of a type-checker as follows:

;; type-check/get-op: symbol list -> function
;; check the types and number of the arguments for the primitive
;;  'id', returning the primitive operation on success, and
;;  throwing an error on failure
(define (type-check/get-op id args)
  (let-values ([(op num-args-allowed arg-types) (get-prim id)])
     op))

For starters, we just return the operation as it is. Note that let-values allows us to bind all three results of the call to “get-prim” simultaneously.

To fill in the gaps, we should count the number of arguments in the “args” variable, and check that against “num-args-allowed”:

;; type-check/get-op: symbol list -> function
(define (type-check/get-op id args)
  (let-values
      ([(num-args-received) (length args)]
       [(op num-args-allowed arg-types) (get-prim id)])
    (if (not (equal? num-args-received num-args-allowed))
        (error 'type-check
               "~a expected ~a args, but received ~a."
               id num-args-allowed num-args-received)
        op)))

Note that we need the additional parentheses around (num-args-received) in the binding clause, because let-values expects each clause to have a list of identifiers. In addition, the “error” function accepts a formatting string similar to the kind encountered in C’s printf function. However, here the “~a” variable allows one to write out whatever value one wants. The details of Racket’s flavor of formatted strings is in the documentation here. All we need to know here is that “~a” works for strings and numbers.

Now that we’ve checked that there are the correct number of arguments, let us ensure they have the right type. To do this, we simply apply each element of “arg-types” (a list of boolean-valued functions) to the corresponding element of “args”:

(andmap (λ (a b) (a b)) arg-types args)

Here “(λ (a b) (a b))” is an anonymous function which accepts two arguments, and applies the first argument as a function call on the second argument. If we “map” this lambda expression over the list of arg-types and args, we will get a list of booleans representing whether the arguments have the correct types! Going slightly further, “andmap” performs such a “map,” and returns true if and only if each value in the resulting map is true (probably short-cutting by terminating upon finding a single false value). Thus, this expression is true precisely when the arguments type-check!

Putting this back into our type-check/get-op function, we are finished:

;; type-check/get-op: symbol list -> function
(define (type-check/get-op id args)
  (let-values ([(num-args-received) (length args)]
               [(op num-args-allowed types) (get-prim id)])
    (if (not (equal? num-args-received num-args-allowed))
        (error 'type-check
               "~a expected ~a args, but received ~a."
               id num-args-allowed num-args-received)
        (let ([good-args? (andmap (λ (a b) (a b)) types args)])
          (if good-args?
              op
              (error 'type-check
      "one of the arguments to ~a has the wrong type: ~a"
      id (string-join (map chai-expr->string args) ", ")))))))

[We apologize for the sloppy indenting in the last two lines; it is better than forcing the reader to scroll sideways. For a more pleasant indentation scheme (with colors, parenthesis matching, and more!), view the file in DrRacket, the standard Racket IDE.]

So that’s it! Note now that we may add in new primitive operations without changing any of the code in the parser, type-checker, or interpreter! From now on, when we need a new primitive, we will simply add it and mention its name. We will not bother the reader with its implementation, because it is nothing more than another entry in the look-up table, which is out of sight and out of mind.

Adequately many test cases are provided in the source code, so that the user may verify the interpreter’s correctness. Of course, as we repeat time and time again, the full source code for this post is available on this blog’s Google Code page. Additionally, we have a server accepting and evaluating your every program! Just amble right on over to the Chai Interpreters page to give it a swing.

Next time,  we’ll get started with the first big consideration in our language: how to handle scope. We’ll introduce a language form that binds values to variables, and come up with a system for referencing them. Until then!