Programming using exceptions

The goal of this lecture note is to illustrate how to declare and to catch exceptions.

Resources

Catching exceptions in OCaml

By default, once it is raised, an exception bubbles all the way up to the toplevel. OCaml, however, offers a way to catch an exception on its way up, using a match-style expression with the keyword try:

<expression> ::= ...
               | try <expression> with
                 | name {name}* -> <expression>
                 | ...

When evaluating, e.g., try e with | Division_by_zero -> e', three things may happen:

  • evaluating e yields a value; then this value is the result of evaluating this try-expression;
  • evaluating e raises the Division_by_zero exception; then the result of evaluating this try-expression is the result of evaluating e';
  • evaluating e raises another exception; then evaluating this try-expression does not yield any result; instead, the other exception continues to be raised.

A first example

We can test whether a given string is empty by indexing its first character: if the result is a character, the string was nonempty, and if an exception is raised, the string was empty:

# String.get "abc" 0;;
- : char = 'a'
# String.get "" 0;;
Exception: Invalid_argument "index out of bounds".
#

Concretely:

let string_emptyp s =
  try let _ = String.get s 0
      in false
  with
  | Invalid_argument "index out of bounds" ->
     true;;

The accompanying file contains a unit-test function and two other solutions.

Exercise 16

The goal of this exercise is to implement three versions of an OCaml function singleton_stringp : string -> bool that tests whether its input string is a singleton string, i.e., contains exactly one character:

let test_singleton_stringp candidate =
  let b0 = (candidate "" = false)
  and b1 = (candidate "a" = true)
  and b2 = (candidate "ab" = false)
  and b3 = (candidate "abc" = false)
  in b0 && b1 && b2 && b3;;
  1. The first version should use String.length.
  2. The second version should use String.map.
  3. The third version should use the topic of the present chapter.

Exercise 17

The goal of this exercise is to implement three versions of an OCaml function two_characters_stringp : string -> bool that tests whether its input string contains exactly two characters:

let test_two_characters_stringp candidate =
  let b0 = (candidate "" = false)
  and b1 = (candidate "a" = false)
  and b2 = (candidate "ab" = true)
  and b3 = (candidate "abc" = false)
  and b4 = (candidate "abcd" = false)
  in b0 && b1 && b2 && b3 && b4;;
  1. The first version should use String.length.
  2. The second version should use String.map.
  3. The third version should use the topic of the present chapter.

Exercise 18

The present exercise generalizes the two previous exercises. Its goal is to implement several versions of an OCaml function string_lengthp : string -> int -> bool that, given a string and given an integer, tests whether this integer is the length of this string.

A second example

The following predicate tests whether applying a given zero-ary function raises an exception:

let raises_an_exception f =
  try let _ = f ()
      in false
  with
  | _ ->
    true;;

Given a zero-ary function f, this function applies it to the unit value:

  • if no exception is raised, the result of applying f is ignored and false is returned;
  • if an exception is raised, it is intercepted by the catch-all clause in the try-expression and true is returned.

To wit:

# raises_an_exception (fun () -> 42);;
- : bool = false
# raises_an_exception (fun () -> 42 / 0);;
- : bool = true
#

Interlude

Anton: How about we write a unit-test function for raises_an_exception?

Alfrothul: Uh oh.

Anton: Got cold feet?

Alfrothul (sighing): No, no. But at what type?

Anton: At what type?

Alfrothul: OCaml is not infinitely polymorphic. It will only let us test functions that have a monomorphic type. So we will have to write several unit-test functions, one for each type.

Anton (optimistic): Fine. Let’s say that the codomain of the candidate function is bool.

Alfrothul (sitting at the keyboard): Here we go:

let test_raises_an_exception candidate =
 (* test_raises_an_exception : (unit -> bool) -> bool *)

  (* etc. *);;

Alfrothul: Now what?

Anton (grabbing the keyboard): Now we piggy-back on the examples above. Look:

let test_raises_an_exception candidate =
 (* test_raises_an_exception : (unit -> bool) -> bool *)
  let b0 = (candidate (fun () -> true) = false)
  and b1 = (candidate (fun () -> false) = false)
  (* etc. *)
  in b0 && b1;;

Alfrothul: OK. Applying the candidate function to a function that does not raise an exception should yield false. Huh, can we try it?

Anton (generously): Of course. C-c C-e:

# let test_raises_an_exception candidate =
 (* test_raises_an_exception : (unit -> bool) -> bool *)
  let b0 = (candidate (fun () -> true) = false)
  and b1 = (candidate (fun () -> false) = false)
  (* etc. *)
  in b0 && b1;;
val test_raises_an_exception : ((unit -> bool) -> bool) -> bool = <fun>
#

Alfrothul: Wait a second.

Anton (surprised): Why?

Alfrothul: Did you see the type of test_raises_an_exception?

Anton: Oh.

Alfrothul: It is not (unit -> bool) -> bool at all.

Anton (after a moment of thought): And it shouldn’t be. The type of the candidate function is not unit -> bool: the candidate function is a function that expects a function and returns a Boolean.

Alfrothul: Right. And it is the expected function that has the type unit -> bool. That is why the domain of the candidate function is unit -> bool.

Anton: Yes. And then we test its result, which is why the codomain of the candidate function is bool. Let me fix the comment:

let test_raises_an_exception candidate =
 (* test_raises_an_exception : ((unit -> bool) -> bool) -> bool *)
  let b0 = (candidate (fun () -> true) = false)
  and b1 = (candidate (fun () -> false) = false)
  (* etc. *)
  in b0 && b1;;

Alfrothul: Thanks.

Anton: Now let’s try it:

# test_raises_an_exception raises_an_exception;;
- : bool = true
#

Anton (beatifically): To see is to boolieve.

Alfrothul: That behooves, I guess. Anyway, whether the given function returns true or false, the test succeeds?

Anton: Yes. The point is that this function does not raise an exception.

Alfrothul: OK.

Anton (getting back to business): And applying the candidate function to a function that does raise an exception should yield true:

let test_raises_an_exception candidate =
 (* test_raises_an_exception : ((unit -> bool) -> bool) -> bool *)
  let b0 = (candidate (fun () -> true) = false)
  and b1 = (candidate (fun () -> false) = false)
  and b2 = (candidate (fun () -> raise (Failure "is not an option")) = true)
  (* etc. *)
  in b0 && b1 && b2;;

Anton: Look:

# test_raises_an_exception raises_an_exception;;
- : bool = true
#

Alfrothul: Right. And applying the candidate function to a function that calls another function that raises an exception should also yield true.

Anton (typing busily): Yup:

let test_raises_an_exception candidate =
 (* test_raises_an_exception : ((unit -> bool) -> bool) -> bool *)
  let b0 = (candidate (fun () -> true) = false)
  and b1 = (candidate (fun () -> false) = false)
  and b2 = (candidate (fun () -> raise (Failure "is not an option")) = true)
  and b3 = (candidate (fun () -> 1/0) = true)
  (* etc. *)
  in b0 && b1 && b2 && b3;;

Alfrothul: That one won’t work.

Anton: It won’t?

Alfrothul (soberly): Not the right type. The domain of the supplied function needs to be bool, and here it is int. OCaml is not infinitely polymorphic.

Anton (easy): Sure thing:

let test_raises_an_exception candidate =
 (* test_raises_an_exception : ((unit -> bool) -> bool) -> bool *)
  let b0 = (candidate (fun () -> true) = false)
  and b1 = (candidate (fun () -> false) = false)
  and b2 = (candidate (fun () -> raise (Failure "is not an option")) = true)
  and b3 = (candidate (fun () -> 1/0 = 0/1) = true)
  (* etc. *)
  in b0 && b1 && b2 && b3;;

Anton: Happy now?

Alfrothul: Such as one can be, yes. Can we test it?

Anton: Of course:

# test_raises_an_exception raises_an_exception;;
- : bool = true
#

Anton: See? That wasn’t so hard.

Alfrothul: But what if the candidate function itself raises an exception?

Anton: Uh oh.

Alfrothul (whisking the keyboard): Let me try:

# test_raises_an_exception (fun f -> raise (Failure "is not an option"));;
Exception: Failure "is not an option".
#

Anton: Oops.

Alfrothul: We just need to make the unit-test function robust. If the candidate function raises an exception, then the test should fail. Look:

let robust_test_raises_an_exception candidate =
 (* robust_test_raises_an_exception : ((unit -> bool) -> bool) -> bool *)
  let b0 = (try candidate (fun () -> true) = false with
            | _ ->
               false)
  and b1 = (try candidate (fun () -> false) = false with
            | _ ->
               false)
  and b2 = (try candidate (fun () -> raise (Failure "is not an option")) = true with
            | _ ->
               false)
  and b3 = (try candidate (fun () -> 1/0 = 0/1) = true with
            | _ ->
               false)
  (* etc. *)
  in b0 && b1 && b2 && b3;;

Anton: Right. We intercept any exception raised by the candidate function and then return false because the unit test should fail.

Alfrothul: Yes. And otherwise it is business as usual: if the candidate function yields a result, we verify whether it is the expected result:

# robust_test_raises_an_exception raises_an_exception;;
- : bool = true
# robust_test_raises_an_exception (fun f -> raise (Failure "is not an option"));;
- : bool = false
#

Jacques Clouseau: Ahh... That. Felt. Good.

Declaring new exceptions in OCaml

Just as we can define our own types in OCaml, we can define our own exceptions with the following declaration:

<declaration> ::= let <formal> = <expression>
                | type <name> = ...
                | exception <name> of <type>

There is a global data type of exceptions, and each new declaration of an exception adds a constructor to this global data type.

Application: multiplying all the integers in a binary tree

Let us revisit the data type of binary trees where the payloads are stored in the leaves:

type 'a binary_tree =
  | Leaf of 'a
  | Node of 'a binary_tree * 'a binary_tree;;

let binary_tree_fold leaf_case node_case t_given =
  let rec visit t =
    match t with
    | Leaf v ->
       leaf_case v
    | Node (t1, t2) ->
       node_case (visit t1) (visit t2)
  in visit t_given;;

The goal of this section is to implement a function that multiplies all the payloads in a binary tree of integers:

let test_mult candidate =
  let b0 = (candidate (Leaf 0)
            = 0)
  and b1 = (candidate (Leaf 1)
            = 1)
  and b2 = (candidate (Node (Leaf 1,
                             Leaf 2))
            = 2)
  and b3 = (candidate (Node (Leaf 1,
                             Node (Leaf 2,
                                   Leaf 3)))
            = 6)
  and b4 = (candidate (Node (Leaf 1,
                             Node (Leaf 2,
                                   Node (Leaf 3,
                                         Leaf 4))))
            = 24)
  and b5 = (candidate (Node (Leaf 1,
                             Node (Leaf 2,
                                   Node (Leaf 3,
                                         Node (Leaf 4,
                                               Leaf 5)))))
            = 120)
  and b6 = (candidate (Node (Leaf 1,
                             Node (Leaf 2,
                                   Node (Leaf 0,
                                         Node (Leaf 4,
                                               Leaf 5)))))
            = 0)
  (* etc. *)
  in b0 && b1 && b2 && b3 && b4 && b5 && b6;;

The implementation is that of a routine recursive descent in the given binary tree of integers:

let mult_v0 t_given =
  let rec visit t =
    match t with
    | Leaf n ->
       n
    | Node (t1, t2) ->
       visit t1 * visit t2
  in visit t_given;;

Here is the logic of visit:

  • For any tree t, the application of visit to t completes and yields an integer, the result of multiplying all integers in t.
  • So the initial application of visit to the given tree yields the result of multiplying all the integers in this tree.

Since this implementation is structurally recursive, it can be expressed using binary_tree_fold:

let mult_v1 t_given =
  binary_tree_fold (fun n -> n) ( * ) t_given;;

Both implementations pass the unit tests:

# test_mult mult_v0;;
- : bool = true
# test_mult mult_v1;;
- : bool = true
#

The straightforward recursive descent is visualized as follows, using the traced version of mult_v0 found in the accompanying file:

# traced_mult_v0 show_int (Node (Node (Node (Leaf 1, Leaf 2), Node (Leaf 3, Leaf 0)), Node (Leaf 4, Leaf 5)));;
mult_v0 (Node (Node (Node (Leaf 1, Leaf 2), Node (Leaf 3, Leaf 0)), Node (Leaf 4, Leaf 5))) ->
visit (Node (Node (Node (Leaf 1, Leaf 2), Node (Leaf 3, Leaf 0)), Node (Leaf 4, Leaf 5))) ->
  visit (Node (Leaf 4, Leaf 5)) ->
    visit (Leaf 5) ->
    visit (Leaf 5) <- 5
    visit (Leaf 4) ->
    visit (Leaf 4) <- 4
  visit (Node (Leaf 4, Leaf 5)) <- 20
  visit (Node (Node (Leaf 1, Leaf 2), Node (Leaf 3, Leaf 0))) ->
    visit (Node (Leaf 3, Leaf 0)) ->
      visit (Leaf 0) ->
      visit (Leaf 0) <- 0
      visit (Leaf 3) ->
      visit (Leaf 3) <- 3
    visit (Node (Leaf 3, Leaf 0)) <- 0
    visit (Node (Leaf 1, Leaf 2)) ->
      visit (Leaf 2) ->
      visit (Leaf 2) <- 2
      visit (Leaf 1) ->
      visit (Leaf 1) <- 1
    visit (Node (Leaf 1, Leaf 2)) <- 2
  visit (Node (Node (Leaf 1, Leaf 2), Node (Leaf 3, Leaf 0))) <- 0
visit (Node (Node (Node (Leaf 1, Leaf 2), Node (Leaf 3, Leaf 0)), Node (Leaf 4, Leaf 5))) <- 0
mult_v0 (Node (Node (Node (Leaf 1, Leaf 2), Node (Leaf 3, Leaf 0)), Node (Leaf 4, Leaf 5))) <- 0
- : int = 0
#

However, since 0 is absorbing for multiplication, if visit encounters 0 in a leaf – as is the case just above – there is no need to pursue the computation, and that is a job for the option type:

let mult_v2 t_given =
  let rec visit t =
    match t with
    | Leaf n ->
       if n = 0
       then None
       else Some n
    | Node (t1, t2) ->
       match visit t2 with
       | None ->
          None
       | Some p2 ->
          match visit t1 with
          | None ->
             None
          | Some p1 ->
             Some (p1 * p2)
  in match visit t_given with
     | None ->
        0
     | Some p ->
        p;;

Here is the logic of visit:

  • For any tree t, the application of visit to t yields Some n, where n is the result of multiplying all integers in t if none of them is 0. Otherwise, if t contains 0, the application of visit to t yields None.
  • At the outset, if the result of applying visit to the given tree is Some n, the result of applying mult_v2 to this tree is n, the result of multiplying all integers in the given tree. Otherwise, if the result of applying visit to the given tree is None, the result of applying mult_v2 to this tree is 0, the result of multiplying all integers in the given tree since 0 is absorbing for multiplication.

Since this implementation is structurally recursive, it can be expressed using binary_tree_fold:

let mult_v3 t =
  match binary_tree_fold (fun n ->
                           if n = 0
                           then None
                           else Some n)
                         (fun ih1 ih2 ->
                           match ih1 with
                           | None ->
                              None
                           | Some p1 ->
                              match ih2 with
                              | None ->
                                 None
                              | Some p2 ->
                                 Some (p1 * p2))
                         t_given
  with
  | None ->
     0
  | Some p ->
     p;;

Both implementations pass the unit tests:

# test_mult mult_v2;;
- : bool = true
# test_mult mult_v3;;
- : bool = true
#

That the traversal is interrupted once 0 is encountered is visualized as follows, using the traced version of mult_v2 found in the accompanying file:

# traced_mult_v2 show_int (Node (Node (Node (Leaf 1, Leaf 2), Node (Leaf 3, Leaf 0)), Node (Leaf 4, Leaf 5)));;
mult_v1 (Node (Node (Node (Leaf 1, Leaf 2), Node (Leaf 3, Leaf 0)), Node (Leaf 4, Leaf 5))) ->
  visit (Node (Node (Node (Leaf 1, Leaf 2), Node (Leaf 3, Leaf 0)), Node (Leaf 4, Leaf 5))) ->
    visit (Node (Leaf 4, Leaf 5)) ->
      visit (Leaf 5) ->
      visit (Leaf 5) <- Some 5
      visit (Leaf 4) ->
      visit (Leaf 4) <- Some 4
    visit (Node (Leaf 4, Leaf 5)) <- Some 20
    visit (Node (Node (Leaf 1, Leaf 2), Node (Leaf 3, Leaf 0))) ->
      visit (Node (Leaf 3, Leaf 0)) ->
        visit (Leaf 0) ->
        visit (Leaf 0) <- None
      visit (Node (Leaf 3, Leaf 0)) <- None
    visit (Node (Node (Leaf 1, Leaf 2), Node (Leaf 3, Leaf 0))) <- None
  visit (Node (Node (Node (Leaf 1, Leaf 2), Node (Leaf 3, Leaf 0)), Node (Leaf 4, Leaf 5))) <- None
mult_v2 (Node (Node (Node (Leaf 1, Leaf 2), Node (Leaf 3, Leaf 0)), Node (Leaf 4, Leaf 5))) <- 0
- : int = 0
#

However, the option type gets in the way at every stage of the recursive computation. It can be bypassed by using a continuation – a function representing the rest of the computation – instead:

let mult_v4 t_given =
  let rec visit t k =
    match t with
    | Leaf n ->
       if n = 0
       then 0
       else k n
    | Node (t1, t2) ->
       visit t2 (fun p2 ->
                  visit t1 (fun p1 ->
                             k (p1 * p2)))
  in visit t_given (fun p -> p);;

Here is the logic of visit:

  • For any given tree t and for any continuation k, when visit is applied to t and k, k is only applied if t does not contain 0. Otherwise, 0 is returned, the result of multiplying all the integers in t since 0 is absorbing for multiplication.
  • So if the given tree contains 0, 0 is returned. And if the given tree does not contain 0, the initial continuation (i.e., the second argument in the initial call to visit) is applied to the non-zero result of multiplying all the integers in the given tree and returns this result.

To conform with OCaml’s right-to-left evaluation of subexpressions, each right subtree t2 is traversed (using a tail call to visit) before the corresponding left subtree t1 (which is traversed using another tail call to visit).

This implementation passes the unit tests:

# test_mult mult_v4;;
- : bool = true
#

That the traversal is carried out iteratively is visualized as follows, using the traced version of mult_v4 found in the accompanying file:

# traced_mult_v4 show_int (Node (Node (Leaf 2, Leaf 3), Leaf 4));;
mult_v4 (Node (Node (Leaf 2, Leaf 3), Leaf 4)) ->
visit (Node (Node (Leaf 2, Leaf 3), Leaf 4)) (fun p -> ...) ->
visit (Leaf 4) (fun p -> ...) ->
continuing with 4 after Leaf 4 ->
visit (Node (Leaf 2, Leaf 3)) (fun p -> ...) ->
visit (Leaf 3) (fun p -> ...) ->
continuing with 3 after Leaf 3 ->
visit (Leaf 2) (fun p -> ...) ->
continuing with 2 after Leaf 2 ->
continuing with 6 after Node (Leaf 2, Leaf 3) ->
and completing with 24 after Node (Node (Leaf 2, Leaf 3), Leaf 4) ->
mult_v4 (Node (Node (Leaf 2, Leaf 3), Leaf 4)) <- 24
- : int = 24
#

And that the iterative traversal is interrupted once 0 is encountered is visualized as follows:

# traced_mult_v4 show_int (Node (Node (Node (Leaf 1, Leaf 2), Node (Leaf 3, Leaf 0)), Node (Leaf 4, Leaf 5)));;
mult_v4 (Node (Node (Node (Leaf 1, Leaf 2), Node (Leaf 3, Leaf 0)), Node (Leaf 4, Leaf 5))) ->
visit (Node (Node (Node (Leaf 1, Leaf 2), Node (Leaf 3, Leaf 0)), Node (Leaf 4, Leaf 5))) (fun p -> ...) ->
visit (Node (Leaf 4, Leaf 5)) (fun p -> ...) ->
visit (Leaf 5) (fun p -> ...) ->
continuing with 5 after Leaf 5 ->
visit (Leaf 4) (fun p -> ...) ->
continuing with 4 after Leaf 4 ->
continuing with 20 after Node (Leaf 4, Leaf 5) ->
visit (Node (Node (Leaf 1, Leaf 2), Node (Leaf 3, Leaf 0))) (fun p -> ...) ->
visit (Node (Leaf 3, Leaf 0)) (fun p -> ...) ->
visit (Leaf 0) (fun p -> ...) ->
and stopping with 0
mult_v4 (Node (Node (Node (Leaf 1, Leaf 2), Node (Leaf 3, Leaf 0)), Node (Leaf 4, Leaf 5))) <- 0
- : int = 0
#

Exercise 19

Since mult_v4 is structurally recursive, express it using binary_tree_fold.

Solution for Exercise 19

Anton: Well, we can’t do that.

Alfrothul: We can’t?

Anton: Well, yes – visit takes two arguments, and binary_tree_fold abstracts recursive functions that only take one argument, look:

let mult_v4 t_given =
  let rec visit t k =
    match t with
    | Leaf n ->
       if n = 0
       then 0
       else k n
    | Node (t1, t2) ->
       visit t2 (fun p2 ->
                  visit t1 (fun p1 ->
                             k (p1 * p2)))
  in visit t_given (fun p -> p);;

let binary_tree_fold leaf_case node_case t_given =
  let rec visit t =
    match t with
    | Leaf v ->
       leaf_case v
    | Node (t1, t2) ->
       node_case (visit t1) (visit t2)
  in visit t_given;;

Alfrothul: Right. But OCaml’s syntactic sugar is hiding the cake.

Halcyon: Good, more culinary metaphors. Can we skip to the part where we get the cake too?

Alfrothul: No, I am actually serious: let foo x1 x2 = ... is syntactic sugar for let foo = fun x1 -> fun x2 -> ....

Anton: OK, middle path, here we come:

let mult_v4b t_given =
  let rec visit t = fun k ->
    match t with
    | Leaf n ->
       if n = 0
       then 0
       else k n
    | Node (t1, t2) ->
       visit t2 (fun p2 ->
                  visit t1 (fun p1 ->
                             k (p1 * p2)))
  in visit t_given (fun p -> p);;

Alfrothul: Right. Now, remember the solution for Exercise 37 in Week 09?

Anton: You mean “S-m-n theorem much, Dana?”, right. Tu quoque, Alfrothul?

Alfrothul: A little bit, maybe, but k and t are distinct, so rather than waiting for fun k -> match t with ... to be applied, we might as well test t right away, and return fun k -> ... in each clause.

Anton: So you want to commute the function abstraction and the conditional expression?

Pablito: Clearly yes, so let’s skip the dilly-dally:

let mult_v4c t_given =
  let rec visit t =
    match t with
    | Leaf n ->
       (fun k ->
         if n = 0
         then 0
         else k n)
    | Node (t1, t2) ->
       (fun k ->
         visit t2 (fun p2 ->
                    visit t1 (fun p1 ->
                               k (p1 * p2))))
  in visit t_given (fun p -> p);;

Alfrothul: See how visit now takes one argument, just as in the definition of binary_tree_fold?

Anton: Yes. Still, the initial call to visit still takes two arguments, not one, so the definition of mult_v4c is still not fold-ready.

Dana: But that could be arranged.

Anton: How so?

Dana: Well, the letrec-expression and the application in its body could also be commuted.

Alfrothul: Ah, you mean that let rec ... in e0 e1 and (let rec ... in e0) e1 are observationally equivalent?

Dana: If no names in e1 are declared in the let-expression, yes.

Pablito: That’s the case here, look:

let mult_v4d t_given =
  (let rec visit t =
     match t with
     | Leaf n ->
        (fun k ->
          if n = 0
          then 0
          else k n)
     | Node (t1, t2) ->
        (fun k ->
          visit t2 (fun p2 ->
                     visit t1 (fun p1 ->
                                k (p1 * p2))))
   in visit t_given) (fun p -> p);;

Anton: OK, we are in business:

let mult_v4e t_given =
  (binary_tree_fold
     (fun n k ->
       if n = 0
       then 0
       else k n)
     (fun ih1 ih2 ->
       fun k ->
        ih2 (fun p2 ->
              ih1 (fun p1 ->
                    k (p1 * p2))))
     t_given) (fun p -> p);;

Pablito: Let me spread some more syntactic sugar in the second argument of binary_tree_fold:

let mult_v4f t_given =
  (binary_tree_fold
     (fun n k ->
       if n = 0
       then 0
       else k n)
     (fun ih1 ih2 k ->
       ih2 (fun p2 ->
             ih1 (fun p1 ->
                   k (p1 * p2))))
     t_given) (fun p -> p);;

Dana: And since application associates to the left...

Alfrothul: Good times.

Dana: ...we don’t need the parentheses around the call to binary_tree_fold:

let mult_v4g t_given =
  binary_tree_fold
    (fun n k ->
      if n = 0
      then 0
      else k n)
    (fun ih1 ih2 k ->
      ih2 (fun p2 ->
            ih1 (fun p1 ->
                  k (p1 * p2))))
    t_given
    (fun p -> p);;

Loki: Yes, Mimer?

Mimer: Nothing, nothing.

Exercise 20

Verify that inlining the call to binary_tree_fold in the definition of mult_v4g, we fall back on our feet.

Solution for Exercise 20

Pablito: Piece of cake!

Halcyon: Sounds good, let’s have it too.

Pablito: No, seriously, look. First, we inline the call to binary_tree_fold:

let mult_v4h t_given =
  (let t_given = t_given
   and node_case = (fun ih1 ih2 k ->
                     ih2 (fun p2 ->
                           ih1 (fun p1 ->
                                 k (p1 * p2))))
   and leaf_case = (fun n k ->
                     if n = 0
                     then 0
                     else k n)
   in let rec visit t =
        match t with
        | Leaf v ->
           leaf_case v
        | Node (t1, t2) ->
           node_case (visit t1) (visit t2)
      in visit t_given) (fun p -> p);;

Halcyon: Checking:

# let () = assert (test_mult mult_v4h);;
#

Pablito: Then we unfold the let-expressions, which we can do since none of visit and t occur in the definienses:

let mult_v4i t_given =
  (let rec visit t =
     match t with
     | Leaf v ->
        (fun n k ->
          if n = 0
          then 0
          else k n) v
     | Node (t1, t2) ->
        (fun ih1 ih2 k ->
          ih2 (fun p2 ->
                ih1 (fun p1 ->
                      k (p1 * p2)))) (visit t1) (visit t2)
   in visit t_given) (fun p -> p);;

Halcyon: Huh, what if they did?

Pablito: Then we would need to rename visit or t, whichever occurs in the definienses.

Alfrothul: Well spotted, Pablito!

Pablito: Huh, thanks, but it’s not that complicated, look:

let foo x =
  let y = x + 1
  in let x = 10
     in x + y;;

Anton: Looking.

Alfrothul: And checking:

# foo 100;;
- : int = 111
#

Pablito: It would not be correct to unfold the let-expression that declares y, look:

let foo x =
  let x = 10
  in x + (x + 1);;

Anton: Right.

Alfrothul: To wit:

# foo 100;;
- : int = 21
#

Pablito: So instead, we rename the shadowing local name in the original definition of foo:

let foo x =
  let y = x + 1
  in let z = 10
     in z + y;;

Anton: OK.

Alfrothul: Checking, checking:

# foo 100;;
- : int = 111
#

Pablito: Thanks. And now we can correctly unfold the let-expression that declares y, look:

let foo x =
  let z = 10
  in z + (x + 1);;

Alfrothul: To wit:

# foo 100;;
- : int = 111
#

Halcyon: Right. So. Let’s get back to business:

# let () = assert (test_mult mult_v4i);;
#

Pablito: Thanks, Halcyon. Now we can apply the two function abstractions:

let mult_v4j t_given =
  (let rec visit t =
     match t with
     | Leaf v ->
        let n = v
        in (fun k ->
             if n = 0
             then 0
             else k n)
     | Node (t1, t2) ->
        let ih2 = visit t2
        and ih1 = visit t1
        in (fun k ->
             ih2 (fun p2 ->
                   ih1 (fun p1 ->
                         k (p1 * p2))))
   in visit t_given) (fun p -> p);;

Halcyon: Checking:

# let () = assert (test_mult mult_v4j);;
#

Pablito: Then we unfold the let-expressions, which we can do since neither of v, visit, t2, and t1 are shadowed in the body of the two let-expressions:

let mult_v4k t_given =
  (let rec visit t =
     match t with
     | Leaf v ->
        (fun k ->
          if v = 0
          then 0
          else k v)
     | Node (t1, t2) ->
        (fun k ->
          visit t2 (fun p2 ->
                     visit t1 (fun p1 ->
                                k (p1 * p2))))
   in visit t_given) (fun p -> p);;

Halcyon: Checking:

# let () = assert (test_mult mult_v4k);;
#

Pablito: And now we can back-commute the function abstraction fun k -> ... and the match-expression:

let mult_v4l t_given =
  (let rec visit t =
     fun k -> match t with
              | Leaf v ->
                 if v = 0
                 then 0
                 else k v
              | Node (t1, t2) ->
                 visit t2 (fun p2 ->
                            visit t1 (fun p1 ->
                                       k (p1 * p2)))
   in visit t_given) (fun p -> p);;

Halcyon: Checking:

# let () = assert (test_mult mult_v4l);;
#

Pablito: Now spreading a bit of syntactic sugar:

let mult_v4m t_given =
  (let rec visit t k =
     match t with
     | Leaf v ->
        if v = 0
        then 0
        else k v
     | Node (t1, t2) ->
        visit t2 (fun p2 ->
                   visit t1 (fun p1 ->
                              k (p1 * p2)))
   in visit t_given) (fun p -> p);;

Halcyon: Checking:

# let () = assert (test_mult mult_v4m);;
#

Pablito: And now we can back-commute the letrec-expression and the application in its body:

let mult_v4n t_given =
  let rec visit t =
    match t with
    | Leaf v ->
       (fun k ->
         if v = 0
         then 0
         else k v)
    | Node (t1, t2) ->
       (fun k ->
         visit t2 (fun p2 ->
                    visit t1 (fun p1 ->
                               k (p1 * p2))))
  in visit t_given (fun p -> p);;

Halcyon: Checking:

# let () = assert (test_mult mult_v4n);;
#

Pablito: And the result coincides with the definition of mult_v4.

Halcyon: Success!

Application: multiplying all the integers in a binary tree (continued)

Alternatively we can remain in direct style (i.e., not use a continuation) and use an exception instead:

exception Zero;;

let mult_v5 t_given =
  let rec visit t =
    match t with
    | Leaf n ->
       if n = 0
       then raise Zero
       else n
    | Node (t1, t2) ->
       visit t1 * visit t2
  in try visit t_given with
     | Zero ->
        0;;

Here is the logic of visit:

  • For any tree t, the application of visit to t completes (and yields a non-zero integer) if t does not contain 0. If t contains 0, the exception Zero is raised.
  • If the exception Zero is raised, it is caught outside the initial call to visit, and then 0 is returned, the result of multiplying all the integers in the given tree. Otherwise, the initial call to visit completes and yields a non-zero integer, the result of multiplying all integers in the given tree, and this integer is returned.

As per OCaml’s right-to-left evaluation of subexpressions in applications, each right subtree t2 is traversed (using a call to visit) before the corresponding left subtree t1 (which is traversed using another call to visit).

Since this implementation is structurally recursive, it can be expressed using binary_tree_fold:

let mult_v6 t_given =
  try binary_tree_fold (fun n ->
                         if n = 0
                         then raise Zero
                         else n)
                       ( * )
                       t_given
  with
     | Zero ->
        0;;

Both implementations pass the unit tests:

# test_mult mult_v5;;
- : bool = true
# test_mult mult_v6;;
- : bool = true
#

That the recursive traversal is interrupted once 0 is encountered is visualized as follows, using the traced version of mult_v5 found in the accompanying file:

# traced_mult_v5 show_int (Node (Node (Node (Leaf 1, Leaf 2), Node (Leaf 3, Leaf 0)), Node (Leaf 4, Leaf 5)));;
mult_v5 (Node (Node (Node (Leaf 1, Leaf 2), Node (Leaf 3, Leaf 0)), Node (Leaf 4, Leaf 5))) ->
  visit (Node (Node (Node (Leaf 1, Leaf 2), Node (Leaf 3, Leaf 0)), Node (Leaf 4, Leaf 5))) ->
    visit (Node (Leaf 4, Leaf 5)) ->
      visit (Leaf 5) ->
      visit (Leaf 5) <- 5
      visit (Leaf 4) ->
      visit (Leaf 4) <- 4
    visit (Node (Leaf 4, Leaf 5)) <- 20
    visit (Node (Node (Leaf 1, Leaf 2), Node (Leaf 3, Leaf 0))) ->
      visit (Node (Leaf 3, Leaf 0)) ->
        visit (Leaf 0) ->
mult_v5 (Node (Node (Node (Leaf 1, Leaf 2), Node (Leaf 3, Leaf 0)), Node (Leaf 4, Leaf 5))) <- 0
- : int = 0
#

Exercise 21

  1. Implement one more version of the multiplication function over binary trees, mult_v7, using an accumulator.
  2. Express your new version using binary_tree_fold.

Hint about Exercise 21

Check out the solution for Exercise 19.

Application: testing whether a leaf satisfies a given predicate in a binary tree

The goal of this section is to implement a function that, given a predicate (i.e., a Boolean-valued function) and a binary tree, tests whether the payload of one of the leaves satisfies this predicate.

For example, the following unit-test function checks whether a given integer occurs in a given binary tree of integers:

let test_occurs_int candidate =
  let b0 = (candidate (fun i -> i = 0)
                      (Leaf 0)
            = true)
  and b1 = (candidate (fun i -> i = 0)
                      (Leaf 1)
            = false)
  and b2 = (candidate (fun i -> i = 0)
                      (Node (Leaf 1,
                             Leaf 2))
            = false)
  and b3 = (candidate (fun i -> i = 1)
                      (Node (Leaf 1,
                             Leaf 2))
            = true)
  and b4 = (candidate (fun i -> i = 2)
                      (Node (Leaf 1,
                             Leaf 2))
            = true)
  and b5 = (candidate (fun i -> i = 0)
                      (Node (Leaf 1,
                             Node (Leaf 2,
                                   Leaf 3)))
            = false)
  and b6 = (candidate (fun i -> i = 1)
                      (Node (Leaf 1,
                             Node (Leaf 2,
                                   Leaf 3)))
            = true)
  and b7 = (candidate (fun i -> i = 2)
                      (Node (Leaf 1,
                             Node (Leaf 2,
                                   Leaf 3)))
            = true)
  and b8 = (candidate (fun i -> i = 3)
                      (Node (Leaf 1,
                             Node (Leaf 2,
                                   Leaf 3)))
            = true)
  (* etc. *)
  in b0 && b1 && b2 && b3 && b4 && b5 && b6 && b7 && b8;;

The implementation is that of a routine recursive descent in the given binary tree:

let occurs_v0 p t =
  let rec visit t =
    match t with
    | Leaf n ->
       p n
    | Node (t1, t2) ->
       let b2 = visit t2
       and b1 = visit t1
       in b1 || b2
  in visit t;;

let occurs_v1 p t =
  binary_tree_fold p (||) t;;

Both implementations pass the unit tests:

# test_occurs_int occurs_v0;;
- : bool = true
# test_occurs_int occurs_v1;;
- : bool = true
#

The straightforward recursive descent is visualized as follows, using the traced version of occurs_v0 found in the accompanying file:

# traced_occurs_v0 show_int (fun i -> i = 1) (Node (Node (Node (Leaf 5, Leaf 4), Leaf 3), Node (Node (Leaf 2, Leaf 1), Leaf 0)));;
occurs_v0 p (Node (Node (Node (Leaf 5, Leaf 4), Leaf 3), Node (Node (Leaf 2, Leaf 1), Leaf 0))) ->
visit (Node (Node (Node (Leaf 5, Leaf 4), Leaf 3), Node (Node (Leaf 2, Leaf 1), Leaf 0))) ->
  visit (Node (Node (Leaf 2, Leaf 1), Leaf 0)) ->
    visit (Leaf 0) ->
    visit (Leaf 0) <- false
    visit (Node (Leaf 2, Leaf 1)) ->
      visit (Leaf 1) ->
      visit (Leaf 1) <- true
      visit (Leaf 2) ->
      visit (Leaf 2) <- false
    visit (Node (Leaf 2, Leaf 1)) <- true
  visit (Node (Node (Leaf 2, Leaf 1), Leaf 0)) <- true
  visit (Node (Node (Leaf 5, Leaf 4), Leaf 3)) ->
    visit (Leaf 3) ->
    visit (Leaf 3) <- false
    visit (Node (Leaf 5, Leaf 4)) ->
      visit (Leaf 4) ->
      visit (Leaf 4) <- false
      visit (Leaf 5) ->
      visit (Leaf 5) <- false
    visit (Node (Leaf 5, Leaf 4)) <- false
  visit (Node (Node (Leaf 5, Leaf 4), Leaf 3)) <- false
visit (Node (Node (Node (Leaf 5, Leaf 4), Leaf 3), Node (Node (Leaf 2, Leaf 1), Leaf 0))) <- true
occurs_v0 p (Node (Node (Node (Leaf 5, Leaf 4), Leaf 3), Node (Node (Leaf 2, Leaf 1), Leaf 0))) <- true
- : bool = true
#

However, since true is absorbing for disjunction, if visit encounters a leaf that safisfies the predicate—as is the case just above—there is no need to pursue the computation. Is this a job for the option type? Well, actually no, not here, because of OCaml’s short-circuit evaluation for disjunction: all we need is to not name the result of the recursive calls to visit:

let occurs_v2 p t =
  let rec visit t =
    match t with
    | Leaf n ->
       p n
    | Node (t1, t2) ->
       visit t2 || visit t1
  in visit t;;

(The right-to-left depth-first traversal is there to respect the right-to-left evaluation of subexpressions in OCaml.)

This implementation passes the unit tests:

# test_occurs_int occurs_v2;;
- : bool = true
#

More to the point, if a leaf whose payload satisfies the predicate is encountered, the rest of the tree is not traversed, as visualized below, using the traced version of occurs_v2 found in the accompanying file and the same example as above:

# traced_occurs_v2 show_int (fun i -> i = 1) (Node (Node (Node (Leaf 5, Leaf 4), Leaf 3), Node (Node (Leaf 2, Leaf 1), Leaf 0)));;
occurs_v2 p (Node (Node (Node (Leaf 5, Leaf 4), Leaf 3), Node (Node (Leaf 2, Leaf 1), Leaf 0))) ->
visit (Node (Node (Node (Leaf 5, Leaf 4), Leaf 3), Node (Node (Leaf 2, Leaf 1), Leaf 0))) ->
  visit (Node (Node (Leaf 2, Leaf 1), Leaf 0)) ->
    visit (Leaf 0) ->
    visit (Leaf 0) <- false
    visit (Node (Leaf 2, Leaf 1)) ->
      visit (Leaf 1) ->
      visit (Leaf 1) <- true
    visit (Node (Leaf 2, Leaf 1)) <- true
  visit (Node (Node (Leaf 2, Leaf 1), Leaf 0)) <- true
visit (Node (Node (Node (Leaf 5, Leaf 4), Leaf 3), Node (Node (Leaf 2, Leaf 1), Leaf 0))) <- true
occurs_v2 p (Node (Node (Node (Leaf 5, Leaf 4), Leaf 3), Node (Node (Leaf 2, Leaf 1), Leaf 0))) <- true
- : bool = true
#

And indeed once Leaf 1 is reached, the corresponding left subtree and the left subtrees above it are not traversed.

Still the nodes containing these left subtrees are touched upon one after the other as visit returns, and that is because for each right subtree, the call to visit is not a tail call.

How to make all calls to visit tail calls? With a continuation of course:

let occurs_v3 p t =
  let rec visit t k =
    match t with
    | Leaf n ->
       p n || k ()
    | Node (t1, t2) ->
       visit t2 (fun () ->
                  visit t1 k)
  in visit t (fun () -> false);;

Here is the logic of visit:

  • For any tree t and for any continuation k, when visit is applied to t and k, k is only applied if t does not contain a leaf whose payload does not satisfy the given predicate. Otherwise, k is not applied and true is returned, conveying that t contains a payload that satisfies the given predicate.
  • So if the given tree contains a payload that satisfies the given predicate, true is returned. Otherwise, if the given tree contains no payloads that satisfy the given predicate, the initial continuation (i.e., the second argument in the initial call to visit) is applied and returns false.

To conform with OCaml’s right-to-left evaluation of subexpressions, each right subtree t2 is traversed (using a tail call to visit) before the corresponding left subtree t1 (which is traversed using another tail call to visit).

This implementation passes the unit tests:

# test_occurs_int occurs_v3;;
- : bool = true
#

That the traversal is carried out iteratively and interrupted when encountering a fitting leaf is visualized as follows, using the traced version of occurs_v3 found in the accompanying file:

# traced_occurs_v3 show_int (fun i -> i = 1) (Node (Node (Node (Leaf 5, Leaf 4), Leaf 3), Node (Node (Leaf 2, Leaf 1), Leaf 0)));;
occurs_v3 p (Node (Node (Node (Leaf 5, Leaf 4), Leaf 3), Node (Node (Leaf 2, Leaf 1), Leaf 0))) ->
visit (Node (Node (Node (Leaf 5, Leaf 4), Leaf 3), Node (Node (Leaf 2, Leaf 1), Leaf 0))) ->
visit (Node (Node (Leaf 2, Leaf 1), Leaf 0)) ->
visit (Leaf 0) ->
continuing after Leaf 0 ->
visit (Node (Leaf 2, Leaf 1)) ->
visit (Leaf 1) ->
and stopping with true
occurs_v3 p (Node (Node (Node (Leaf 5, Leaf 4), Leaf 3), Node (Node (Leaf 2, Leaf 1), Leaf 0))) <- true
- : bool = true
#

Alternatively we can remain in direct style and use an exception instead:

exception Found_one;;

let occurs_v4 p t_given =
  let rec visit t =
    match t with
    | Leaf n ->
       if p n
       then raise Found_one
       else false
    | Node (t1, t2) ->
       let b2 = visit t2
       in b2 || visit t1
  in try visit t_given with
     | Found_one ->
        true;;

Here is the logic of visit:

  • For any tree t, applying visit to t completes (and yields false) if t does not contain a leaf whose payload does not satisfy the predicate. If t contains such a leaf, the exception Found_one is raised.
  • If the exception Found_one is raised, it is caught outside the initial call to visit, and then true is returned, conveying that the given tree contains a payload that satisfies the given predicate. Otherwise, the initial call to visit completes and yields false, conveying that the given tree contains no payloads that satisfy the given predicate.

To conform with OCaml’s right-to-left evaluation of subexpressions in applications, each right subtree t2 is traversed (using a non-tail call to visit) before the corresponding left subtree t1 (which is traversed using a tail call to visit). The result of evaluating visit t2 is named (using a cosmetic let-expression) to manifest that this call is not a tail call, and the result of evaluating visit t1 is not named to manifest that this call is a tail call.

This implementation passes the unit tests:

# test_occurs_int occurs_v4;;
- : bool = true
#

That the recursive traversal is interrupted when encountering a fitting leaf is visualized as follows, using the traced version of occurs_v4 found in the accompanying file:

# traced_occurs_v4 show_int (fun i -> i = 1) (Node (Node (Node (Leaf 5, Leaf 4), Leaf 3), Node (Node (Leaf 2, Leaf 1), Leaf 0)));;
occurs_v4 p (Node (Node (Node (Leaf 5, Leaf 4), Leaf 3), Node (Node (Leaf 2, Leaf 1), Leaf 0))) ->
visit (Node (Node (Node (Leaf 5, Leaf 4), Leaf 3), Node (Node (Leaf 2, Leaf 1), Leaf 0))) ->
  visit (Node (Node (Leaf 2, Leaf 1), Leaf 0)) ->
    visit (Leaf 0) ->
    visit (Leaf 0) <- false
    visit (Node (Leaf 2, Leaf 1)) ->
      visit (Leaf 1) ->
occurs_v4 p (Node (Node (Node (Leaf 5, Leaf 4), Leaf 3), Node (Node (Leaf 2, Leaf 1), Leaf 0))) <- true
- : bool = true
#

(The indentation manifests that visit (Leaf 0) was a non-tail call and that visit (Node (Leaf 2, Leaf 1)) is a tail call.)

But now visit only returns false, never true, and therefore it might as well return the unit value:

let occurs_v5 p t =
  let rec visit t =
    match t with
    | Leaf n ->
       if p n
       then raise Found_one
       else ()
    | Node (t1, t2) ->
       let () = visit t2
       in visit t1
  in try let () = visit t in false with
     | Found_one ->
        true;;

This implementation passes the unit tests:

# test_occurs_int occurs_v5;;
- : bool = true
#

It directly corresponds to occurs_v3 in that for both of them,

  • in the Leaf case, the computation only continues if the predicate is not satisfied,
  • in the Node case, the traversal of the current left subtree only takes place if the traversal of the current right subtree completed, and
  • in occurs_v5, applying visit yields () and in occurs_v3, applying visit continues with ().

Moving on, and remembering that e1; e2 is syntactic sugar for let () = e1 in e2, this implementation can be expressed using sequencing:

let occurs_v6 p t =
  let rec visit t =
    match t with
    | Leaf n ->
       if p n
       then raise Found_one
       else ()
    | Node (t1, t2) ->
       visit t2;
       visit t1
  in try visit t; false with
     | Found_one ->
        true;;

This implementation passes the unit tests:

# test_occurs_int occurs_v6;;
- : bool = true
#

It has the hallmarks of imperative programming: sequencing and jumps.

Exercise 22

Revisit Exercise 09 in Week 11 and implement a function that tests whether a given binary tree of integers represents a well-balanced mobile, using an exception instead of an option type or a continuation.

Exercise 23

Revisit Section The logical counterpart of the map function for binary trees and implement andmap_binary_tree in a way that exploits that false is absorbing for conjunction.

Exercise 24

Revisit Section The logical counterpart of the map function for binary trees and implement ormap_binary_tree in a way that exploits that true is absorbing for disjunction.

Exercise 25

As a revisitation of Exercise 13, implement a predicate that, given a string, tests whether this string is a palindrome using a while-loop and an exception.

Resources

Version

Created [03 Nov 2022]