Mini-project: The underlying determinism of OCaml

Warning

OCaml is a statically typed programming language, and so any OCaml expression has a type and is type-correct.

OCaml is a deterministic programming language in that it has a fixed processing order, and the goal of this mini-project is to become aware of this processing order.

To this end, here is a family of tracing identity functions:

let an_int n =
 (* an_int : int -> int *)
  let () = Printf.printf "processing %d...\n" n in
  n;;

let a_bool b =
 (* a_bool : bool -> bool *)
  let () = Printf.printf "processing %B...\n" b in
  b;;

let a_char c =
 (* a_char : char -> char *)
  let () = Printf.printf "processing '%c'...\n" c in
  c;;

let a_string s =
 (* a_string : string -> string *)
  let () = Printf.printf "processing \"%s\"...\n" s in
  s;;

let a_unit () =
 (* a_unit : unit -> unit *)
  let () = Printf.printf "processing the unit value...\n" in
  ();;

let a_function f =
 (* a_function : ('a -> 'b) -> 'a -> 'b *)
  let () = Printf.printf "processing a function...\n" in
  fun x -> f x;;

let a_function' name f =
 (* a_function' : string -> ('a -> 'b) -> 'a -> 'b *)
  let () = Printf.printf "processing the %s function...\n" name in
  fun x -> f x;;

Resources

Question 01

In an addition (+ in infix notation), are the operands (i.e., the arguments of +) evaluated from left to right or from right to left?

Answer to Question 01

Let’s try:

# an_int 1 + an_int 10;;
processing 10...
processing 1...
- : int = 11
#

Since the processing order of OCaml is fixed, this example is enough to conclude that the operands of an addition are evaluated from right to left. Not just for addition, but also for the other arithmetic operations:

# an_int 10 - an_int 1;;
processing 1...
processing 10...
- : int = 9
# an_int 5 * an_int 4;;
processing 4...
processing 5...
- : int = 20
# an_int 5 / an_int 0;;
processing 0...
processing 5...
Exception: Division_by_zero.
#

Interlude

Alfrothul: Does the answer scale to nested operations?

Anton: Nested operations.

Alfrothul: Well, yes, you know, (1 + 10) + 100, for example.

Anton: Ah, right. If so, 100 should be evaluated first.

Alfrothul: Right. And then 10, and then 1.

Anton: Well, let’s try:

# (an_int 1 + an_int 10) + an_int 100;;
processing 100...
processing 10...
processing 1...
- : int = 111
#

Alfrothul: Wait – we should also trace the result of the inner addition:

# an_int (an_int 1 + an_int 10) + an_int 100;;
processing 100...
processing 10...
processing 1...
processing 11...
- : int = 111
#

Anton: Ah, right, then both operands of the outer addition are traced.

Alfrothul: Just to make sure, let me reassociate the additions.

Anton: Reassociate the additions.

Alfrothul: Well, yes, you know, 1 + (10 + 100).

Anton: Ah, right. If so, 100 should be evaluated first, and then 10.

Alfrothul: And then the right operand would be evaluated, and a trace could be emitted containing its result.

Anton: Indeed. And then 1 would be evaluated, and the result would be 111. Let me try:

# an_int 1 + an_int (an_int 10 + an_int 100);;
processing 100...
processing 10...
processing 110...
processing 1...
- : int = 111
#

Alfrothul: Yup. Scientific method, here we come.

Pablito: So evaluating an_int (an_int 1 + an_int 10) + an_int (an_int 100 + an_int 1000) should first emit processing 1000..., and then processing 100..., and then processing 1100..., and then processing 10..., and then processing 1..., and then processing 11..., and then the result would be 1111, right?

Alfrothul: Why don’t you give it a try.

Pablito: I also want to try an_int (an_int 1 / an_int 0) + an_int 10.

Halcyon: What about an_int (an_int (an_int 1 / an_int 0) + an_int 10)?

Question 02

In a function application, are the two sub-expressions (namely the one in position of a function, on the left, and the one that is the actual parameter of the function, i.e., the argument, on the right) evaluated from left to right or from right to left?

Subsidiary question:

  • Is this order compatible with the answer to Question 01?

Question 03

When a tuple is constructed, in which order were its components evaluated?

Question 04

Given three expressions, e0, e1, and e2, are the two following expressions observationally equivalent (i.e., does evaluating them carry out the same computation?):

(fun x1 -> fun x2 -> e0) e1 e2

(fun (x1, x2) -> e0) (e1, e2)

Subsidiary questions:

  • Is your answer compatible with your answers to Question 02 and Question 03?
  • Assuming that e0, x1, and x2 all have type int, what are the types of fun x1 -> fun x2 -> e0 and fun (x1, x2) -> e0? Have we seen these types before?
Anton: I have seen rhetorical questions before...
Alfrothul: ...and the very last question is definitely one.

Question 05

Given two expressions, e0 and e1, are the two following expressions observationally equivalent:

(fun x1 -> e0) e1

let x1 = e1 in e0

Question 06

In a (local or global) let-expression declaring several bindings at once, i.e., with and, are the definienses evaluated from left to right or from right to left?

Question 07

Given three expressions, e0, e1, and e2, are the two following expressions observationally equivalent:

let x1 = e1 and x2 = e2 in e0

let (x1, x2) = (e1, e2) in e0

Question 08

In a boolean conjunction (&& in infix notation), are the conjuncts (i.e., the arguments of &&, of type bool) evaluated from left to right or from right to left?

To answer this question completely, make sure to consider the 4 possible cases.

Subsidiary questions:

  • Why does this design make sense?
  • Where else could this design make sense?

Pure and impure expressions

A bit of terminology first:

  • An OCaml expression is said to be “impure” if evaluating it
    • incurs a side effect (e.g., a trace),
    • raises an error (e.g., due to a failed assertion, to a division by 0, etc.), or
    • diverges, i.e., seemingly yields no value.
  • If evaluating an OCaml expression incurs no side effects, raises no error, and yields a value, it is said to be “pure”.

So we can sketch a grammar of pure expressions as a sub-grammar of expressions:

n ::= ...a number...

b ::= ...a boolean...

c ::= ...a character...

s ::= ...a string...

x ::= ...a name...

e ::= ...an expression...

pe ::= n | b | c | s |
     | fun formal -> e
     | x
     | (pe, ..., pe)
     | if pe then pe else pe
     | let formal = pe and ... and formal = pe in pe

The non-terminal e is for OCaml expressions in general, and the non-terminal pe is for pure expressions. So, to a reasonable approximation, a pure expression can be

  • a literal (number, Boolean, character, or string),
  • a function abstraction,
  • a variable (since variables denote values),
  • a tuple of pure expressions, including the empty tuple,
  • a conditional expression whose test, consequent, and alternative are pure expressions, and
  • a let-expression whose definienses and body are pure expressions.

We can also classify [operations on ground values that do not raise errors] as pure. So for example, the addition of two pure expressions is a pure expression, but not the division of two pure expressions because of its potential for dividing by 0.

This classification is both under-approximative and over-approximative because some pure expressions do not fit this grammar (e.g., evaluating the application of the identity function to a pure expression incurs no side effects, raises no error, and yields a value) and because an expression that fits this grammar of pure expressions can be so big that processing it exhausts the computational resources of the implementation of OCaml.

That said, the following questions assume that we can distinguish between expressions that are pure for sure (written pe), and expressions that might be impure (written e).

Loki (ever the helpful one): The technical term is “without loss of generality.”

Question 09

  1. For any pure expression pe of type int, would it be valid to simplify pe * 0 into 0?
  2. For any potentially impure expression e of type int, would it be valid to simplify e * 0 into 0?
  3. For any pure expression pe of type int, would it be valid to simplify pe * 1 into pe?
  4. For any potentially impure expression e of type int, would it be valid to simplify e * 1 into e?

Briefly justify your answers and feel free to peruse the food for thought about lexical scope below for examples of how to describe the evaluation of a composite OCaml expression.

Question 10

Let e1 and e2 be potentially impure expressions.

  1. Are the two following expressions observationally equivalent:

    let x1 = e1 and x2 = e2 in (x1, x2);;
    
    let x2 = e2 and x1 = e1 in (x1, x2);;
    
  2. Are the two following expressions observationally equivalent:

    let x1 = e1 in let x2 = e2 in (x1, x2);;
    
    let x2 = e2 in let x1 = e1 in (x1, x2);;
    

Briefly justify your answers and feel free to peruse the food for thought about lexical scope below for examples of how to describe the evaluation of a composite OCaml expression.

Question 11

Let pe1 and pe2 be pure expressions.

  1. Are the two following expressions observationally equivalent:

    let x1 = pe1 and x2 = pe2 in (x1, x2);;
    
    let x2 = pe2 and x1 = pe1 in (x1, x2);;
    
  2. Are the two following expressions observationally equivalent:

    let x1 = pe1 in let x2 = pe2 in (x1, x2);;
    
    let x2 = pe2 in let x1 = pe1 in (x1, x2);;
    

Briefly justify your answers and feel free to peruse the food for thought about lexical scope below for examples of how to describe the evaluation of a composite OCaml expression.

Question 12

In mathematics, a function is said to be strict if it uses its argument, and non-strict if it does not. For example, in the syntax of OCaml, fun x -> x is such a strict function and fun _ -> 42 is such a non-strict function. So in mathematics, applying this non-strict function to any argument yields 42, always. Is that the case in OCaml too?

Feel free to revisit Question 02 and Question 05.

Food for thought about lexical scope

The tracing identity functions make it possible to visualize lexical scope at work. Revisiting Exercise 09 in Week 04, what are the observable effects and what is the result of evaluating each of the following expressions? Analyze and justify these traces and these results:

let a_name s x =
 (* a_name : string -> 'a -> 'a *)
 let () = Printf.printf "looking up %s...\n" s in
 x;;

let x = an_int 1
in let foo = a_function (fun () ->
                           let () = Printf.printf "evaluating the body of the function denoted by foo\n" in
                           a_name "x" x)
   in let x = a_string "hello"
      in ((a_name "foo" foo) (a_unit ()), a_name "x" x);;

let x = an_int 1
in let bar = a_function (fun y ->
                           let () = Printf.printf "evaluating the body of the function denoted by bar\n" in
                           a_name "x" x + a_name "y" y)
   in let x = a_string "hello"
      in ((a_name "bar" bar) (an_int 10), a_name "x" x);;

N.B.: The point here is that the functions denoted by foo and bar are declared in an environment where x denotes 1, and that they are applied in an environment where x denotes hello.

Solution for the first expression

Let us first consider the untraced version of the first expression:

let x = 1
in let foo = fun () ->
               x
   in let x = "hello"
      in (foo (), x);;

To evaluate let x = 1 in ... in a given environment E0, we first evaluate its definiens, i.e., 1, in E0.

Evaluating 1 in E0 yields the integer 1.

Evaluating this let-expression in E0 then reduces to evaluating its body, i.e., let foo = fun () -> x in ..., in an environment, ah, let’s name it E1, that extends E0 with the binding of x to 1: E1 = (x = 1), E0.

To evaluate let foo = fun () -> x in ... in E1, we first evaluate its definiens, i.e., fun () -> x, in E1.

Evaluating fun () -> x in E1 yields a function that, when applied to the unit value, will evaluate x in E1.

Evaluating this let-expression in E1 then reduces to evaluating its body, i.e., let x = "hello" in ..., in an environment, ah, let’s name it E2, that extends E1 with the binding of foo to the function that, when applied to the unit value, will evaluate x in E1: E2 = (foo = (the function that, when applied to the unit value, will evaluate x in E1)), E1.

To evaluate let x = "hello" in ... in E2, we first evaluate its definiens, i.e., "hello", in E2.

Evaluating "hello" in E2 yields the string “hello”.

Evaluating this let-expression in E2 then reduces to evaluating its body, i.e., (foo (), x), in an environment, ah, let’s name it E3, that extends E2 with the binding of x to “hello”: E3 = (x = “hello”), E2.

To evaluate (foo (), x) in E3, as per the answer to Question 03, we first evaluate x in E3 and then foo () in E3.

Evaluating x in E3 yields the denotation of x in E3, i.e., “hello”.

To evaluate foo () in E3, as per the answer to Question 02, we first evaluate () in E3 and then foo in E3.

Evaluating () in E3 yields the unit value ().

Evaluating foo in E3 yields the denotation of foo in E3, i.e., the function that, when applied to the unit value, will evaluate x in E1.

Evaluating this application in E3 then reduces to applying the result of evaluating foo in E3 – which is a function since OCaml is statically typed – to the result of evaluating () in E3.

Applying this function to () reduces to evaluating x in E1.

Evaluating x in E1 yields the denotation of x in E1, i.e., 1.

The result of evaluating (foo (), x) in E3 is therefore the pair (1, “hello”).

This series of events is confirmed by the trace emitted by OCaml when the first expression is evaluated:

processing 1...
processing a function...
processing "hello"...
looking up x...
processing the unit value...
looking up foo...
evaluating the body of the function denoted by foo
looking up x...
- : int * string = (1, "hello")
#

Solution for the second expression

Let us first consider the untraced version of the second expression:

let x = 1
in let bar = fun y ->
               x + y
   in let x = "hello"
      in (bar 10, x);;

To evaluate let x = 1 in ... in a given environment E0, we first evaluate its definiens, i.e., 1, in E0.

Evaluating 1 in E0 yields the integer 1.

Evaluating this let-expression in E0 then reduces to evaluating its body, i.e., let bar = fun y -> x + y in ..., in an environment, ah, let’s name it E1, that extends E0 with the binding of x to 1: E1 = (x = 1), E0.

To evaluate let bar = fun y -> x + y in ... in E1, we first evaluate its definiens, i.e., fun y -> x + y, in E1.

Evaluating fun y -> x + y, in E1 yields a function that, when applied to an integer, will evaluate x + y in an environment, ah, let’s name it E1’, that extends E1 with the binding of y to this integer: E1’ = (y = this integer), E1.

Evaluating this let-expression in E1 then reduces to evaluating its body, i.e., let x = "hello" in ..., in an environment, ah, let’s name it E2, that extends E1 with the binding of bar to the function that, when applied to an integer, will evaluate x + y in E1’: E2 = (bar = (the function that, when applied to an integer, will evaluate x + y in E1’)), E1.

To evaluate let x = "hello" in ... in E2, we first evaluate its definiens, i.e., "hello", in E2.

Evaluating "hello" in E2 yields the string “hello”.

Evaluating this let-expression in E2 then reduces to evaluating its body, i.e., (bar 10, x), in an environment, ah, let’s name it E3, that extends E2 with the binding of x to “hello”: E3 = (x = “hello”), E2.

To evaluate (bar 10, x) in E3, as per the answer to Question 03, we first evaluate x in E3 and then bar 10 in E3.

Evaluating x in E3 yields the denotation of x in E3, i.e., “hello”.

To evaluate bar 10 in E3, as per the answer to Question 02, we first evaluate 10 in E3 and then bar in E3.

Evaluating 10 in E3 yields the integer 10.

Evaluating bar in E3 yields the denotation of bar in E3, i.e., the function that, when applied to an integer, will evaluate x + y in E1’.

Evaluating this application in E3 then reduces to applying the result of evaluating bar in E3 – which is a function since OCaml is statically typed – to the result of evaluating 10 in E3.

Applying this function to 10 reduces to evaluating x + y in E1’ = (y = 10), E1.

To evaluate x + y in E1’, as per the answer to Question 01, we first evaluate y in E1’ and then x in E1’.

Evaluating y in E1’ yields the denotation of y in E1’, i.e., 10.

Evaluating x in E1’ yields the denotation of x in E1’, i.e., 1.

Evaluating x + y in E1’ then reduces to adding 1 and 10, which yields 11.

The result of evaluating (bar 10, x) in E3 is therefore the pair (11, “hello”).

This series of events is confirmed by the trace emitted by OCaml when the second expression is evaluated:

processing 1...
processing a function...
processing "hello"...
looking up x...
processing 10...
looking up bar...
evaluating the body of the function denoted by bar
looking up y...
looking up x...
- : int * string = (11, "hello")
#

Resources

Version

Changed the non-terminal for pure expressions to be pe [04 Mar 2023]

Streamlined Question 07 [01 Feb 2023]

Created [10 Jan 2023]