Demand-driven computation, revisited

The goal of this lecture note is to revisit demand-driven computation in OCaml without using the Lazy library.

Resources

Desugaring a conditional expression into a function call

Given any expressions test, consequent, and alternative, the expression if test then consequent else alternative cannot be desugared into the function call cond test consequent alternative, where cond is defined as follows:

let cond_v0 t c a =
 (* cond_v0 : bool -> 'a -> 'a *)
  if t
  then c
  else a;;

This desugaring is unsound because OCaml follows call by value and so the three actual parameters of cond_v0 (namely test, consequent, and alternative) are evaluated before cond_v0 is applied. However, only one among consequent and alternative should be evaluated, not the other. For example:

# cond_v0 true (0/1) (1/0);;
Exception: Division_by_zero.
#

Can we somehow manage to not evaluate consequent and alternative at the point of call, and only at the point of use in the definition of cond?

Actually we can, using OCaml’s Lazy library, if we desugar if test then consequent else alternative as cond test (lazy consequent) (lazy alternative) and if we define cond as follows:

let cond_v1 t delayed_c delayed_a =
 (* cond_v1 : bool -> 'a Lazy.t -> 'a Lazy.t -> 'a *)
 if t
 then Lazy.force delayed_c
 else Lazy.force delayed_a;;

To wit:

# cond_v1 true (lazy (0/1)) (lazy (1/0));;
- : int = 0
#

Indeed the evaluation of the consequent and the evaluation of the alternative are delayed when cond_v1 is applied, and only one of the resulting suspended computations is made to take place once cond_v1 has been applied and its body is evaluated.

Delaying an evaluation without using the Lazy library

In a very precise sense, the body of a function is not evaluated until this function is applied, which suggests a very simple way to desugar lazy e as fun () -> e, i.e., as a parameterless function whose body is e. As for provoking the evaluation of this body, all we need is to apply this parameterless function to the unit value:

let cond_v2 t delayed_c delayed_a =
 if t
 then delayed_c ()
 else delayed_a ();;

To wit:

# cond_v2 true (fun () -> 0/1) (fun () -> 1/0);;
- : int = 0
#

Indeed the evaluation of the consequent and the evaluation of the alternative are delayed when cond_v2 is applied, and only one of the resulting suspended computations is made to take place once cond_v2 has been applied and its body is evaluated.

So all in all,

  • to delay the evaluation of an expression e, all we need to do is to write it as fun () -> e, and
  • to force a delayed evaluation, all we need to do is to apply it to ().

In other words, we can define our own force function:

let force h =
  h ();;

Interlude

Mimer: Déjà vu, anyone?

Alfrothul: Oh no, not again.

Harald (politely): Pray elaborate?

Mimer: What is the type of force?

Alfrothul: You want a proof tree or can we ask OCaml?

Mimer: Either way works.

Harald (clickety clickety clickety click):

# let force h = h ();;
val force : (unit -> 'a) -> 'a = <fun>
#

Brynja: We’ve seen this type before.

Vigfus: Indeed we did, it was Exercise 2.b in Week 05.

Harald: Harumph.

Thunks

A thunk is a parameterless function: an expression parameterized by the unit value and that is evaluated when the thunk is applied to the unit value. Thunks provide a simple means to delay the evaluation of an expression and to make this evaluation take place on demand.

Consider, for example, the two following thunks that respectively emit "I am a lumberjack" and "I am OK" before returning 10 and 1, respectively:

let lumberjack_thunk =
  fun () -> let () = Printf.printf "I am a lumberjack\n"
            in 10;;

let ok_thunk =
  fun () -> let () = Printf.printf "I am OK\n"
            in 1;;

Each time these thunks are forced, their message is emitted:

# let a = force lumberjack_thunk
  and b = force lumberjack_thunk
  and c = force lumberjack_thunk
  and d = force lumberjack_thunk
  and e = force ok_thunk
  and f = force ok_thunk
  in a + b + c + d + e + f;;
I am a lumberjack
I am a lumberjack
I am a lumberjack
I am a lumberjack
I am OK
I am OK
- : int = 42
#

Their message, however, are not emitted unless they are forced: this computation occurs on demand.

Thunks are a practical tool for studying demand-drivenness. The following three sections are dedicated to developing a call-by-value behaviour, a call-by-name behaviour, and a call-by-need behaviour for thunks, in reference to the previous lecture note about Evaluation order.

Thunks by value

To map a thunk into a thunk that behaves as in call by value, we force it right away and return a constant thunk that, when forced, returns the value that was computed right away:

let delay_value thunk =
 (* delay_value : (unit -> 'a) -> unit -> 'a *)
  let v = force thunk
  in fun () -> v;;

For example, let us define the value counterparts of the lumberjack thunk and of the ok thunk:

let lumberjack_thunk_value =
  delay_value lumberjack_thunk;;

let ok_thunk_value =
  delay_value ok_thunk;;

In doing so, their message is emitted once and for all, and therefore replaying the scenario above silently yields the answer 42, as it should with call by value:

# let a = force lumberjack_thunk_value
  and b = force lumberjack_thunk_value
  and c = force lumberjack_thunk_value
  and d = force lumberjack_thunk_value
  and e = force ok_thunk_value
  and f = force ok_thunk_value
  in a + b + c + d + e + f;;
- : int = 42
#

Exercise 25

As an alternative delaying strategy, Loki suggested we swap the let-expression and the function abstraction in the definition of delay_value:

let delay_value_swapped thunk =
 (* delay_value_swapped : (unit -> 'a) -> unit -> 'a *)
  fun () -> let v = force thunk
            in v;;

Does this swapped definition still work? Why?

Thunks by name

To map a thunk into a thunk that behaves as in call by name, we need do nothing because a thunk already behaves as in call by name:

let delay_name thunk =
 (* delay_name : (unit -> 'a) -> unit -> 'a *)
  thunk;;

For example, let us define the name counterparts of the lumberjack thunk and of the ok thunk:

let lumberjack_thunk_name =
  delay_name lumberjack_thunk;;

let ok_thunk_value =
  delay_name ok_thunk;;

Replaying the scenario above gives the same result, since the thunks are unchanged:

# let a = force lumberjack_thunk_name
  and b = force lumberjack_thunk_name
  and c = force lumberjack_thunk_name
  and d = force lumberjack_thunk_name
  and e = force ok_thunk_name
  and f = force ok_thunk_name
  in a + b + c + d + e + f;;
I am a lumberjack
I am a lumberjack
I am a lumberjack
I am a lumberjack
I am OK
I am OK
- : int = 42
#

Thunks by need

To map a thunk into a thunk that behaves as in call by need, we need to memoize its result the first time it is forced, and that is where references (i.e., mutable values, see the lecture note about Mutable data in OCaml) come handy.

The key idea of delay_need is to map a given thunk to a new thunk that is defined in the lexical scope of a reference initialized to None. The new thunk consults this reference to determine whether the given thunk has been forced yet, and then takes action:

  • if the given thunk has been forced already, a memoized result is in store, and this result is returned; and
  • if the given thunk has not been forced already, the new thunk forces it and names the result memoized_value; it then mutates the reference to Some memoized_value (Some to signify that the thunk has already been forced, and memoized_value to record the result of the first forcing); and finally it returns the result of forcing the thunk, i.e., the memoized value.

In OCaml:

let delay_need thunk =
 (* delay_need : (unit -> 'a) -> unit -> 'a *)
  let memory_cell = ref None
  in fun () -> match !memory_cell with
               | Some memoized_value ->
                  memoized_value
               | None ->
                  let memoized_value = force thunk
                  in (memory_cell := Some memoized_value;
                      memoized_value);;

For example, let us define the need counterparts of the lumberjack thunk and of the ok thunk:

let lumberjack_thunk_need =
  delay_need lumberjack_thunk;;

let ok_thunk_need =
  delay_need ok_thunk;;

Replaying the scenario above still yields the answer 42, but it does so by emitting the messages I am a lumberjack and I am OK once, as it should with call by need:

# let a = force lumberjack_thunk_need
  and b = force lumberjack_thunk_need
  and c = force lumberjack_thunk_need
  and d = force lumberjack_thunk_need
  and e = force ok_thunk_need
  and f = force ok_thunk_need
  in a + b + c + d + e + f;;
I am a lumberjack
I am OK
- : int = 42
#

Thunks that behave by need, i.e., that memoize the result of the computation they delay, are often referred to as memo-thunks.

Exercise 26

As an alternative delaying strategy, Loki suggested we swap the let-expression and the function abstraction in the definition of delay_need:

let delay_need_swapped thunk =
 (* delay_need_swapped : (unit -> 'a) -> unit -> 'a *)
  fun () -> let memory_cell = ref None
            in match !memory_cell with
               | Some memoized_value ->
                  memoized_value
               | None ->
                  let memoized_value = force thunk
                  in (memory_cell := Some memoized_value;
                      memoized_value);;

Does this swapped definition still work? Why?

Demand-driven computation in OCaml, revisited

Let use revisit the initial section about Demand-driven computation in OCaml, to illustrate our encoding of demand-driven computation in the form of:

  • a polymorphic type, unit -> 'a, which is the type of suspended computations,
  • a delaying syntactic form, delay_need (fun () -> e), for any expression e, which yields a suspended computation, and
  • a forcing function, force : (unit -> 'a) -> 'a, which, given a suspended computation, triggers this computation.

Let us illustrate this point with the tracing identify function for integers:

let force thunk =
  thunk ();;

let an_int n =
  let () = Printf.printf "processing %d...\n" n
  in n;;

If we delay the application of an_int to an integer, its trace is not emitted until the first time the corresponding suspended computation is triggered. Henceforth this suspended computation is completed, no matter how many times we attempt to trigger it:

# let thirty_three = delay_need (fun () -> an_int 33);;
val thirty_three : unit -> int = <fun>
# thirty_three;;
- : unit -> int = <fun>
# force thirty_three;;
processing 33...
- : int = 33
# force thirty_three;;
- : int = 33
# thirty_three;;
- : unit -> int = <fun>
#

Conversely, if a computation is suspended and not triggered, it does not take place:

# let _ = delay_need (fun () -> an_int 100) in 10;;
- : int = 10
# (fun _ -> 10) (delay_need (fun () -> an_int 100));;
- : int = 10
#

Furthermore, demands can be nested.

A simple example of nested demands, revisited

As in Section A simple example of nested demands, in the following example, z denotes is a suspended computation that, when triggered, triggers the suspended denotation denoted by y, which in turn triggers the suspended denotation denoted by x:

let add n1 n2 =
  let () = Printf.printf "adding %d to %d...\n" n1 n2
  in n1 + n2;;

let x = delay_need (fun () -> an_int 1);;

let y = delay_need (fun () -> add (force x) (an_int 10));;

let z = delay_need (fun () -> add (force y) (an_int 100));;

The scenario from Section A simple example of nested demands (first force z twice in a row, and then y, and then x) gives rise to the same effects for the same reasons:

# force z;;
processing 100...
processing 10...
processing 1...
adding 1 to 10...
adding 11 to 100...
- : int = 111
# force z;;
- : int = 111
# force y;;
- : int = 11
# force x;;
- : int = 1
#

The same simple example of nested demands, just different, revisited

As in Section The same simple example of nested demands, just different, in the following analogous example, c denotes is a suspended computation that, when triggered, triggers the suspended denotation denoted by b, which in turn triggers the suspended denotation denoted by a:

let a = delay_need (fun () -> an_int 1);;

let b = delay_need (fun () -> add (force a) (an_int 10));;

let c = delay_need (fun () -> add (force b) (an_int 100));;

(The original example had x instead of a, y instead of b, and z instead of c, and we forced z.)

The scenario from The same simple example of nested demands, just different (first force b twice in a row, and then a, and then c twice in a row) gives rise to the same effects for the same reasons:

# force b;;
processing 10...
processing 1...
adding 1 to 10...
- : int = 11
# force b;;
- : int = 11
# force a;;
- : int = 1
# force c;;
processing 100...
adding 11 to 100...
- : int = 111
# force c;;
- : int = 111
#

Summa summorum: demand-driven computation

To sum up, we have provided the same expressive power as in the Lazy library:

  • computations are suspended using delay_need (fun () -> ...) and triggered on demand using force;
  • the first time a suspended computation is triggered, its result is memoized and subsequently reused if it is needed; and
  • triggering a suspended computation may demand a cascade of other suspended computations to be triggered.

Resources

Version

Recast [24 mar 2020]

Adjusted the summary [17 Apr 2019]

Revisited the sections from Chapter Demand-driven computation using delay_need and force [12 Apr 2019]

Created [08 Apr 2019]