The goal of this lecture note is to revisit demand-driven computation in OCaml without using the Lazy library.
Given any expressions test, consequent, and alternative, the expression if test then consequent else alternative cannot be desugared into the function call cond test consequent alternative, where cond is defined as follows:
let cond_v0 t c a =
(* cond_v0 : bool -> 'a -> 'a *)
if t
then c
else a;;
This desugaring is unsound because OCaml follows call by value and so the three actual parameters of cond_v0 (namely test, consequent, and alternative) are evaluated before cond_v0 is applied. However, only one among consequent and alternative should be evaluated, not the other. For example:
# cond_v0 true (0/1) (1/0);;
Exception: Division_by_zero.
#
Can we somehow manage to not evaluate consequent and alternative at the point of call, and only at the point of use in the definition of cond?
Actually we can, using OCaml’s Lazy library, if we desugar if test then consequent else alternative as cond test (lazy consequent) (lazy alternative) and if we define cond as follows:
let cond_v1 t delayed_c delayed_a =
(* cond_v1 : bool -> 'a Lazy.t -> 'a Lazy.t -> 'a *)
if t
then Lazy.force delayed_c
else Lazy.force delayed_a;;
To wit:
# cond_v1 true (lazy (0/1)) (lazy (1/0));;
- : int = 0
#
Indeed the evaluation of the consequent and the evaluation of the alternative are delayed when cond_v1 is applied, and only one of the resulting suspended computations is made to take place once cond_v1 has been applied and its body is evaluated.
In a very precise sense, the body of a function is not evaluated until this function is applied, which suggests a very simple way to desugar lazy e as fun () -> e, i.e., as a parameterless function whose body is e. As for provoking the evaluation of this body, all we need is to apply this parameterless function to the unit value:
let cond_v2 t delayed_c delayed_a =
if t
then delayed_c ()
else delayed_a ();;
To wit:
# cond_v2 true (fun () -> 0/1) (fun () -> 1/0);;
- : int = 0
#
Indeed the evaluation of the consequent and the evaluation of the alternative are delayed when cond_v2 is applied, and only one of the resulting suspended computations is made to take place once cond_v2 has been applied and its body is evaluated.
So all in all,
In other words, we can define our own force function:
let force h =
h ();;
Mimer: Déjà vu, anyone?
Alfrothul: Oh no, not again.
Harald (politely): Pray elaborate?
Mimer: What is the type of force?
Alfrothul: You want a proof tree or can we ask OCaml?
Mimer: Either way works.
Harald (clickety clickety clickety click):
# let force h = h ();;
val force : (unit -> 'a) -> 'a = <fun>
#
Brynja: We’ve seen this type before.
Vigfus: Indeed we did, it was Exercise 2.b in Week 05.
Harald: Harumph.
A thunk is a parameterless function: an expression parameterized by the unit value and that is evaluated when the thunk is applied to the unit value. Thunks provide a simple means to delay the evaluation of an expression and to make this evaluation take place on demand.
Consider, for example, the two following thunks that respectively emit "I am a lumberjack" and "I am OK" before returning 10 and 1, respectively:
let lumberjack_thunk =
fun () -> let () = Printf.printf "I am a lumberjack\n"
in 10;;
let ok_thunk =
fun () -> let () = Printf.printf "I am OK\n"
in 1;;
Each time these thunks are forced, their message is emitted:
# let a = force lumberjack_thunk
and b = force lumberjack_thunk
and c = force lumberjack_thunk
and d = force lumberjack_thunk
and e = force ok_thunk
and f = force ok_thunk
in a + b + c + d + e + f;;
I am a lumberjack
I am a lumberjack
I am a lumberjack
I am a lumberjack
I am OK
I am OK
- : int = 42
#
Their message, however, are not emitted unless they are forced: this computation occurs on demand.
Thunks are a practical tool for studying demand-drivenness. The following three sections are dedicated to developing a call-by-value behaviour, a call-by-name behaviour, and a call-by-need behaviour for thunks, in reference to the previous lecture note about Evaluation order.
To map a thunk into a thunk that behaves as in call by value, we force it right away and return a constant thunk that, when forced, returns the value that was computed right away:
let delay_value thunk =
(* delay_value : (unit -> 'a) -> unit -> 'a *)
let v = force thunk
in fun () -> v;;
For example, let us define the value counterparts of the lumberjack thunk and of the ok thunk:
let lumberjack_thunk_value =
delay_value lumberjack_thunk;;
let ok_thunk_value =
delay_value ok_thunk;;
In doing so, their message is emitted once and for all, and therefore replaying the scenario above silently yields the answer 42, as it should with call by value:
# let a = force lumberjack_thunk_value
and b = force lumberjack_thunk_value
and c = force lumberjack_thunk_value
and d = force lumberjack_thunk_value
and e = force ok_thunk_value
and f = force ok_thunk_value
in a + b + c + d + e + f;;
- : int = 42
#
As an alternative delaying strategy, Loki suggested we swap the let-expression and the function abstraction in the definition of delay_value:
let delay_value_swapped thunk =
(* delay_value_swapped : (unit -> 'a) -> unit -> 'a *)
fun () -> let v = force thunk
in v;;
Does this swapped definition still work? Why?
To map a thunk into a thunk that behaves as in call by name, we need do nothing because a thunk already behaves as in call by name:
let delay_name thunk =
(* delay_name : (unit -> 'a) -> unit -> 'a *)
thunk;;
For example, let us define the name counterparts of the lumberjack thunk and of the ok thunk:
let lumberjack_thunk_name =
delay_name lumberjack_thunk;;
let ok_thunk_value =
delay_name ok_thunk;;
Replaying the scenario above gives the same result, since the thunks are unchanged:
# let a = force lumberjack_thunk_name
and b = force lumberjack_thunk_name
and c = force lumberjack_thunk_name
and d = force lumberjack_thunk_name
and e = force ok_thunk_name
and f = force ok_thunk_name
in a + b + c + d + e + f;;
I am a lumberjack
I am a lumberjack
I am a lumberjack
I am a lumberjack
I am OK
I am OK
- : int = 42
#
To map a thunk into a thunk that behaves as in call by need, we need to memoize its result the first time it is forced, and that is where references (i.e., mutable values, see the lecture note about Mutable data in OCaml) come handy.
The key idea of delay_need is to map a given thunk to a new thunk that is defined in the lexical scope of a reference initialized to None. The new thunk consults this reference to determine whether the given thunk has been forced yet, and then takes action:
In OCaml:
let delay_need thunk =
(* delay_need : (unit -> 'a) -> unit -> 'a *)
let memory_cell = ref None
in fun () -> match !memory_cell with
| Some memoized_value ->
memoized_value
| None ->
let memoized_value = force thunk
in (memory_cell := Some memoized_value;
memoized_value);;
For example, let us define the need counterparts of the lumberjack thunk and of the ok thunk:
let lumberjack_thunk_need =
delay_need lumberjack_thunk;;
let ok_thunk_need =
delay_need ok_thunk;;
Replaying the scenario above still yields the answer 42, but it does so by emitting the messages I am a lumberjack and I am OK once, as it should with call by need:
# let a = force lumberjack_thunk_need
and b = force lumberjack_thunk_need
and c = force lumberjack_thunk_need
and d = force lumberjack_thunk_need
and e = force ok_thunk_need
and f = force ok_thunk_need
in a + b + c + d + e + f;;
I am a lumberjack
I am OK
- : int = 42
#
Thunks that behave by need, i.e., that memoize the result of the computation they delay, are often referred to as memo-thunks.
As an alternative delaying strategy, Loki suggested we swap the let-expression and the function abstraction in the definition of delay_need:
let delay_need_swapped thunk =
(* delay_need_swapped : (unit -> 'a) -> unit -> 'a *)
fun () -> let memory_cell = ref None
in match !memory_cell with
| Some memoized_value ->
memoized_value
| None ->
let memoized_value = force thunk
in (memory_cell := Some memoized_value;
memoized_value);;
Does this swapped definition still work? Why?
Let use revisit the initial section about Demand-driven computation in OCaml, to illustrate our encoding of demand-driven computation in the form of:
Let us illustrate this point with the tracing identify function for integers:
let force thunk =
thunk ();;
let an_int n =
let () = Printf.printf "processing %d...\n" n
in n;;
If we delay the application of an_int to an integer, its trace is not emitted until the first time the corresponding suspended computation is triggered. Henceforth this suspended computation is completed, no matter how many times we attempt to trigger it:
# let thirty_three = delay_need (fun () -> an_int 33);;
val thirty_three : unit -> int = <fun>
# thirty_three;;
- : unit -> int = <fun>
# force thirty_three;;
processing 33...
- : int = 33
# force thirty_three;;
- : int = 33
# thirty_three;;
- : unit -> int = <fun>
#
Conversely, if a computation is suspended and not triggered, it does not take place:
# let _ = delay_need (fun () -> an_int 100) in 10;;
- : int = 10
# (fun _ -> 10) (delay_need (fun () -> an_int 100));;
- : int = 10
#
Furthermore, demands can be nested.
As in Section A simple example of nested demands, in the following example, z denotes is a suspended computation that, when triggered, triggers the suspended denotation denoted by y, which in turn triggers the suspended denotation denoted by x:
let add n1 n2 =
let () = Printf.printf "adding %d to %d...\n" n1 n2
in n1 + n2;;
let x = delay_need (fun () -> an_int 1);;
let y = delay_need (fun () -> add (force x) (an_int 10));;
let z = delay_need (fun () -> add (force y) (an_int 100));;
The scenario from Section A simple example of nested demands (first force z twice in a row, and then y, and then x) gives rise to the same effects for the same reasons:
# force z;;
processing 100...
processing 10...
processing 1...
adding 1 to 10...
adding 11 to 100...
- : int = 111
# force z;;
- : int = 111
# force y;;
- : int = 11
# force x;;
- : int = 1
#
As in Section The same simple example of nested demands, just different, in the following analogous example, c denotes is a suspended computation that, when triggered, triggers the suspended denotation denoted by b, which in turn triggers the suspended denotation denoted by a:
let a = delay_need (fun () -> an_int 1);;
let b = delay_need (fun () -> add (force a) (an_int 10));;
let c = delay_need (fun () -> add (force b) (an_int 100));;
(The original example had x instead of a, y instead of b, and z instead of c, and we forced z.)
The scenario from The same simple example of nested demands, just different (first force b twice in a row, and then a, and then c twice in a row) gives rise to the same effects for the same reasons:
# force b;;
processing 10...
processing 1...
adding 1 to 10...
- : int = 11
# force b;;
- : int = 11
# force a;;
- : int = 1
# force c;;
processing 100...
adding 11 to 100...
- : int = 111
# force c;;
- : int = 111
#
To sum up, we have provided the same expressive power as in the Lazy library:
Recast [24 mar 2020]
Adjusted the summary [17 Apr 2019]
Revisited the sections from Chapter Demand-driven computation using delay_need and force [12 Apr 2019]
Created [08 Apr 2019]