The goal of this lecture note is to introduce mutable data in OCaml:
Fun fact: as in mathematics and unlike in imperative languages, variables do not vary in OCaml. There are, however, cases where we need a degree of mutability in the course of a computation, and that is why OCaml also features a type of mutable values: references. A reference is
Like that of options and lists, the type of a reference is polymorphic. Its name is the same as its constructor, namely ref:
REF | G |- e : t | |
G |- ref e : t ref |
DEREF | G |- e : t ref | |
G |- !e : t |
MUTATE | G |- e1 : t ref | G |- e2 : t | |
G |- e1 := e2 : unit |
In words:
First of all, let us look at the ref constructor:
# ref;;
- : 'a -> 'a ref = <fun>
#
We can see that it is a unary function that, given a value, returns a reference to this value:
Let us interactively construct the reference to an integer:
# ref 33;;
- : int ref = {contents = 33}
#
Let us interactively construct the reference to a Boolean:
# ref true;;
- : bool ref = {contents = true}
#
Let us interactively construct the reference to a string:
# ref "hello";;
- : string ref = {contents = "hello"}
#
We can construct references to any value:
for example, pairs of integers:
# ref (1, 10);;
- : (int * int) ref = {contents = (1, 10)}
#
for example, pairs of other ground values than integers:
# ref (33, true);;
- : (int * bool) ref = {contents = (33, true)}
#
for example, functions:
# ref (fun i -> i + 1);;
- : (int -> int) ref = {contents = <fun>}
#
for example, references:
# ref (ref "hello");;
- : string ref ref = {contents = {contents = "hello"}}
#
Since references are values, they can occur in other values:
for example, here is a pair of references:
# (ref false, ref 100);;
- : bool ref * int ref = ({contents = false}, {contents = 100})
#
for example, here is a reference to a pair of references:
# ref (ref false, ref 100);;
- : (bool ref * int ref) ref =
{contents = ({contents = false}, {contents = 100})}
#
References can occur in type declarations:
for example, here is the type of binary trees with mutable leaves:
type 'a binary_tree_with_mutable_leaves =
| Leaf of 'a ref
| Node of 'a binary_tree_with_mutable_leaves * 'a binary_tree_with_mutable_leaves;;
for example, here is the type of binary trees with mutable nodes:
type 'a binary_tree_with_mutable_nodes =
| Leaf of 'a
| Node of 'a binary_tree_with_mutable_nodes ref * 'a binary_tree_with_mutable_nodes ref;;
Given a reference, the ! unary operator yields the value that is referred to.
For example, given the reference to an integer, we can obtain this integer by applying ! to it:
# let r = ref 33 in !r;;
- : int = 33
#
For example, given the reference to the reference to an integer, we can obtain this integer by applying ! to it twice in a row:
# let r = ref (ref 33) in !(!r);;
- : int = 33
#
For example, here is a function that given a pair of references, returns the pair of the values that are referred to in the given pair:
# fun (r1, r2) -> (!r1, !r2);;
- : 'a ref * 'b ref -> 'a * 'b = <fun>
#
The infix operator :=, given a reference to a value of type t and a new value of type t, modifies the reference so that it refers to the new value. The result is of type unit.
For example, in the following expression, a reference to an integer, 1, is locally created and named r:
let r = ref 1 (* <-- creation of a reference *)
in let x = !r (* <-- dereference *)
in let () = r := 10 in (* <-- mutation of the reference *)
let y = !r (* <-- dereference *)
in x + y;;
In the scope of the declaration of r,
To wit:
# let r = ref 1
in let x = !r
in let () = r := 10 in
let y = !r
in x + y;;
- : int = 11
#
More atomically, we can
declare a reference:
# let r = ref 1;;
val r : int ref = {contents = 1}
#
mutate this reference:
# r := 10;;
- : unit = ()
#
dereference this reference:
# !r;;
- : int = 10
#
Back in Week 05 (the General interlude (continued) where an updated version of the String module was introduced), this updated module featured an imperative simulation of String.mapi in terms of String.map:
let string_mapi f s =
let i = ref ~-1
in String.map (fun c ->
let () = i := succ !i in
f !i c)
s;;
Its rationale was that since String.map applies its first argument to each character of its second argument from left to right, we can locally define a reference to an index, and imperatively increment this index each time the first argument is applied. The first argument of string_mapi is then successively applied
...
Adjust the imperative simulation just above so that given a string of length n (represented as n), the first argument of the adjusted function is successively applied
...
Pablito: We can compute the length of a string using String.map and a reference.
Halcyon: We can?
Pablito: Well yes – the mapped function is applied to each character of the given string, so if it increments a counter that was initialized with 0, the resulting counter will end with the number of characters in the string, i.e., its length. Look:
let string_length s =
let n = ref 0
in let _ = String.map (fun c -> let () = n := succ !n in c) s
in !n;;
Halcyon: Just a sec. – yes, this definition passes the unit tests.
Pablito: Thank you.
Let us revisit the section about Thunks by need in Week 11, using references rather than the Lazy library.
To map a thunk into a thunk that behaves as in call by need, we need to memoize its result the first time it is forced, and that is where references come handy.
The key idea of delay_need is to map a given thunk to a new thunk that is defined in the lexical scope of a reference initialized to None. The new thunk consults this reference to determine whether the given thunk has been forced yet, and then takes action:
In OCaml:
let delay_need thunk =
(* delay_need : (unit -> 'a) -> unit -> 'a *)
let memory_cell = ref None
in fun () -> match !memory_cell with
| Some memoized_value ->
memoized_value
| None ->
let memoized_value = force thunk
in (memory_cell := Some memoized_value;
memoized_value);;
For example, let us define the need counterparts of the lumberjack thunk and of the ok thunk that emit "I am a lumberjack" and "I am OK" before returning 10 and 1, respectively:
let lumberjack_thunk =
fun () -> let () = Printf.printf "I am a lumberjack\n"
in 10;;
let ok_thunk =
fun () -> let () = Printf.printf "I am OK\n"
in 1;;
let lumberjack_thunk_need =
delay_need lumberjack_thunk;;
let ok_thunk_need =
delay_need ok_thunk;;
Replaying the scenario from Week 11 still yields the answer 42, but it does so by emitting the messages I am a lumberjack and I am OK once, as it should with call by need:
# let a = force lumberjack_thunk_need
and b = force lumberjack_thunk_need
and c = force lumberjack_thunk_need
and d = force lumberjack_thunk_need
and e = force ok_thunk_need
and f = force ok_thunk_need
in a + b + c + d + e + f;;
I am a lumberjack
I am OK
- : int = 42
#
Created [03 Nov 2022]