Mutable data in OCaml

The goal of this lecture note is to introduce mutable data in OCaml:

<type> ::= ... | <type> ref
<expression> ::= ... | ref <expression> | ! <expression> | <expression> := <expression>

References in OCaml

Fun fact: as in mathematics and unlike in imperative languages, variables do not vary in OCaml. There are, however, cases where we need a degree of mutability in the course of a computation, and that is why OCaml also features a type of mutable values: references. A reference is

  • constructed with ref,
  • dereferenced with !, and
  • mutated using the infix operator :=.

Like that of options and lists, the type of a reference is polymorphic. Its name is the same as its constructor, namely ref:

REFG |- e : t
G |- ref e : t ref
DEREFG |- e : t ref
G |- !e : t
MUTATEG |- e1 : t refG |- e2 : t
G |- e1 := e2 : unit

In words:

  • in any type environment G, the expression ref e has type t ref whenever in that environment, e has type t;
  • in any type environment G, the expression !e has type t whenever in that environment, e has type t ref; and
  • in any type environment G, the expression e1 := e2 has type unit whenever in that environment, e1 has type t ref and e2 has type t.

Constructing mutable values

First of all, let us look at the ref constructor:

# ref;;
- : 'a -> 'a ref = <fun>
#

We can see that it is a unary function that, given a value, returns a reference to this value:

  • Let us interactively construct the reference to an integer:

    # ref 33;;
    - : int ref = {contents = 33}
    #
    
  • Let us interactively construct the reference to a Boolean:

    # ref true;;
    - : bool ref = {contents = true}
    #
    
  • Let us interactively construct the reference to a string:

    # ref "hello";;
    - : string ref = {contents = "hello"}
    #
    

We can construct references to any value:

  • for example, pairs of integers:

    # ref (1, 10);;
    - : (int * int) ref = {contents = (1, 10)}
    #
    
  • for example, pairs of other ground values than integers:

    # ref (33, true);;
    - : (int * bool) ref = {contents = (33, true)}
    #
    
  • for example, functions:

    # ref (fun i -> i + 1);;
    - : (int -> int) ref = {contents = <fun>}
    #
    
  • for example, references:

    # ref (ref "hello");;
    - : string ref ref = {contents = {contents = "hello"}}
    #
    

Since references are values, they can occur in other values:

  • for example, here is a pair of references:

    # (ref false, ref 100);;
    - : bool ref * int ref = ({contents = false}, {contents = 100})
    #
    
  • for example, here is a reference to a pair of references:

    # ref (ref false, ref 100);;
    - : (bool ref * int ref) ref =
    {contents = ({contents = false}, {contents = 100})}
    #
    

References can occur in type declarations:

  • for example, here is the type of binary trees with mutable leaves:

    type 'a binary_tree_with_mutable_leaves =
      | Leaf of 'a ref
      | Node of 'a binary_tree_with_mutable_leaves * 'a binary_tree_with_mutable_leaves;;
    
  • for example, here is the type of binary trees with mutable nodes:

    type 'a binary_tree_with_mutable_nodes =
      | Leaf of 'a
      | Node of 'a binary_tree_with_mutable_nodes ref * 'a binary_tree_with_mutable_nodes ref;;
    

Dereferencing mutable values

Given a reference, the ! unary operator yields the value that is referred to.

  • For example, given the reference to an integer, we can obtain this integer by applying ! to it:

    # let r = ref 33 in !r;;
    - : int = 33
    #
    
  • For example, given the reference to the reference to an integer, we can obtain this integer by applying ! to it twice in a row:

    # let r = ref (ref 33) in !(!r);;
    - : int = 33
    #
    
  • For example, here is a function that given a pair of references, returns the pair of the values that are referred to in the given pair:

    # fun (r1, r2) -> (!r1, !r2);;
    - : 'a ref * 'b ref -> 'a * 'b = <fun>
    #
    

Mutating references

The infix operator :=, given a reference to a value of type t and a new value of type t, modifies the reference so that it refers to the new value. The result is of type unit.

For example, in the following expression, a reference to an integer, 1, is locally created and named r:

let r = ref 1              (* <-- creation of a reference *)
in let x = !r              (* <-- dereference *)
   in let () = r := 10 in  (* <-- mutation of the reference *)
      let y = !r           (* <-- dereference *)
      in x + y;;

In the scope of the declaration of r,

  1. the variable x is locally declared to denote the value that is currently referred to by the denotation of r (i.e., 1);
  2. the reference is mutated with another integer (namely 10);
  3. the variable y is locally declared to denote the value that is currently referred to by the denotation of r (i.e., 10); and
  4. the result of adding the denotations of x (i.e., 1) and y (i.e., 10) is returned.

To wit:

# let r = ref 1
  in let x = !r
     in let () = r := 10 in
        let y = !r
        in x + y;;
- : int = 11
#

More atomically, we can

  1. declare a reference:

    # let r = ref 1;;
    val r : int ref = {contents = 1}
    #
    
  2. mutate this reference:

    # r := 10;;
    - : unit = ()
    #
    
  3. dereference this reference:

    # !r;;
    - : int = 10
    #
    

The imperative simulation of String.mapi in terms of String.map

Back in Week 05 (the General interlude (continued) where the Stringy module was introduced), this Stringy module featured an imperative simulation of String.mapi in terms of String.map:

let string_mapi f s =
  let i = ref ~-1
  in String.map (fun c ->
                  let () = i := succ !i in
                  f !i c)
                s

Its rationale was that since String.map applies its first argument to each character of its second argument from left to right, we can locally define a reference to an index, and imperatively increment this index each time the first argument is applied. The first argument of string_mapi is then successively applied

  1. to 0 and the first character of the given string,
  2. to 1 and the second character of the given string,
  3. to 2 and the third character of the given string,
  4. etc.

Version

Fine-tuned the narrative of the section about string_mapi [05 Apr 2020]

Added the section about string_mapi [04 Apr 2020]

Created [24 Mar 2020]