Modules in OCaml

So far, all our declarations have been either

  • global, with let ... = ..., type ..., and exception ..., or
  • local, with let ... = ... in ... and with fun ... -> ....

The visibility of these declarations is then regulated by lexical scope.

OCaml also provides a way

  1. to group declarations together in a module and
  2. to control the visibility of these declarations outside this module by declaring a type for each module.

The type of a module is called its signature and it provides an interface between what the module is supposed to offer (from the point of view of its user) and how it is achieving this offer (from the point of view of its implementer). Modules provide language support for writing libraries as well as for implementing abstract data types.

Syntax

So far, we have been operating with the following BNF of OCaml, essentially:

       <type> ::= int
                | bool
                | ...
                | <type> -> <type>
                | <type> * <type> {* <type>}*
                | ...

<declaration> ::= let <name> {: <type>}? = <expression> {and <name> = <expression>}*
                | type <name> = ...

 <expression> ::= <literal>
                | <name>
                | if <expression> then <expression> else <expression>
                | let <name> {: <type>}? = <expression> {and <name> = <expression>}* in <expression>
                | ...

where

  • the notation {...}? means that ... may occur zero times or once.
  • the notation {...}* means that ... may occur zero times or more.

Taking modules into account, this BNF is:

  • extended with one new production for declarations (for modules),
  • extended with one new non-terminal (for structures), and
  • modified with a dot notation for names.

The extension and modification of this BNF reads:

       <type> ::= ...

<declaration> ::= ...
                | module <name> = <structure>

  <structure> ::= struct {<declaration>}* end

 <expression> ::= ...
                | <name>{.<name>}*
                | ...

Analysis:

  • a structure is garnished with declarations placed between the keywords struct and end;
  • a module is declared with the keyword module and given a name that denotes a structure; and
  • a variable declared in a module is accessed by writing the name of the module, then a dot, and then the name of the variable. So for example, if the variable is named x and the module is named M, then the variable is accessed outside the module by writing M.x.

To summarize, modules are named structures that group declarations.

Interlude

Harald: We have already encountered this dot notation, haven’t we?

Alfrothul: Yes we have. Take the list accessors List.hd and List.tl, for example.

Vigfus: And List.rev too.

Mimer: That is because OCaml is implemented in OCaml and lists are implemented in a module named List. Each of the functions hd, tl, and rev are declared in that module.

Alfrothul: And the capitalization in the name “List”?

Mimer: OCaml expects the name of modules to start with a capital letter.

Harald: Just like the name of data-type constructors?

Mimer: Just like the name of data-type constructors.

A first module

Our first module is named First_v0, and it consists of a structure that groups the declaration of two variables, b, and n:

module First_v0 =
  struct
    let b = true
    let n = 33
  end;;

Outside this module, we can refer to, e.g., n by writing First_v0.n. For example, the expression First_v0.n + 11 evaluates to 44.

The type of the structure denoted by First_v0 – its signature – reads as follows:

module First_v0 :
  sig
    val b : bool
    val n : int
  end

This signature conveys that the structure denoted by First_v0 contains two declarations, and it specifies the names of the variables that are declared, together with their type, in the order they were declared.

Exercise 8

Define a module named Zeroth_v0 and consisting of an empty structure, i.e., a structure that groups the declaration of zero variables. What does its signature look like?

Solution for Exercise 8

Let’s ask OCaml:

# module Zeroth_v0 =
    struct
    end;;
module Zeroth_v0 : sig  end
#

The structure and the signature are both empty.

A second module

In a structure, each declaration is processed sequentially, using the same lexical-scope discipline as at the OCaml toplevel. For example, let us extend our first module with a third declaration that refers to an earlier declaration in the structure:

module Second_v0 =
  struct
    let b = true
    let n = 33
    let i = n + 11
  end;;

Once this module is declared, Second_v0.i denotes 44.

The type of the structure denoted by Second_v0 – its signature – reads as follows:

module Second_v0 :
  sig
    val b : bool
    val n : int
    val i : int
  end

This signature conveys that the structure denoted by Second_v0 contains three declarations.

A third module

Lexical scope oblige, in a structure, we can declare variables with the same name. These names then shadow the earlier names:

module Third_v0 =
  struct
    let b = true
    let n = 33
    let i = n + 11
    let i = n
    and n = i
    and b' = i < n
  end;;

The signature of this module, i.e., the type of the structure denoted by Third_v0, reads as follows:

module Third_v0 :
  sig
    val b : bool
    val i : int
    val n : int
    val b' : bool
  end

This signature contains four declarations because in the six declarations in Third_v0, the initial declarations of n and i are shadowed by the subsequent ones. The order of the declarations, in this signature, reflects the order in which each (non-shadowed) variable has been declared. Reflecting the parallelism of the last let-expression in the module, Third_v0.i denotes 33, Third_v0.n denotes 44, and Third_v0.b' denotes false.

To wit:

# module Third_v0 =
    struct
      let b = true
      let n = 33
      let i = n + 11
      let i = n
      and n = i
      and b' = i < n
    end;;

module Third_v0 : sig val b : bool val i : int val n : int val b' : bool end
# Third_v0.i;;
- : int = 33
# Third_v0.n;;
- : int = 44
# Third_v0.b;;
- : bool = true
#

A fourth module, with nesting

Since the declaration of a module is, well, a declaration, we can declare a module in a module:

module Fourth_v0 =
  struct
    let x = 10
    module July =
      struct
        let y = 20
      end
    let z = 30
  end;;

Once this module is declared,

  • Fourth_v0.x denotes 10,
  • Fourth_v0.July.y denotes 20 (note how the nested use of the dot notation reflects the nesting of July in Fourth_v0), and
  • Fourth_v0.z denotes 30.

Its signature reads as follows:

module Fourth_v0 :
  sig
    val x : int
    module July :
      sig
        val y : int
      end
    val z : int
  end

To wit:

# module Fourth_v0 =
    struct
      let x = 10
      module July =
        struct
          let y = 20
        end
      let z = 30
    end;;
module Fourth_v0 : sig val x : int module July : sig val y : int end val z : int end
# Fourth_v0.x;;
- : int = 10
# Fourth_v0.July.y;;
- : int = 20
# Fourth_v0.z;;
- : int = 30
#

Exercise 9

Pick a name for a non-existent month (e.g., Juillet) and declare an alternative version of Module Fourth_v0 (e.g., Quatorze) where Module July is replaced by an empty module (like in Exercise 8) named after this non-existent month. What does the signature of your alternative version of Module Fourth_v0 look like?

Solution for Exercise 9

The alternative version reads as follows:

module Quatorze =
  struct
    let x = 10
    module Juillet =
      struct
      end
    let z = 30
  end;;

Its signature reads as follows (courtesy of OCaml):

# module Quatorze =
  struct
    let x = 10
    module Juillet =
      struct
      end
    let z = 30
  end;;
module Quatorze : sig val x : int module Juillet : sig  end val z : int end
#

The signature of Juillet, in the signature of Quatorze, is empty.

Declaring the signature of a module

OCaml also provides a way to name the signature of modules. Taking this feature into account, its BNF reads:

            <type> ::= ...

<type-declaration> ::= val <name> : <type>
                     | module <name> : <signature>

       <signature> ::= <name>
                     | sig {<type-declaration>}* end

       <structure> ::= <name>
                     | struct {<declaration>}* end

     <declaration> ::= ...
                     | module <name> {: <signature>}? = <structure>
                     | module type <name> = <signature>

      <expression> ::= ...

The first module, revisited

Let us name the signature of the first module:

module type FIRST =
  sig
    val b : bool
    val n : int
  end;;

We are now in position to revisit the first module and explicitly declare it with this signature:

module First_v1 : FIRST =
  struct
    let b = true
    let n = 33
  end;;

To wit:

# module First_v1 : FIRST =
  struct
    let b = true
    let n = 33
  end;;
module First_v1 : FIRST
#

Exercise 10

  1. Using the name ZEROTH, name the signature of your empty module in Exercise 8.
  2. Explicitly declare an empty module called Zeroth_v1 with this signature.

Solution for Exercise 10

  1. Here is the named signature:

    module type ZEROTH =
      sig
      end;;
    
  2. Here is the empty module:

    module Zeroth_v1 : ZEROTH =
      struct
      end;;
    

To wit:

# module type ZEROTH = sig end;;
module type ZEROTH = sig  end
# module Zeroth_v1 : ZEROTH = struct end;;
module Zeroth_v1 : ZEROTH
#

The second module, revisited

Let us name the signature of the second module:

module type SECOND =
  sig
    val b : bool
    val n : int
    val i : int
  end;;

We can also revisit the second module and explicitly declare it with this signature... or with the same signature as First_v1, since it is more restrictive:

module Second_v1 : SECOND =
  struct
    let b = true
    let n = 33
    let i = n + 11
  end;;

module Second_v2 : FIRST =
  struct
    let b = true
    let n = 33
    let i = n + 11
  end;;

Henceforth the interface to Second_v2 is restricted to b and n: Second_v2.b and Second_v2.n are defined but not Second_v2.i: i has become private to the implementer of Second_v2.

To wit:

# Second_v1.b;;
- : bool = true
# Second_v2.b;;
- : bool = true
# Second_v1.n;;
- : int = 33
# Second_v2.n;;
- : int = 33
# Second_v1.i;;
- : int = 44
# Second_v2.i;;
Characters 0-10:
  Second_v2.i;;
  ^^^^^^^^^^
Error: Unbound value Second_v2.i
#

Exercise 11

What would happen if Second_v2 was to be explicitly declared with the signature ZEROTH?

Solution for Exercise 11

Here is what OCaml says:

# module Second_v3 : ZEROTH = struct let b = true let n = 33 let i = n + 11 end;;
module Second_v3 : ZEROTH
#

Because of its restricted signature, we cannot access any of the names declared in Second_v2:

# Second_v1.b;;
- : bool = true
# Second_v3.b;;
Characters 0-14:
  Second_v3.b;;
  ^^^^^^^^^^^^^^
Error: Unbound value Second_v3.b
#

These names have become private to the implementer of Second_v3.

The third module, revisited

We can finally revisit the third module and declare it with an explicit (unnamed) signature, because OCaml also allows that:

module Third_v1 : sig val b : bool val b' : bool end =
  struct
    let b = true
    let n = 33
    let i = n + 11
    let i = n
    and n = i
    and b' = i < n
  end;;

Henceforth the interface to Third_v1 is restricted to b and b': i and n have become private to its implementer.

Exercise 12

What would happen if Third_v1 was to be explicitly declared with the empty signature, i.e., with sig end?

Solution for Exercise 12

Let’s ask OCaml:

# module Third_v2 : sig end =
    struct
      let b = true
      let n = 33
      let i = n + 11
      let i = n
      and n = i
      and b' = i < n
    end;;
module Third_v2 : sig  end
#

OCaml accepts this declaration. To us users, Third_v2 behaves as the empty module.

Functors

Just as OCaml features functions as expressions where a variable has been abstracted to denote [the value of] another expression, it also features functors as modules where a variable has also been abstracted to denote another structure. Taking this feature into account, its BNF reads:

            <type> ::= ...

<type-declaration> ::= val <name> : <type>
                     | module <name> : <signature>

       <signature> ::= <name>
                     | sig {<type-declaration>}* end
                     | functor (<name> : <signature>) -> <signature>

       <structure> ::= <name>
                     | struct {<declaration>}* end
                     | functor (<name> : <signature>) -> <structure>
                     | <structure> (<structure>)

     <declaration> ::= let <name> = <expression> {and <name> = <expression>}*
                     | module <name> {: <signature>}? = <structure>
                     | module type <name> = <signature>

      <expression> ::= ...
                     | <name>{.<name>}*
                     | ...

Applying a functor to an argument of a suitable type then yields a module where the variable is instantiated with this argument.

For example, the following functor is parameterized with a module of type FIRST:

module Second_maker_v0 =
  functor (M : FIRST) ->
  struct
    let b = M.b
    let n = M.n
    let i = n + 11
  end;;

Applying Second_maker_v0 to a module of type FIRST yields a module that has the same signature as that of Module Second_v0. In this module, b and n respectively denote the values of b and n in the argument of type FIRST.

Interlude

Harald: You remember the syntactic sugar for function declarations in OCaml?

Alfrothul: Beg pardon?

Harald: You know, for example the identity function or the successor function:

let identity = fun x -> x;;

let successor = fun i -> i + 1;;

Alfrothul: You mean syntactic sugar like this, so that we don’t write the keyword fun:

let identity x =
  x;;

let successor i
  = i + 1;;

Harald: Yes. Well, there is a similar syntactic sugar for functors. For example, Second_maker_v0 can also be declared without writing the keyword functor:

module Second_maker_v1 (M : FIRST) =
  struct
    let b = M.b
    let n = M.n
    let i = n + 11
  end;;

Alfrothul: Ah, OK.

The second module, re2visited

Let us applying Functor Second_maker_v0 to Module First_v1, which is suitably typed:

# module Second_v4 = Second_maker_v0 (First_v1);;
module Second_v4 : sig val b : bool val n : int val i : int end
#

As one can see, the resulting module has the same signature as that of Second_v0. Therefore we can declare it as having this explicit signature:

# module Second_v5 : SECOND = Second_maker_v0 (First_v1);;
module Second_v5 : SECOND
#

We can also declare it with the same restriction as before, should we so wish:

# module Second_v6 : FIRST = Second_maker_v0 (First_v1);;
module Second_v6 : FIRST
#

Exercise 13

What would happen if the result of applying Second_maker_v0 to First_v1 was to be explicitly declared

  1. with Signature ZEROTH?
  2. with the empty signature?

Solution for Exercise 13

Let’s ask OCaml:

# module Second_v7 : ZEROTH = Second_maker_v0 (First_v1);;
module Second_v7 : ZEROTH
# module Second_v8 : sig end = Second_maker_v0 (First_v1);;
module Second_v8 : sig  end
#

OCaml accepts these two declarations. To us users, these modules behave as the empty module.

Functors that return other functors

A functor can also return another functor:

module Compound_maker_maker =
  functor (X : sig val x : int end) ->
  functor (Y : sig val y : int end) ->
  struct
    let x = X.x
    let y = Y.y
  end;;

We can then instanciate it with one structure:

module Compound_maker =
  Compound_maker_maker (struct let x = 10 end);;

The resulting functor can then be instantiated with another structure:

module Compound =
  Compound_maker (struct let y = 20 end);;

Alternatively, the initial functor can be instantiated with both structures at once:

module Compound_alt =
  Compound_maker_maker (struct let x = 10 end) (struct let y = 20 end);;

Interlude

Alfrothul (soberly): The syntactic sugar for functor declarations scales.

Harald: Beg pardon?

Alfrothul: For example, Compound_maker_maker can be declared without writing the keyword functor:

module Compound_maker_maker' (X : sig val x : int end) (Y : sig val y : int end) =
  struct
    let x = X.x
    let y = Y.y
  end;;

Harald: Ah, OK.

Functors that are applied to other functors

Conversely to having functors return other functors, we can also apply functors to other functors:

module Full_of_emptiness (F : functor (M : sig end) -> sig end) =
  F (struct end);;

Postlude

Harald: This module language is like a functional language, isn’t it?

Alfrothul: Yes, it looks like: structures are named, just like [the values of] expressions are named. Also, they can be abstracted into a functor through a variable that stands for another structure.

Harald: Right: functors are to structures what functions are to expressions, since an expression can be abstracted into a function through a variable that stands for [the value of] another expression.

Alfrothul: And functors can both return and be applied to other functors.

Harald: So, a functional language of modules?

Alfrothul: A functional language of modules.

Harald: Do you think a functor could be recursive?

Alfrothul: Time out, Harald! Time out.

Resources

Version

Created [02 Apr 2019]