Polymorphic types

The goal of this lecture note is to introduce polymorphic types in OCaml.

Reminder

OCaml is a statically typed language with (so far) the following grammar of types and expressions:

<type> ::= int | bool | char | string | <type> * ... * <type> | unit | <type> -> <type>
<type_environment> ::= . | (<name> : <type>), <type_environment>
<expression> ::= <integer> | <boolean> | <character> | <string> | if <expression> then <expression> else <expression> | (<expression>, ..., <expression>) | <name> | fun <formal> -> <expression> | <expression> <expression>
<integer> ::= 0 | 1 | -1 | 2 | -2 | ...
<boolean> ::= true | false
<character> ::= one character delimited by two single quotes
<string> ::= a series of characters delimited by two double quotes
<name> ::= a concatenation of characters starting with a lowercase character

The corresponding typing rules for lambda-abstraction and applications read as follows:

FUNG, (x1 : t1) |- e2 : t2
G |- fun x1 -> e2 : t1 -> t2
APPG |- e0 : t1 -> t2G |- e1 : t1
G |- e0 e1 : t2

In words:

  • in any type environment G, the expression fun x1 -> e2 has type t1 -> t2 whenever in that environment extended with the binding of x1 to t1, e2 has type t2; and
  • in any type environment G where the expression e0 has type t1 -> t2 and where e1 has type t1, the expression e0 e1 has type t2: applying a function of type t1 -> t2 to an argument of type t1 yields a result of type t2.

Monomorphic types

Assuming typing rules such as

NOTG |- e : bool
G |- not e : bool
ADDG |- e1 : intG |- e2 : int
G |- e1 + e2 : int

terms such as fun x -> not x and fun x -> fun y -> x + y are monomorphically typed, i.e., their type is completely determined and their context of use is completely fixed. The former is a function expecting a Boolean and returning a Boolean, and the second is a function that expects an integer and returns a function that expects an integer and returns an integer. To wit:

# fun x -> not x;;
- : bool -> bool = <fun>
# fun x -> fun y -> x + y;;
- : int -> int -> int = <fun>
#

These types are composite (they use the type constructor ->) but their ground instances of <type> are one of the atomic types, i.e., int, bool, char, string, or unit:

<type> -->
<type> -> <type> -->    (* the left instance of <type> is the ground type bool *)
bool -> <type> -->      (* the remaining instance of <type> is the ground type bool *)
bool -> bool

<type> -->
<type> -> <type> -->           (* the left instance of <type> is the ground type int *)
int -> <type> -->
int -> <type> -> <type> -->    (* the left instance of <type> is the ground type int *)
int -> int -> <type> -->       (* the remaining instance of <type> is the ground type int *)
int -> int -> int

Given each expression, OCaml has inferred its type. Let us verify that each of these expressions has the type inferred by OCaml, i.e., let us type check these expressions by contructing the corresponding proof tree:

  • We start at the bottom of the proof tree, stating that in any type environment G, the expression fun x -> not x has type bool -> bool:

    G |- fun x -> not x : bool -> bool
    

    There is only one typing rule we can apply, namely FUN, where G is instantiated with G, x is instantiated with x, e is instantiated with not x, t1 is instantiated with bool, and t2 is instantiated with bool:

        (x : bool), G |- not x : bool
    FUN ----------------------------------
        G |- fun x -> not x : bool -> bool
    

    There is only one typing rule we can apply, namely NOT, where G is instantiated with G and e is instantiated with x:

        (x : bool), G |- x : bool
    NOT -------------------------
        (x : bool), G |- not x : bool
    FUN ----------------------------------
        G |- fun x -> not x : bool -> bool
    

    There is only one typing rule we can apply, namely LOOKUP_FOUND, where x is instantiated with x, t is instantiated with bool, G is instantiated with G, x is instantiated with x, and t is instantiated with bool:

    LOOKUP_FOUND -------------------------
                 (x : bool), G |- x : bool
             NOT -------------------------
                 (x : bool), G |- not x : bool
             FUN ----------------------------------
                 G |- fun x -> not x : bool -> bool
    

    The proof tree is complete, which proves that in any type environment G, the expression fun x -> not x has type bool -> bool.

  • We start at the bottom of the proof tree, stating that in any type environment G, the expression fun x -> fun y -> x + y has type int -> int -> int:

    G |- fun x -> fun y -> x + y : int -> int -> int
    

    There is only one typing rule we can apply, namely FUN, where G is instantiated with the current type environment, i.e., G, x is instantiated with the formal parameter of the current function abstraction, i.e., x, e is instantiated with the body of the current function abstraction, i.e., fun y -> x + y, t1 is instantiated with the domain of the current type, i.e., int, and t2 is instantiated with the codomain of the current type, i.e., int -> int (since function types implicitly associate to the right):

        (x : int), G |- fun y -> x + y : int -> int
    FUN ------------------------------------------------
        G |- fun x -> fun y -> x + y : int -> int -> int
    

    Again, there is only one typing rule we can apply, namely FUN, where G is instantiated with the current type environment, i.e., G, x is instantiated with the formal parameter of the current function abstraction, i.e., y, e is instantiated with the body of the current function abstraction, i.e., x + y, t1 is instantiated with the domain of the current type, i.e., int, and t2 is instantiated with the codomain of the current type, i.e., int:

        (y : int), (x : int), G |- x + y : int
    FUN --------------------------------------
        (x : int), G |- fun y -> x + y : int -> int
    FUN ------------------------------------------------
        G |- fun x -> fun y -> x + y : int -> int -> int
    

    There is only one typing rule we can apply, namely ADD, where G is instantiated with the current type environment, i.e., (y : int), (x : int), G, e1 is instantiated with the first operand in the current expression, i.e., x, and e2 is instantiated with the second operand in the current expression, i.e., y:

        (y : int), (x : int), G |- x : int    (y : int), (x : int), G |- y : int
    ADD ------------------------------------------------------------------------
        (y : int), (x : int), G |- x + y : int
    FUN --------------------------------------
        (x : int), G |- fun y -> x + y : int -> int
    FUN ------------------------------------------------
        G |- fun x -> fun y -> x + y : int -> int -> int
    

    Considering the left branch, there is only one typing rule we can apply, namely LOOKUP_NOT_FOUND_YET since x and y are distinct, where x' is instantiated with the first identifier in the current environment, i.e., y, t' is instantiated with the type of the first identifier in the current environment, i.e., int, G is instantiated with the rest of the type environment, i.e., (x : int), G, x is instantiated with the identifier to look up, i.e., x, and t is instantiated with the current type, i.e., int:

                         (x : int), G |- x : int
    LOOKUP_NOT_FOUND_YET ----------------------------------
                         (y : int), (x : int), G |- x : int    (y : int), (x : int), G |- y : int
                     ADD ------------------------------------------------------------------------
                         (y : int), (x : int), G |- x + y : int
                     FUN --------------------------------------
                         (x : int), G |- fun y -> x + y : int -> int
                     FUN ------------------------------------------------
                         G |- fun x -> fun y -> x + y : int -> int -> int
    

    Still considering the left branch, there is only one typing rule we can apply, namely LOOKUP_FOUND since the variable whose binding starts the environment is the one we are looking for. So LOOKUP_FOUND is applied, where x is instantiated with the first identifier in the current environment, i.e., x, t is instantiated with the type of the first identifier in the current environment, i.e., int, G is instantiated with the rest of the type environment, i.e., G, x is instantiated with the identifier to look up, i.e., x, and t is instantiated with the current type, i.e., int:

            LOOKUP_FOUND -----------------------
                         (x : int), G |- x : int
    LOOKUP_NOT_FOUND_YET ----------------------------------
                         (y : int), (x : int), G |- x : int    (y : int), (x : int), G |- y : int
                     ADD ------------------------------------------------------------------------
                         (y : int), (x : int), G |- x + y : int
                     FUN --------------------------------------
                         (x : int), G |- fun y -> x + y : int -> int
                     FUN ------------------------------------------------
                         G |- fun x -> fun y -> x + y : int -> int -> int
    

    Considering the right branch, there is only one typing rule we can apply, namely LOOKUP_FOUND, where x is instantiated with the first identifier in the current environment, i.e., y, t is instantiated with the type of the first identifier in the current environment, i.e., int, G is instantiated with the rest of the type environment, i.e., (x : int), G, x is instantiated with the identifier to look up, i.e., y, and t is instantiated with the current type, i.e., int:

            LOOKUP_FOUND -----------------------
                         (x : int), G |- x : int
    LOOKUP_NOT_FOUND_YET ----------------------------------    LOOKUP_FOUND ----------------------------------
                         (y : int), (x : int), G |- x : int                 (y : int), (x : int), G |- y : int
                     ADD -------------------------------------------------------------------------------------
                         (y : int), (x : int), G |- x + y : int
                     FUN --------------------------------------
                         (x : int), G |- fun y -> x + y : int -> int
                     FUN ------------------------------------------------
                         G |- fun x -> fun y -> x + y : int -> int -> int
    

    The proof tree is complete, which proves that in any type environment G, the expression fun x -> fun y -> x + y has type int -> int -> int.

Polymorphic types

Some expressions, however, have a type that is not completely determined, and to this end, the grammar of types is extended with type variables:

<type> ::= int | bool | char | string | <type> * ... * <type> | unit | <type> -> <type> | <type_variable>
<type_variable> ::= 'a | 'b | 'c | ...

(N.B.: 'a is pronounced “alpha”, 'b is pronounced “beta”, etc.)

A type containing variables is said to be polymorphic: it adapts to its context of use. For example, the identity function is polymorphic:

# fun x -> x;;
- : 'a -> 'a = <fun>
#

The identity function can be variously applied to values of different types:

# (fun x -> x) 33;;
- : int = 33
# (fun x -> x) true;;
- : bool = true
# (fun x -> x) 'a';;
- : char = 'a'
# (fun x -> x) "abc";;
- : string = "abc"
# (fun x -> x) ((33, true), ('a', "abc"));;
- : (int * bool) * (char * string) = ((33, true), ('a', "abc"))
# let identity = fun x -> x;;
val identity : 'a -> 'a = <fun>
# (identity 33, identity "abc");;
- : int * string = (33, "abc")
#

As pointed out in the section about Ground equality, OCaml’s equality predicate is polymorphic:

# (=);;
- : 'a -> 'a -> bool = <fun>
# 1 = 1;;
- : bool = true
# 1 = 2;;
- : bool = false
# true = true;;
- : bool = true
# 'a' = 'b';;
- : bool = false
# "hello" = "world";;
- : bool = false
#

Not only that, but this equality predicate is structural and so it scales to, e.g., tuples:

# (1, ('2', "3")) = (1, ('2', "3"));;
- : bool = true
#

It however does not scale to functions: functions cannot be tested for structural equality. The only thing that can be done with a function is to apply it, and so we can only test their extensional equality: two functions are extensionally equal whenever applying them to the same input yields the same output, for any input.

Interlude about the extensional equality of functions

Vigfus: And there we go again with self-reference.

Halcyon: Self-reference?

Vigfus: Well, yes, we are defining the equality of functions based on the equality of their arguments and the equality of their result.

Halcyon: That indeed looks like a circular argument.

Loki: The problem is only apparent. If two objects are equal, then they are the same as each other. Hence the self-reference.

Brynja: Loki, you must love “Smoke on the water”.

Loki: Well, its opening guitar riff is awesome.

Brynja: But the argument is not circular.

Halcyon: Not circular?

Brynja: The equality at type t1 -> t2 is defined in terms of the equality at type t1 and of the equality at type t2, which are smaller types. And since types are constructed inductively, the argument may repeat if t1 and t2 are compound types (for example, two values of type t3 * t4, i.e., two pairs, are equal whenever their first projections (of type t3) are equal and their second projections (of type t4) are equal), but eventually the argument will stop at a ground type, e.g., int, bool, or unit.

Halcyon: OK, not circular.

Vigfus: But not computable either.

Halcyon: Not computable?

Vigfus: Testing whether two functions are extensionally equal is just that – testing – and we know that testing is incomplete in general.

Mimer: You got it, Vigfus.

Interlude about functions returning other functions

Harald: So what is the type of fun x1 -> fun x2 -> x1?

Alfrothul: Easy peasy, and it is a polymorphic function:

# fun x -> fun y -> x;;
- : 'a -> 'b -> 'a = <fun>
#

Brynja: That is easy to prove too, witness the following typing-derivation tree:

        LOOKUP_FOUND ---------------------
                     (x : 'a), G |- x : 'a
LOOKUP_NOT_FOUND_YET -------------------------------
                     (y : 'b), (x : 'a), G |- x : 'a
                 FUN -------------------------------
                     (x : 'a), G |- fun y -> x : 'b -> 'a
                 FUN -----------------------------------------
                     G |- fun x -> fun y -> x : 'a -> 'b -> 'a

Harald: Hum. Is there an expression that has type 'a -> 'b -> 'b?

Alfrothul: I guess we could try constructing a typing-derivation tree?

Brynja: Let’s start with the mystery expression ?:

G |- ? : 'a -> 'b -> 'b

Alfrothul: Well, there is only one typing rule we can apply, namely FUN, and that forces the mystery expression ? to be fun x -> ??, where ?? is a new mystery expression:

    (x : 'a), G |- ?? : 'b -> 'b
FUN ---------------------------------
    G |- fun x -> ?? : 'a -> 'b -> 'b

Harald: Again, there is still only one typing rule we can apply, namely FUN, and that forces the new mystery expression ?? to be fun y -> ???, where ??? is another mystery expression:

    (y : 'b), (x : 'a), G |- ??? : 'b
FUN --------------------------------------
    (x : 'a), G |- fun y -> ??? : 'b -> 'b
FUN -------------------------------------------
    G |- fun x -> fun y -> ??? : 'a -> 'b -> 'b

Brynja: LOOKUP_FOUND comes to the rescue, which forces the other mystery expression ??? to be y:

LOOKUP_FOUND -------------------------------
             (y : 'b), (x : 'a), G |- y : 'b
         FUN ------------------------------------
             (x : 'a), G |- fun y -> y : 'b -> 'b
         FUN -----------------------------------------
             G |- fun x -> fun y -> y : 'a -> 'b -> 'b

Harald: The proof tree is complete, which proves that in any type environment G, the expression fun x -> fun y -> y has type 'a -> 'b -> 'b.

Alfrothul: And so fun x -> fun y -> y is your mystery expression.

Brynja: Is it unique?

Harald: I guess not:

# fun x -> fun y -> (fun z -> z) x;;
- : 'a -> 'b -> 'a = <fun>
#

Exercise 1

  1. Exhibit an expression of type 'a -> 'b -> 'a * 'b. Justify your answer.
  2. Exhibit an expression of type 'a -> 'b -> 'b * 'a. Justify your answer.
  3. Exhibit two distinct expressions of type bool -> 'a -> 'a -> 'a. Justify your answer.

Interlude about functions applied to other functions

Harald: Remember our first unit-test function?

Alfrothul: Yes... To simplify, it was something like:

let test_successor candidate =
  (candidate 0 = 1) && (candidate 1 = 2);;

Harald: Well, this function is applied to another function.

Alfrothul: That is indeed true:

# test_successor;;
- : (int -> int) -> bool = <fun>
#

Brynja: Reading the function type as logical implication, I think we should be able to write a function whose type is the modus ponens.

Harald: You mean something like ('a -> 'b) -> 'a -> 'b?

Brynja: Yup. The bottom part of the typing-derivation tree is easy enough to draw:

    (x : 'a), (f : 'a -> 'b), G |- ? : 'b
FUN ------------------------------------------
    (f : 'a -> 'b), G |- fun x -> ? : 'a -> 'b
FUN -------------------------------------------------
    G |- fun f -> fun x -> ? : ('a -> 'b) -> 'a -> 'b

Alfrothul: To obtain an expression of type 'b, we could apply f to something that has type 'a.

Harald: x comes handy:

    (x : 'a), (f : 'a -> 'b), G |- f : 'a -> 'b    (x : 'a), (f : 'a -> 'b), G |- x : 'a
APP ------------------------------------------------------------------------------------
    (x : 'a), (f : 'a -> 'b), G |- f x : 'b
FUN --------------------------------------------
    (f : 'a -> 'b), G |- fun x -> f x : 'a -> 'b
FUN ---------------------------------------------------
    G |- fun f -> fun x -> f x : ('a -> 'b) -> 'a -> 'b

Brynja: And now all we need to do is to look up f and x in the type environment:

        LOOKUP_FOUND ---------------------------------
                     (f : 'a -> 'b), G |- f : 'a -> 'b
LOOKUP_NOT_FOUND_YET -------------------------------------------    LOOKUP_FOUND -------------------------------------
                     (x : 'a), (f : 'a -> 'b), G |- f : 'a -> 'b                 (x : 'a), (f : 'a -> 'b), G |- x : 'a
    APP --------------------------------------------------------------------------------------------------------------
        (x : 'a), (f : 'a -> 'b), G |- f x : 'b
    FUN --------------------------------------------
        (f : 'a -> 'b), G |- fun x -> f x : 'a -> 'b
    FUN ---------------------------------------------------
        G |- fun f -> fun x -> f x : ('a -> 'b) -> 'a -> 'b

Harald: This expression is not unique, we can apply the identity function in it as much as we like.

Brynja: So the function whose type is the modus ponens expects a function and returns a function that expects an argument and applies this function to this argument.

Alfrothul: So it’s an apply function, right? To give it a name:

let apply f x =
 (* apply : ('a -> 'b) -> 'a -> 'b *)
  f x;;

Mimer: Right. This apply function is the computational content of the modus ponens.

Exercise 2

  1. Exhibit an expression of type 'a -> 'a * 'a. Justify your answer.
  2. Exhibit an expression of type (unit -> 'a) -> 'a. Justify your answer.
  3. Exhibit an expression of type ('a -> 'b -> 'c) -> 'a -> 'b -> 'c. Justify your answer.
  4. Exhibit an expression of type ('a -> 'b -> 'c) -> 'b -> 'a -> 'c. Justify your answer.
  5. Exhibit an expression of type ('a -> 'b) -> 'a -> 'a -> 'b * 'b. Justify your answer.
  6. Exhibit an expression of type ('a * 'b -> 'c) -> 'a -> 'b -> 'c. Justify your answer.
  7. Exhibit an expression of type ('a -> 'b) -> ('b -> 'c) -> 'a -> 'c. Justify your answer.

Reading the function type as logical implication and pairing as logical conjunction, do these types have a logical content?

Postlude

Halcyon: I remember my first encounter with polymorphism.

Vigfus: Pray tell.

Halcyon: My Dad asked me how I was doing at school and I told him I was struggling with additions.

Vigfus: OK?

Halcyon: So he tried to help and asked me that if I had 2 oranges and he gave me one more, how many oranges would I have.

Vigfus: And?

Halcyon: And I told him I didn’t know, because at school we had only learned how to add apples.

Vigfus (suspicious): Practicing Dad jokes, are we.

Version

Added the postlude [16 Mar 2020]

Added the interlude about the extensional equality of functions [07 Feb 2020]

Created [06 Feb 2020]