The map function over polymorphic lists

The goal of this lecture note is to introduce the polymorphic map function that given a function and a list, applies this function to each of the elements of this list and collects the results in a new list. We had seen an instance of the map function for strings in Week 05, String.map, and in the mini-project about strings, string_map.

Anakin (opening really big eyes): Is it possible to use this expressive power?
The Supreme Chancellor: Not with a for loop.

Resources

Warmup #1: mapping the successor function

Our task is to implement an OCaml function that, given a list of integers, maps it to a list of integers containing the successors of all the elements of the given list, in the same order:

let test_list_map_succ candidate =
  (candidate [2; 1; 0] = [3; 2; 1])
  (* etc. *);;

Clearly this function should be specified by induction over its argument ns:

  • base case (ns = [])

    the empty list containing no elements, this function should map it to the empty list

  • induction step (ns = n :: ns’):

    for any list element n and for any list ns’, assuming that the function maps ns’ to the corresponding list of successors, which is the induction hypothesis, it should map (n :: ns’) to the list that starts with (succ n) and continues with this corresponding list of successors

Based on this specification, we can flesh out the unit-test function:

let test_list_map_succ candidate =
      (* an instance of the base case: *)
  let bc = (candidate [] = [])
      (* a few handpicked lists: *)
  and b1 = (candidate [0] = [1])
  and b2 = (candidate [1; 0] = [2; 1])
  and b3 = (candidate [2; 1; 0] = [3; 2; 1])
  and b4 = (candidate [3; 2; 1; 0] = [4; 3; 2; 1])
      (* an instance of the induction step: *)
  and is = (let n = 4
            and ns' = [3; 2; 1; 0]
            in let ih = candidate ns'
               in candidate (n :: ns') = succ n :: ih)
  (* etc. *)
  in bc && b1 && b2 && b3 && b4 && is;;

Since list_map_succ is specified inductively, it is to be programmed as a structurally recursive function. Here is its skeleton:

let list_map_succ ns_given =
  let rec visit ns =
    match ns with
    | [] ->
       ...
    | n :: ns' ->
       let ih = visit ns'
       in ...
  in visit ns_given;;

Rationale: the skeleton of visit follows the inductive structure of lists, i.e., it has a base case (the argument is the empty list) and an induction step (the argument is not the empty list).

Now all we need is to fill the blanks:

  • in the base case, the result is [], and
  • in the induction step, the result is obtained by cons’ing the successor of n to the result of the recursive call.

To wit:

let list_map_succ_v0 ns_given =
  let rec visit ns =
    match ns with
    | [] ->
       []
    | n :: ns' ->
       let ih = visit ns'
       in succ n :: ih
  in visit ns_given;;

This implementation is structurally recursive (since it follows the inductive structure of its input) and it passes the unit test:

# test_list_map_succ list_map_succ_v0;;
- : bool = true
#

In the induction step, the let-expression that declares ih, in the definition of visit, is cosmetic and so we can unfold it:

let list_map_succ_v1 ns_given =
  let rec visit ns =
    match ns with
    | [] ->
       []
    | n :: ns' ->
       succ n :: visit ns'
  in visit ns_given;;

This simplified implementation is still structurally recursive and it passes the unit test:

# test_list_map_succ list_map_succ_v1;;
- : bool = true
#

Warmup #2: mapping the predecessor function

Our task is to implement an OCaml function that, given a list of integers, maps it to a list of integers containing the predecessors of all the elements of the given list, in the same order:

let test_list_map_pred candidate =
  let b0 = (candidate [3; 2; 1] = [2; 1; 0])
  and b1 = (candidate [2; 1; 0] = [1; 0; ~-1])
  (* etc. *)
  in b0 && b1;;

(See the resource file.)

Warmup #3: mapping the square function

Our task is to implement an OCaml function that, given a list of integers, maps it to a list of integers containing the squares of all the elements of the given list, in the same order:

let test_list_map_square candidate =
  (candidate [3; 2; 1] = [9; 4; 1])
  (* etc. *);;

(See the resource file.)

Warmup #4: mapping the list-singleton function

Our task is to implement an OCaml function that, given a list of integers, maps it to a list of integers containing the singleton list of all the elements of the given list, in the same order:

let test_list_map_singleton_int candidate =
  (candidate [3; 2; 1] = [[3]; [2]; [1]])
  (* etc. *);;

(See the resource file.)

Generalization: the map function for lists

Generalizing, let us write a generic version of the functions above, list_map, that is parameterized by the function to map. This generic function is still specified by induction over its argument xs, given a function f:

  • base case (ns = []):

    the empty list containing no elements, this function should map it to the empty list

  • induction step (xs = x :: xs’):

    for any given list element x and for any given list xs’, if the function maps xs’ to the corresponding list of results, which is the induction hypothesis, it should map (x :: xs’) to the list that starts with the result of applying f to x and that continues with this corresponding list of results

Since list_map is specified inductively, it is to be programmed as a structurally recursive function. Here is its skeleton:

let list_map f xs_given =
  let rec visit xs =
    match xs with
    | [] ->
       ...
    | x :: xs' ->
       let ih = visit xs'
       in ...
  in visit xs_given;;

Rationale: the skeleton of visit follows the inductive structure of lists, i.e., it has a base case (the argument is the empty list) and an induction step (the argument is not the empty list).

Now all we need is to fill the blanks:

  • in the base case, the result is [], and
  • in the induction step, the result is obtained by cons’ing the result of applying f of n to the result of the recursive call.

To wit:

let list_map f xs_given =
 (* list_map : ('a -> 'b) -> 'a list -> 'b list *)
  let rec visit xs =
    match xs with
    | [] ->
       []
    | x :: xs' ->
       let ih = visit xs'
       in f x :: ih
  in visit xs_given;;

This implementation is structurally recursive (since it follows the inductive structure of its input) and we can instantiate it to implement each of the warmup examples above:

let list_map_succ_v2 ns =
  list_map succ ns;;

let list_map_pred_v2 ns =
  list_map pred ns;;

let list_map_square_v2 ns =
  list_map (fun n -> n * n) ns;;

let list_map_singleton_v2 ns =
  list_map (fun n -> [n]) ns;;

All these implementations pass the corresponding unit tests:

# test_list_map_succ list_map_succ_v2;;
- : bool = true
# test_list_map_pred list_map_pred_v2;;
- : bool = true
# test_list_map_square list_map_square_v2;;
- : bool = true
# test_list_map_singleton_int list_map_singleton_v2;;
- : bool = true
#

This generic map function is available in OCaml’s List library:

# test_list_map_succ (List.map succ);;
- : bool = true
# test_list_map_pred (List.map pred);;
- : bool = true
# test_list_map_square (List.map (fun n -> n * n));;
- : bool = true
# test_list_map_singleton_int (List.map (fun n -> [n]);;
- : bool = true
#

A map function implements a homomorphism, i.e., it preserves the structure of its input.

Given a function f and, e.g., a list [v0; v1; v2; v3], i.e., the result of evaluating:

List.cons v0 (List.cons v1 (List.cons v2 (List.cons v3 [])))

list_map essentially yields the list [f v0; f v1; f v2; f v3], i.e., the result of evaluating:

List.cons (f v0) (List.cons (f v1) (List.cons (f v2) (List.cons (f v3) [])))

In words, given a function f represented as f and a list vs of length n and represented as vs, evaluating list_map f vs recursively gives rise to n nested applications of List.cons that mimic the n nested occurrences of List.cons that constructed vs. For each nested occurrence of List.cons that constructed vs with a value vi, there is an occurrence of List.cons that constructs the result with the result of evaluating f vi.

Exercise 06

Implement list_copy (seen earlier on) as an instance of list_map.

Exercise 07

The goal of this exercise is to assess in which order the first argument of List.map and of list_map is applied to the elements of their second argument.

  1. When List.map f [v0; v1; v2] is evaluated, in which order is f applied to v0, v1, and v2? Is it from left to right or from right to left?

    Hint: pick a list of integers and map the tracing identity function an_int over it.

  2. When list_map f [v0; v1; v2] is evaluated, in which order is f applied to v0, v1, and v2? Is it from left to right or from right to left?

  3. If your answer to b. was “from right to left”, implement a version of list_map where the function is applied to each element of the list from left to right. And if your answer to b. was “from left to right”, implement a version of list_map where the function is applied to each element of the list from right to left.

  4. What do you conclude about the essence of list-map outlined just before Exercise 06?

    Hint: this question parallels the questions raised in the food for thought in Question 03 and Question 04 in the mini-project about strings.

Exercise 08

Exhibit an OCaml expression of type ('a * 'b) list -> ('b * 'a) list.

Hint: what is the title of this chapter?

Bong-sun: Lucky this hint isn’t the title of this chapter, eh.
Raymond Smullyan: It could have been, it could have been.
Mimer: Thanks for coming by, Prof. Smullyan. Could you autograph my copy of, ah, what is the name of this book?
Raymond Smullyan: Be glad to. Which one, did you say?
Loki: So now it needs a title?
Mimer: Actually, can you autograph both of them?
Raymond Smullyan: Gladly, Mimer. Gladly.

Exercise 09

Implement an OCaml function that unzips a list of pairs into a pair of lists, as in the zipper device for garnments (pouch, pants, etc.). For example, applying either function to [(1, 'a'); (2, 'b')] should yield ([1; 2], ['a'; 'b']). The first implementation should be structurally recursive over the given list, and the second should use List.map.

Subsidiary question: Implement a predicate that, given a list of pairs and a pair of lists, determines whether unzipping this list of pairs yields this pair of lists.

Hint about Exercise 09

Since the unzipping function is applied to a list of pairs, its domain should be ('a * 'b) list. And since it maps a list of pairs to the corresponding pair of lists, its codomain should be 'a list * 'b list. So all in all, its type should pe ('a * 'b) list -> 'a list * 'b list.

Now more than ever, one should first construct a unit-test function on successive growing lists of pairs:

                                      []

                              [(1, 'a')]

                    [(2, 'b'); (1, 'a')]

          [(3, 'c'); (2, 'b'); (1, 'a')]

[(4, 'd'); (3, 'c'); (2, 'b'); (1, 'a')]

Or again, more tellingly:

                                                []

                                    (1, 'a') :: []

                        (2, 'b') :: (1, 'a') :: []

            (3, 'c') :: (2, 'b') :: (1, 'a') :: []

(4, 'd') :: (3, 'c') :: (2, 'b') :: (1, 'a') :: []

The first line illustrates the base case, and the other lists illustrate the induction step, with the extra point that the tail of each nonempty list is the list written in the previous line. So, in practice, the result in the previous line is the result of the recursive call on the tail of the current list.

Concretely:

  • First the base case:

    let test_unzip_int_char candidate =
      let b0 = (candidate [] = result0)  (* for an appropriate result0 *)
      in b0;;
    

    What is an appropriate expression in the stead of result0? It should have type int list * char list.

  • Then a first instance of the induction step:

    let test_unzip_int_char candidate =
      let b0 = (candidate [] = result0)  (* for an appropriate result0 *)
      and b1 = (candidate [(1, 'a')] = result1)  (* for an appropriate result1 that combines 1, 'a', and result0 *)
      in b0 && b1;;
    

    The test b1 features an instance of the induction step since [(1, 'a')] is syntactic sugar for (1, 'a') :: [].

    What is an appropriate expression in the stead of result1? It should have type int list * char list and combine 1, 'a', and result0.

  • Then a second instance of the induction step:

    let test_unzip_int_char candidate =
      let b0 = (candidate [] = result0)  (* for an appropriate result0 *)
      and b1 = (candidate [(1, 'a')] = result1)  (* for an appropriate result1 that combines 1, 'a', and result0 *)
      and b2 = (candidate [(2, 'b'); (1, 'a')] = result2)  (* for an appropriate result2 that combines 2, 'b', and result1 *)
      in b0 && b1 && b2;;
    

    The test b2 features an instance of the induction step because [(2, 'b'); (1, 'a')] is syntactic sugar for (2, 'b') :: (1, 'a') :: [], i.e., for (2, 'b') :: [(1, 'a')].

    What is an appropriate expression in the stead of result2? It should have type int list * char list and combine 2, 'b', and result1.

etc.

Once one has figured out what to write in the stead of result0, result1, result2, etc., it is simple to implement unzip.

Exercise 10

Consider the two following functions:

let foo f g xs =
 (* foo : ('a -> 'b) -> ('c -> 'a) -> 'c list -> 'b list *)
  List.map f (List.map g xs);;

let bar f g xs =
 (* bar : ('a -> 'b) -> ('c -> 'a) -> 'c list -> 'b list *)
  List.map (fun x -> f (g x)) xs;;
  1. Are the functions denoted by foo and bar equivalent, when they are applied to functions that are pure (i.e., make no side effects) and total (i.e., defined on all their inputs)?
  2. Are the functions denoted by foo and bar equivalent, when they are applied to impure (e.g., tracing) functions or to partial (i.e., not total) functions?

Solution for Exercise 10

First of all, let us analyze foo and bar:

  • given two functions f and g and a list xs, the function denoted by foo first maps g over xs, yielding a new list, and then maps f over this new list, yielding a resulting list
  • given two functions f and g and a list xs, the function denoted by bar maps the composition of f and g over xs, yielding a resulting list.

Thus equipped, here are the answers to the questions:

  1. Let us pick two pure functions: one that increments its argument by 10, and one that increments its argument by 1:

    • by the analysis above, given these two functions and a list, the function denoted by foo

      1. traverses this list, applies the second function (the one that increments its argument by 1) to each element of this list, and constructs the list of the incremented results, which is a new list with the same length as the given list; and
      2. traverses this new list of incremented results, applies the second function (the one that increments its argument by 10) to each element of this new list, and constructs the list of the incremented results, which is the resulting list and has the same length as the new list;

      so all in all, given a list, the function denoted by foo returns a final list with the same length as the given list, but where all the elements are incremented by 1 + 10 = 11:

      # foo (fun j -> j + 10) (fun i -> i + 1) [0; 1; 2; 3; 4; 5];;
      - : int list = [11; 12; 13; 14; 15; 16]
      #
      
    • by the analysis above, given two functions and a list, the function denoted by bar

      1. manufactures a new function that composes the two functions; given an integer, this composite function first increments it by 1, and then increments the result by 10; in other words, this function increments its argument by 11;
      2. traverses the given list, applying the composite function to each element of this list, and constructs the list of the results; this list is the resulting list and has the same length as the given list;

      so all in all, given a list, the function denoted by bar returns a list with the same length as the given list, but where all the elements are incremented by 11:

      # bar (fun j -> j + 10) (fun i -> i + 1) [0; 1; 2; 3; 4; 5];;
      - : int list = [11; 12; 13; 14; 15; 16]
      #
      

    This leads us to conclude that the functions denoted by foo and bar are equivalent when applied to pure and total functions, because the order in which these pure and total functions are applied does not matter.

  2. As a corollary, even though they return the same result, the functions denoted by foo and bar are not equivalent when applied to impure functions, because the order in which these impure functions are applied does matter. Their effects are observationally distinct:

    # foo (fun j -> let () = Printf.printf "ten %s ->\n" (show_int j) in j + 10)
          (fun i -> let () = Printf.printf "one %s ->\n" (show_int i) in i + 1)
          [0; 1; 2; 3; 4; 5];;
    one 0 ->
    one 1 ->
    one 2 ->
    one 3 ->
    one 4 ->
    one 5 ->
    ten 1 ->
    ten 2 ->
    ten 3 ->
    ten 4 ->
    ten 5 ->
    ten 6 ->
    - : int list = [11; 12; 13; 14; 15; 16]
    # bar (fun j -> let () = Printf.printf "ten %s ->\n" (show_int j) in j + 10)
          (fun i -> let () = Printf.printf "one %s ->\n" (show_int i) in i + 1)
          [0; 1; 2; 3; 4; 5];;
    one 0 ->
    ten 1 ->
    one 1 ->
    ten 2 ->
    one 2 ->
    ten 3 ->
    one 3 ->
    ten 4 ->
    one 4 ->
    ten 5 ->
    one 5 ->
    ten 6 ->
    - : int list = [11; 12; 13; 14; 15; 16]
    #
    

    Analysis:

    • the function denoted by foo first applies the function that increments its argument by 1 (witness the successive traces of one), and then applies the function that increments its argument by 10 (witness the successive traces of ten), whereas
    • the function denoted by bar interleaves the function that increments its argument by 1 and the function that increments its argument by 10 (witness the interleaved traces of one and ten).

Historical note

Fusing the composition of functions into one function where no intermediate list is allocated is something that back in the days, Peter Landin and William Burge named “loop fusion”.

Exercise 11

The goal of this exercise is to implement the list analogue of String.mapi.

Given a function f and, e.g., a list [v0; v1; v2; v3], i.e., the result of evaluating:

List.cons v0 (List.cons v1 (List.cons v2 (List.cons v3 [])))

list_mapi essentially yields the list [f 0 v0; f 1 v1; f 2 v2; f 3 v3], i.e., the result of evaluating:

List.cons (f 0 v0) (List.cons (f 1 v1) (List.cons (f 2 v2) (List.cons (f 3 v3) [])))

In words, given a function f represented as f and a list vs of length n and represented as vs, evaluating list_mapi f vs recursively gives rise to n nested applications of List.cons that mimic the n nested occurrences of List.cons that constructed vs. For each nested occurrence of List.cons that constructed vs with a value vi, there is an occurrence of List.cons that constructs the result with the result of evaluating f i vi.

  1. Implement list.mapi so that, for example:

    # list_mapi (fun n v -> (n, v)) ['a'; 'b'; 'c'];;
    - : (int * char) list = [(0, 'a'); (1, 'b'); (2, 'c')]
    #
    
  2. Use this analogue to implement the polymorphic copy function over lists.

  3. Use this analogue to implement a polymorphic function that maps a list to a matching list of increasing indices. For example, this function should map the list ['a'; 'b'; 'c'] to [0; 1; 2].

  4. Use this analogue to implement a polymorphic function that maps a list to a matching list of decreasing indices. For example, this function should map the list ['a'; 'b'; 'c'] to [2; 1; 0].

The logical counterpart of the map function for lists

In the particular case where the function to map over a list is a predicate (i.e., a Boolean-valued function), we often want to compute not the list of its results but their conjunction or their disjunction:

let list_andmap p xs_given =
 (* list_andmap : ('a -> bool) -> 'a list -> bool *)
  let rec visit xs =
    match xs with
    | [] ->
       true
    | x :: xs' ->
       p x && visit xs'
  in visit xs_given;;

let list_ormap p xs_given =
 (* list_ormap : ('a -> bool) -> 'a list -> bool *)
  let rec visit xs =
    match xs with
    | [] ->
       false
    | x :: xs' ->
       p x || visit xs'
  in visit xs_given;;

Rationale:

  • list_andmap p [v1; v2; v3; ...; vN] corresponds to p v1 && p v2 && p v3 && ... && p vN && true, and
  • list_ormap p [v1; v2; v3; ...; vN] corresponds to p v1 || p v2 || p v3 || ... || p vN || false.

In this case, it is critical not to name the recursive call to visit so that list_andmap and list_ormap benefit from short-circuit evaluation:

# list_andmap (fun n -> if n = 0 then false else 1/0 = 0) [0; 1; 2; 3];;
- : bool = false
# list_ormap (fun n -> if n = 0 then true else 1/0 = 0) [0; 1; 2; 3];;
- : bool = true
#

Indeed, suppose we named the recursive call to visit, as in the incorrect definition below:

let list_andmap_incorrect p xs_given =
 (* list_andmap_incorrect : ('a -> bool) -> 'a list -> bool *)
  let rec visit xs =
    match xs with
    | [] ->
       true
    | x :: xs' ->
       let ih = visit xs'
       in p x && ih
  in visit xs_given;;

Then, if p denotes fun n -> if n = 0 then false else 1/0 = 0, desugaring list_andmap_incorrect p [0; 1] would yield:

let ih = let ih = true
         in p 1 && ih
in p 0 && ih;;

Evaluating this expression would give rise to the error Division_by_zero because p 1 is evaluated first, whereas evaluating p 0 && p 1 && true would yield false because p 0 is evaluated first and OCaml implements short-circuit evaluation. Indeed evaluating p 0 yields false, and therefore the result of evaluating p 0 && p 1 && true is false, without p 1 && true being evaluated (which would give rise to the error Division_by_zero).

Anton: I am so reminded of the mini-project about strings.
Alfrothul: You and me both, Anton. You and me both.

The case of the last call in list_andmap and list_ormap

The desugared counterparts of list_andmap p [v1; v2; v3; ...; vN] and list_ormap p [v1; v2; v3; ...; vN] read p v1 && p v2 && p v3 && ... && p vN && true and p v1 || p v2 || p v3 || ... || p vN || false, but it would be more intuitive for them to read:

p v1 && p v2 && p v3 && ... && p vN

p v1 || p v2 || p v3 || ... || p vN

without the trailing && true and || false. Not only would it be more intuitive, but it would have the property that the calls p vN would be tail calls, i.e., the last thing that happens when applying list_andmap and list_ormap.

Long story short (but it is the same story as the one about unparsing lists): to define such tail-sensitive versions of list_andmap and list_ormap, one should define them by induction over the tail of their input list when this list is nonempty:

let list_andmap_tail_sensitive p xs_given =
 (* list_andmap_tail_sensitive : ('a -> bool) -> 'a list -> bool *)
  match xs_given with
  | [] ->
     true
  | x :: xs' ->
     let rec visit x xs =
       match xs with
       | [] ->
          p x
       | x' :: xs' ->
          p x && visit x' xs'
  in visit x xs';;

let list_ormap_tail_sensitive p xs_given =
 (* list_ormap_tail_sensitive : ('a -> bool) -> 'a list -> bool *)
  match xs_given with
  | [] ->
     true
  | x :: xs' ->
     let rec visit x xs =
       match xs with
       | [] ->
          p x
       | x' :: xs' ->
          p x || visit x' xs'
  in visit x xs';;

The point is that when the tail of a list is empty, its head is the last element of this list.

Postlude

Anton: Just to make sure, list_andmap is List.for_all and list_ormap is List.exists, right?

Mimer: Right.

Resources

Version

Added a hint in Exercise 07.d [11 Mar 2023]

Added a hint in Exercise 08 [10 Mar 2023]

Created [06 Oct 2022]