The goal of this lecture note is to introduce the notion of unit tests.
The notion of functions that are applied to other functions puts us in privileged position to define unit-test functions: functions that, given a candidate function, verify whether applying this candidate function to a handful of handpicked actual parameters yields the expected results.
Unit-test functions come handy when figuring out what a function should do before figuring out how it should do it, i.e., before programming it.
Halcyon: The word “unit” is a homonym here, right?
Dana: Yes.
Consider, for example, factorial numbers.
Factorial numbers are the products of the successive integers from 1 to any integer:
By convention, fac_0 = 1, noted 0!, which provides a handy base case for an inductive definition of factorial numbers:
Should we wish to implement a factorial function that maps any natural number (say, n) to its factorial number (i.e., fac_n), here is a handy unit-test function that implements a conjunction of tests:
let test_fac candidate =
(candidate 0 = 1)
&& (candidate 1 = 1)
&& (candidate 2 = 1 * 2)
&& (candidate 3 = 1 * 2 * 3)
&& (candidate 4 = 1 * 2 * 3 * 4)
&& (candidate 5 = 1 * 2 * 3 * 4 * 5)
(* etc. *);;
Applying this unit-test function to a putative factorial function yields true if this putative function behaves like the factorial function when applied to 0, 1, 2, 3, 4, and 5: we say that the putative function passes the unit tests. Otherwise, applying the unit-test function yields false if this putative function does not behave like the factorial function when applied to either of 0, 1, 2, 3, 4, and 5: we say that the putative function failed to pass the unit tests, indicating that it was implemented incorrectly.
Alternatively, we can compute the factorial numbers in the definition of test_fac and inline the result of this computation:
let test_fac' candidate =
(candidate 0 = 1)
&& (candidate 1 = 1)
&& (candidate 2 = 2)
&& (candidate 3 = 6)
&& (candidate 4 = 24)
&& (candidate 5 = 121)
(* etc. *);;
Consider, for example, Fibonacci numbers.
Fibonacci numbers are members of a sequence of natural numbers that start with 0 and 1 and continues with the sum of the preceding two elements in the sequence:
Should we wish to implement a Fibonacci function that maps any natural number (say, n) to its Fibonacci number (i.e., fib_n), here is a handy unit-test function:
let test_fib candidate =
(candidate 0 = 0)
&& (candidate 1 = 1)
&& (candidate 2 = 1)
&& (candidate 3 = 2)
&& (candidate 4 = 3)
&& (candidate 5 = 5)
&& (candidate 6 = 8)
(* etc. *);;
The ship’s computer: Thank you for observing all safety precautions.
Applying this unit-test function to a putative Fibonacci function yields true if this putative function behaves like the Fibonacci function when applied to 0, 1, 2, 3, 4, 5, and 6: the putative function passes the unit tests. Otherwise, it yields false: the putative function failed to pass the unit tests.
Write a unit-test function for the successor function, i.e., the function that, given an integer, returns this integer, plus 1:
let test_successor candidate =
...;;
Verify that an OCaml implementation of the successor function passes your unit tests, i.e., that applying your unit-test function to this OCaml implementation of the successor function yields true:
# test_successor (fun i -> i + 1);;
- : bool = true
# test_successor (fun i -> 1 + i);;
- : bool = true
# test_successor succ;;
- : bool = true
#
(succ denotes the predefined successor function in OCaml, and by the same token, pred denotes the predefined predecessor function.)
Verify that an OCaml function that does not implement the successor function fails your unit test:
# test_successor (fun i -> i);;
- : bool = false
# test_successor (fun i -> i + 2);;
- : bool = false
#
Write a unit-test function for the negation function, i.e., the function that, given a Boolean, returns this Boolean, negated:
let test_not candidate =
...;;
Verify that the predeclared OCaml implementation of the negation function passes your unit tests, i.e., that applying your unit-that function to this OCaml implementation of the negation function yields true:
# test_not not;;
- : bool = true
#
Verify that an OCaml function that does not implement the negation function fails your unit test:
# test_not (fun b -> b);;
- : bool = false
#
Implement your own version of the negation function using an if-expression:
let neg_v1 b =
if ... then ... else ...;;
Verify that your implementation passes your unit tests:
# test_not neg_v1;;
- : bool = true
#
Implement your own version of the negation function using the equality predicate for Booleans:
let neg_v2 b =
... = ...;;
Verify that your implementation passes your unit tests:
# test_not neg_v2;;
- : bool = true
#
Write a unit-test function for the less-than comparison function, i.e., the function that, given two integers, returns true if the first is smaller than the second, and otherwise returns false:
let test_less_than candidate =
...;;
Verify that an OCaml implementation of the less-than comparison function passes your unit tests, i.e., that applying your unit-that function to this OCaml implementation of the less-than comparison function yields true:
# test_less_than (fun i j -> i < j);;
- : bool = true
# test_less_than (<);;
- : bool = true
#
Verify that an OCaml function that does not implement the less-than comparison function fails your unit test:
# test_less_than (fun i j -> i = j);;
- : bool = false
# test_less_than (fun i j -> i > j);;
- : bool = false
#
Question: would fun i j -> j < i fit the bill?—Loki
Let us enumerate a handful of pairs of integers where the first is smaller than the second, as well as a handful of pairs of integers where the first is greater than or equal to the second:
let test_less_than candidate =
(candidate 0 1 = true)
&& (candidate (-1) 1 = true)
&& (candidate (-10) 10 = true)
&& (candidate 0 0 = false)
&& (candidate 1 (-1) = false)
&& (candidate 10 (-10) = false)
(* etc. *);;
Here is a verification:
# test_less_than (fun i j -> i < j);;
- : bool = true
# test_less_than (<);;
- : bool = true
#
Here is another verification:
# test_less_than (fun i j -> i = j);;
- : bool = false
# test_less_than (fun i j -> i > j);;
- : bool = false
#
Anton: In Parts b. and c., are we testing the unit-test function?Mimer: Yes we are, Anton. Yes we are.Loki: How self-referential can you get.
Let us split the unit-test function into two:
a positive one that verifies that pairs of integers where the first is smaller than the second are correctly mapped to true, and
a negative one that verifies that pairs of integers where the first is greater than or equal to the second are correctly mapped to false:
let positive_test_less_than candidate =
(candidate 0 1 = true)
&& (candidate (-1) 1 = true)
&& (candidate (-10) 10 = true)
(* etc. *);;
let negative_test_less_than candidate =
(candidate 0 0 = false)
&& (candidate 1 (-1) = false)
&& (candidate 10 (-10) = false)
(* etc. *);;
Here is the positive verification:
# positive_test_less_than (fun i j -> i < j);;
- : bool = true
# negative_test_less_than (fun i j -> i < j);;
- : bool = true
#
Here is the negative verification:
# positive_test_less_than (fun i j -> i = j);;
- : bool = false
# negative_test_less_than (fun i j -> i = j);;
- : bool = false
# positive_test_less_than (fun i j -> i > j);;
- : bool = false
# negative_test_less_than (fun i j -> i > j);;
- : bool = false
#
Alfrothul: So in Part c. of this exercise, we want the tests to fail.Mimer: Yes. For the tests to succeed, they must fail.Loki: How paradoxical can you get.
Write a unit-test function for a twice-less-than comparison function, i.e., the function that, given three integers, returns true if the first is smaller than or equal to the second and the second smaller than or equal to the third, and otherwise returns false:
let test_twice_less_than candidate =
(candidate 100 200 300 = true)
&& (candidate ... ... ... = true)
&& (candidate ... ... ... = true)
&& (candidate 101 100 300 = false)
&& (candidate ... ... ... = false)
&& ...
(* etc. *);;
Verify that an OCaml implementation of the twice-less-than comparison function passes your unit tests, i.e., that applying your unit-test function to this OCaml implementation of the twice-less-than comparison function yields true:
# test_twice_less_than (fun i j k -> ...);;
- : bool = true
#
Hint: ... should contain one occurrence of the OCaml boolean conjunction and two occurrences of one of the OCaml comparison operators.
Food for thought: besides testing whether its first argument is smaller than its second argument and whether its second argument is smaller than its third argument, should your OCaml implementation of the twice-less-than comparison function also test whether its first argument is smaller than its third argument? Why?
Verify that an OCaml function that does not implement the twice-less-than comparison function fails your unit tests:
# test_twice_less_than (fun i j k -> ...);;
- : bool = false
# test_twice_less_than (fun i j k -> ...);;
- : bool = false
#
Name several implementations of the twice_less_than comparison function:
let twice_less_than_v0 i j k =
...
let twice_less_than_v1 i j k =
...
let twice_less_than_v2 i j k =
...
...
Verify that your implementations pass your unit tests:
# test_twice_less_than twice_less_than_v0;;
- : bool = true
# test_twice_less_than twice_less_than_v1;;
- : bool = true
# test_twice_less_than twice_less_than_v2;;
- : bool = true
#
Implement an OCaml function that, given three integer arguments x, y, and z, tests whether x occurs in the interval that starts with y and ends with z.
First of all, how would you test your OCaml function?
Inspired by the factorial function that maps a non-negative integer n to the product from 1 to n (i.e., 1 * 2 * 3 * ... * n), write a unit-test function for a “summatorial” function that maps a non-negative integer n to the sum from 0 to n (i.e., 0 + 1 + 2 + 3 + ... + n, i.e., the sum of the n+1 first natural numbers).
Observing that the function (fun i -> 2 * i + 1) maps a non-negative integer n to the n+1‘th odd natural number, write a unit-test function for a function that, given a non-negative integer n, computes the sum of the n+1 first odd numbers.
The most exciting phrase to hear in science,the one that heralds new discoveries,is not “Eureka!” but “That’s funny...”
Mischievously, someone who shall remain unnamed wrote the two following candidates for the factorial and Fibonacci functions:
let mischievous_fac n =
if n = 0 then 1 else
if n = 1 then 1 else
if n = 2 then 2 else
if n = 3 then 6 else
if n = 4 then 24 else
if n = 5 then 121 else 42;;
let mischievous_fib n =
if n = 0 then 0 else
if n = 1 then 1 else
if n = 2 then 1 else
if n = 3 then 2 else
if n = 4 then 3 else
if n = 5 then 5 else
if n = 6 then 8 else 42;;
Question: do these mischievous functions pass the unit tests for the factorial and Fibonacci functions?
Hint: to answer this question, you will need to evaluate test_fac mischievous_fac and test_fib mischievous_fib.
Program testing [...] show[s] the presence of bugs,[not] their absence.
Bong-sun: Well, let’s see:
# #use "week-03_unit-tests.ml";;
val test_fac : (int -> int) -> bool = <fun>
val test_fib : (int -> int) -> bool = <fun>
val test_less_than : (int -> int -> bool) -> bool = <fun>
val positive_test_less_than : (int -> int -> bool) -> bool = <fun>
val negative_test_less_than : (int -> int -> bool) -> bool = <fun>
val mischievous_fac : int -> int = <fun>
val mischievous_fib : int -> int = <fun>
val not_or : bool -> bool -> bool = <fun>
val and_not_not : bool -> bool -> bool = <fun>
val not_and : bool -> bool -> bool = <fun>
val or_not_not : bool -> bool -> bool = <fun>
val end_of_file : string = "week-03_unit-tests.ml"
#
Pablito: OK?
Bong-sun: Now we check that the fake factorial function passes the unit tests:
# test_fac mischievous_fac;;
- : bool = true
#
Pablito: OK.
Bong-sun: And we check that it is fake, e.g., because it does not map 6 to the Factorial number for 6:
# mischievous_fac 6 <> 1 * 1 * 2 * 3 * 4 * 5 * 6;;
- : bool = true
#
Pablito: Ah, right. So the same should go for the fake fibonacci function. Let me try:
# test_fib mischievous_fib;;
- : bool = true
#
Bong-sun: Right. The fake fibonacci function passes the unit tests. Now we just need to check that it is fake. Let’s see. The fake fibonacci function is well defined from 0 to 6, since that is all that the unit-test function is verifying. So let’s apply it to 7.
Pablito: And the Fibonacci number for 7 is?
Bong-sun: Well, the Fibonacci number for 5 is 5 and the Fibonacci number for 6 is 8, so the Fibonacci number for 7 is 5 + 8, i.e., 13:
# mischievous_fib 7 <> 13;;
- : bool = true
#
Pablito: So a fake function is well defined for all the cases that the unit-test function verifies, and otherwise it yields an incorrect result.
Bong-sun: Yes. And to illustrate that, we need both to show that applying the unit-test function to it yields true, and that applying it to an untested input does not yield the correct output.
Pablito: Testing shows the presence of a bug, eh.
Edsger Dijkstra: Well, it does.
Mimer: Prof. Dijkstra! Thanks for coming by! Huh, mind autographing my copy of The Humble Programmer?
What Dijkstra said.—Loki the Mischievous
Dana: I am afraid Loki is right.
Anton: Well, he was just quoting Dijkstra, which seems like a safe thing to do. Why wouldn’t he be right?
Dana: Authority arguments notwithstanding, look:
# 1 * 2 * 3 * 4 * 5;;
- : int = 120
#
Anton: So?
Alfrothul (jumping in): 120, not 121.
Anton: Oh. So mischievous_fac is incorrect not just because of 42, which is incorrect, but because applying it to 5 yields 121, which is also incorrect.
The Robot: It does not compute.
Loki: Why so negative? Haven’t you read the interlude illustrating how tests show the presence of bugs? mischievous_fac passes the unit test.
Mimer: What Loki is pointing out is that there is a bug both in test_fac and in mischievous_fac. And since it is the same bug, it is not detected. Yes, Moody?
“Mad Eye” Moody: CONSTANT VIGILANCE!
Write two unit-test functions:
Can you strengthen your unit-test functions so that not even a mischievous Nordic god can bypass them?
As reminded in the introductory chapter about OCaml, the conjunction of two Booleans is true if both of these Booleans are true; otherwise it is false. So a natural unit-test function enumerates the 4 possibilities:
let test_conj candidate =
(candidate true true = true)
&& (candidate false true = false)
&& (candidate true false = false)
&& (candidate false false = false);;
This unit-test function needs no strengthening because it is complete: it enumerates all possible cases.
Dijkstra’s point is that incomplete testing only proves the presence of errors, not their absence, since it does not enumerate all possible cases. In contrast, complete testing proves the absence of errors since it enumerates all possible cases.
Ever at the forefront of logic in computer science, Anton and Alfrothul have studied the De Morgan Laws, which say that (1) the negation of a conjunction is the disjunction of the negations of the conjuncts, and (2) the negation of a disjunction is the conjunction of the negations of the disjuncts. After much sweating, they have come up with the following definitions:
let not_or b1 b2 =
not (b1 || b2);;
let and_not_not b1 b2 =
not b1 && not b2;;
let not_and b1 b2 =
not (b1 && b2);;
let or_not_not b1 b2 =
not b1 || not b2;;
Getting as much inspiration from the De Morgan Laws as you feel like, how would you test these functions, pairwise?
Also, in reference to the closing question of Exercise 20, can you strengthen your unit-test functions?
Alfrothul: not_and implements a NAND, right?Mimer: Yes it does.
Code coverage, or again test coverage, provides an argument about how exhaustively a program is tested: a program is exhaustively tested if all its syntactic units are processed in the course of the tests. Concretely, a functional program is exhaustively tested if all its sub-expressions are evaluated in the course of the tests, and an imperative program is exhaustively tested if all its commands are executed in the course of the tests.
In practice, since programs contain conditional constructs (prototypically “if boolean_expression then consequent else alternative” but programming languages often feature other conditional constructs), each boolean expression in each conditional construct should evaluate to true in one test and to false in another test, a criterion known as “condition coverage”. A unit test that satisfies condition coverage also satisfies the criterion known as “branch coverage”: each consequent and each alternative in each conditional construct is processed in the course of the tests.
Here is a simple example. The following OCaml function
It is implemented using two nested conditional expressions, e.g., as follows:
let decrement_or_increment n =
(* decrement_or_increment : int -> int *)
if n <= 0
then if n < 0
then pred n
else 0
else succ n;;
The following unit-test functions test decrement_or_increment in an inexhaustive manner because for each of them, at least one subexpression in the definition is not evaluated in the course of the tests.
For example, only positive integers are tested:
let test_decrement_or_increment_with_positive_integers candidate =
(candidate 1 = 2) &&
(candidate 2 = 3);;
The implementation passes this unit test, but so does the following incorrect one:
let decrement_or_increment_bogus_for_non_positive_integers n =
if n <= 0
then 42
else succ n;;
To wit:
# test_decrement_or_increment_with_positive_integers decrement_or_increment;;
- : bool = true
# test_decrement_or_increment_with_positive_integers decrement_or_increment_bogus_for_non_positive_integers;;
- : bool = true
#
This unit-test function does not satisfy the criterion of condition coverage because it only makes the outer test n <= 0 evaluate to false. As a consequence, it does not satisfy the criterion of branch coverage because the outer consequent if n < 0 then pred n else 0 is not evaluated when the unit-test function is applied. Therefore a faulty definition where this consequent is replaced by another expression with the same type passes this unit test, no matter what this other expression evaluates to.
For example, only negative integers are tested:
let test_decrement_or_increment_with_negative_integers candidate =
(candidate ~-1 = ~-2) &&
(candidate ~-2 = ~-3);;
The implementation passes this unit test, but so does the following incorrect one:
let decrement_or_increment_bogus_for_non_positive_integers n =
if n <= 0
then if n < 0
then pred n
else 42
else 42;;
To wit:
# test_decrement_or_increment_with_negative_integers decrement_or_increment;;
- : bool = true
# test_decrement_or_increment_with_positive_integers decrement_or_increment_bogus_for_non_positive_integers;;
- : bool = true
#
This unit-test function does not satisfy the criterion of condition coverage because it only makes the outer test n <= 0 and the inner test n < 0 evaluate to true. As a consequence, it does not satisfy the criterion of branch coverage because the inner alternative 0 and the outer alternative succ n are not evaluated when the unit-test function is applied. Therefore a faulty definition where these alternatives are replaced by other expressions with the same type passes this unit test, no matter what these other expressions evaluate to.
For example, only non-negative integers are tested:
let test_decrement_or_increment_with_non_negative_integers candidate =
(candidate 0 = 0) &&
(candidate 1 = 2) &&
(candidate 2 = 3);;
The implementation passes this unit test, but so does the following incorrect one:
let decrement_or_increment_bogus_for_negative_integers n =
(* decrement_or_increment : int -> int *)
if n <= 0
then if n < 0
then 42
else 0
else succ n;;
To wit:
# test_decrement_or_increment_with_non_negative_integers decrement_or_increment;;
- : bool = true
# test_decrement_or_increment_with_non_negative_integers decrement_or_increment_bogus_for_negative_integers;;
- : bool = true
#
This unit-test function does not satisfy the criterion of condition coverage because it only makes the inner test n < 0 evaluate to false. As a consequence, it does not satisfy the criterion of branch coverage because the inner consequent pred n is not evaluated when the unit-test function is applied. Therefore a faulty definition where this consequent is replaced by another expression with the same type passes this unit test, no matter what this other expression evaluates to.
For example, only non-positive integers are tested:
let test_decrement_or_increment_with_non_positive_integers candidate =
(candidate 0 = 0) &&
(candidate ~-1 = ~-2) &&
(candidate ~-2 = ~-3);;
The implementation passes this unit test, but so does the following incorrect one:
let decrement_or_increment_bogus_for_positive_integers n =
(* decrement_or_increment : int -> int *)
if n <= 0
then if n < 0
then pred n
else 0
else 42;;
To wit:
# test_decrement_or_increment_with_non_positive_integers decrement_or_increment;;
- : bool = true
# test_decrement_or_increment_with_non_positive_integers decrement_or_increment_bogus_for_positive_integers;;
- : bool = true
#
This unit-test function does not satisfy the criterion of condition coverage because it only makes the outer test n <= 0 evaluate to true. As a consequence, it does not satisfy the criterion of branch coverage because the outer alternative succ n is not evaluated when the unit-test function is applied. Therefore a faulty definition where this consequent is replaced by another expression with the same type passes this unit test, no matter what this other expression evaluates to.
For example, only non-zero integers are tested:
let test_decrement_or_increment_with_non_zero_integers candidate =
(candidate 1 = 2) &&
(candidate 2 = 3) &&
(candidate ~-1 = ~-2) &&
(candidate ~-2 = ~-3);;
The implementation passes this unit test, but so does the following incorrect one:
let decrement_or_increment_bogus_for_zero n =
(* decrement_or_increment : int -> int *)
if n <= 0
then if n < 0
then pred n
else 42
else succ n;;
To wit:
# test_decrement_or_increment_with_non_zero_integers decrement_or_increment;;
- : bool = true
# test_decrement_or_increment_with_non_zero_integers decrement_or_increment_bogus_for_zero;;
- : bool = true
#
This unit-test function does not satisfy the criterion of condition coverage because it only makes the Boolean expression n < 0 evaluate to true. As a consequence, it does not satisfy the criterion of branch coverage because the consequent pred n is not evaluated when the unit-test function is applied. Therefore a faulty definition where this consequent is replaced by another expression with the same type passes this unit test, no matter what this other expression evaluates to.
In contrast, the following unit-test function tests decrement_or_increment in an exhaustive manner because all the subexpressions in the definition are evaluated in the course of the tests:
let test_decrement_or_increment candidate =
(candidate ~-2 = ~-3) &&
(candidate ~-1 = ~-2) &&
(candidate 0 = 0) &&
(candidate 1 = 2) &&
(candidate 2 = 3);;
To wit:
So in terms of code coverage, decrement_or_increment is exhaustively tested.
Code coverage is intensional in that it is about the “how” of a program. Extensionally, what the program computes is something else, and so it cannot be completely tested. We can still write fake implementations that pass the unit tests:
let fake_decrement_or_increment n =
if ~-2 <= n && n <= 2
then decrement_or_increment n
else 1/0;;
To wit:
# test_decrement_or_increment fake_decrement_or_increment;;
- : bool = true
# fake_decrement_or_increment 3;;
Exception: Division_by_zero.
#
So exhaustive testing is not necessarily complete.
Is complete testing exhaustive? Yes if there is no dead code, and no if there is dead code. What is dead code? It is a piece of a program that is never executed. For a concrete example:
let foo x =
if x mod 2 < 2
then x + 1
else let y = x / 0
in x - 1;;
By the very definition of the modulo operator, the test x mod 2 < 2 always evaluates to true. Therefore the consequent x + 1 is always evaluated and the alternative is never evaluated: this alternative is dead code. Try as we might, we cannot ensure test coverage.
As a corollary, foo would be better implemented as:
let foo x =
x + 1;;
Added the interlude illustrating how tests show the presence of bugs [19 Mar 2023]
Added a hint in Exercise 17.b [27 Jan 2023]
Expanded Exercise 17 [27 Jan 2023]
Created [26 Aug 2022]