The goal of this lecture note is to illustrate how to program with strings in OCaml.
So strings are built out of characters and are written between double quotes:
# "hello world";;
- : string = "hello world"
#
They are concatenated with the infix operator ^:
# "ab" ^ "ba";;
- : string = "abba"
# "hello" ^ " " ^ "world";;
- : string = "hello world"
#
And if we want to have a double quote in a string, we quote it with a backslash, i.e., \. This backslash is purely notational, in that it does not exist in the string, which we can verify by computing the length of the said string:
# "\"";;
- : string = "\""
# String.length "\"";;
- : int = 1
#
Likewise if we want to have a backslash in a string – we quote it too:
# "\\";;
- : string = "\\"
# String.length "\\";;
- : int = 1
#
Besides String.length, a function of type string -> int to compute the length of a string, the library for strings also contains, e.g, String.get : string -> int -> char, a function to index a character in a string, starting from 0 all the way up to (and excluding) the length of the string. Indexing a string with a negative integer or with an index that is greater or equal to the length of the string raises an error:
# let a_string_of_length_3 = "o_O";;
val a_string_of_length_3 : string = "o_O"
# String.length a_string_of_length_3;;
- : int = 3
# String.get a_string_of_length_3 min_int;;
Exception: Invalid_argument "index out of bounds".
# String.get a_string_of_length_3 (-1);;
Exception: Invalid_argument "index out of bounds".
# String.get a_string_of_length_3 0;;
- : char = 'o'
# String.get a_string_of_length_3 1;;
- : char = '_'
# String.get a_string_of_length_3 2;;
- : char = 'O'
# String.get a_string_of_length_3 4;;
Exception: Invalid_argument "index out of bounds".
# String.get a_string_of_length_3 max_int;;
Exception: Invalid_argument "index out of bounds".
#
Assume that strings are represented (i.e., implemented) as a contiguous area of memory (which they actually are). Such a contiguous area of memory is often called a “buffer”, and the fact that it cannot be indexed outside of its bounds (here: 0, inclusively, and the length of the string, exclusively) means that the implementation of OCaml does not suffer from the buffer overflow problem. Small causes, big effects.
Notationally, if the name s denotes a string and i denotes an integer, s.[i] is syntactic sugar for (i.e., is unparsed into) String.get s i.
In Exercise 01.h, for example, “adding 1 to the left n times, iteratively and starting with 0” means that we start with 0, and then we add 1, which gives 1, and then we add 1, which gives 2, and then we add 1, which gives 3, etc., iteratively, i.e., one addition at a time, as if we were constructing an onion:
0 = 0
1 + (0) = 1
1 + (1 + (0)) = 1 + (1) = 2
1 + (1 + (1 + (0))) = 1 + (2) = 3
1 + (1 + (1 + (1 + (0)))) = 1 + (3) = 4
An editorial recommendation: to add 1 to the left n times, iteratively and starting with 0,
Anton: This mini-interlude suspiciously looks like we are constructing another onion.
Alfrothul: Right, starting from the bud, namely Exercise 01.1, we embed it into some food for thought, and then we embed the result into a mini-interlude.
Anton: We should write a small interlude about that. We would call it “Small interlude about a mini-interlude about the food for thought about Exercise 01.1.”
Alfrothul: OK, we could, but that was not your point, was it?
Anton: Actually it was not, I got distracted by the title of this mini-interlude.
Alfrothul: One has to admit it is not very telling.
Anton: Let me fix that.
The fourth wall: Ouch.The fifth wall: Again?The sixth wall: That must have hurted.The seventh wall: Er, guys. Where are we?The eighth wall: In Hilbert’s hotel, looks like.
The fourth wall: Hello world.The fifth wall: Now I am the fifth wall?The sixth wall: Tell me about it.
Anton: Ah, much better.
Alfrothul: Still feels like an onion, but the outer layer is a lot more clear. So, Emacs?
Anton: Right. So. First, C-x 3 M-x run-ocaml RET, let’s place ourselves at the toplevel of OCaml:
#
Alfrothul: OK.
Anton: Then let’s type 1 + (:
# 1 + (
Alfrothul: OK.
Anton: Then let’s move the cursor to the beginning of what we have typed (C-a) and let’s put what we have typed in Emacs’s kill-buffer (C-k):
#
Alfrothul: OK.
Anton: Now let’s yank what we just killed (C-y), like, 5 times:
# 1 + (1 + (1 + (1 + (1 + (
Alfrothul: OK.
Anton: Now let’s type 0:
# 1 + (1 + (1 + (1 + (1 + (0
Alfrothul: OK.
Anton: And let’s type 5 close parentheses, and then ;;, and then hit the enter key:
# 1 + (1 + (1 + (1 + (1 + (0)))));;
- : int = 5
#
Alfrothul: Well, that’s the Emacs content of the food for thought about Exercise 01.1 all right.
Anton: Yup. And now instead of 0, we can put 1.
Alfrothul: Or instead of 1, we can put 2.
Anton: Alternatively, we could put ) + 1 in the kill buffer and start by typing, like, 5 open parentheses.
Mimer: Thanks, Anton, you made your point. Besides, don’t you have another mini-interlude to be in?
Anton: Ah, right. I was getting to it, thanks.
The fourth wall: Ouch.The seventh wall: That must have hurted too.The eighth wall: Getting crowded here.The ninth wall: Seriously?The tenth wall: Hear, hear.The eleventh wall: We have ears now?
Magritte: I approve this mini-interlude.
Anton: Wow. s and s.
Alfrothul (o _ o): s and s.
Anton: One is the name of the string in OCaml, and the other is the string.
Alfrothul: Ah, right. Names and what these names denote.
Anton: Yes. A useful distinction to make, isn’t it?
Alfrothul: In this context, yes. Let me try: evaluating String.length s amounts to applying the function denoted by String.length to the string denoted by s. How am I doing?
Anton: You are doing good, I believe. How about String.length "aBcD"?
Alfrothul: Same same. Evaluating String.length "aBcD" amounts to applying the function denoted by String.length to the string denoted by "aBcD".
Pablito: But the string denoted by "aBcD" is "aBcD", since "aBcD" is a literal. Wait. Actually no: one is the syntax of the string – how it’s written, and the other is its semantics – what it means, i.e., an OCaml value of type string. I think I’m getting the hang of this.
Since Version 4.00.0, the OCaml library for strings features String.map, which has type (char -> char) -> string -> string. To quote its documentation, “String.map f s applies function f in turn to all the characters of s (in increasing index order) and stores the results in a new string that is returned.”
For example, String.map can be used to map a string to another string of the same length but containing only the character 'a', as captured in the following unit-test function:
let test_to_a candidate =
let b0 = (candidate "" = "")
and b1 = (candidate "x" = "a")
and b2 = (candidate "xy" = "aa")
and b3 = (candidate "xyz" = "aaa")
(* etc. *)
in b0 && b1 && b2 && b3;;
To implement to_a, we supply the constant function fun c -> 'a' to String.map and to the given string:
let to_a s =
String.map (fun c -> 'a') s;;
This implementation passes the unit test:
# test_to_a to_a;;
- : bool = true
#
In words, fun c -> 'a' is applied to each character of the given string, and the resulting characters 'a' all occur in the resulting string:
Also, since c is not used in the body of fun c -> 'a', we can make that manifest by writing _ instead:
let to_a s =
String.map (fun _ -> 'a') s;;
So all in all, String.map is applied to a function and to a string, applies this function to each of the characters of this string, and returns the string containing the resulting characters, in the same order. Likewise, a unit-test function is applied to a function, applies this function to a selection of inputs and compares each result to the corresponding expected result, and returns the conjunction of the results of each comparison.
Given a function f and, e.g., a string "abc", i.e., the result of evaluating:
string_of_char 'a' ^ string_of_char 'b' ^ string_of_char 'c'
String.map essentially yields the result of evaluating:
string_of_char (f 'a') ^ string_of_char (f 'b') ^ string_of_char (f 'c')
where string_of_char maps a character into a singleton string containing this character and is defined as follows:
let string_of_char c =
String.make 1 c;;
In words, given a function f represented as f and a string s of length n and represented as s, evaluating String.map f s gives rise to n-1 concatenations of the singleton strings that contain the result of applying f to each of the successive characters of s.
Alfrothul: Ah! Now we can write a generator of real random strings.
Anton: A generator of real random strings. We already had a generator of random strings, didn’t we?
Alfrothul: Yes we did, but they all contained the same character.
Anton: The same random character, yes.
Alfrothul: And now we can manufacture random strings containing random characters.
Anton: Pray explain.
Alfrothul: Well, we first generate a string of random length, and then we map each of its characters to a random one using String.map. Look:
let random_char () =
char_of_int (Random.int 32 + int_of_char ' ');;
let random_string n =
String.map (fun _ -> random_char ()) (String.make (Random.int n) ' ');;
Anton (prudently): I see that your function has type int -> string.
Alfrothul: It does. Given a non-negative integer, it first creates a string of random length containing the space character using String.make, and then it maps each occurrence of the space character into a random character using String.map. Look:
# random_string 10;;
- : string = "(+%- .62"
# random_string 10;;
- : string = " 3+>,"
# random_string 10;;
- : string = ""
# random_string 10;;
- : string = "0?8,."
#
Anton: Harrumph.
Loki: Shouldn’t you guys first be implementing unit-test functions for random_char and random_string?
Keeping in mind that int_of_char and char_of_int respectively map a character into its ASCII code and the ASCII code of a character into this character (see Section A character-encoding standard: ASCII in Week 03), define a function that maps a string of lowercase letters to the corresponding string of uppercase letters and another function that maps a string of uppercase letters to the corresponding string of lowercase letters:
let test_upper_string_of_letters candidate =
let b0 = (candidate "abc" = "ABC")
(* etc. *)
in b0;;
let test_lower_string_of_letters candidate =
let b0 = (candidate "ABC" = "abc")
(* etc. *)
in b0;;
Pablito: We are supposed to extend these unit-test functions with our own tests, right?Mimer: Right.Dana: Any other piece of advice?Mimer: Read the general interlude below.
(The input of upper_string_of_letters is assumed to be a string of lowercase letters and the input of lower_string_of_letters is assumed to be a string of uppercase letters.)
Anton: But what if the given string does not exclusively contain lowercase letters?
Alfrothul: Well, we were asserted it does.
Dana: My turn to be literal:
let test_upper_letter candidate =
let a = (candidate 'a' = 'A')
and z = (candidate 'z' = 'Z')
in a && z;;
let upper_letter c =
let a = int_of_char 'a'
and n = int_of_char c
and z = int_of_char 'z'
and d = int_of_char 'a' - int_of_char 'A'
in let () = assert (a <= n && n <= z) in
char_of_int (n - d);;
let () = assert (test_upper_letter upper_letter = true);;
Loki: Wow. Literal and testy.
Anton: Hang on a second. OK, a denotes the ASCII code of the lowercase letter 'a', n denotes the ASCII code of the given letter, z denotes the ASCII code of the lowercase letter 'z', and d denotes... denotes what?
Dana: Well, the ASCII code for 'a' is 97 and the ASCII code for 'A' is 65, and letters occur in alphabetical order in the ASCII table, so...
Anton: OK. Evaluating n - d gives the ASCII code of the corresponding uppercase letter.
Alfrothul: Which it does, since you have asserted that n lies between a and z, i.e., that the given letter is lowercase. If it isn’t, the assertion fails and the computation stops.
Pablito: Well, lower_letter follows.
Halcyon: Mutatis mutandis.
This exercise is a continuation of Exercise 02.
Implement a function that maps the lowercase letters in an arbitrary string to uppercase and that leaves the other characters unchanged, be them uppercase letters, digits, or whichever other character:
let test_upper_string candidate =
let b0 = (candidate "abc123" = "ABC123")
and b1 = (candidate "--A B C**" = "--A B C**")
and b2 = (candidate "Abc" = "ABC")
and b3 = (candidate "aBc" = "ABC")
and b4 = (candidate "abC" = "ABC")
(* etc. *)
in b0 && b1 && b2 && b3 && b4;;
Implement a function that maps the uppercase letters in an arbitrary string to lowercase and that leaves the other characters unchanged, be them lowercase letters, digits, or whichever other character:
let test_lower_string candidate =
let b0 = (candidate "ABC123" = "abc123")
and b1 = (candidate "--a b c**" = "--a b c**")
and b2 = (candidate "Abc" = "abc")
and b3 = (candidate "aBc" = "abc")
and b4 = (candidate "abC" = "abc")
(* etc. *)
in b0 && b1 && b2 && b3 && b4;;
Subsidiary questions, keeping in mind that in Emacs, M-b is used to make the cursor jump to the beginning of the latest word (or of the current word), and M-f is used to make the cursor jump to the end of the next word (or of the current word):
Alfrothul: Fun fact – since Version 4.02.0, the OCaml library for strings also features a function String.mapi, which has type (int -> char -> char) -> string -> string. To quote its documentation, “String.mapi f s calls f with each character of s and its index (in increasing index order) and stores the results in a new string that is returned.”
Anton: So... We should be able to map any string to a string of the same length, but containing successive digits. Blam.
Alfrothul: Beg pardon?
Anton: Let me sketch a unit-test function:
let test_blam candidate =
let b0 = (candidate "" = "")
and b1 = (candidate "a" = "0")
and b2 = (candidate "ab" = "01")
and b3 = (candidate "abc" = "012")
and b4 = (candidate "abcd" = "0123")
(* etc. *)
in b0 && b1 && b2 && b3 && b4;;
Alfrothul: I see. Pretty straighforward indeed:
let blam s =
String.mapi (fun i _ -> char_of_int (i + 48)) s;;
Anton: i + 48, eh.
Alfrothul: That’s because the ASCII code of 0 is 48:
# int_of_char '0';;
- : int = 48
# char_of_int 48;;
- : char = '0'
# char_of_int (48 + 1);;
- : char = '1'
# char_of_int (48 + 2);;
- : char = '2'
#
Anton: I see. But there is no way I want to remember that the ASCII code of 0 is 48.
Alfrothul: What’s that now?
Anton: I mean we should be more explicit in the definition of blam:
let blam s =
String.mapi (fun i _ -> char_of_int (i + int_of_char '0')) s;;
Alfrothul: Mmmh... Right. Still, be that as it may, blam passes the unit tests too, which was the point:
# test_blam blam;;
- : bool = true
#
Anton: And what if blam is applied to a string that is longer than 10? Let me try:
# String.length "abcdefghijkl";;
- : int = 12
# blam "abcdefghijkl";;
- : string = "0123456789:;"
#
Alfrothul: Well, that’s the ASCII table for you. After the characters 8 and 9, the characters : and ;.
Be a pal and
Could you parameterize blam to yield a cycle that is shorter than 10? For example, when applied to a string and to 6, the result should be the cyclic string "012345012345012345...". Of course a new unit-test function will be needed too.
Subsidiary questions:
Mimer: Given a function f and, e.g., a string "abc", i.e., the result of evaluating:
string_of_char 'a' ^ string_of_char 'b' ^ string_of_char 'c'
String.mapi essentially yields the result of evaluating:
string_of_char (f 0 'a') ^ string_of_char (f 1 'b') ^ string_of_char (f 2 'c')
where string_of_char maps a character into a singleton string containing this character.
In words, given a function f represented as f and a string s of length n and represented as s, evaluating String.mapi f s gives rise to n-1 concatenations of the singleton strings that contain the result of applying f to each of the successive characters of s as well as to their respective indices in s.
Pablito: I’m sorry but String.mapi is undefined:
# String.mapi;;
Characters 0-11:
String.mapi;;
^^^^^^^^^^^
Error: Unbound value String.mapi
Did you mean map?
#
Dana: What is your version of OCaml?
Pablito: When OCaml starts, it displays:
OCaml version 4.01.0
#
Jasmine: That’s why.
Mimer: Ms. Dubrow, thanks for stopping by!
Dana (heedless): Right – mapi was only added to the library for strings since Version 4.02.0.
Pablito: OK?
Dana: Can you upgrade your version of OCaml?
Pablito (with a small voice): No?
Mimer: That’s OK – the accompanying resource file contains an updated version of the String module that contains a definition of mapi, so just load this file and you are good to go.
Pablito (clickety clickety click): Yay! I can now generate the 26 letters of the alphabet in a string.
Anton (O_o): Letters? Alphabet? String?
Pablito: Well, yes. We can map a string of 26 characters to a string that contains all successive letters of the alphabet. Look:
# let lower_case_alphabet =
String.mapi (fun i _ -> char_of_int (i + int_of_char 'a')) (String.make 26 ' ');;
val lower_case_alphabet : string = "abcdefghijklmnopqrstuvwxyz"
#
Mimer: Way to go, Pablito!
Loki: Wow, Mimer.Mimer: Yes?Loki: That was a quick and dirty definition of String.mapi.Mimer: Did you expect a reference implementation?Loki: Well, it is effective.Dana (coughing in her hand): Foreshadowing.
Can String.map be defined in terms of String.mapi?
Pablito: On second thought, though...
Dana: On second thought?
Pablito: It would be a better idea to use String.init here to generate the 26 letters of the alphabet in a string.
Dana: Pray tell.
Pablito (taking a deep breath): Well, since Version 4.02.0, the library for strings also contains an init function of type int -> (int -> char) -> string that we can use to initialize a string. Given the length of the desired string and a function of type int -> char, this init function yields a string where each of the characters are the result of applying the given function to each of the successive valid indices in this string. Look:
# String.init 26 (fun i -> char_of_int (i + int_of_char 'A'));;
- : string = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
#
Dana: Mimer is right. Way to go, Pablito!
Pablito: Well, he also put init in the accompanying resource file, so...
Alfrothul: Well spotted anyhow, Pablito. And the generator of random strings too is better implemented with String.init:
let random_string n =
String.init (Random.int n) (fun _ -> random_char ());;
Pablito: Right, this implementation only constructs one string, not two.
Anton: How about we generalize?
Alfrothul: Er... Sure, sounds good, let’s. What do you have in mind?
Anton: Well, the ASCII table contains 2 to the 7, i.e., 128 characters, so we could create a string for it.
Alfrothul: Indeed we can:
# let ascii_table = String.init 128 char_of_int;;
val ascii_table : string =
"\000\001\002\003\004\005\006\007\b\t\n\011\012\r\014\015\016\017\018\019\020\021\022\023\024\025\026\027\028\029\030\031 !\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\127"
#
Dana (jumping in): Wow, what a way to print the first 32 characters in the ASCII table. Let me check something:
# '\000';;
- : char = '\000'
# int_of_char '\000';;
- : int = 0
# String.length "\000";;
- : int = 1
#
Dana: OK, makes sense.
Pablito (jumping in too): Actually...
Dana (interrupting): No, wait. There is a logic here:
# int_of_char '0';;
- : int = 48
# '\048';;
- : char = '0'
#
Dana: Right, that’s the logic. All the characters can be written using their ASCII code, and they are rendered either using the character they represent if this character has a name, or using their ASCII code if this character has no name.
She emerges from her thoughts.
Dana: Sorry. You were saying?
Pablito: Wow, thanks. This makes so much sense.
Dana: It does, doesn’t it. But you started to say something.
Pablito: Ah, right. Actually, OCaml’s ASCII table has 2 to the 8, i.e., 256 entries, and the last one use the same convention, look:
# String.get (String.init 256 char_of_int) 255;;
- : char = '\255'
#
Anton: Thanks. Be that as it may, we can now simulate String.mapi in terms of String.map:
let test_string_mapi candidate =
let b0 = (candidate (fun i c -> char_of_int (int_of_char c + 1)) "abcde" = "bcdef")
and b1 = (candidate (fun i c -> char_of_int (i + int_of_char '0')) "abcde" = "01234")
in b0 && b1;;
Anton: In the first test, the function only uses the character, and in the second test, the function only uses the index of this character in the given string. Here we go:
let string_mapi f s =
String.map
(fun c -> let i = int_of_char c
in f i (String.get s i))
(String.init (String.length s) char_of_int);;
Alfrothul: Wow. You first create a prefix of the ASCII table, and then you map pointwise a function over this prefix. In effect, this function is given the index of the character it is applied to as well as this character.
Anton: Right. And then we can apply the given function to the index and to the corresponding character in the given string, as per String.mapi‘s mandate.
Dana: ...but only for strings up to 256 characters.
Mike Wazowski: We’re still working on it. It’s a work in progress.
Mimer: Mr. Wazowski, thanks for stopping by! Do you need ushers?
Anton (unfazed): Yes, only up to 256 characters:
let string_mapi f s =
let () = assert (String.length s <= 256) in
String.map
(fun c -> let i = int_of_char c
in f i (String.get s i))
(String.init (String.length s) char_of_int);;
Pablito: Well, this implementation passes the unit test.
The string is a stark data structure and everywhere it is passed there is much duplication of process.It is a perfect vehicle for hiding information.—Alan Perlis‘s programming epigram #34
Implement a function that indexes the ASCII table.
Halcyon: May I?
Bong-sun: Please, Halcyon.
Halcyon: <clickety clickety click>:
let char_index i =
char_of_int i;;
Bong-sun (laughing): Well done!
Halcyon: Thank you. It’s a basic exercise about what char_index does and how it does it.
Bong-sun: How about playing it safe, though:
let char_index' i =
let () = assert (0 <= i && i < 256) in
char_of_int i;;
Pablito: Well, char_of_int is going to check whether the index is valid, so Halcyon’s definition is already safe.
Bong-sun: True. But it is less practical, look:
# char_index (-1);;
Exception: Invalid_argument "char_of_int".
# char_index' (-1);;
Exception: Assert_failure ("//toplevel//", 153, 13).
#
Bong-sun (continuing): The error message for char_index' is more useful.
Halcyon: You are right, Bong-sun. The error message – such as it is – is more about what char_index does than about how it does it.
Mimer enthusiastically claps his hands. Anton wants to ask about how Mimer is clapping his hands exactly, but Alfrothul shushes him, so let’s move on to the next exercise.
Implement a function of type char -> char that maps a character to the next character in the ASCII table.
Pablito: Well, easy-peasy:
let char_next c =
...
Bong-sun: Just a sec.
Pablito: Yes, it will just take a second?
Bong-sun: Right. But before that second:
let test_char_next candidate =
let b0 = (candidate 'a' = 'b')
and b1 = (candidate '0' = '1')
in b0 && b1;;
Halcyon: Yes!
Pablito: OK OK:
let char_next c =
char_of_int (succ (int_of_char c));;
Halcyon: Let me check:
# test_char_next char_next;;
- : bool = true
#
Pablito: Yay.
Bong-sun: Wait.
Pablito: Yes?
Bong-sun: There are only 256 characters.
Pablito: Er... Yes?
Bong-sun: So what happens if we apply char_next to the last character?
Halcyon: Huh-oh:
# char_next '\255';;
Exception: Invalid_argument "char_of_int".
#
Bong-sun (practical): Well, what does OCaml do for integers?
Anton: What does OCaml do for integers.
Bong-sun: OCaml only represents finitely many integers, with max_int denoting the largest one and min_int the smallest one. And it kinds of roll over in that adding 1 to the largest integer yields the smallest one, look:
# succ max_int = min_int;;
- : bool = true
#
Loki: This is so cool.
Bong-sun: Maybe it is. But we could do the same for char_next and get serious about it:
let test_char_next candidate =
let b000 = (candidate '\000' = '\001')
and b048 = (candidate '0' = '1')
and b088 = (candidate 'X' = 'Y')
and b097 = (candidate 'a' = 'b')
and b254 = (candidate '\254' = '\255')
and b255 = (candidate '\255' = '\000')
in b000 && b048 && b088 && b097 && b254 && b255;;
Halcyon: Love the way you indexed the tests, Bong-sun.
Bong-sun: Thank you.
Pablito: Er... But how do we know whether a given character is the last one and what to do then?
Bong-sun (on a roll): With an if-expression, look:
let char_next c =
if c = '\255'
then '\000'
else char_of_int (succ (int_of_char c));;
Halcyon: Checking:
# test_char_next char_next;;
- : bool = true
#
Pablito: But what if the character is greater than '\255'?
Bong-sun: That’s not possible since the type of char_next is char -> char and characters are elements of the ASCII table, which has 256 entries.
Pablito: OK...
Implement a function of type char -> char that maps a character to the previous character in the ASCII table.
Integers can be converted to strings (in base 10) and vice-versa:
# string_of_int 123;;
- : string = "123"
# int_of_string "123";;
- : int = 123
#
The function converting a string to an integer is a left inverse of the function converting an integer to a string:
# int_of_string (string_of_int 543210);;
- : int = 543210
# int_of_string "hello world";;
Exception: Failure "int_of_string".
#
Negative integers can also be formatted into a string, but on the occasion one must help the OCaml parser by wrapping them between parentheses or prefixing their sign with ~, so that the negative sign is not read as the infix subtraction operator, which gives rise to, ah, unintuitive error messages:
# int_of_string "-321";;
- : int = -321
# string_of_int -123;; (* this expression is read as subtracting 123 to string_of_int *)
Characters 0-13:
string_of_int -123;;
^^^^^^^^^^^^^
Error: This expression has type int -> string
but an expression was expected of type int
# string_of_int (-123);; (* fix: isolate the negative integer between cosy parentheses *)
- : string = "-123"
# string_of_int ~-123;; (* alternative fix: mark the negative sign as such *)
- : string = "-123"
#
To generalize string_of_int, OCaml offers a library function to format strings, Printf.sprintf, that returns a formatted string. To a first approximation, its first argument is the string to print out (so there is nothing to format here):
# Printf.sprintf "Hello world";;
- : string = "Hello world"
#
This argument string is actually a control directive that determines the behavior of Printf.sprintf: if, for example, it contains an occurrence of %i, then Printf.sprintf expects not just the control string, but an integer too – in fact, as many integers as occurrences of %i in the control string. These integers are formatted into part of the resulting string:
# Printf.sprintf "We are in %ind Street." 42;;
- : string = "We are in 42nd Street."
# Printf.sprintf "We stand at the corner of %ind Street and %ith Avenue." 42 5;;
- : string = "We stand at the corner of 42nd Street and 5th Avenue."
#
Here, the control directive %i specifies that an integer is expected. Other control directives exist, e.g., %s for strings and %B for Booleans:
# Printf.sprintf "Is %s %B?\n" "that" true;;
- : string = "Is that true?\n"
#
where '\n' is a newline character.
There is more information in the OCaml manual.
OCaml also offers a library function to format strings and print them out, Printf.printf, that returns the unit-type value. The only difference between Printf.sprintf and Printf.printf is that one returns the formatted string whereas the other prints it out and returns the unit-type value.
To wit:
# Printf.sprintf "Hello\nworld\n";;
- : string = "Hello\nworld\n"
# Printf.printf "Hello\nworld\n";;
Hello
world
- : unit = ()
#
where we see the newline character at work, twice, in that printing it issues a new line in the output:
# Printf.printf "\n";;
- : unit = ()
#
Food for thought:
Getting back to Printf.sprintf and Printf.printf, the following expressions are equivalent:
where “equivalent” means “yield the same result with the same observable effects.” To wit:
# Printf.printf "Hello world\n";;
Hello world
- : unit = ()
# Printf.printf "%s" (Printf.sprintf "Hello world\n");;
Hello world
- : unit = ()
# Printf.printf "%s" (Printf.sprintf "%s" (Printf.sprintf "Hello world\n"));;
Hello world
- : unit = ()
# Printf.printf "%s" (Printf.sprintf "%s" (Printf.sprintf "%s" (Printf.sprintf "Hello world\n")));;
Hello world
- : unit = ()
# Printf.printf "We are in %i%s Street.\n" 42 "nd";;
We are in 42nd Street.
- : unit = ()
# Printf.printf "%s" (Printf.sprintf "We are in %i%s Street.\n" 42 "nd");;
We are in 42nd Street.
- : unit = ()
# Printf.printf "%s" (Printf.sprintf "%s" (Printf.sprintf "We are in %i%s Street.\n" 42 "nd"));;
We are in 42nd Street.
- : unit = ()
# Printf.printf "%s" (Printf.sprintf "%s" (Printf.sprintf "%s" (Printf.sprintf "We are in %i%s Street.\n" 42 "nd")));;
We are in 42nd Street.
- : unit = ()
#
Added Exercise 07 [12 Mar 2023]
Added Exercise 05, Exercise 06, and their solutions [12 Mar 2023]
Expanded the aftermath about the ASCII table [12 Feb 2023]
Created [10 Jan 2023]