Kleene’s S-m-n theorem

Programs hacking programs. Why?

Thomas A. Anderson

So program processors are programs that process other programs: an interpreter runs a program, and a compiler translates a source program into a target program. Are there other such programs that process other programs?

Exercise 01

Analyze the quip “One person’s program is another program’s data.”, from its origins (“Quod ali cibus est aliis fuat acre venenum.” and “One man’s meat is another man’s poison.”) to its significance here.

Programs and the function they compute

If a program p written in L computes a function f, then p and f are algebraically related using an interpreter for L (let’s name it run):

run p = f
Neil Jones: Yes.
Mimer: Prof. Jones, thanks for coming back!

Then we say that f is computable.

Conversely, if a function is computable, then a program exists that computes it.

The program is intensional, i.e., about the “how” of the computation, and the function is extensional, i.e., about the “what” of the computation. (We have already met this “what” (extension) vs. “how” (intension) in Week 01, when interpreting an interpreter and when interpreting a compiler.)

For example, consider the factorial function that maps 0 to 1 and that maps a positive integer n to 1 * 2 * ... (n - 1) * n. The factorial function is computable: we can (and we will) implement it, i.e., we will write a program that computes the factorial function. Let fac denote this program. Assuming this program to be sound and complete, running it on the input 5 should yield 120 since 5! = 120:

run fac 5 = 120

When a program is computable, the meaning of this program is the function computed by this program.

Not all functions, however, are computable: no programs exist that compute them. These functions are uncomputable: we cannot implement them.

For example, consider the halting predicate (i.e., a function that returns a Boolean) that, given a program and its input, returns true if running this program on this input terminates and false if it doesn’t. This function is uncomputable: we cannot implement it.

Interlude about the halting predicate

Anton: Uncomputable?

Alfrothul: That’s what was said.

Anton: How so?

Alfrothul: Good question.

Dana: Well, the existence of a halting predicate would entail a paradox?

Alfrothul: OK, I remember Cyrano de Bergerac and Pinocchio (Exercise 24 in Week 01). Paradox in what sense?

Dana: Well, suppose that this program exists.

Anton: A program that computes the halting predicate, yes.

Dana: And suppose that a diverging subprogram exists: executing it does not terminate.

Anton: OK.

Dana: Now imagine the following computation. Given a program and its input, test whether running this program on this input terminates, using the halting predicate. If it does, then execute the diverging subprogram; and otherwise, terminate.

Alfrothul: OK.

Dana: Does this computation terminate?

Anton: Well, there are two possibilities: either running the given program on the given input terminates, or it doesn’t.

Alfrothul: First case: running the given program terminates. Then we execute the diverging subprogram. OK, in that case the computation doesn’t terminate since executing the diverging subprogram does not terminate.

Pablito: Second case: running the given program does not terminate. Then we terminate. OK, in that case the computation terminates.

Anton: To sum up, if running the given program on the given input terminates, then the computation does not terminate, and if running the given program on the given input does not terminate, then the computation terminates. It feels like a big negation.

Dana (steadfast): Maybe it does. Now consider the program implementing this computation.

Anton: Aha—this program is given another program and its input.

Dana: Right. Given this other program and its input, running this other program on its input achieves the computation that you just characterized as a big negation. Now let’s say that this other program is this program, i.e., let’s apply this program to itself, that is to say, to a copy of itself.

Halcyon (sententiously): A process known as diagonalization.

Pablito (excitedly): clickety clickety click. Guys, did you see that the Wikipedia entries Diagonal argument and Diagonal argument (disambiguation) point to each other?

Dana (undaunted): Of course they do. But the question here is whether applying the program to itself terminates.

Anton: Well, we face the same two possibilities: either running the given program—let me still call it the given program, even though it is the same program—on its input terminates or it doesn’t.

Alfrothul: First case: running the given program on the given input terminates. As we have seen above, the computation doesn’t terminate, and therefore running the given program on the given input doesn’t terminate since this program implements the computation.

Pablito: Second case: running the given program on the given input does not terminate. As we have seen above, the computation terminates, and therefore running the given program on the given input terminates since this program implements the computation.

Alfrothul: OK, a paradox, since the given program is the program. If running it terminates, then running it doesn’t terminate, and if running it doesn’t terminate, then running it terminates.

Dana: Right.

Pinocchio: AT-CHOO!

Dana (undeterred): So the halting predicate is uncomputable.

Mimer: Put otherwise, the halting problem is undecidable: given an arbitrary program and its input, we cannot determine whether executing this program on this input terminates.

Cyrano: Thank you.

Aftermath

Anton: I guess this answers the question.

Alfrothul: Sorry. What answers which question?

Anton: The question as to whether there are programs that process other programs and that are not interpreters or compilers.

Alfrothul: Oh, you mean the program that implements the halting predicate?

Anton: Yes: this halting program is given another program, to test whether this other program terminates.

Alfrothul: But the halting program doesn’t exist, we just showed that.

Anton: Harrumph.

Kleene’s S-m-n function

Consider a program p that has m + n input parameters. Given m arguments, v1, v2, ..., and vm, and given n arguments, w1, w2, ..., and wm, running p on these m + n arguments returns a result:

run p (v1, v2, ..., vm, w1, w2, ..., wn) = result

Kleene‘s S-m-n function (written S-m-n), given p and the m first arguments, produces program q that, given the remaining n arguments, returns the same result as the original program over the m + n arguments:

S-m-n (p, (v1, v2, ..., vm)) = q
    run q (w1, w2, ..., wn)  = result

A few remarks:

  • p is often said to be the source program, and q the residual program.

  • q is a version of p that is specialized with respect to the m first arguments. As such, the S-m-n function specializes p with respect to the m first arguments.

  • If m is 0, then q is p:

               S-0-n (p, ()) = q
    run q (w1, w2, ..., wn)  = result
    
  • If n is 0, then q is the constant program that always returns result:

    S-m-0 (p, (v1, v2, ..., vm)) = q
                       run q ()  = result
    

The S-m-n function is appealing because a residual program may not only be a specialized version of a source program, it may be a simplified version of it. For example, suppose (1) that the source program that has 2 + 1 = 3 input parameters, (2) that it computes the equation (x1 + x2) + x3, where x1, x2, and x3 are the names of the input parameters, and (3) that we want to specialize this source program with respect to the two arguments 1 and 10 using the S-2-1 function. As such, since x1 denotes 1 and x2 denotes 10, the equation (x1 + x2) + x3 reads (1 + 10) + x3 which can be simplified to 11 + x3. So the residual program has one input parameter, x3, and computes the equation 11 + x3, where 11 is the result of simplifying 1 + 10. In the extreme case where m = 3 and n = 0, the entire equation (x1 + x2) + x3 can be simplified into result. The residual program is then a lot more efficient than the source program since it performs no computations at all: it just returns the result.

For a more sizeable example, say that p is a program with 3 input parameters that are named x1, x2, and x3, and say that this program computes the equation 0 + (1 + x1) + (x2 + 1000) + x3. Executing this program on 3 arguments – let us say, 10, 100, and 10000 – induces the computation of 0 + (1 + 10) + (100 + 1000) + 10000 and yields 11111:

run p (10, 100, 10000) = 11111
  • Say that we specialize p with respect to 0 arguments:

    S-0-3 (p, ()) = q_3
    

    The residual program, q_3, is a program with 3 input parameters, x1, x2, and x3, that computes (1 + x1) + (x2 + 1000) + x3, assuming that the addition with 0 has been simplified away. Executing this program on 3 arguments – 10, 100, and 10000 – induces the computation of (1 + 10) + (100 + 1000) + 10000 and yields 11111:

    run q_3 (10, 100, 10000) = 11111
    
  • Say that we specialize p with respect to 1 argument – 10:

    S-1-2 (p, (10)) = q_2
    

    The residual program, q_2, is a program with 2 input parameters, x2 and x3, that computes 11 + (x2 + 1000) + x3, assuming that 0 + (1 + 10) has been simplified to 11. Executing this program on 2 arguments – 100 and 10000 – induces the computation of 11 + (100 + 1000) + 10000 and yields 11111:

    run q_2 (100, 10000) = 11111
    
  • Say that we specialize p with respect to 2 arguments – 10 and 100:

    S-2-1 (p, (10, 100)) = q_1
    

    The residual program, q_1, is a program with 1 input parameter, x3, that computes 1111 + x3, assuming that 0 + (1 + 10) + (100 + 1000) has been simplified to 1111. Executing this program on 1 argument – 10000 – induces the computation of 1111 + 10000 and yields 11111:

    run q_1 (10000) = 11111
    
  • Say that we specialize p with respect to 3 arguments – 10, 100 and 10000:

    S-3-0 (p, (10, 100, 10000)) = q_0
    

    The residual program, q_0, is a program with 0 input parameters that computes 11111, assuming that 0 + (1 + 10) + (100 + 1000) + 10000 has been simplified to 11111. Executing this program on 0 arguments induces the computation of 11111 and yields 11111:

    run q_0 () = 11111
    

In general, though, m and n are not 0, but the idea still applies. In the rest of this lecture note, and without loss of generality, we let m and n be 1, for simplicity. In other words, from here on, we use the S-1-1 function.

The characteristic equations then read:

run p (v, w) = result
S-1-1 (p, v) = p_v
   run p_v w = result

where we wrote p_v instead of q to convey that the residual program (i.e., the result of S-1-1) is a version of p that is specialized with respect to v.

Say we need to carry out the following runs:

run p (v, w1)
run p (v, w2)
run p (v, w3)

where w1, w2, and w3 are 3 distinct arguments. During these runs, all of the computations of p are carried out, both those that depend on its first argument and those that involve the second argument. Alternatively, let us first specialize p with respect to v:

S-1-1 (p, v) = p_v

The following runs then proceed as follows:

run p_v w1
run p_v w2
run p_v w3

Assume that S-1-1 has carried out all the computations of p that depend on its first argument. These computations are carried out once, and in the subsequent runs, only the remaining computations of p (those that involve the second argument) are carried out: p_v is therefore more efficient than p since to obtain the same result, it carries out fewer computations:

run p_v w1 = run p (v, w1)
run p_v w2 = run p (v, w2)
run p_v w3 = run p (v, w3)

Here are some concrete examples:

  • A cookbook

    A cookbook contains many recipes. The recipe for omelets, for example, says that the cook needs 2 eggs per person.

    An ordinary editor would publish this book, e.g., with the title “Good recipes for the whole family”.

    In addition, an enterprising editor would:

    • specialize the recipes from being for an arbitrary number of people to being just for two persons; the specialized recipe for omelets, for example, would say that the cook needs 4 eggs; and
    • collect these specialized recipes in a book and publish it with the title “Happy recipes for you and me”.

    This enterprising editor could also:

    • specialize the recipes from being for an arbitrary number of people to being just for one person; the specialized recipe for omelets, for example, would say that the cook needs 2 eggs; and
    • collect these specialized recipes in a book and publish it with the title “Lonely but healthy recipes”.

    This enterprising editor could equally well:

    • specialize selected recipes to using a reduced number of dishes; and
    • collect these specialized recipes in a book and publish it with the title “Healthy recipes for students”, on the ground that students typically live in small places. (The title says “Healthy” so that the parents of the said students buy this book as a parting gift for their progeny as it leaves home for college.)

    But why stop there? Why not select recipes that don’t involve any cooking and publish several books (one for the family, one for you and me, one for single people, one for students) with the titles “The book of healthy salads”, “Healthy salads for the whole family”, “Healthy salads for you and me”, “Lonely but healthy salads”, and “Healthy salads for healthy students”, for example. Also, why not dividing the recipes with seasonal ingredients and publish 4 books, nay, a multiple of 4 books: “Healthy salads in the spring”, “Summer dreams of cooking for you and me”, “Autumn leaves, let’s eat well”, “Winter recipes for the healthy student”, etc.

    Truly, the S-m-n function offers a wealth of opportunities. And if the enterprising editor knows how to use Aristotle’s 4 causes in the titles, well, the sky is the limit.

Halcyon: A lot?
Pablito: A lot.
Halcyon: OK.
  • Matrix multiplication

    A matrix aggregates a lot of numbers, and multiplying two matrices involves multiplying these numbers, like, a lot.

    Say that matrix_multiply is a program that multiplies two matrices and yields a resulting matrix:

    run matrix_multiply (M1, M2) = M3
    

    A hollow matrix is a matrix containing many 0’s. Since 0 is absorbing for multiplication (i.e., since for any n, 0 * n = 0 and n * 0 = 0), multiplying a hollow matrix and another matrix results in a hollow matrix. Specializing a matrix-multiplication program with respect to a hollow matrix is likely to yield a specialized program that only performs the computations that do not involve 0 over the other matrix, i.e., a more efficient program.

    So let us instantiate the characteristic equations:

    run p (v, w) = result
    S-1-1 (p, v) = p_v
       run p_v w = result
    

    with

    • matrix_multiply for p,
    • M1 for v,
    • M2 for w, and
    • M3 for result.

    The result reads:

    run matrix_multiply (M1, M2) = M3
     S-1-1 (matrix_multiply, M1) = matrix_multiply_M1
      run matrix_multiply_M1 M2  = M3
    

    The specialized program matrix_multiply_M1 is a matrix multiplier dedicated to M1.

  • Parsing

    Given a grammar g and a text t, a general parser GP decides whether the text is grammatically correct:

    run GP (g, t) = result
    

    Typically, we have many texts to parse given the same grammar, and presumably a lot of computations in GP depend on this grammar. Ideally, specializing a general parser with respect to a grammar gives a specialized program that only performs the computations that involve the text to parse.

    So let us instantiate the characteristic equations:

    run p (v, w) = result
    S-1-1 (p, v) = p_v
       run p_v w = result
    

    with

    • GP for p,
    • g for v,
    • t for w, and
    • result for result.

    The result reads:

    run GP (g, t) = result
    S-1-1 (GP, g) = GP_g
      run GP_g t  = result
    

    The specialized program GP_g is a parser dedicated to g.

  • Unparsing

    Given a parsed text pt and a directive d to map this parsed text to a textual representation, we implement an unparser:

    run unparse (d, pt) = text
    

    (In a programming language such as C, this unparser is named sprintf.)

    Typically, we have many parsed texts to unparse given the same directive, and presumably a lot of computations in unparse depend on this directive. Ideally, specializing an unparser with respect to a directive gives a specialized program that only performs the computations that involve the parsed text to unparse.

    So let us instantiate the characteristic equations:

    run p (v, w) = result
    S-1-1 (p, v) = p_v
       run p_v w = result
    

    with

    • unparse for p,
    • d for v,
    • pt for w, and
    • text for result.

    The result reads:

    run unparse (d, pt) = text
     S-1-1 (unparse, d) = unparse_d
       run unparse_d pt = text
    

    The specialized program is an unparser dedicated to d.

In programming,
everything we do is a special case of something more general,
– and often we know it too quickly.

Alan Perlis‘s programming epigram #30

Interlude

Anton (wondering): Wouldn’t it be nice if the S-m-n function were computable?

Neil Jones (enjoying himself): It sure would.

I-diagrams, revisited

So an interpreter is a program for running another program. For example, let’s say we are given:

  1. an x86 microprocessor, denoted run^x86 and depicted as follows:

    _images/ditaa-2c051e6667d18dbf7c52db58b6907e2a6123147d.png
  2. a program that computes the factorial function and that is written in x86, denoted by fac_x86 and depicted as follows:

    _images/ditaa-6aa1d03be5a3d182bc4ecf7473e4ef6d6775a6b6.png

We depict the execution of fac_x86 by putting the two pictures on top of each other:

_images/ditaa-fcdf6ea9c92d4b7993e98d37aa8547af094dbbf1.png

And we depict the running of fac_x86 on 5:

run^x86 fac_x86 5

by superimposing 5 on fac_x86:

_images/ditaa-f0b4c942ea3b86229292914c1e576ac585d00054.png

This superimposition is consistent with interpreters executing other interpreters. For example, say we are given two more artifacts:

  • an interpreter for OCaml written in x86, denoted run^OCaml_x86:

    _images/ditaa-f3d9682c6a0e0888d250959896f4e6e2de99b901.png
  • a program that computes the factorial function and that is written in OCaml, denoted by fac_OCaml:

    _images/ditaa-6384d11fb5bad542bbbff075e0553707a108fef6.png

We depict the running of fac_OCaml on 5 by superimposing 5 on fac_OCaml, which is itself superimposed on run_x86^OCaml, which is itself superimposed on run^x86:

_images/ditaa-02d131978ef1cfb4ea3943923cb11771c518b200.png

Algebraically:

run^x86 run_x86^OCaml fac_OCaml 5

As such, the following OCaml processor:

_images/ditaa-7080ca0615a1b673c5c50cbd78d8057cf8ea95c0.png

can be defined algebraically as:

run^OCaml = run^x86 run_x86^OCaml

Program equivalence

Two programs are equivalent when they mean the same. In the example just above, fac_OCaml and fac_x86 are equivalent because they both compute the factorial function. In other words, running fac_OCaml and fac_x86 on the same non-negative integer n yields the same result:

run^OCaml fac_OCaml n = run^x86 fac_x86 n

And a compiler is meaning preserving in that it translates a source program into an equivalent target program:

_images/ditaa-e653e6ff106c7b4c20282a2fd313bfb3b823a6a9.png

In this light, let us revisit the characteristic equations of the S-1-1 function:

run p (v, w) = result
S-1-1 (p, v) = p_w
   run p_v w = result

The first and the third equations say that given p, v, and w, we can equivalently run p on v and w or run p_v on w to obtain result.

Exercise 02

Given a source program source_program and its input input, an interpreter interpreter executes source_program on input:

run interpreter (source_program, input) = output

Typically, we run source_program on several inputs, not just one. So it is a good idea to specialize interpreter with respect to a given source_program.

To be concrete, say that interpreter is an interpreter for Python written in OCaml, and that the S-m-n function specializes a program written in OCaml into another program written in OCaml.

Questions:

  1. In which language is source_program written?
  2. Apply the S-1-1 function appropriately to specialize interpreter with respect to source_program, and characterize its result.
Yoshihiko Futamura: Good questions!
Mimer: Prof. Futamura, thanks for stopping by!

Hint towards solving Exercise 02

  1. You need to instantiate the characteristic equations:

    run p (v, w) = result
    S-1-1 (p, v) = p_v
       run p_v w = result
    

    with:

    • interpreter for p
    • source_program for v
    • input for w
    • output for result

And then the questions are:

  • in which language is interpreter_source_program written?
  • what is interpreter_source_program, i.e., what does it do when it is executed?
  • what is the relation between source_program and interpreter_source_program?

Kleene’s S-m-n theorem

Anton: Ha!

Kleene’s S-m-n theorem states that the S-m-n function is computable. Therefore a program exists that computes it. This program is traditionally referred to as a “program specializer” (for what it does) and as a “partial evaluator” (for how it does it, since it evaluates part of the source program to specialize it with respect to the given part of the input). Because of the latter point, its traditional name is PE:

run PE = S-m-n

So given a program p that has m + n input parameters and returns a result:

run p (v1, v2, ..., vm, w1, w2, ..., wn) = result

a partial evaluator, given p and the m first arguments, produces program q that, given the remaining n arguments, returns the same result as the original program over the m + n arguments:

run PE (p, (v1, v2, ..., vm)) = q
     run q (w1, w2, ..., wn)  = result
Anton: So it looks like one program’s variable...
Alfrothul: ...is another program’s constant.
Alan Perlis: Sounds about right. Do you guys need a chairperson?
Mimer: Prof. Perlis, thanks for stopping by! And thanks for your epigrams too!

Exercise 03

Write down the four causes of a partial evaluator:

  • what is a partial evaluator used for?
  • which data does it operate on?
  • which operations does it carry on these data?
  • which methods does it use to carry these operations?

Exercise 04

Partial evaluation is what a partial evaluator does.

Write down as many as you like of the four causes of partial evaluation, in the order of your choosing:

  • final cause
  • material cause
  • efficient cause
  • formal cause

How does your solution compare with the four causes of a partial evaluator, in Exercise 03?

Exercise 05

Revisit the four examples above (matrix multiplication, parsing, unparsing, and the interpreter exercise) using PE instead of S-1-1:

run PE = S-1-1

Partial solution for Exercise 05

The first step is to state the characteristic equations using PE instead of S-1-1:

 run p (v, w) = result
run PE (p, v) = p_v
    run p_v w = result

Food for thought

In the vein of I- and T-diagrams, and assuming that PE computes S-1-1, can you think of a diagram that would usefully depict PE?

Specializing the partial evaluator with respect to the program to specialize

Again for simplicity, in this section, PE computes the S-1-1 function.

Suppose that we are in a situation where we need to specialize the same program with respect to a variety of partial inputs:

run PE (p, v) = p_v

Then de facto we are going to run PE many times over p, an opportunity for the S-1-1 function. So let us instantiate the characteristic equations:

run p (v, w) = result
S-1-1 (p, v) = p_v
   run p_v w = result

with

  • PE for p,
  • p for v,
  • v for w, and
  • p_v for result.

The result reads:

run PE (p, v) = p_v
S-1-1 (PE, p) = PE_p
   run PE_p v = p_v

The specialized program PE_p is a program specializer that is dedicated to specializing p. And indeed running p_v on w gives the same result as running p on both v and w:

run p_v w  = run p (v, w)

Using PE instead of S-1-1:

 run PE (p, v) = p_v
run PE (PE, p) = PE_p
    run PE_p v = p_v
Halcyon: Diagonalization, here we come again.
Pablito: True that—the partial evaluator is operating on itself.
Alfrothul (picky): On a copy of itself.
Neil Jones: Well, that’s self-applicable partial evaluation for you.

Here are some concrete examples:

  • Matrix multiplication

    To specialize the program specializer with respect to a matrix-multiplication program, let us instantiate the characteristic equations:

     run PE (p, v) = p_v
    run PE (PE, p) = PE_p
        run PE_p v = p_v
    

    with

    • matrix_multiply for p and
    • M1 for v.

    The result reads:

    run PE (matrix_multiply, M1) = matrix_multiply_M1
    run PE (PE, matrix_multiply) = PE_matrix_multiply
       run PE_matrix_multiply M1 = matrix_multiply_M1
    

    This result could also be obtained as an instance of the characteristic equations:

     run p (v, w) = result
    run PE (p, v) = p_v
        run p_v w = result
    

    with

    • PE for p,
    • matrix_multiply for v,
    • M1 for w, and
    • matrix_multiply_M1 for result.

    The specialized program PE_matrix_multiply is a program specializer that is dedicated to specializing matrix_multiply. And indeed running matrix_multiply_M1 on M2 gives the same result as running matrix_multiply on both M1 and M2:

    run matrix_multiply_M1 M2 = run matrix_multiply (M1, M2)
    
  • Parsing

    To specialize the program specializer with respect to a general parser, let us instantiate the characteristic equations:

     run PE (p, v) = p_v
    run PE (PE, p) = PE_p
        run PE_p v = p_v
    

    with

    • GP for p and
    • g for v.

    The result reads:

     run PE (GP, g) = GP_g
    run PE (PE, GP) = PE_GP
        run PE_GP g = GP_g
    

    This result could also be obtained as an instance of the characteristic equations:

     run p (v, w) = result
    run PE (p, v) = p_v
        run p_v w = result
    

    with

    • PE for p,
    • GP for v,
    • g for w, and
    • GP_g for result.

    The specialized program PE_GP is a program specializer that is dedicated to specializing GP. And indeed running GP_g on t gives the same result as running GP on both g and t:

    run GP_g t = run GP (g, t)
    
  • Unparsing

    Let us specialize the program specializer with respect to an unparser, let us instantiate the characteristic equations:

     run PE (p, v) = p_v
    run PE (PE, p) = PE_p
        run PE_p v = p_v
    

    with

    • unparse for p and
    • d for v.

    The result reads:

     run PE (unparse, d) = unparse_d
    run PE (PE, unparse) = PE_unparse
        run PE_unparse d = unparse_d
    

    This result could also be obtained as an instance of the characteristic equations:

     run p (v, w) = result
    run PE (p, v) = p_v
        run p_v w = result
    

    with

    • PE for p,
    • unparse for v,
    • d for w, and
    • unparse_d for result.

    The specialized program PE_unparse is a program specializer that is dedicated to specializing unparse. And indeed running unparse_d on pt gives the same result as running unparse on both d and pt:

    run unparse_d pt = run unparse (d, pt)
    

Exercise 06

As a continuation of Exercise 02, specialize the program specializer with respect to interpreter, and characterize the resulting specialized program specializer: what does it do?

(To be concrete, say that PE is a program written in OCaml.)

Yoshihiko Futamura: Another good question.

Exercise 07

The goal of this exercise is to revisit the equation:

run PE (PE, p) = PE_p

in the particular case where p is PE itself, and to characterize the resulting program, PE_PE.

  1. What is the result of applying PE_PE to matrix-multiply?
  2. What is the result of applying PE_PE to GP?
  3. What is the result of applying PE_PE to unparse?
  4. What is the result of applying PE_PE to interpreter, as a continuation of Exercise 02 and Exercise 06?
  5. What is the result of applying PE_PE to PE?
Yoshihiko Futamura: More good questions.

Partial solution for Exercise 07

If p is PE itself, the equation reads:

run PE (PE, PE) = PE_PE
  1. This case reads:

    run PE_PE matrix_multiply = result
    

    which is an instance of the third characteristic equation:

     run p (v, w) = result
    run PE (p, v) = p_w
        run p_v w = result
    

    with

    • PE for p,
    • PE for v, and
    • matrix_multiply for w

    and where we are asked to describe what result denotes.

    Quoting the above:

    The specialized program PE_p is a program specializer that is dedicated to specializing p. And indeed running p_v on w gives the same result as running p on both v and w:

    run p_v w  = run p (v, w)
    

    Let us instantiate this quote with PE instead of p, PE instead of v, and matrix_multiply instead of w:

    run PE_PE matrix_multiply = run PE (PE, matrix_multiply)
    

    As established above:

    run PE (PE, matrix_multiply) = PE_matrix_multiply
    

    where the specialized program PE_matrix_multiply is a program specializer that is dedicated to specializing the matrix-multiplication program.

    So we conclude that PE_PE maps matrix_multiply to a program specializer that is dedicated to specializing matrix_multiply:

    run PE_PE matrix_multiply = PE_matrix_multiply
    
  1. A simpler argument.

    This case reads:

    run PE_PE matrix_multiply = result
    

    which is an instance of the third characteristic equation:

     run PE (p, v) = p_v
    run PE (PE, p) = PE_p
        run PE_p v = p_v
    

    with

    • PE for p and
    • matrix_multiply for v

    and where we are asked to describe what p_v denotes.

    Substituting PE for p and matrix_multiply for v, the third characteristic equation reads:

    run PE_PE matrix_multiply = PE_matrix_multiply
    

    As elaborated above, PE_matrix_multiply is a program specializer that is dedicated to specializing the matrix-multiplication program.

    So we conclude that PE_PE maps matrix_multiply to a program specializer that is dedicated to specializing matrix_multiply.

  2. This case reads:

    run PE_PE GP = result
    

    which is an instance of the third characteristic equation:

     run p (v, w) = result
    run PE (p, v) = p_v
        run p_v w = result
    

    with

    • PE for p,
    • PE for v, and
    • GP for w

    and where we are asked to describe what result denotes.

    Quoting the above:

    The specialized program PE_p is a program specializer that is dedicated to specializing p. And indeed running p_v on w gives the same result as running p on both v and w:

    run p_v w  = run p (v, w)
    

    Let us instantiate this quote with PE instead of p, PE instead of v, and GP instead of w:

    run PE_PE GP = run PE (PE, GP)
    

    As established above:

    run PE (PE, GP) = PE_GP
    

    where the specialized program PE_GP is a program specializer that is dedicated to specializing the general parser.

    So we conclude that PE_PE maps GP to a program specializer that is dedicated to specializing GP:

    run PE_PE GP = PE_GP
    
  1. A simpler argument.

    This case reads:

    run PE_PE GP = result
    

    which is an instance of the third characteristic equation:

     run PE (p, v) = p_v
    run PE (PE, p) = PE_p
        run PE_p v = p_v
    

    with

    • PE for p and
    • GP for v

    and where we are asked to describe what p_v denotes.

    Substituting PE for p and GP for v, the third characteristic equation reads:

    run PE_PE GP = PE_GP
    

    As elaborated above, PE_GP is a program specializer that is dedicated to specializing the general parser.

    So we conclude that PE_PE maps GP to a program specializer that is dedicated to specializing GP.

  2. This case reads:

    run PE_PE unparse = result
    

    It is an instance of the third characteristic equation:

     run p (v, w) = result
    run PE (p, v) = p_v
        run p_v w = result
    

    with

    • PE for p,
    • PE for v, and
    • unparse for w

    and where we are asked to describe what result denotes.

    Quoting the above:

    The specialized program PE_p is a program specializer that is dedicated to specializing p. And indeed running p_v on w gives the same result as running p on both v and w:

    run p_v w  = run p (v, w)
    

    Let us instantiate this quote with PE instead of p, PE instead of v, and unparse instead of w:

    run PE_PE unparse = run PE (PE, unparse)
    

    As established above:

    run PE (PE, unparse) = PE_unparse
    

    where the specialized program PE_unparse is a program specializer that is dedicated to specializing the unparser.

    So we conclude that PE_PE maps unparse to a program specializer that is dedicated to specializing unparse:

    run PE_PE unparse = PE_unparse
    
  1. A simpler argument.

    This case reads:

    run PE_PE unparse = result
    

    which is an instance of the third characteristic equation:

     run PE (p, v) = p_v
    run PE (PE, p) = PE_p
        run PE_p v = p_v
    

    with

    • PE for p and
    • unparse for v

    and where we are asked to describe what p_v denotes.

    Substituting PE for p and unparse for v, the third characteristic equation reads:

    run PE_PE unparse = PE_unparse
    

    As elaborated above, PE_unparse is a program specializer that is dedicated to specializing the unparser.

    So we conclude that PE_PE maps unparse to a program specializer that is dedicated to specializing unparse.

Postlude

Loki: A partial solution about a partial evaluator. I see.

Mimer: Harrumph.

Version

Mentioned soundness and completeness more prominently [21 Jan 2023]

Created [10 Jan 2023]