Three language processors for arithmetic expressions

The goal of this project is to formalize an interpreter for arithmetic expressions, a compiler from arithmetic expressions to byte code, and an interpreter for byte code (i.e., a virtual machine), and to prove that for any given arithmetic expression,

  • interpreting this arithmetic expression and
  • compiling this arithmetic expression and then running the resulting byte-code program

yield the same result, be it a natural number or an error message.

So the primary goal of this term project is to prove the following theorem:

Theorem the_commuting_diagram :
  forall sp : source_program,
    interpret sp = run (compile sp).

for

  • a source language of arithmetic expressions:

    Inductive arithmetic_expression : Type :=
      Literal : nat -> arithmetic_expression
    | Plus : arithmetic_expression -> arithmetic_expression -> arithmetic_expression
    | Minus : arithmetic_expression -> arithmetic_expression -> arithmetic_expression.
    
    Inductive source_program : Type :=
      Source_program : arithmetic_expression -> source_program.
    
  • a target language of byte-code instructions:

    Inductive byte_code_instruction : Type :=
      PUSH : nat -> byte_code_instruction
    | ADD : byte_code_instruction
    | SUB : byte_code_instruction.
    
    Inductive target_program : Type :=
      Target_program : list byte_code_instruction -> target_program.
    

    (N.B.: For simplicity, we shall use the infix notation ++ for the list concatenation that comes with the List library.)

  • a semantics of expressible values:

    Inductive expressible_value : Type :=
      Expressible_nat : nat -> expressible_value
    | Expressible_msg : string -> expressible_value.
    
  • a source interpreter:

    interpret : source_program -> expressible_value
    
  • a compiler:

    compile : source_program -> target_program
    
  • a target interpreter (i.e., a virtual machine):

    run : target_program -> expressible_value
    

The source for errors is subtraction, since subtracting two natural numbers does not always yield a natural number: for example, 3 - 2 is defined but not 2 - 3.

You are expected, at the very least:

  • to implement a source interpreter and to verify that it satisfies its specification,
  • to implement a target interpreter and to verify that it satisfies its specification,
  • to implement a compiler and to verify that it satisfies its specification,
  • to prove that the diagram commutes, i.e., to show that interpreting any given expression gives the same result as compiling this expression and then running the resulting compiled program

Beyond this absolute minimum, in decreasing importance, it would be good:

  • to write an accumulator-based compiler and to prove that it satisfies the specification,
  • to investigate byte-code verification,
  • to prove that each of the specifications specifies a unique function and
  • to make a copy of the above in a separate file and modify it mutatis mutandis so that the three language processors operate from right to left instead of from left to right.

Also, feel free to expand the source language and the target language, e.g., with multiplication, quotient, and modulo.

Resources

Version

Created [12 Apr 2024]

Table Of Contents

Previous topic

Expectations about the term project

Next topic

Tools