Due: Monday, Nov 25th by 11:59 PM

Getting Started

Download CS340_Assign07.zip. Unzip it on a computer that has both Ruby and Scala installed. You should see a directory called CS340_Assign07.

Copy all of the Ruby files from CS340_Assign05 (the AST builder assignment) into this directory.

Your Task

In Assignment 4 you implemented a parser for a small programming language. Let's call this language MiniLang.

Your task in this assignment is to implement a compiler that translates MiniLang programs to instructions for a virtual machine called MiniVM. MiniVM is a stack-based virtual machine, and is very much like a reduced subset of the Java Virtual Machine.

The compiled form of a MiniLang program should be a MiniVM program that prints the value of the expression in the last statement in the program. (You may assume that the last statement will be an expression statement.)

MiniVM

MiniVM is described by the following documents:

Documentation.md - discusses the general execution model of MiniVM

InstructionSet.md - describes the instructions supported by MiniVM

Another useful form of documentation is the MiniVM test programs. These are included in the testprogs subdirectory of the project. A good way to see how the test programs work is to execute them in interactive mode. For example:

./MiniVM.rb -x -i testprogs/add.mvm

When you run a MiniVM program in interactive mode, you will see the program's assembly instructions (with the next instruction to be executed marked with an arrow), the current stack contents, and the output that the program has produced. For example, if you run the testprogs/add.mvm program in the command above, the first screen of output will be

==> 000 enter 0, 0
    001 ldc_i 4
    002 ldc_i 5
    003 add
    004 syscall $println
    005 pop
    006 ldc_i 0
    007 ret

Current stack:                  Output:
-1

Each time you press the Enter key, the program will advance by one instruction. For example, pressing Enter for this program will show the next program state:

    000 enter 0, 0
==> 001 ldc_i 4
    002 ldc_i 5
    003 add
    004 syscall $println
    005 pop
    006 ldc_i 0
    007 ret

Current stack:                  Output:
-1

Pressing Enter one more time:

    000 enter 0, 0
    001 ldc_i 4
==> 002 ldc_i 5
    003 add
    004 syscall $println
    005 pop
    006 ldc_i 0
    007 ret

Current stack:                  Output:
4
-1

Note that the ldc_i instruction (at code address 001) pushed the value 4 onto the stack.

Implementing a Compiler

You will implement the Compiler in the Compiler.scala source file.

The code already provided will read an AST produced by your ASTBuilder class. Here is how you will run your compiler. First, compile the compiler and its required classes using the scalac command:

scalac *.scala

Then, use the ASTEncoderTest.rb program to parse the MiniLang program to parse the MiniLang program and generate an AST, and the Compiler program to translate the AST into an executable MiniLang program:

(cat progname.mlang | ruby ASTEncoderTest.rb | scala Compiler) > progname.mvm

The file 4times5.mlang is provided as an example program, and has the following contents:

var a;
var b;
a := 4;
b := 5;
a * b;

You can compile this to a MiniVM program using the command:

(cat 4times5.mlang | ruby ASTEncoderTest.rb | scala Compiler) > 4times5.mvm

My implementation of the Compiler program produced the following MiniVM instructions for 4times5.mlang:

main:
        enter 0, 2
        ldc_i 4
        dup
        stlocal 0
        pop
        ldc_i 5
        dup
        stlocal 1
        pop
        ldlocal 0
        ldlocal 1
        mul
        syscall $println
        ret

We can execute the program as follows:

./MiniVM -x 4times5.mvm

Which produces the output:

20

Ok, WTF am I actually supposed to DO?

Hey, calm down!

Writing a compiler is pretty easy. You've already done a lot of the work by implementing a lexer, parser, and AST builder.

Here's the idea. Your compiler should translate each expression into the program into a sequence of instructions that, when executed, will compute the result of the expression and push the result on the stack.

This is surprisingly simple!

  • For literals (integer and string), emit an ldc_i or ldc_str.
  • For variable references (identifiers), emit an ldlocal instruction that pushes the value of the local variable corresponding to the identifier.
  • For binary operators other than op_assign, recursively generate code for the left and right subtrees. This will leave the computed left and right operands on the stack. Then, emit an add, sub, mul, div, or exp instruction to compute the overall result.
  • For op_assign, recursively generate code for the right-hand subexpression. Find the name of the variable on the left hand side of the assignment operator, and find which MiniVM local variable it corresponds to. Generate an stlocal instruction to store the computed result of the right hand expression in the local variable.

Handling variables

Before your Compiler program generates code, it should scan the statement list for var_decl statements. It should build a map of variable names to integers, such that each variable is associated with a unique integer value starting at 0. This map can be used when generating code to find out the MiniVM local variable associated with each MiniLang variable in the original program.

The analyzeLocals method recursively builds this map. You will need to implement this method. The documentation for the Scala Map trait will be useful. (A trait in Scala is like an interface in Java: it describes methods that must be implemented by a class.)

Generating code

The visit method does the work of translating the AST into MiniVM instructions. It takes two parameters,

  • n, a Node object, which represents an AST node
  • locals, the map of local variable names to integer local variable numbers

Pattern matching is used to generate code based on the type of AST node. A partial implementation is provided for statement_list nodes. You will need to add cases for the rest of your AST node types.

To generate a MiniVM instruction, just use the println function to print it.

Stack management

Each expression in the MiniLang program should push a result value on the stack. So, when you generate code for an expression statement, it should result in the value of the overall expression being pushed onto the stack.

The program will only use the value of the last expression statement. So, the values pushed by all expression statements except for the last one should be cleared from the stack using the pop instruction.

Also, the MiniVM program must end in a ret instruction, which consumes a value. Make sure there is a value left on the stack before the generated MiniVM program executes the ret instruction.

Testing

Write some MiniLang programs, compile them, and execute the resulting MiniVM program interactively.

Make sure you test some programs with some complex expressions: ensure that the correct result value is computed.

Coming soon: some additional test programs.

Grading

  • Compiles example program correctly: 30
  • Simple expressions:
    • Integer literals: 5
    • String literals: 5
    • Variable references: 10
  • Operators:
    • Addition: 5
    • Subtraction: 5
    • Division: 5
    • Multiplication: 5
    • Exponentiation: 5
    • Assignment: 15
  • Stack management: 10

Submitting

Submit to Marmoset as assign07 by running the command

make submit

Enter your Marmoset username and password when prompted.