Due: Friday, October 12th by 11:59 PM

Updated Oct 11th - additional test cases added

Getting Started

Start with either the example recursive-descent parser or the example precedence climbing parser. (I suggest starting from the precedence climbing parser, but either is fine.)

Modify Lexer.rb so that it has the PATTERNS array that you created in Assignment 2.

Your Task

Implement a parser for MiniLang.

Your parser should use the following context-free grammar. Note that terminal symbols are in bold and nonterminal symbols are in italics.

programstatement_list
statement_liststatement statement_list | statement
statementexpression ;
statementvar identifier ;
expression → see below
primary_expressionidentifier | int_literal | string_literal
primary_expression( expression )

The expression nonterminal refers to binary infix expressions with the following operators:

Operators Precedence Associativity
:= 1 right
+ - 2 left
* / 3 left
^ 4 right

Each of the operators above combines two primary_expressions. A primary expression (as can be seen in the grammar above) is a variable (identifier), integer literal, string literal, or an explicitly parenthesized expression.

When you run the Parser.rb program, it will parse the input as a MiniLang program and then print a parse tree. For example, for the input:

var a;
var b;
a := 4;
b := a ^ 2;
a ^ (b + 3);

The program should produce a parse tree that looks something like:

program
+--statement_list
   +--statement
   |  +--var("var")
   |  +--identifier("a")
   |  +--semi(";")
   +--statement_list
      +--statement
      |  +--var("var")
      |  +--identifier("b")
      |  +--semi(";")
      +--statement_list
         +--statement
         |  +--op_assign
         |  |  +--primary
         |  |  |  +--identifier("a")
         |  |  +--primary
         |  |     +--int_literal("4")
         |  +--semi(";")
         +--statement_list
            +--statement
            |  +--op_assign
            |  |  +--primary
            |  |  |  +--identifier("b")
            |  |  +--op_exp
            |  |     +--primary
            |  |     |  +--identifier("a")
            |  |     +--primary
            |  |        +--int_literal("2")
            |  +--semi(";")
            +--statement_list
               +--statement
                  +--op_exp
                  |  +--primary
                  |  |  +--identifier("a")
                  |  +--primary
                  |     +--lparen("(")
                  |     +--op_plus
                  |     |  +--primary
                  |     |  |  +--identifier("b")
                  |     |  +--primary
                  |     |     +--int_literal("3")
                  |     +--rparen(")")
                  +--semi(";")

Hints

Remove the :ident_or_keyword pattern from the PATTERNS array. Replace it with two rules:

[/^var/, :var],
[/^[A-Za-z_][A-Za-z_0-9]*/, :identifier],

This will cause var to be treated as a special keyword token. The :identifier tag will be used for all ordinary identifiers.

Note that you will need to add a new lexer rule for the assignment operator (:=): something like

[/^:=/, :op_assign],

Use recursive descent for the program, statement_list, statement, and primary_expression nonterminals.

It will probably be easiest to use precedence climbing for expressions. If you do not use precedence climbing, you will need to eliminate all of the occurrences of left recursion that would occur for left-associative operators (e.g., +, -, *, etc.)

The parse trees created by your parser don't have to look exactly like mine, as long as they correctly encode the desired operator precedences and associativities.

Testing

This section lists some additional inputs you can try, along with possible parse tree outputs.

Note that depending on how your parser is implemented, you may see a different parse tree. However, the operator precedences and associativities in your parse tree should match mine.

Input:

a * b + c;

Parse tree:

program
+--statement_list
   +--statement
      +--op_plus
      |  +--op_mul
      |  |  +--primary
      |  |  |  +--identifier("a")
      |  |  +--primary
      |  |     +--identifier("b")
      |  +--primary
      |     +--identifier("c")
      +--semi(";")

Input:

a * (b + c);

Parse tree:

program
+--statement_list
   +--statement
      +--op_mul
      |  +--primary
      |  |  +--identifier("a")
      |  +--primary
      |     +--lparen("(")
      |     +--op_plus
      |     |  +--primary
      |     |  |  +--identifier("b")
      |     |  +--primary
      |     |     +--identifier("c")
      |     +--rparen(")")
      +--semi(";")

Input:

a ^ b ^ c;

Output:

a ^ b ^ c;
program
+--statement_list
   +--statement
      +--op_exp
      |  +--primary
      |  |  +--identifier("a")
      |  +--op_exp
      |     +--primary
      |     |  +--identifier("b")
      |     +--primary
      |        +--identifier("c")
      +--semi(";")

Input:

(a ^ b) ^ c;

Output:

program
+--statement_list
   +--statement
      +--op_exp
      |  +--primary
      |  |  +--lparen("(")
      |  |  +--op_exp
      |  |  |  +--primary
      |  |  |  |  +--identifier("a")
      |  |  |  +--primary
      |  |  |     +--identifier("b")
      |  |  +--rparen(")")
      |  +--primary
      |     +--identifier("c")
      +--semi(";")

Submitting

Submit a zipfile containing all of your Ruby files to Marmoset as assign4:

https://cs.ycp.edu/marmoset/