YCP Logo Lab 2: Recursive Descent Parsing

Assigned: Sept 24th

Getting Started

Download CS340_Lab2.zip. Import it into Eclipse (File->Import...->General->Existing projects into workspace->Archive file). You should see a project called CS340_Lab2 in the package explorer.

You will modify the code in the Parser class.

The Notes for Lecture 7 will be helpful.

Your Task

Your task is to write a recursive-descent parser for the following "calculator language" grammar, where A is the start symbol:

A → set_keyword identifier assign_op E
E' → plus_op T E'
E' → minus_op T E'
E' → ε
T' → mult_op F T'
T' → div_op F T'
T' → ε
F → identifier | int_literal

Note that all the symbols indicated in italics are nonterminal symbols. This grammar has had all occurrences of left-recursion eliminated, and is suitable for top-down parsing with one token of lookahead.

Implement parse methods for each nonterminal symbol. You can use the parser's lexer field to call methods on a lexical analyzer object, which behaves identically to the lexical analyzer you implemented in Lab 1. Note that the TokenType enumeration has been renamed SymbolType, because it is used for both terminal symbols (Token objects) and nonterminal symbols (Nonterminal objects).

Parse errors

If your parser reaches a point where it encounters a token which cannot legally occur in the derivation, it should throw a ParserException.

Two common reasons a derivation cannot be continued:

  1. A token is required, but end of input has been reached.
  2. A specific kind of token is needed, but that kind of token is not the next one the lexer will return. (See the expect method in the Parser class.)


When you have fully implemented the parse methods for each nonterminal, you can run the main method of the TestParser class. Each line of input that you type in the Console window will be parsed, and a textual representation of the parse tree will be printed out.

For example, if you type the input

set result = a + b * 3

you should see the following parse tree:

   |  +--F
   |  |  +--IDENTIFIER("a")
   |  +--TPRIME
      |  +--F
      |  |  +--IDENTIFIER("b")
      |  +--TPRIME
      |     +--MULT_OP("*")
      |     +--F
      |     |  +--INT_LITERAL("3")
      |     +--TPRIME