Homework 2: Parser
Due Tuesday, 5 February 2008, 6 PM
Homework 2 is to build a parser for Appel's Tiger language using ML-Yacc.
The Tiger language is defined in Appendix A of the course textbook.
The ML-Yacc user's manual is
available online,
and also (in less detail) in chapters three and four of the textbook.
Your parser should produce an abstract syntax tree using the datatypes
defined in the Absyn structure defined in the file
$TIGER/chap4/absyn.sml.
Skeleton files to get you started are available in the
$TIGER/chap3/ and $TIGER/chap4/ directories, which
you can find at Appel's textbook homepage. In particular, these files
will be useful:
- tiger.grm
- Skeleton ML-Yacc file to get you started
- parsetest.sml
- Test driver
- parse.sml
- Top-level test driver code
- absyn.sml
- AST datatypes
- symbol.sml
- Support library providing symbols
- table.sml, table.sig
- Support library providing tables
- prabsyn.sml
- Useful library for printing ASTs
Remember to update your sources.cm file. Be certain to add
ml-yacc-lib.cm to it if you haven't already, in addition to
any other new files you are using. You'll want to remove
tokens.sml from it, because a tokens file will be
automatically generated by ML-Yacc now. Also, you'll need to update
some header information in your lex file—see the end of Chapter
3 for details.
You should submit:
- A tiger.grm file, the ML-Yacc source for your parser.
- Any other source files you wrote to support your parser.
- A text file describing:
- The members of your team,
- How you handled each shift-reduce conflict, and why you think it
is correct, and
- Anything you think is of interest about your parser.
Anything subtle about your grammar, such as ambiguity resolutions of
various kinds, should be clearly commented. You knew that, of course.
Your parser should use the same error-reporting machinery you employ in
your lexer.
Some advice:
- Make no attempt to write the ML code for the semantic actions that
construct the AST when you start. Your first task should be to simply
write the grammar to parse the concrete syntax; just have each rule
produce the unit value
()
.
- Once your grammar is able to parse all of the test files in
$TIGER/testcases, then add semantic actions to the grammar to
construct the AST while parsing tiger source code.
- I estimate writing the grammar was about 2/3 of the time I spent
writing the parser; adding the semantic actions was about 1/3 of the
work.
- You may find it helpful to write tiny toy grammars to explore
ideas.
- If you don't understand how shift/reduce parsers work, and can't
figure out the table information left in the
tiger.grm.desc file by ML-Yacc, you are going to be in a
world of pain.
- Be warned that there are some grammatically tricky bits to the grammar
that will require some thinking. It may help to
think more and hack less.
- Is your LR parser efficiently parsing lists of arbitrary length (e.g.,
declarations, or expression sequences)?
- Did you report the locations of syntax errors properly?
See the course text for more information, in particular Chapters 3 and
4 and Appendix A.
Good luck and have fun.
Last updated 18 February 2008.