EECS 322: Compiler Construction

Overview

This course will teach you how to build a compiler for a simple (but illustrative) programming language. It will explain the standard structuring for a compiler with a front end (parsing, type-checking), a middle end (optimization / transformations) and a back end (code generation).


LectureTech L160; TTh 12:30-1:50pm

Recommended Text (none are required)Modern Compiler Implementation in ML by Andrew W. Appel

More on Packrat parsing. Probably best to start with Brian Ford's master's thesis.

Mailing list

EECS322


Interpreters

The file 322-interps.tar.gz contains an interpreter for L1, the input language for your first compiler. (As the quarter goes on, we will be adding more interpreters to this file.)

The file tests.tar.gz contains the test cases for the test fests that have been run. Use it with run-test-fests (in 322-interps.tar.gz) to test your implementations


Lecture notes

lecture11.txt some profitable optimizations
lecture10.txt from L5 to L4 (closure conversion)
anf1.ss anf2.ss anf3.ss anf4.ss anf5.ss anf6.ss A-normalization, step by step to CPS
lecture07-bounds-checking.pdf   lecture07-given.pdf Optimizations
lecture06.txt from L3 to L2
lecture05.txt tail calls
lecture04.pdf register allocation
lecture04-call-return-protocol.txt liveness & function calls
lecture03.txt from L1 to x86
lecture02.txt L1
lecture01.txt introduction


Homework assignments

1a.pdf write some L1 programs, due April 8th
1b.pdf from L1 to x86, due April 15th
2a.pdf liveness and spilling, due April 29th
2b.pdf L2 to L1, register allocation, due May 6th
3.pdf L3 to L2, explicate calling convention and data representation, due May 13th
4.pdf L4 to L3, A normalization, due May 20th
5.pdf L5 to L4, closure conversion and λ lifting, due May 27th
speed.pdf cleanup for contest, due June 3


Pair programming

Students are encouraged (but not required) to work in pairs. Pair programming is not team progamming, however. That is, pairs must promise (in writing) that they will always sit together when working on the assignments, never separately. If this is too much of a burden, work alone.


Test Fests

Each assignment consists of two parts: first the design of test cases and second the implementation. Each student's assignment will be run against everyone's test cases. You get 2 points for finding a bug in someone else's implementation with your test suite and you lose 1 point if someone else finds a bug in your implementation with their test suite. Multiple test case failures are (generously) assumed to be just a single bug

The results of the test fests will be public (and will include my code and test suites).


Programming LanguageStudents are free to use any programming language. As a general guideline, I recommend a programming language that is both safe and has garbage collection. These two features make building software easier (and the second often improves performance). Also, you will have to build a simple parser for a parenthesis-based language that comes for free in PLAI, so you may want to just use it.

Grades

Grades in the course are based on passing each of the programming assignnments, the speed test, plus passing your codewalk for up to 9 opportunities to pass.

GradeAA-B+BCDF
Assignments
passed
9876510

To pass one of the programming assignments (1b, liveness, spill, 2b, 3, 4, or 5), you must either pass 75% of the test cases in the initial test fest, submit a test suite that finds a bug in every (other) submission in the initial test fest, or pass 90% of the test cases in a later test fest.

To pass the speed test, your compiler must generate a binary that produces the correct output for each of the submitted speed test programs.

To pass your codewalk, you must produce code that is structured well, i.e., easy to read and maintain (it does not have to be perfect, but it has to be something that I would work on, rather than throw out and start over).

The winner of the speed test and anyone that beats mzscheme on all programs gets a free pass to be used on any one assignment. Note that while mzscheme has had 15 years of continuous development that gives it a fair edge over your 10 or so weeks worth of effort, it is at a significant disadvantage because its versions of the primitive operations are more complex and have more error checking. Overall, this should make it a fair fight. (Put another way, getting performance in the face of all the details that go into a full-fledged, safe language is not easy.)

You may resubmit any version of any assignment any time up to the last day of finals (June 11, 2010) and if you do not pass your codewalk, you may request a private codewalk (on the same assignment or a different one).


Cheating

Your code will be scrutinized for plagiarism and other forms of cheating and, if discovered, you will be punished to furthest extent possible.


Robby Findler