TITLE: On the Virtues of a Small Source Language in the Compiler Course
AUTHOR: Eugene Wallingford
DATE: May 05, 2012 11:53 AM
DESC:
-----
BODY:
I have not finished grading my students' compilers yet. I haven't
even looked at their public comments about the project. (Anonymous
feedback comes later in the summer when course assessment data
arrives.) Still, one lesson has risen to the surface:
Keep the source language small. No, really.
I long ago learned the importance of assigning a source language
small enough to be scanned, parsed, and translated completely in
a single semester. Over the years, I had pared the languages I
assigned down to the bare essentials. That leaves a small
language, one that creates
some fun programming challenges.
But it's a language that students can master in fifteen weeks.
My students this term were all pretty good programmers, and I am
a weak man. So I gave in to the temptation to add just a few of
more features to the language, to make it a bit more interesting
for my students: variables, an assignment statement, a sequence
construct, and a single loop form. It was as if I had learned
nothing from all my years teaching this course.
The effect of processing a larger language manifested itself in
an expected way: the more students have to do, the more likely
that they won't get it all done. This affected a couple of the
teams more than the others, but it wasn't so bad. It meant that
some teams didn't get as far along with function calls and with
recursion than we had hoped. Getting a decent subset of such a
language running is still an accomplishment for students.
But the effect of processing a larger language manifested itself
in a way I did not expect, too, one more detrimental to student
progress: a "hit or miss" quality to the correctness of their
implementations. One team had function calls mostly working, but
not recursion. Another team had tail recursion mostly working(!),
but ordinary function calls didn't work well. One team had local
vars working fine but not global variables, while most teams
knocked out globals early and, if they struggled at all, it was
with locals.
The extra syntactic complexity in the language created a different
sort of problems for the teams.
While a single new language feature doesn't seem like too much in
isolation, but it interacts with all the existing features and all
the other new features to create a much more complex language for
the students to understand and for the parser to recognize and get
right. Sure, our language had regular tokens and a context-free
grammar, which localizes the information the scanner and parser
need to see in order to do their jobs. Like all of us, though,
students make errors when writing their code. In the more complex
space, it is harder to track down the root cause of an error,
especially when there are multiple errors present and complicate
the search. (Or should I say
complect?)
This is an important lesson in language design more generally,
especially for a language aimed at beginners. But it also stands
out when a compiler for the language is being written by beginning
compiler writers.
I am chastened and will return to the True Path of Small Language
the next time I teach this course.
-----