Module 1:
A Scanner for Klein
Stage 1 of the Klein compiler project
Out: Friday, September 5
Due: Friday, September 19, at 11:59 PM
STATUS CHECK DUE: Friday, September 12
Tasks
This stage consists of one component of your Klein compiler and three auxiliary programs.
1. A Lexical Analyzer
Your primary task is:
Write a scanner for Klein.
A scanner is an object that is initialized with a string or a sequence of characters and that produces as output a sequence of Klein tokens.
Implement your scanner as one or more deterministic
finite-state machines derived from regular expressions.
Document both your regular expressions and your finite-state
machines in files stored in the project's doc/
directory.
Your scanner should provide at least two public functions:
next and peek. The
next function returns the next token in the
program and advances the scanner's pointer. The
peek function returns the next token but does
not advance the pointer.
Be sure that your analyzer detects and reports all lexical errors, such as illegal characters, number literals that are out of range, or identifiers that are too long.
- Your scanner must catch all error messages generated by your implementation language and report them as Klein errors. Ideally, the user of your Klein compiler will not know the compiler's implementing language.
- Bonus points will be awarded if the scanner's error messages report the line and character on which the error occurs, or if the messages specify the state the scanner is in at the time of the error.
2. A Token-Listing Program
Once your scanner is able to produce a sequences of tokens:
Write a program that takes a valid Klein program as input and uses your scanner to produce a listing of all the tokens that appear in it.
Display one token per line. For tokens with semantic content, such as an identifier or a number, display the content, too. For test program print-one.kln, it might generate something like this:
keyword function identifier main leftparen or: punct leftparen rightparen or: punct rightparen semicolon or: punct semicolon keyword integer identifier print leftparen integer 1 rightparen integer 1 end_of_file
3. A New Klein Program
In order to test your scanner,
Write at least one meaningful legal program in Klein.
Your test program should be at least at the level of an early CS 1510 program doing arithmetic. If you are stumped for where to start, try one of these ideas.
You should, of course, produce as many tests as necessary in order to ensure that your scanner works correctly. You are also encouraged to use any test programs available from the project home page.
4. The kleins Command
Finally,
Create an executable or a Unix command-line script named
kleins that takes the name of a Klein
program as an argument and runs your token-listing program
on it.
For example:
$ ./kleins programs/print-one.kln keyword program identifier main leftparen ...
This will be the first in a series of tools that makes up the command-line suite of your Klein compiler.
Deliverables
Submit only one copy of each assignment per team. The team captain or a designated team member can be responsible for the submission.
Status Check
Submit:
- a list of token types, and
- a list of positive and negative examples of each.
When creating your examples, pay attention to the edge cases. For example, the value of 231-1 is a legal integer token, but the value of 231+1 is not.
Record your lists in a plaintext file and email the file to the instructor on the status check due date.
Final Deliverable
By the due time and date, use
the course submission system
to submit
your project directory
electronically as a zip file named project01.zip
or project01.tar.gz.