Module 1:
A Scanner for Klein

Stage 1 of the Klein compiler project
Out: Friday, September 5
Due: Friday, September 19, at 11:59 PM
STATUS CHECK DUE: Friday, September 12

Tasks

This stage consists of one component of your Klein compiler and three auxiliary programs.

1. A Lexical Analyzer

Your primary task is:

Write a scanner for Klein.

A scanner is an object that is initialized with a string or a sequence of characters and that produces as output a sequence of Klein tokens.

Implement your scanner as one or more deterministic finite-state machines derived from regular expressions. Document both your regular expressions and your finite-state machines in files stored in the project's doc/ directory.

Your scanner should provide at least two public functions: next and peek. The next function returns the next token in the program and advances the scanner's pointer. The peek function returns the next token but does not advance the pointer.

Be sure that your analyzer detects and reports all lexical errors, such as illegal characters, number literals that are out of range, or identifiers that are too long.

  • Your scanner must catch all error messages generated by your implementation language and report them as Klein errors. Ideally, the user of your Klein compiler will not know the compiler's implementing language.
  • Bonus points will be awarded if the scanner's error messages report the line and character on which the error occurs, or if the messages specify the state the scanner is in at the time of the error.

2. A Token-Listing Program

Once your scanner is able to produce a sequences of tokens:

Write a program that takes a valid Klein program as input and uses your scanner to produce a listing of all the tokens that appear in it.

Display one token per line. For tokens with semantic content, such as an identifier or a number, display the content, too. For test program print-one.kln, it might generate something like this:

keyword      function
identifier   main
leftparen                      or:  punct leftparen
rightparen                     or:  punct rightparen
semicolon                      or:  punct semicolon
keyword      integer
identifier   print
leftparen    
integer      1
rightparen   
integer      1
end_of_file

3. A New Klein Program

In order to test your scanner,

Write at least one meaningful legal program in Klein.

Your test program should be at least at the level of an early CS 1510 program doing arithmetic. If you are stumped for where to start, try one of these ideas.

You should, of course, produce as many tests as necessary in order to ensure that your scanner works correctly. You are also encouraged to use any test programs available from the project home page.

4. The kleins Command

Finally,

Create an executable or a Unix command-line script named kleins that takes the name of a Klein program as an argument and runs your token-listing program on it.

For example:

$ ./kleins programs/print-one.kln
keyword      program
identifier   main
leftparen
...

This will be the first in a series of tools that makes up the command-line suite of your Klein compiler.

Deliverables

Submit only one copy of each assignment per team. The team captain or a designated team member can be responsible for the submission.

Status Check

Submit:

  • a list of token types, and
  • a list of positive and negative examples of each.

When creating your examples, pay attention to the edge cases. For example, the value of 231-1 is a legal integer token, but the value of 231+1 is not.

Record your lists in a plaintext file and email the file to the instructor on the status check due date.

Final Deliverable

By the due time and date, use the course submission system to submit your project directory electronically as a zip file named project01.zip or project01.tar.gz.