The Klein Language Specification
Introduction
Klein is a small, mostly functional programming language that is designed specifically to be used as a manageable source language in a course on compiler design and implementation. Though small and simple, the language is Turing-complete. Klein was motivated by Doug Baldwin's old teaching language MinimL.
Why is the language called Klein?
I chose the name to indicate the size of the language.
klein
is the German word for "small" or "little". It is one of the
first German words I learned back in the eighth grade.
There are a number of projects in the software world named
"Klein". In particular, this language is not affiliated in
anyway with the now-defunct Klein project at Sun Microsystems,
which was "a Self virtual machine written fully in Self".
Self
is a very cool prototype-based object-oriented programming
language — check it out sometime.
Where does the logo come from?
The Klein logo is a picture of
a bottle whose neck feeds in on itself.
This is an example of
a Klein bottle,
"a non-orientable surface ... in which notions of left and
right cannot be consistently defined". A Möbius
strip is a non-orientable surface with boundary; a Klein
bottle is a non-orientable surface without a boundary.
The bottle is not called "Klein" because it is small. It shares
its name with German mathematician
Felix Klein,
who first described this mathematical object in 1882. I chose
this specific image because it is an attractive visual
representation.
Language Specification
Here are a complete grammar for Klein and a list of syntax features not included in the grammar.
Following the grammar and list of syntax features are informal and possibly incomplete textual descriptions of many of the language's features. The purpose of these sections is to clarify the syntax of the language. They are not sufficient on their own; they complement the formal definition. If you have any questions, please ask sooner rather than later.
Grammar
Here is the grammar for Klein. In this grammar...
- Non-terminals are given in UPPERCASE.
- Terminals are given in "double quotes".
-
ε, the lowercase Greek letter epsilon, stands for the empty string. It indicates that nothing is a legal alternative. - The whitespace is intended to aid readability. It is not significant.
<PROGRAM> ::= <DEFINITION-LIST>
<DEFINITION-LIST> ::= ε
| <DEFINITION> <DEFINITION-LIST>
<DEFINITION> ::= "function" <IDENTIFIER> "(" <PARAMETER-LIST> ")" ":" <TYPE>
<BODY>
<PARAMETER-LIST> ::= ε
| <FORMAL-PARAMETERS>
<FORMAL-PARAMETERS> ::= <ID-WITH-TYPE>
| <ID-WITH-TYPE> "," <FORMAL-PARAMETERS>
<ID-WITH-TYPE> ::= <IDENTIFIER> ":" <TYPE>
<TYPE> ::= "integer"
| "boolean"
<BODY> ::= <PRINT-EXPRESSION> <BODY>
| <EXPR>
<PRINT-EXPRESSION> ::= "print" "(" <EXPR> ")"
<EXPRESSION> ::= <SIMPLE-EXPRESSION>
| <EXPRESSION> "=" <SIMPLE-EXPRESSION>
| <EXPRESSION> "<" <SIMPLE-EXPRESSION>
<SIMPLE-EXPRESSION> ::= <TERM>
| <SIMPLE-EXPRESSION> "or" <TERM>
| <SIMPLE-EXPRESSION> "+" <TERM>
| <SIMPLE-EXPRESSION> "-" <TERM>
<TERM> ::= <FACTOR>
| <TERM> "*" <FACTOR>
| <TERM> "/" <FACTOR>
| <TERM> "and" <FACTOR>
<FACTOR> ::= <LITERAL>
| "not" <FACTOR>
| "-" <FACTOR>
| <IDENTIFIER>
| <IDENTIFIER> "(" <ARGUMENT-LIST> ")"
| "if" <EXPRESSION> "then" <EXPRESSION> "else" <EXPRESSION>
| "(" <EXPRESSION> ")"
<ARGUMENT-LIST> ::= ε
| <FORMAL-ARGUMENTS>
<FORMAL-ARGUMENTS> ::= <EXPRESSION>
| <EXPRESSION> "," <FORMAL-ARGUMENTS>
<LITERAL> ::= <INTEGER-LITERAL>
| <BOOLEAN-LITERAL>
Lexical Features
Identifiers
A user-defined identifier is a string beginning with a letter
and consisting of letters, digits, and the underscore
( _ ).
Identifiers are case-sensitive. Upper- and lower-case letters are not considered equivalent.
Identifiers must be no longer than 256 characters.
Reserved Words
These are the reserved words of Klein:
integer boolean true false if then else not and or function print
print is a primitive identifier.
true and false are boolean literals.
The rest are keywords.
Klein reserved words may not be used as identifiers in user-defined names of functions or formal parameters.
Like identifiers, Klein reserved words are case-sensitive.
Integer Literals
An integer literal is a string of digits. Leading zeros are not permitted for non-zero integer literals.
There are no leading plus or minus signs to indicate positive or negative values. Thus, all integer literals are positive.
Integer literals must be in the range from 0 to 231-1, inclusive.
Symbols
These are Klein's operators, punctuation marks, and special characters:
( ) , : + - * / < = _ (* *)
The first row lists Klein's punctuation. The next two rows list Klein's operators.
Punctuation and operators are self-delimiting. They do not have to be preceded or followed by whitespace to separate them from the next token.
The fourth row lists Klein's special characters.
_ may appear only in identifiers.
(* and *) delimit a comment.
Skippable Characters
Comments and whitespace are ignored by the syntax.
Whitespace consists only of space (" "),
tab ("\t"), and
end-of-line ("\n" and "\r")
characters. It serves to separate tokens. Whitespace
characters may not appear inside a literal,
identifier, keyword, or operator. Otherwise, whitespace is
insignificant.
A comment begins with the characters (* and
continues up to the next occurrence of the characters
*). Any characters inside a comment are
ignored.
Data Types and Values
All values in Klein are either integers or booleans. Nearly every element in a program is an expression that produces an integer result or a boolean result.
There are only two boolean values. The two boolean literals
are true and false.
Klein supports integer values in the range -231 to 231-1.
Identifiers can name booleans or integers.
Compound Expressions
The language provides the following kinds of expression.
Arithmetic
These operators add, subtract, multiply, or divide two integers.
x + y x - y x * y x / y
x / y computes integer division, dropping
the remainder.
Boolean Comparisons
These operators compare the values of two integers, yielding a boolean value.
x < y x = y
< yields true if its left operand
is less than its right operand, and false
otherwise.
= yields true if its left operand
has the same value as its right operand, and
false otherwise.
Boolean Connectives
These operators compute a boolean value from one or two boolean operands.
not x x or y x and y
The unary not operator yields true
if its operand is false, and false
otherwise.
or yields true if either its left
operand or its right operand yields true, and
false otherwise.
and yields true if both its left
operand and its right operand yield true, and
false otherwise.
or and and short-circuit evaluation
when possible.
Conditional Selection
The if expression evaluates a test expression
and uses its value to select one of two alternative
expressions to evaluate. It yields the value of the first
alternative if the test expression produces a true value, and
the value of the second if the test expression yields a false
value. The else clause is required.
For example:
if (flag < 0) x + y else x - y
produces the sum of x and y if flag is less than 0. Otherwise, it produces their difference.
Function Call
A function call applies a function to zero or more arguments, yielding the value of the expression in the body of the function. Arguments are passed by value.
For example:
combine( x+y, 1 )
computes the sum of x and y, passes that value and a 1 as two
arguments to the function combine, and produces
the value returned by applying the function to its arguments.
A function may have zero or more formal parameters. The scope of a formal parameter is the body of the function.
All user-defined functions return either an integer value or a boolean value. Klein has no notion of a "void" function.
Miscellaneous
Compound expressions can be nested to any depth.
Binary operators and function calls evaluate their arguments from left to right.
The only user-defined identifiers in Klein are the names of functions and the names of formal parameters to functions. There are no "variables".
User-Defined Functions
Each function declaration consists of the function's name, its formal parameters and their types, the function's return type, and its body.
Function names are unique.
A function may refer only to its formal parameters and to other functions.
A Klein function may call itself.
Primitive Functions
For the purposes of user interaction, Klein provides the
primitive function print(expression).
For example:
print( x+y )
print writes the value of its argument on standard
output, followed by a new line character.
Unlike all user-defined functions, the value of a call to
print is undefined. For this reason, if a
function contains a call to print, that call
comes at the top of the function body.
Programs
A Klein program consists of a sequence of function definitions.
Every program must define a function named main,
which is called first when the program runs.
The result returned by main is printed on
standard output.
Users may provide arguments to main on the
command line.
For example, here is a complete Klein program that computes the absolute value of its argument:
function main(n : integer) : integer
if n < 0
then -n
else n
If this program were compiled into an executable file named
abs, then running it under Unix might look
something like this:
$ abs -3 3