Session 17 Bonus Reading
Type Checking a Little Language
Introduction
Languages more complex than Klein have many more constructs. Working through how to type-check other kinds of expressions may help you think about type checking more generally and about type checking a simple language like Klein. This bonus reading let's you do that on your own.
If you have any questions, be sure to ask.
Type Checking a Small Language
Consider this simple language, which generates programs that consist of any number of declarations followed by a single expression:
P → D ; E
D → D { ; D }
| id : T
T → char
| integer
| array [num] of T
| ↑T
E → literal
| num
| id
| E mod E
| E [ E ]
| E↑
This language has two basic types, char and
integer, and two constructed types, arrays and
pointers.
Each array has the index set [0..num-1], where
num is the declared size of the array. Expressions
include the integer operation mod, array
dereferencing, and pointer dereferencing.
In this language, all identifiers are declared prior to being used. Here are two programs generated by the grammar:
year: integer; a: integer;
year mod 1970 b: char;
c: array [10] of integer;
d: ↑integer;
c[d↑] mod c[a]
A type checker for this language can first build type expressions for each declared identifier and then compute the type of the program's expression. We can implement the semantic actions needed to type-check a program in separate arms of the type checker code.
Let's consider declarations first. These actions require the program to record basic types and assemble constructed types from their components:
P → D ; E D → D ; D | id : T addType(id.value, T.type) T → char T.type ← char | integer T.type ← integer | array [num] of T1 T.type ← array([0..num.value-1], T1.type) | ↑T1 T.type ← pointer(T1.type)
Now let's consider the types of expressions. For literal values, the types are basic types:
E → literal E.type ← char E → num E.type ← integer
Identifiers have values associated with them in the symbol table:
E → id E.type ← lookupType(id.value)
Types for the three remaining kinds of expressions must be
computed. Because they must be computed from parts that have
specific types, there is the possibility of a type error. For
example, the integer operation mod requires
integer arguments.
E → E1 mod E2 E.type ←
if E1.type = integer and
E2.type = integer
then
integer
else
type error
Similarly, array indices much be integers. At this point, we don't care about the value of the index (is it in the index set?), only that the types match up.
E → E1 [ E2 ] E.type ←
if E1.type = array(s, t) and
E2.type = integer
then
t
else
type error
Finally, pointer dereferencing works only for pointer types, and returns the pointed-to type:
E → E1↑ E.type ←
if E1.type = pointer(t)
then
t
else
type error
That's pretty much how it works. Not too bad!
A Quick Exercise
- a boolean data type
- a comparison operation "E < E"
- logical operations "! E" and "E && E"
Check out the solution below.
Type checking expressions really is quite simple, a matter of verifying argument types and setting result types. At the top level, it is straightforward structural recursion. At the bottom level, it is straightforward selection.
Type Checking Statements
Consider this change and addition to our simple language, which introduces statements:
P → D ; S
S → id := E
| if E then S
| while E do S
| S { ; S }
Not much new is required. The same techniques that type-check expressions also work for type-checking statements. In most languages, statements do not really have types, or need them. We can assign types to statements if we wish, but more common in procedural languages is not to do so, by assigning a custom type surrogate: void.
Exercise Solution
Here are the changes we should make, along with the corresponding type-checking actions:
T → ...
| boolean T.type ← boolean
E → ...
| E1 < E2 E.type ←
if E1.type = integer and
E2.type = integer
then
boolean
else
type error
| ! E1 E.type ←
if E1.type = boolean
then
boolean
else
type error
| E1 && E2 E.type ←
if E1.type = boolean and
E2.type = boolean
then
boolean
else
type error