CS 3540 Session 29

Session 29
The Next Big Thing?

Where Are We?

You are implementing the third version of a small language for programming numbers, Boom. For Homework 9 and Homework 10, you produced tools for a language consisting of numbers, operators for manipulating them, and local variables. For Homework 11, you are adding sequences of statements and mutable data.

We are closing the semester by considering other programming language issues you hear about in modern software development. Last time, we learned about a technique for optimizing an interpreter, using an odd little language for our laboratory. Today we consider another odd little language the likes of which we may encounter the future.

A Quick Puzzle: An Odd Little Language

There's this language that operates on a stack...

Demo two simple programs and their stacks.
Show how the stack operators work.

Read the top of this handout, which describes a Turing-complete stack-based language.

Then trace the three programs at the bottom of the page.


302 1000 500.0 / -.



4.95 5.0 0.1 rollup - abs >=.



10 6 swap dup * swap dup * - sqrt.

Some of these may look familiar... if you squint hard enough.

The Future, Then and Now

The year was 1996.
Bill Clinton was elected to a second term as president.
The Macarena dance craze swept the country, perhaps afflicting your parents.
Most of you were not yet born. Perhaps you have older siblings or cousins who were toddlers.
That year, the CS faculty re-designed this course to achieve two goals. The first was to teach a set of principles about programming languages. These principles had to lay the foundation for students who would be working in industry until 2040. The second was to introduce students to functional programming, which had no other home in our curriculum.
Many CS faculty thought the idea of functional programming was cool, though they never used it themselves. Good mental hygiene, they said. Exercise for the mind. But no one uses this stuff in industry, right?

For some of us, though, the functional programming part of the course had practical goals as well as theoretical. We believed that it might well come to industry. Students learned a lot in the course, but they sometimes wondered why they were learning it...
Jump forward to 2002. Java programmers are building big OO systems for "the enterprise". People begin to talk about the Value Object pattern and immutable Money objects. We begin to see the rise of functional style within Java, an OO language.
Jump forward a decade, to 2015. Clojure, a cousin of Racket, is being using in production code across the country. Serious stuff: big data processing, web services, banking.
Even in Cedar Falls, Iowa. Scala and Scalaz become a major part of one local company's programming stack.
Out in the world, many companies build their entire tech stack on functional programming.
Now it is 2025. What does our future hold? It's hard to know, because it is hard to predict the future.
In the White House, we have Donald Trump, who replaced Joe Biden, who replaced... Donald Trump, who replaced Barack Obama. All three are proof that politics is hard to predict, even in the short run.
We survived Gangnam Style and a UNI dance craze all its own. Something else has replaced them by now. Both are proof that no generation escapes the mocking of its children.
Now, our dance crazes all break out on TikTok. We can predict with some confidence that a new dance will catch fire in the future, but it's impossible to know what it will be or who will trigger it.
If you are like me when I was in college, you feel like a strange blend of the little kids you were...
... the young professionals you are becoming. Many of you in this class will be working in the computing industry deep into the 2060s, maybe even to 2070.

Functional programming was a story for the future in 1996. The course has evolved quite a bit since then, and it still focuses on laying a solid foundation. But now FP isn't the "new" future. It is the present: in recent languages such as Swift, Scala, and Clojure, in updates to mainstream languages such as Java, and C++, and in libraries for JavaScript such as React and Angular.

What's the story of 2035? Or 2060? I can make only an educated guess.
It may look something like this:
[swap dip dup dip pop] dip dup dip pop

Strangely, the future of programming may look and sound like 1940s jazz scat. Hey to the great Ella Fitzgerald.

A Different Road

We have stretched your mind this semester with Racket's prefix notation. Postfix notation isn't much different. Consider this little postfix interpreter that supports binary operators:

> (postfix 2)
'(2)

> (postfix 2 3 +)
'(5)

The expressions in green are programs in what is sometimes called a stack-based language. Postfix notation, also called Reverse Polish notation, is the first thing that many conventional programmers notice when working in a stack-based language. It corresponds to a postfix traversal of a program tree.

We can write longer programs, too:

> (postfix 2 3 + 5 *)
'(25)

This program is equivalent to 2 + 3 * 5 . Er, make that (2 + 3) * 5. As long as we know the arity of each procedure, postfix notation requires no rules for the precedence of operators.

My little interpreter is returning the state of its data stack. The parentheses expose that I've implemented my stack using a Racket list. Our stack could finish with more than one value on it:

> (postfix 2 3 + 5 * 6 3 /)
'(2 25)

Adding an operator to the end of our program can return us to a single-value stack:

> (postfix 2 3 + 5 * 6 3 / -)
'(23)

This points out a more general feature of stack programs: we can compose programs simply by concatenating their source code:

> (postfix 2 3 +)
'(5)

> (postfix 2 3 + 2 *)
'(10)

2 * is like a new program — "double my argument" — though, in this style of programming, there are no arguments. All programs read from the stack and leave their results there. The * operator requires two arguments, but the program pushes only one. So this program requires the stack to contain at least one value.

We can place this program immediately after any other program + and create a new program. For this reason, another name for this programming style is concatenative programming. It turns out that the stack is really just an implementation detail — and an especially convenient one.

The Joy Language

The language you played with in our opening exercise is Joy. It is a Turing-complete stack-based language in which programs are written postfix.

Joy style is concatenative. As we saw above, if we have two sub-programs that compute partial results, we concatenate two sub-programs to compute a compound result. For example, two plus (two times the square root of 16) is:

16 sqrt 2 *   2 +
------- ---   ---
        ^     ^
        |     |-- second program
        |
        |-- first program

If we look at a program written in postfix notation, it seems that data flows from right to left, and the application of operators flows from left to right.

Look at the source file.

You traced three programs on the opening exercise. All were problems on Homework 2:

Solutions to these problems are included in a file of Joy programs in today's code. Things to note:

Joy allows multi-line comments using Pascal's (* ... *) markers.
We can create and name new operators using the DEFINE directive. Semicolons separate consecutive definitions.

What Makes Joy Different?

The same file of Joy programs also includes implementations for two modular arithmetic functions, times and minus, which I sometimes assign as homework problems (in Racket, of course). These problems are complicated enough that we start to see some of the things that make languages like Joy different.

Different operators take a different numbers of arguments off the stack. Each operator knows its own arity.
Programmers use comments to show the state of the stack after a meaningful set of operations. Without them, most readers would get lost quickly. For example:
```
x y z -> z z x y
```
Joy has many stack operators. They don't compute new values; they rearrange the stack in order to enable a computation.
- dup makes a copy of the value on top of the stack and pushes it.
```
x -> x x
```
- swap reverses the two values on top of the stack.
```
x y -> y x
```
- rollup moves items two and three on the stack up to the top of the stack.
```
x y z -> z x y
```
- rolldown moves the top two values down on the stack beneath the third item.
```
x y z -> y z x
```
- rollupd behaves like rollup, but it dips under the top of the stack to do it.
```
x y z w -> z x y w
```
Joy has its own patterns and idioms. For example:
- dup rolldown moves two copies of the top of the stack under the next item:
```
x y -> y y x
```
- dup rollupd rollup moves two copies of the top of the stack under the next two items:
```
x y z -> z z x y
```

This style is very different. Learning it is a much bigger change than learning Racket, or even BF.

What are those bracketed bits of code? Brackets denote a list. A list can contain operators. There are higher-order functions that treat lists as programs!

[1 2 3] [4 5 6 7] concat.

[1 2 3 4] [dup *] map.

DEFINE square ==  dup  *.
[1 2 3 4] [square] map.

5  [1]  [*]  primrec.

The last example shows us that:

DEFINE factorial == [1]  [*]  primrec.

This language may look impossible to you after only a few minutes. Even so, we can write amazing programs in Joy:

This file contains a function that solves the Towers of Hanoi.
```
"jp-nestrec.joy" include.
[]  3  r-hamilhyp.
```
This file contains a Joy interpreter written in Joy.

Yes, those programs look weird — really weird. They may even look hard or scary. You probably felt the same way about Racket fourteen weeks ago. Take a look at the code you are writing for Homework 11 now. Python and Java programs look weird and scary to many beginners — until they get used to them!

A Joy Interpreter

Now that you have begun to write your own interpreter for a small language, you can appreciate what it takes to implement Joy. Knowing how interpreters work can also help you understand the language better! Take a look at this Joy interpreter, written in Racket.

I wrote a simple string parser, so that I don't have to rely on Racket to read Racket-friendly expressions. This means I don't need to wrap parentheses around my sentences.

In the interpreter itself, we see the same sort of recursive evaluator we saw in the postfix interpreter early, with a dose of your own language interpreter style...

What Really Makes Joy Different?

Joy really looks different now. What are the big ideas?

Here's one: No names for data.

Functional programming gave up state and assignments, but it still used names: parameters and local variables. Joy uses the stack for all of its data values, so we don't need to name arguments or parameters.

We don't apply functions. We compose them.

We say that Racket and languages like it (every language you use?) are applicative. Think about a Racket expression: on (+ 2 3), the evaluator applies the + procedure to its arguments 2 and 3. The + is treated differently from the 2 and the 3. We can pass functions as arguments, but they are treated differently when used.

To find the square root of the square root of a number in Racket, we would say (sqrt (sqrt num)). In Python and Java, we would say sqrt(sqrt(num)). These apply the square root function twice.

Concatenative programming uses function composition rather than function application. This is the defining difference between languages such as Joy and the the languages most of us use on a daily basis.

Here is the square root of the square root in Joy:

sqrt sqrt               # computes (sqrt (sqrt arg))

In practical terms, writing code in a concatenative language is not all that different from programming in a functional language, except that there is less nesting of functions. Rather than writing:

(f0 (f1 (f2  ... (fn x) ...)))

we could write:

x fn ... f2 f1 f0

But under the hood, there is a big difference.

Racket is based on the lambda calculus, as are most functional programming languages. The lambda calculus is simple, yet it requires three kinds of term: variables, lambdas, and applications. It also requires several rules for replacing variable names with their values, as well as the concepts of with bindings, closures, and scope. This is quite a bit of complexity.

Concatenative languages have a much simpler core. They require only functions and compositions. We don't even need an evaluation rule, because evaluation is just the composition of functions. It never has to deal with named state, so there are no variables. Without variables, there is no mutation. This means that concatenative languages are in a certain sense more functional than the languages we usually call "functional"!

I just said, "There are only functions and compositions". But wait. Recall this program from above:

2   16 sqrt 2 *   +
-   -----------   -

The + is easy enough to fit into the system. But what about the 2? Or the 16 sqrt 2 * part?

All three parts are programs. Everything is a function, being composed by concatenation. Even 2 is a function: a constant function that takes no arguments from the stack:

          2 === (lambda () 2)                  # I'm mixing Racket
16 sqrt 2 * === (lambda () (* (sqrt 16) 2))    # and Joy syntax...

This approach has a different sort of unity of representation that leads to a need for a new kind of data type for functions.

Why Might This Be The Future?

New section — still just an outline....

From the machine's perspective

Many computer architectures are stack-based. The virtual machines for Python, Java, and JavaScript all use assembly language with a stack.

For example, the JavaScript function function(x, y) { return (x + y) == 1; } compiles to:

LocalGet 0
LocalGet 1
Add
Const 1
Eq
Return

From the programmer's perspective

Stack-based languages operate at a higher level of abstraction. We can manipulate functions without worrying about lower-level details. There are no names for args or parameters. All types are compositions of other types.

Programs are sequences

It is easy to manipulate Joy programs. They are simply sequences of tokens. This supports refactoring, combining, and packaging code.

As a result, they are amenable to a particular sort of machine learning: genetic programming.

The idea behind genetic programming is enticing: Programs are "encoded" as genes that can be modified using an evolutionary algorithm. A population of programs can evolve toward a program most fit for solving a program.

Joy programs already look like genes!

This file contains two operators for genetic programming. Try:

(crossover '(16 sqrt 2 * 2 +)
           '(2 3 + 5 * 6 3 / -))

(mutate '(16 sqrt 2 * 2 +))

Stack-based languages offer a simple test bed for machine learning. We can find new solutions to problems, optimal solutions, etc.

Demo Alex's program.

Concatenative Programming in the World

Unix

Unix operating systems such as Linux have many connections to concatenative programming. They come with many built-in tools, including dc:

dc is the oldest language on Unix; it was written on the PDP-7 and ported to the PDP-11 before Unix [itself] was ported.
— Ken Thompson

dc is probably in your Unix:

4 7 + p

(Many high school students learned to program by hacking on their Hewlett-Packard HP-28 and HP-48 calculators back in the '70s and '80s, using its stack-based language, "RPL".)

But the most common example of concatenative programming in the wild is Unix pipes. They operate on string values, but their mode of operation matches what Joy does. Instead of stack operators like dup and swap, they have various operators for manipulating the data stream:

the pipe, |, which sends the output of one program as input to the next program in line
operators for sending, receiving, and redirecting multiple streams, such as n< and 2&>1

Programming Languages

As noted above, concatenative languages lie at the heart of several systems used by traditional programmers every day including the Java virtual machine, the CPython engine, and JavaScript running in the browser.

The Forth programming language is used to program embedded microcontrollers, especially in the medical industry.

The language that drives many computer printers, Postscript, is a dynamically typed, concatenative programming language. Check out the examples on the Wikipedia page. After having seen some Joy, Postscript will look familiar. It's actually a very readable language.

Mainstream Programming?

Will Joy be the story of 2034 or 2049? I doubt it, but concatenative programming may be. For a more likely candidate language, check out Factor. It is a concatenative language that aspires to be a more complete tool for sytems programming and has some interesting features, including static type checking and modules. Though it is still young, Factor has good cross-platform support, an IDE with a modern feel, and a growing open-source community.

Links

Today's code file contains three interpreters:

a simple postfix evaluator written in Racket, in the v1-postfix.rkt file
a simple string-based Joy interpreter, v2-simple-joy.rkt, written in Racket. This implementation is only a start, and is missing a few key language features, such as DEFINE. However, I wrote it in the style of our other CS 3540 interpreters, which means you should be able to follow the code, or even modify or extend it. Don't forget to run a double-quoted string:
```
> (run "5 2 swap dup * swap dup * -")
'(21)
```
a mirror of the C and Joy source code for the Standard Joy interpreter in the standard-joy/ directory. I copied these directly from the main Joy website (see below). To compile an executable on your machine, go into the src/ subdirectory and type make. Move the new joy file up to the standard-joy/ directory in place of the file that is there.

When you run the executable, you won't see a prompt, but it is a REPL. Don't forget the period:
```
5 2 swap dup * swap dup * -.
21
```

And now a few optional links, for the interested:

image credits for the slides I used in my historical review
The main Joy website, which includes source code for the canonical implementation of the language as well as documentation and tutorials, has disappeared from the web. In the meantime, here is the tutorial from which I got some of my examples and a glossary of the operators types included in the language.
For this session, I used this blog entry by a Joy newbie for several examples.
If you would like to read more about Forth, check out the language homepage. If you really want to dive in, check out this issue of Read-Eval-Print-Love about Forth and a cool implementation of it in Ruby.
There are several good tutorials on concatenative languages out on the web, most of which are compatible with my discussion of Joy here. When I first wrote this lecture, I used Why Concatenative Programming Matters for several examples. The author has links to a programming language of his own design, too. This tutorial goes deep quickly, so don't worry if you can't follow the whole discussion.

Wrap Up

Reading
- Review these lecture notes and code. Pay special attention to the simple string-based Joy interpreter.
- [optional] — If you want to read more about how this style of programming affects how we think about types, read this reading on types in Joy.
Homework
- Homework 11 is available and due on Friday.

Session 29 The Next Big Thing?