[ accompanying slides ]
The year was 1996. Bill Clinton was elected to a second term as president. The Macarena dance craze swept the country. Most of you were... toddlers.
That year, we redesigned this course to achieve to two goals. The first was to teach a set of principles about programming languages. These principles had to lay the foundation for students who would be working in industry until 2040. The second was to introduce students to functional programming, which had no required spot in our curriculum.
Many CS faculty thought functional programming was cool, though they never used it themselves. Good hygiene, they say. Exercise for the mind. But it's not practical, right? For some of us, though, the functional programming part of the course was for purposes both theoretical and practical: we knew that it was coming to industry. So students learned a lot in the course, but they sometimes wondered why...
Jump forward to 2003. Java programmers are building big OO systems for "the enterprise". People begin to talk about the Value Object pattern and immutable objects. We begin to see the rise of functional style in an OO language.
Jump forward a decade, to 2013. Scala, Clojure, and even a little Haskell are being using in production code across the country. Serious stuff: data processing, web services, banks. Cedar Falls, Iowa, not (just) Silicon Valley.
Now it is 2015. Barack Obama is serving his second term as president. We have all survived Gangnam Style. Most of you are... grown-ups.
What's the story of 2025? Or 2035? I can make only an educated guess. It may sound something like this:
[swap dip dup dip pop] dip dup dip pop
(Who knew the future of programming would sound like a 1940s jazz scat?)
Let's look at what is sometimes called a stack-based language.
: 2 . 2 : 2 3 + . 5
Postfix notation, also called Reverse Polish notation, is the first thing that conventional programmers notice when working in a stack-based language. It corresponds to a postfix traversal of a program tree.
We can write longer programs, too:
: 2 3 + 5 * . 25
This program is equivalent to 2 + 3 * 5. Er, make that (2 + 3) * 5. As long as we know the arity of each procedure, postfix notation requires no rules for the precedence of operators. (The same is true for prefix languages, as you have seen in Racket...)
As a result, we can compose programs simply by concatenating their source code
: 16 sqrt 2 * 8
sqrt 2 * is like a new program: two times the square root of its argument.
For this reason, the formal name for this programming style is concatenative programming. It turns out that the stack is really just an implementation detail -- and an especially convenient one.
To use sub-programs to compute partial results of a more complex operation, we concatenate the sub-programs and add the new operator. For example, two plus two times the square root of 16 is:
2 16 sqrt 2 * + - ----------- - ^ ^ ^ | | |-- new operator | | | |-- second program | |-- first program
If we look at a program written in postfix notation, it seems that data flows from right to left, and the application of operators flows from left to right.
The programs above are legal programs in Joy, the canonical concatenative programming language. Joy is simple, as are most tools for it, but it has all the ideas we need to understand concatenative programming. Let's look at a few more Joy programs, to see how concatenative programmers think.
In addition to number literals and arithmetic operators, Joy also has
(Notice that Joy uses the single quote differently than Racket uses it.)
So, this program evaluates to true:
: "abcd" 1 at 'b equal. true
So where is the stack? Let's examine this program step by step.
at is a two-argument function. It expects the top of the stack to be a zero-based index and the item below it to be a collection. It finds the item at the indexed position in the collection -- here, the character 'b in the string "abcd" -- and pushes it onto the stack.
Joy has list literals. Lists have zero-based indices and are written in square brackets. This program yields 3:
: [1 2 3 4 5] 2 at. 3
Because Joy is dynamically typed, its lists are heterogeneous. [1 'c 2 "abc" [1 2 3] 5] is a legal list.
: [1 'c 2 "abc" [1 2 3] 5] 4 at. [1 2 3]
Joy has our old friend cons for adding things to lists. Of course, it works on two stack arguments:
: 5  cons. [5 2] : 3 5  cons. # A Joy program returns whatever [5 2] # is on top of the stack. : 3 5  cons cons. [3 5 2]
If a value we want to put in a list gets pushed onto the stack on top of the list, we need to swap before consing:
:  4 3 - swap cons. [1 2] : 3 5  cons cons 1 swap cons. # 'swap cons' is so common... [1 3 5 2] : 3 5  cons cons 1 swons. # ... there's a primitive! [1 3 5 2]
The hash mark is a legal part of this Joy program. It indicates a comment that runs to the end of the line. The pair (* ... *) indicate multi-line comments. (* I've always liked this syntax, from my early days writing Pascal programs. *)
The stack lies at the heart of Joy programming, so the language provides a wide array of primitive functions for manipulating the stack. In addition to swap, we can duplicate the item on top of the stack. The following yields the square of 4:
: 4 dup *. 16
In Joy, dup and swap are huge.
: 2 3 - . -1 : 2 3 swap - . 1
Suppose you are given this stack:
Write a program to compute 5 squared minus 2 squared.
5 2 fill in the blank
Sadly, "square" is not a primitive function in Joy, so you'll need to dup and swap.
How about this:
: 5 2 swap dup * swap - 23
Yes, that looks weird. So do Python and Java and Racket -- until you get used to them!
We say that Racket and languages like it (every language you use?) are applicative. Think about a Racket expression: on (+ 2 3), the evaluator applies the + procedure to its arguments 2 and 3. The + is treated differently from the 2 and the 3. We can pass procs as arguments, but they are treated differently then.
Concatenative programming uses function composition rather than function application. This is the defining difference between languages such as Joy and the the languages most of us use on a daily basis.
sqrt sqrt # computes (sqrt (sqrt arg))
To compose sqrt with sqrt in Racket, we need a higher-order procedure:
(define compose (lambda (f g) (lambda (x) (f (g x)))))
In Joy, to compute f(g(2)), we write 2 g f. To compose f and g more generally, f·g,, f(g(x)), we write simply g f. (By now, you should understand immediately why this is written backwards from the way we usually write this our infix/prefix world!)
In practical terms, writing code in a concatenative language is not all that different from programming in a functional language, except that there is less nesting of functions. Rather than writing:
(f0 (f1 (f2 ... (fn x) ...))we could write:
x fn ... f2 f1 f0But under the hood, there is a big difference.
Racket is based on the lambda calculus, as are most functional programming languages. The lambda calculus is simple, yet it requires three kinds of term: variables, lambdas, and applications. It also requires several rules for replacing variable names with their values, as well as the concepts of with bindings, closures, and scope. This is quite a bit of complexity.
Concatenative languages have a much simpler core. They require only functions and compositions. We don't even need an evaluation rule, because evaluation is just the composition of functions. It never has to deal with named state, so there are no variables. Without variables, there is no mutation. This means that concatenative languages are in a certain sense more functional than the languages we usually call "functional"!
I just said, "There are only functions and compositions". But wait. Recall this program from above:
2 16 sqrt 2 * + - ----------- -
The + is easy enough to fit into the system. But what about the 2? Or the 16 sqrt sqrt part?
All three parts are programs. Everything is a function, being composed by concatenation. 2 is actually a constant function that takes no arguments from the stack:
2 === (lambda () 2) # I'm mixing Racket 16 sqrt 2 * === (lambda () (* (sqrt 16) 2)) # and Joy syntax...
This approach has a different sort of unity of representation that leads to a need for a new kind of data type for functions.
It turns out that the  we saw in our lists above is more than just list syntax. We now know that 1, 2, and 3 are programs (functions) themselves. The list [1 2 3] is a quoted program. (Think about how lists and quotes and programs relate in Racket.).
We saw earlier that:
The list [4 dup *] is a program. Actually, we say that it is a value representing a program. It can be pushed onto the stack and popped from the stack by another function and used as a value. This corresponds to a function being a "first-class value" in Racket, where we can pass around procedures as data and evaluate them whenever we want.
A quoted program is evaluated by another function, which in Joy is called a combinator. In Racket, we called such things higher-order procedures (and 'combinator' meant something else). A Joy combinator takes one or more quoted programs as input and uses them to create new functions. This is how most programming in Joy works.
The combinator i unquotes a quoted program, like this:
: [4 dup *] i. 16
Notice that unquoting a quoted program is equivalent to evaluating it!
Joy has other combinators you will recognize. Many are higher-order functions that you will recognize from Racket, such as map, filter, and fold. For example, map takes a quoted program and applies it to each element of a list:
: [1 2 3 4] [dup *] map. [1 4 9 16]
What about control structures? They, too, are implemented as combinators. Consider the standard if-then-else statement, which in Joy is the ifte operator. It requires that the stack contain three quoted programs: the test, the then block, and the else block.
In this example, the test operates on a number, so the item on the stack just below the test is expected to be a number. The ifte function applies the test program to the number next on the stack. If it is greater than 1000, it applies the then block, thus halving the number; otherwise, it applies the else block, thus tripling it.
: 1500 [1000 >] [2 /] [3 *] ifte. 750 : 150 [1000 >] [2 /] [3 *] ifte. 450
It takes a while to get used to the postfix syntax, especially for quoted programs that look like curried infix expressions. But, as we saw with Racket and its prefix notation, once you become familiar, you that the notation offers a lot of flexibility.
Though Joy is dynamically typed, its functions do have expectations of the arguments. Most combinators require that the stack contain values of specific types.
Consider filter, another familiar higher-order function. It expects the stack to contain an aggregate data value (such as a list) and a quoted program. The program must return true of false. filter produces a new aggregate of the same type (list, set, string,...) that contains only the elements of the original that "pass the test".
In this example, the quoted program ['Z >] returns true for characters whose ASCII values are greater than 'Z's ASCII value. So we can use it and filter to remove uppercase characters, blanks, and many other special characters from a string: following evaluates to '
: "John Smith" ['Z >] filter. "ohnmith"
fold enables us to reduce a list by combining its members using an operator such as addition or multiplication. It requires three arguments on the stack: the list to be reduced, a quoted program to be run when the list is empty, and a quoted program that operates on two value.
This program computes the sum of the numbers in a list:
: [2 5 3] 0 [+] fold. 10
This program passes a more complex binary operator. It computes the sum of the squares:
: [2 5 3] 0 [dup * +] fold. 38
See the power of concatenation: To square the value before folding, we simply insert a [dup *]!
What about a computation that requires doing two things with each member of the list, such as the computing the average? One way to do this is to do the summing and counting in "two passes": duplicate the list, sum the top version, swap take the length of the second, and then divide:
dup 0 [+] fold swap size / ------------ ------------- copy sum the 1st count the 2nd divide # find the average of a list : [7 17 12] dup 0 [+] fold swap size /. 12
Some combinators make sense only in Joy. For example, there is a step combinator that accesses all the elements of an aggregate in sequence. Joy also has a set of recursive combinators that produce recursive functions. For example, primrec does counted recursion on a number:
: 5  [*] primrec ----------------- # factorial 120
As crazy as this looks, there is theory to support how datatypes work in a concatenative language like Joy. If you are interested, take a look.
Time is short, so let's cut our dive into this part of Joy short here as well.
As we wrote programs above, we thought about composition from left to right: 2 3 +. Can we think about it from right to left? Yes! The result is curried functions. 3 + is a one-argument function that adds 3 to a number.
We can pull out any piece of code from a program, name it, and substitute the name back in. As a result, refactoring is straightforward. That's one of the great advantages of concatenative languages. Something called "row polymorphism" gives us a way to compute the type of any program quote, but keep in mind that argument types matter. Functions care about what's on top of the stack!
Without any restrictions on the associativity of function application and composition, many possibilities arise. For example compiling programs in a concatenative language can be as simple as:
Thus a parallel compiler for a concatenative language, is plain old map-reduce! As far as I know, this is impossible to do with any other kind of language.
In Joy, as at the Unix command line, the program is really about the flow of data from the beginning of the program to the end. That's why the postfix notation we use to compose concatenated functions makes sense. The 'reverse' order is actually data flowing forward:
+---+ | 2 | +---+ | | +---+ | | 3 | | +---+ | | V V +---------+ | * | +---------+ | V
Translating a standard Java infix expression to postfix is as simple as doing a post-order traversal of the abstract syntax tree. The JVM then executes the program on a stack machine! Most language VMs these days are essentially concatenative. Even the x86 architecture relies heavily on a stack for managing local state (recall our discussion of block structure and our use of a stack or variable declarations and bindings?), so even C compilers make use of a dose of concatenative style.
Will Joy be the story of 2025 or 2035? I doubt it, but concatenative programming may be. For a more likely candidate, check out Factor. It aspires to be a more complete tool for sytems programming and has some interesting features, including static type checking and modules. Though it is still young, Factor has good cross-platform support, an IDE with a modern feel, and a growing open-source community.
Whether Joy, Factor, or a language that has not been invented yet, it will probably sound something like this:
[swap dip dup dip pop] dip dup dip pop
Let's hope the language of the future reads better than it sings.
The main Joy website includes source code for the canonical implementation of the language as well as documentation and tutorials.
For this session, I used this blog entry by a Joy newbie for several examples.
Today's code file contains two Joy interpreters.
When you run the executable, you won't see a prompt, but it is a REPL. Don't forget the period:
5 2 swap dup * swap dup * -. 21
This implementation is only a start, and is missing a few key language features, such as DEFINE. However, I wrote it in the style of our other CS 3540 interpreters, which means you should be able to follow the code, or even modify or extend it. Don't forget to parenthesize the input to the REPL:
joy: (5 2 swap dup * swap dup * -) 21
There are several good tutorials on concatenative languages out on the web, most of which are compatible with my discussion of Joy here. For this lecture, I used Why Concatenative Programming Matters for several examples. The author has links to a programming language of his own design, too. This tutorial goes deep quickly, so don't worry if you can't follow the whole discussion.
The images in my opening slideshow come from the following sources:
I almost used this XKCD and this slide about FP. If only we had more time...™.
As crazy as this looks, there is theory to support how concatenative languages work. How?
Recall: 2 is a function that takes no inputs and returns one integer, itself. The more common * is a function that takes two integer inputs and returns one integer:
2 :: () → (int) 3 :: () → (int) * :: (int, int) → (int)
But, as written, we cannot compose these functions. The range of 2 does not match the domain of 3. The range of 2 · 3, whatever that is, does not match the domain of *.
The basic idea is to give every function a generic data type that accounts for the full stack made available to it. Each function can be thought of as taking any input stack, as long as it is "topped" with the values it actually needs. The function then returns the arguments it doesn't use back to the stack, followed by its actual return values, which are pushed onto the top of the stack.
2 :: (A) → (A, int) 3 :: (B) → (B, int) * :: (C, int, int) → (C, int)
Now we can match B = (A, int) and compose 2 with 3:
2 3 :: (A) → (A, int, int)
This is quite nice! The meaning of 2 3 is clear. It is a function that takes no input and returns both 2 and 3. Joy thus has functions that return multiple values!
Then we match C = A in the result and compose 2 3 with *:
2 3 :: (A) → (A, int, int) * :: (C, int, int) → (C, int) --------------------------------- 2 3 * :: (A) → (A, int)
It works! The program 2 3 * takes no inputs from the stack and produces one integer. Note, too, that this is the same as the type of 6, which is the value computed by 2 3 *.
6 :: (A) → (A, int)
This new approach to typing, called row polymorphism, does just what we need. Thanks to row polymorphism, we have a uniform way to compose functions of different types, in a way that makes the flow of data through a program clear. As a bonus, concatenative languages are able to give us something that applicative functional languages usually don't: functions that return true multiple values.