Session 28

The Next Big Thing?

CS 3540
Programming Languages and Paradigms

The Future, Then and Now

[ accompanying slides ]

The year was 1996. Bill Clinton was elected to a second term as president. The Macarena dance craze swept the country. Most of you were... toddlers.

That year, we redesigned this course to achieve to two goals. The first was to teach a set of principles about programming languages. These principles had to lay the foundation for students who would be working in industry until 2040. The second was to introduce students to functional programming, which had no required spot in our curriculum.

Many CS faculty thought the idea of functional programming was cool, though they never used it themselves. Good hygiene, they say. Exercise for the mind. But no one uses this stuff in industry, right?

For some of us, though, the functional programming part of the course had practical goals as well as theoretical. We believed that it would come to industry. Students learned a lot in the course, but they sometimes wondered why they were learning it...

Jump forward to 2003. Java programmers are building big OO systems for "the enterprise". People begin to talk about the Value Object pattern and immutable objects. We begin to see the rise of functional style in an OO language.

Jump forward a decade, to 2013. Scala, Clojure, and even a little Haskell are being using in production code across the country. Serious stuff: data processing, web services, banks. Cedar Falls, Iowa, not (just) Silicon Valley.

Now it is 2016. Barack Obama is serving his second term as president. We have all survived Gangnam Style. Most of you are... grown-ups.

Functional programming was a story for the future in 1996. The course has evolved a bit since then, but it still focuses on laying a solid foundation. Those of you in this class may be working in the industry until 2060. But now FP isn't the future; it is the present, in new languages such as Swift, recent languages such as Scala, and Clojure, and in updates to mainstream languages such as Java, C++, and JavaScript.

What's the story of 2026? Or 2036? I can make only an educated guess. It may sound something like this:

[swap dip dup dip pop] dip dup dip pop

(Who knew the future of programming would sound like a 1940s jazz scat?)

A Different Road

In addition to stretching your mind with Racket's prefix notation, we've occasionally talked about postfix notation this semester. Consider this little postfix interpreter:

    > (postfix 2)

    > (postfix 2 3 +)

The expressions in red are programs in what is sometimes called a stack-based language. Postfix notation, also called Reverse Polish notation, is the first thing that conventional programmers notice when working in a stack-based language. It corresponds to a postfix traversal of a program tree.

We can write longer programs, too:

    > (postfix 2 3 + 5 *)

This program is equivalent to 2 + 3 - 5 . Er, make that (2 + 3) * 5. As long as we know the arity of each procedure, postfix notation requires no rules for the precedence of operators.

My little interpreter is returning the state of its data stack. The parentheses expose that I've implemented my stack using a Racket list. Our stack could finish with more than one value on it:

    > (postfix 2 3 + 5 * 6 3 /)
    '(2 25)

Adding an operator to the end of our program can return us to a single-value stack:

    > (postfix 2 3 + 5 * 6 3 / -)

This points out a more general feature of stack programs: we can compose programs simply by concatenating their source code

    > (postfix 2 3 +)

    > (postfix 2 3 + 2 *)

2 * is like a new program: "double my argument" -- though there are no arguments. All programs read from the stack and leave their results there.

For this reason, another name for this programming style is concatenative programming. It turns out that the stack is really just an implementation detail -- and an especially convenient one.

An Early Concatenative Language: Forth

The earliest deep exploration into the idea of stack-based languages with concatenative programs was Forth. Chuck Moore designed the first version of Forth in 1968 and soon came to use it for programming embedded microcontrollers, which is where the language earned its initial acclaim.

I have implemented a small Forth interpreter for us to use in exploring more features of concatenative programming. It's implemented in the style you have used on Homeworks 10 and 11, so let's take a look:

As we saw above, if we have two sub-programs tha compute partial results, we concatenate two sub-programs to compute a compound result. For example, two plus (two times the square root of 16) is:

    16 sqrt 2 *   2 +
    -----------   ---
    ^             ^
    |             |-- second program
    |-- first program

If we look at a program written in postfix notation, it seems that data flows from right to left, and the application of operators flows from left to right.

With the stack playing such a central role as the source and store of values during program execution, you may not be surprised that concatenative languages provide a number of stack-manipulating operators. Let's take a look at a few...

A Demo of Concatenative Programming: Joy

The programs above are legal programs in Joy, a concatenative language developed and used to explore these ideas. Joy is simple, as are most tools for it, but it has all the ideas we need to understand concatenative programming. Let's look at a few more Joy programs, to see how concatenative programmers think.

In class today, we'll use the standard interpreter created by Joy's creator, Manfred von Thun. See below for a few details on how you can build and run the interpreter on your own. (Note: The : in the examples below is there in lieu of a prompt. von Thun's interpreter doesn't actually give us a prompt!)

In addition to number literals and arithmetic operators, Joy also has

(Notice that Joy uses the single quote differently than Racket uses it.)

So, this program evaluates to true:

    : "abcd" 1 at 'b equal.

So where is the stack? Let's examine this program step by step.

Joy has list literals. Lists have zero-based indices and are written in square brackets. This program yields 3:

    : [1 2 3 4 5] 2 at.

Because Joy is dynamically typed, its lists are heterogeneous. [1 'c 2 "abc" [1 2 3] 5] is a legal list.

    : [1 'c 2 "abc" [1 2 3] 5] 4 at.
    [1 2 3]

Joy has our old friend cons for adding things to lists. Of course, it works on two stack arguments:

    : 5 [2] cons.
    [5 2]

    : 3 5 [2] cons.             # A Joy program returns whatever
    [5 2]                       # is on top of the stack.

    : 3 5 [2] cons cons.
    [3 5 2]

If a value we want to put in a list gets pushed onto the stack on top of the list, we need to swap before consing:

    : [2] 4 3 - swap cons.
    [1 2]

    : 3 5 [2] cons cons 1 swap cons.   # 'swap cons' is so common...
    [1 3 5 2]

    : 3 5 [2] cons cons 1 swons.       # ... there's a primitive!
    [1 3 5 2]

The hash mark is a legal part of this Joy program. It indicates a comment that runs to the end of the line. The pair (* ... *) indicate multi-line comments. (* I've always liked this syntax, from my early days writing Pascal programs. *)

The stack lies at the heart of Joy programming, so the language provides a wide array of primitive functions for manipulating the stack. In addition to swap, we can duplicate the item on top of the stack. The following yields the square of 4:

    : 4 dup *.

In Joy, dup and swap are huge.

    : 2 3 - .

    : 2 3 swap - .

Quick Exercise

Suppose you are given this stack:

    5 2

Write a program to compute 5 squared minus 2 squared.

    5 2 fill in the blank

Sadly, "square" is not a primitive function in Joy, so you'll need to dup and swap.


How about this:

    : 5 2 swap dup * swap -

Yes, that looks weird. So do Python and Java and Racket -- until you get used to them!

... defining new "words":

    : DEFINE inc == 1 +; dec == 1 -.
    : 2 inc 9 dec dup * +.

    : DEFINE square   ==   dup  *.
    : 2 square.

    : 5 2 swap square swap -.

Programs, Big and Small

We say that Racket and languages like it (every language you use?) are applicative. Think about a Racket expression: on (+ 2 3), the evaluator applies the + procedure to its arguments 2 and 3. The + is treated differently from the 2 and the 3. We can pass procs as arguments, but they are treated differently then.

Concatenative programming uses function composition rather than function application. This is the defining difference between languages such as Joy and the the languages most of us use on a daily basis.

    sqrt sqrt     # computes (sqrt (sqrt arg))

To compose sqrt with sqrt in Racket, we need a higher-order procedure:

    (define compose
      (lambda (f g)
        (lambda (x)
          (f (g x)))))

In Joy, to compute f(g(2)), we write 2 g f. To compose f and g more generally, f·g,, f(g(x)), we write simply g f. (By now, you should understand immediately why this is written backwards from the way we usually write this our infix/prefix world!)

In practical terms, writing code in a concatenative language is not all that different from programming in a functional language, except that there is less nesting of functions. Rather than writing:

     (f0 (f1 (f2  ... (fn x) ...))
we could write:
     x fn ... f2 f1 f0
But under the hood, there is a big difference.

Racket is based on the lambda calculus, as are most functional programming languages. The lambda calculus is simple, yet it requires three kinds of term: variables, lambdas, and applications. It also requires several rules for replacing variable names with their values, as well as the concepts of with bindings, closures, and scope. This is quite a bit of complexity.

Concatenative languages have a much simpler core. They require only functions and compositions. We don't even need an evaluation rule, because evaluation is just the composition of functions. It never has to deal with named state, so there are no variables. Without variables, there is no mutation. This means that concatenative languages are in a certain sense more functional than the languages we usually call "functional"!

I just said, "There are only functions and compositions". But wait. Recall this program from above:

    2   16 sqrt 2 *   +
    -   -----------   -

The + is easy enough to fit into the system. But what about the 2? Or the 16 sqrt sqrt part?

All three parts are programs. Everything is a function, being composed by concatenation. 2 is actually a constant function that takes no arguments from the stack:

               2 === (lambda () 2)                  # I'm mixing Racket
     16 sqrt 2 * === (lambda () (* (sqrt 16) 2))    # and Joy syntax...

This approach has a different sort of unity of representation that leads to a need for a new kind of data type for functions.

A Little More Demo

It turns out that the [] we saw in our lists above is more than just list syntax. We now know that 1, 2, and 3 are programs (functions) themselves. The list [1 2 3] is a quoted program. (Think about how lists and quotes and programs relate in Racket.).

We saw earlier that:

The list [4 dup *] is a program. Actually, we say that it is a value representing a program. It can be pushed onto the stack and popped from the stack by another function and used as a value. This corresponds to a function being a "first-class value" in Racket, where we can pass around procedures as data and evaluate them whenever we want.

A quoted program is evaluated by another function, which in Joy is called a combinator. In Racket, we called such things higher-order procedures (and 'combinator' meant something else). A Joy combinator takes one or more quoted programs as input and uses them to create new functions. This is how most programming in Joy works.

The combinator i unquotes a quoted program, like this:

    : [4 dup *] i.

Notice that unquoting a quoted program is equivalent to evaluating it!

Joy has other combinators you will recognize. Many are higher-order functions that you will recognize from Racket, such as map, filter, and fold. For example, map takes a quoted program and applies it to each element of a list:

    : [1 2 3 4] [dup *] map.
    [1 4 9 16]

What about control structures? They, too, are implemented as combinators. Consider the standard if-then-else statement, which in Joy is the ifte operator. It requires that the stack contain three quoted programs: the test, the then block, and the else block.

In this example, the test operates on a number, so the item on the stack just below the test is expected to be a number. The ifte function applies the test program to the number next on the stack. If it is greater than 1000, it applies the then block, thus halving the number; otherwise, it applies the else block, thus tripling it.

    : 1500 [1000 >]  [2 /]  [3 *]  ifte.

    : 150 [1000 >]  [2 /]  [3 *]  ifte.

It takes a while to get used to the postfix syntax, especially for quoted programs that look like curried infix expressions. But, as we saw with Racket and its prefix notation, once you become familiar, you that the notation offers a lot of flexibility.

Though Joy is dynamically typed, its functions do have expectations of the arguments. Most combinators require that the stack contain values of specific types.

Consider filter, another familiar higher-order function. It expects the stack to contain an aggregate data value (such as a list) and a quoted program. The program must return true of false. filter produces a new aggregate of the same type (list, set, string,...) that contains only the elements of the original that "pass the test".

In this example, the quoted program ['Z >] returns true for characters whose ASCII values are greater than 'Z's ASCII value. So we can use it and filter to remove uppercase characters, blanks, and many other special characters from a string: following evaluates to '

    : "John Smith" ['Z >] filter.

fold enables us to reduce a list by combining its members using an operator such as addition or multiplication. It requires three arguments on the stack: the list to be reduced, a quoted program to be run when the list is empty, and a quoted program that operates on two value.

This program computes the sum of the numbers in a list:

    : [2 5 3]  0  [+]  fold.

This program passes a more complex binary operator. It computes the sum of the squares:

    : [2 5 3]  0  [dup * +]  fold.

See the power of concatenation: To square the value before folding, we simply insert a [dup *]!

What about a computation that requires doing two things with each member of the list, such as the computing the average? One way to do this is to do the summing and counting in "two passes": duplicate the list, sum the top version, swap take the length of the second, and then divide:

        dup  0  [+]  fold  swap  size            /
             ------------        -------------
       copy  sum the 1st         count the 2nd   divide

    # find the average of a list

    : [7 17 12]   dup  0  [+]  fold  swap  size  /.

Some combinators make sense only in Joy. For example, there is a step combinator that accesses all the elements of an aggregate in sequence. Joy also has a set of recursive combinators that produce recursive functions. For example, primrec does counted recursion on a number:

    : 5  [1]  [*]  primrec
         -----------------  # factorial

    : DEFINE factorial == [1] [*] primrec.
    : 5 factorial.

As crazy as this looks, there is theory to support how datatypes work in a concatenative language like Joy. If you are interested, take a look.

Time is short, so let's cut our dive into this part of Joy short here as well.

Some Cool Stuff

As we wrote programs above, we thought about composition from left to right: 2 3 +. Can we think about it from right to left? Yes! The result is curried functions. 3 + is a one-argument function that adds 3 to a number.

We can pull out any piece of code from a program, name it, and substitute the name back in. As a result, refactoring is straightforward. That's one of the great advantages of concatenative languages. Something called "row polymorphism" gives us a way to compute the type of any program quote, but keep in mind that argument types matter. Functions care about what's on top of the stack!

Without any restrictions on the associativity of function application and composition, many possibilities arise. For example compiling programs in a concatenative language can be as simple as:

  1. Divide the program into segments of any size.
  2. Compile each segment in parallel.
  3. Compose the resulting segments into the final result.

Thus a parallel compiler for a concatenative language, is plain old map-reduce! As far as I know, this is impossible to do with any other kind of language.

Concatenative Programming in the World

Will Joy be the story of 2025 or 2035? I doubt it, but concatenative programming may be. For a more likely candidate, check out Factor. It aspires to be a more complete tool for sytems programming and has some interesting features, including static type checking and modules. Though it is still young, Factor has good cross-platform support, an IDE with a modern feel, and a growing open-source community.

Whether Joy, Factor, or a language that has not been invented yet, it will probably sound something like this:

[swap dip dup dip pop] dip dup dip pop

Let's hope the language of the future reads better than it sings.


The main Joy website includes source code for the canonical implementation of the language as well as documentation and tutorials.

For this session, I used this blog entry by a Joy newbie for several examples.

Today's code file contains three interpreters:

There are several good tutorials on concatenative languages out on the web, most of which are compatible with my discussion of Joy here. For this lecture, I used Why Concatenative Programming Matters for several examples. The author has links to a programming language of his own design, too. This tutorial goes deep quickly, so don't worry if you can't follow the whole discussion.

Wrap Up

Image Credits

The images in my opening slideshow come from the following sources:

I almost used this XKCD and this slide about FP. If only we had more time...™.


As crazy as this looks, there is theory to support how concatenative languages work. How?

Recall: 2 is a function that takes no inputs and returns one integer, itself. The more common * is a function that takes two integer inputs and returns one integer:

    2 :: () → (int)
    3 :: () → (int)
    * :: (int, int) → (int)

But, as written, we cannot compose these functions. The range of 2 does not match the domain of 3. The range of 2 · 3, whatever that is, does not match the domain of *.

The basic idea is to give every function a generic data type that accounts for the full stack made available to it. Each function can be thought of as taking any input stack, as long as it is "topped" with the values it actually needs. The function then returns the arguments it doesn't use back to the stack, followed by its actual return values, which are pushed onto the top of the stack.

    2 :: (A) → (A, int)
    3 :: (B) → (B, int)
    * :: (C, int, int) → (C, int)

Now we can match B = (A, int) and compose 2 with 3:

    2 3 :: (A) → (A, int, int)

This is quite nice! The meaning of 2 3 is clear. It is a function that takes no input and returns both 2 and 3. Joy thus has functions that return multiple values!

Then we match C = A in the result and compose 2 3 with *:

    2 3   :: (A) → (A, int, int)
    *     :: (C, int, int) → (C, int)
    2 3 * :: (A) → (A, int)

It works! The program 2 3 * takes no inputs from the stack and produces one integer. Note, too, that this is the same as the type of 6, which is the value computed by 2 3 *.

    6 :: (A) → (A, int)

This new approach to typing, called row polymorphism, does just what we need. Thanks to row polymorphism, we have a uniform way to compose functions of different types, in a way that makes the flow of data through a program clear. As a bonus, concatenative languages are able to give us something that applicative functional languages usually don't: functions that return true multiple values.

Eugene Wallingford ..... ..... April 26, 2016