Session 9
Writing Recursive Programs
Opening Exercise
Suppose that I have a list of course enrollment numbers and need to know if any of the sections is too big for the classrooms available to us.
After I map a function over this list, I have a list containing just the current section enrollments:

Write a function (any-bigger-than? n lon)
,
where n
is a number and lon
is
a list of numbers.
any-bigger-than?
returns true if any number in
lon
is bigger than n
, and false
otherwise.
For example:
> (any-bigger-than? 2 '(2 1 3)) #t > (any-bigger-than? 40 '(26 37 41 25 12)) #t > (any-bigger-than? 40 '(26 37 14 25 12)) #f
If you feel stumped: How would you solve this problem in Python?
Thinking About a Solution
How would you approach this task in a loopy language? You might
write a loop that compares n
to each successive
element in lon
. As soon as you find a larger list
item you can return true. How do you know there are no
numbers in lon
larger than n
? Only by
examining every item in lon
and never finding one.
for i in lon: if i > n: return true return false
This is the same kind of thinking you need to write a recursive function in Racket. We don't need a loop, because we walk down the list recursively.
Could we use MapReduce to solve this problem? Would we want to?
We can, but we may not want to. MapReduce processes every item in the list. As soon as we find one value that meets the criterion, our function can return its answer. Continuing down the rest of the list is unnecessary.Don't feel bad if this problem seems like a big challenge at this point. Most things are difficult when we lack the knowledge we need to solve them. Sometimes, we have the knowledge but don't have a clear plan for which knowledge to use when, or why.
Over the next few sessions, we will learn some techniques that help us think about problems and recursive solutions in a new way. These techniques will be quite useful when we move on to processing languages.
A Quick Tour of Recursive Programs

the Philippine Islands
[optional note]
To recur is to go back. Recursion is a technique for writing programs that goes back to the same function for help.
Loops go back, too. What is different about functions?
As you saw in the "99 Bottles of Beer on the Wall" example, iteration and recursion are pretty similar. We make the same decisions and solve the same sub-problems no matter which one we are using. The footnote on that example went even further: recursion is more powerful than a loop, because it gives us more control over the order in which we do things!
Be not afraid. If you can write a loop, then you can write recursive functions, too.
Recursive programs pair up nicely with inductive specifications:
- A recursive program returns an answer for one or more simple members of the set.
- It computes answers for more complex problems in terms of the answers to simpler problems. It can call itself to solve some of the simpler problems.
More formally, we will say that every recursive program consists of:
- one or more base cases that terminate computation in a pre-defined answer, and
- one or more recursive cases that compute solutions in terms of simpler problems.
The "limit" of the smaller problems is one of the base cases.
The base cases are often pretty straightforward, though we still have to think. They encode our understanding of the problem by recording the answer to the smallest problem(s) in the domain.
Each recursive case consists of three steps:
- Split the data into smaller pieces.
- Solve the pieces.
- Combine the solutions for the parts into a single solution for the original argument.
This is usually where the descriptions of recursion end in our textbooks. "Okay," you might say, "great. But how do I do that?" The goal of the next few weeks is to help you feel this in your bones: Recursion doesn't have to be scary. Sometimes, it's all about the data.
The driver for this process is the the type of the arguments that our function processes. This is where inductive specifications can help.
Writing Recursive Programs for Inductively-Specified Data
In our last session, we saw that we can use inductive definitions to specify data types. An inductive definition is one that:
- lists one or more specific members of the type, and
- describes how to construct more complex members from simpler ones.
Inductive specifications have essentially the same structure as recursive programs. For this reason, inductive data specs — especially ones formalized in a BNF description — can serve as a powerful guide for writing recursive programs that operate on the data.
In fact, this guidance is so useful that I offer you a Little Schemer-style commandment based on it:
When defining a program to process
an inductively-defined data type,
the structure of the program
should follow
the structure of the data.
This rule tells us that if the inductive definition of the input data consists a choice among subtypes, then the recursive definition of the function should consist a choice among those same subtypes.
From there, we rely on the data types of the function's input and output to guide the three steps of a recursive function:
- We split the data into sub-problems based on the type of the argument our function receives.
- When we deal with the sub-problem that is topologically similar to the original problem, we solve that sub-problem with a recursive call.
- We combine the solutions for the sub-parts based on the type of value returned by the function.
An Example: list-length
To see how this works, let's write a function
list-length
that takes a list of numbers as an
argument and returns the length of the list, just like the
primitive list
function
.
You may recall the definition for the data type called <list-of-numbers> from last session:
<list-of-numbers> ::= () | (<number> . <list-of-numbers>)
This BNF definition can serve as a pattern for defining any
program that operates on lists of numbers. A function that
operates on a <list-of-numbers>
will receive
one of two things as an argument:
- the empty list,
()
, or -
a pair whose
car
is a<number>
and whosecdr
is a<list-of-numbers>
.
According to the data definition, these are the only possibilities! There are no other cases to worry about.
Consider
the example list,
(26 37 41 25 12)
. From the perspective of our
function, this list is the pair (26 . (37 41 25 12))
.
But our function can't see any of the numbers in the 'rest' until
it follows the link, so in reality, the list appears to be
(26 . <list-of-numbers>)
.
You might think that "hiding" the rest of the numbers will somehow make our job harder, but in fact it is the source of our power! We can now focus on solving a single task: combining the results.
The definition of a <list-of-numbers>
consists
of a choice. A function that operates on a
<list-of-numbers>
will have to make the same
choice: is the argument an empty list or a pair? For
lists, we use null?
to make this choice. This
boolean condition serves as the selector in an if
or
cond
expression that defines actions to take for each
arm.
We can start writing (list-length lon)
with the
pattern for a function that we have used many times before:
(define (list-length lon) ... )
Following the rule above, our program's structure should mimic the structure of the BNF specification for the data type. A list of numbers is either an empty list or a pair. So, we start with the code for a choice:
(define (list-length lon) (if (null? lon) ;; then: handle an empty list ;; else: handle a pair ))
Now we can write code to handle the two cases in either order. Often, the base case has a simple answer, so we usually write this case first. How many numbers are there in an empty list? Zero, so return 0:
(define (list-length lon) (if (null? lon) 0 ;; else: handle a pair ))
Now, we handle the second part of the specification. What if
lon
is not empty? The BNF for this element states
that such a list of numbers consists of a number followed by a
list of numbers:
<list-of-numbers> ::= (<number> . <list-of-numbers>)
This tells us that we can decompose our problem into two sub-problems:
- the length of the first part of the pair, and
- the length of the second part of the pair.
How many numbers are in the first
? How many numbers
are in the rest
? How do we combine these answers?
The first
of the list is a number, not a list. It
contributes one number to the length of the overall list.
The rest
of the list is the rest of the list. (Ha!)
It, too, is a <list-of-numbers>
— the
same data type as the argument to list-length
. How
can we find its length? Call list-length
!
How do we combine two numbers that are counting the parts of a whole list? This function returns an integer, the total number of numbers in the list. So we add the two numbers together.
So, the pair has a length of 1, for the number in the cons cell
we are processing, plus the result of
(list-length (rest lon))
:
(define (list-length lon) (if (null? lon) 0 (+ 1 (list-length (rest lon))) )) ; can use add1
Let's try it out:
> (list-length '()) 0 > (list-length '(42)) 1 > (list-length '(1 10 100 1000 10000 100000 2 4 6 8)) 10
And our definition is complete!
As we take successive rest
s of the list, we will
eventually encounter the empty list, which is our base case. But
we don't have to think about that now. We received either an
empty list or a pair.
In fact, you almost never want to think like that, walking down the list in your head. Taking that first step lands you in quicksand, which only draws you further into complications that won't help.
Notice: We do not add an explicit guard to our code so
that we don't try to take the rest
of a non-list.
Our code cannot make this error! The function takes the
rest
of its argument only after it knows the argument
is not the empty list. But then the only alternative is a pair,
which always has a cdr
that is a list.
This requires that we assume the original argument received by
list-length
is, in fact, a
<list-of-numbers>
. The specification for the
function states as much. This precondition makes it the
responsibility of the caller of the function to provide a suitable
argument. If the caller doesn't, then our function is not
responsible for the error. The same is true in a statically-typed
language, though in that case we usually have the compiler to
catch the error for us.
A Solution for any-bigger-than?
With our new knowledge of how to write
recursive functions for inductively-specified data,
let's return to any-bigger-than?
from
our warm-up exercise:
Write a function (any-bigger-than? n lon)
,
where n
is a number and lon
is
a list of numbers.
any-bigger-than?
returns true if any number in
lon
is bigger than n
, and false
otherwise.
For example:
> (any-bigger-than? 2 '(2 1 3)) #t > (any-bigger-than? 40 '(26 37 41 25 12)) #t > (any-bigger-than? 40 '(26 37 14 25 12)) #f
Notice: any-bigger-than?
takes a
<list-of-numbers>
as its main argument and
returns a boolean value.
We now know to pattern our solution on the BNF definition of
<list-of-numbers>
. So:
(define (any-bigger-than? n lon) (if (null? lon) ;; then: handle an empty list ;; else: handle a pair ))
There are no numbers in an empty list, so we know that there are
no numbers bigger than n
. In the base case, then,
we return false.
(define (any-bigger-than? n lon) (if (null? lon) #f ;; else: handle a pair ))
In the recursive case, n
is smaller than a member of
lon
when:
-
either it is smaller than the
first
-
or it is smaller than a member of the
rest
The first
of the list is a number, so we can compare
it directly to n
. The rest
of the list
is a list of numbers, so we let our function solve that case for
us.
Racket provides us with built-in functions for expressing both
the comparison, <
, and the disjunction,
or
, so:
(define (any-bigger-than? n lon) (if (null? lon) #f (or (< n (first lon)) (any-bigger-than? n (rest lon)))))
If you wrote a complete solution to the exercise, it may have
different from this: you may have used another if
in
the recursive case. The version here is more faithful to the BNF
for our data type specification and to how we define the answer,
so most functional programmers prefer it. Also, the
if
expression requires an else clause, which is an
opportunity to make an error in the code. However, the two
solutions compute the same value. The most important thing is
that you develop a habit for writing recursive functions by
thinking in this way.
When you are first writing functions of this type, you may well feel uncomfortable trusting that your solution works in the recursive case, because that means relying on the function that you are writing. The only way to overcome this discomfort is to get lots of practice writing recursive functions. This will create a comfort level that the techniques you are using really do work. Of course, we will also test our functions thoroughly. Trust only goes so far.
Could we use higher-order functions to solve these problems?
Can we implement functions likeany-bigger-than?
and list-length
without recursion, using only
higher-order functions? We can! If you'd like, give it a try,
and then check out
this code file
to see some solution, as well as a discussion of a similar
function, list-member?
.
Manipulating Lists of Symbols
In order for us to gain strength as recursive programmers, let's practice on some less intuitive problems. That will force us to think more about both the problem and the solution.
These problems are important for two reasons. First, we will use the functions we write later in the course and in future homework assignments. But if that were the only reason they were important, we would need to understand only what they do, but not how they do it.
The second reason that they are important, though, is that they illustrate several common patterns in recursive programs and how to implement them. So it will be worth our effort to study in detail how they do what they do.
The rest of our examples today operate on values of the
<list-of-symbols>
data type. As its name
suggests, <list-of-symbols>
is quite similar to
<list-of-numbers>
. We can specify this data
type inductively as:
<list-of-symbols> ::= () | (<symbol> . <list-of-symbols>)
The remove-upto
Function
remove-upto
takes two arguments, a symbol
s
and a list of symbols los
. It returns
a list just like los
, minus all the symbols before
the first occurrence of s
. For example:
> (remove-upto 'b '(a b c)) (b c) > (remove-upto 'a '(b d f g a a a a a a)) (a a a a a a)
Note that remove-upto
does not modify the
original los
. In functional programming, our
functions almost never modify their arguments; instead, they
compute a new value for us.
We start with the familiar pattern for handling list recursion.
(define (remove-upto s los) (if (null? los) ; then: handle an empty list ; else: handle a pair ))
In the base case, los
is empty, so there are no
symbols to remove. So, we return ()
.
(define (remove-upto s los) (if (null? los) '() ; else: handle a pair ))
What if los
is not empty? Then it is a pair of the
form (<symbol> . <list-of-symbols>)
.
Our answer depends on (first los)
. If it is
s
, then the answer is los
itself. If
not, then we drop it and all of the symbols up to s
in (rest los)
.
We can write that choice with another if
expression:
(define (remove-upto s los) (if (null? los) '() (if (eq? (first los) s) los ; else remove up to s from the rest of los ) ))
How can we remove all of the symbols up to the first
s
in (rest los)
? The rest of the list
is a list of symbols, so remove-upto
can do
that for us!
(define (remove-upto s los) (if (null? los) '() (if (eq? (first los) s) los (remove-upto s (rest los)))))
We are done! Let's test our function:
> (remove-upto 'a '(a b c)) '(a b c) > (remove-upto 'b '(a b c)) '(b c) > (remove-upto 'd '(a b c)) '() > (remove-upto 'a '()) '() > (remove-upto 'a '(b d f g a a a a a a)) (a a a a a a)
Our understanding of the <list-of-symbols>
data
structure — and especially of its BNF description —
guided us well in writing this function. We still have to think,
of course. But the structure helps know what to think
about.
The remove-first
Function
Now let's try a similar but slightly trickier problem.
remove-first
takes two arguments, a symbol
s
and a list of symbols los
. It returns
a list just like los
minus only the first
occurrence of s
. For example:
> (remove-first 'b '(a b c)) (a c)
Note again that remove-first
does not modify
the original los
. It will build a new list.
We start with the familiar pattern for handling list recursion.
(define (remove-first s los) (if (null? los) ; then: handle an empty list ; else: handle a pair ))
In the base case, los
is empty, so there is no first
occurrence of s
to remove. So, return
()
.
(define (remove-first s los) (if (null? los) '() ; else: handle a pair ))
What if los
is not empty? Then we need to remove
the first occurrence of s
from a pair, if there is
one. There are two cases. Either the first element in
los
is the symbol we want to remove, or it is not.
(define (remove-first s los) (if (null? los) '() (if (eq? (first los) s) ; then remove s from the first of los ; else remove s from the rest of los ) ))
If the s
is the first element in los
,
then it is the first occurrence, and we want to remove it.
remove-first
can return the rest of the list:
(define (remove-first s los) (if (null? los) '() (if (eq? (first los) s) (rest los) ; else remove s from the rest of los ) ))
Now comes the tougher case... If the first element of
los
is not the symbol we want to remove,
then we want to include it in our answer, and we need to remove
the first occurrence of that symbol from the rest of
the list.
What is the answer to be returned by remove-first
in
this case? We need a list
-
whose first item is the
first
oflos
, and -
whose
rest
is the list we get by removings
from the rest oflos
.
We reassemble a list from a first
and a
rest
using cons
. We get the
rest
of the new list by removing s
from the rest of los
.
(define (remove-first s los) (if (null? los) '() (if (eq? (first los) s) (rest los) (cons (first los) (remove-first s (rest los))))))
And we are done! Let's test our function:
> (remove-first 'a '(a b c)) '(b c) > (remove-first 'b '(a b c)) '(a c) > (remove-first 'd '(a b c)) '(a b c) > (remove-first 'a '()) '() > (remove-first 'a '(a a a a a a a a a a)) ; count 'em up! '(a a a a a a a a a)
Once again, our understanding of the structure of a list of symbols guided us well. Following the BNF description helps us know what to think about as we consider the specific details of the task.
Wrap Up
-
Reading
- Read today's notes. Make sure to study today's examples of recursion carefully. Then, begin to use the structural recursion technique as you work on...
-
For a little more practice, check out
this short reading
on the
remove
function.
-
Homework
- Homework 4, which will be available soon and is due in a week.