Programming Languages and Paradigms

*"O, thou hast damnable iteration and
are indeed able to corrupt a saint."
-- Falstaff*,
in Shakespeare's

Write a Scheme procedure `(list-member? n lon)`, where
`n` is a number and `lon` is a list of numbers.
`list-member?` returns true if `n` occurs in
`lon`, and false otherwise. For example:

> (list-member? 1 '(1 2 3)) #t > (list-member? 1 '(4 3 2 1 26)) #t > (list-member? 1 '(5 4 3 2 1)) #t > (list-member? 1 '(2 3 4 5)) #f

(Of course, Scheme provides a primitive `member` procedure
that *almost* does the job, but do not use it... We need
the practice!)

How do you approach the task at all? You'll want to compare
`n` to every element in `lon`, and as soon as you
find a match you can return true. How do you know that `n`
is *not* a member of `lon`? Only by examining every
item in `lon` and never finding a match. If your curiosity
gets the best of you,
peek ahead
to a solution...

Quick Exercise: Could we do solve this problem without using recursion at all, and still not usemember? Hint: Think about what we learned in Session 5.(If only Scheme's primitive

orwere a procedure...)

Would you like to know how to write such procedures, and similar but more complex ones? If yes, then you have come to the right place. If not, then you may need a curiosity transfusion! In any case, I hope that you learn something of interest over the next few sessions.

the Philippine Islands [note] |

**Recursion** is a technique for writing programs. Even for
words we think we know, checking out a dictionary definition can
help us to solidify our understanding. You can check out the
definition of
recursion
at
Merriam-Webster's Collegiate Dictionary.
Here is another:

recursion n. (1616)

- the act of returning
- (Math) the repeated application of a procedure to a preceding result to generate a sequence of values
- (Computing) a programming technique involving the use of a procedure ... or algorithm that calls itself ...

To *recurse* is to return to the same place. A function
or procedure can do that.

In computer science, a **recursive program** is one that:

- immediately returns an answer for one or more simple problems, and
- computes answers for more complex problems in terms of the answers to simpler problems, for which it calls itself.

As a result of the second part of this definition, we can see that a recursive program is defined, in part, in terms of itself. In practice, we create a procedure that calls itself from within its body.

We sometimes use recursive relationships to understand mathematical
properties. For example, suppose that we have a series of functions
for finding the power of a number `x`:

to-the-power-of-0(x) = 1 to-the-power-of-1(x) = 1 * x = x * to-the-power-of-0(x) to-the-power-of-2(x) = 1 * x * x = x * to-the-power-of-1(x) to-the-power-of-3(x) = 1 * x * x * x = x * to-the-power-of-2(x) ...

We can turn this into something more usable by turning the power itself into a variable and using an inductive definition to make the pattern explicit:

to-the-power-of( x, 0 ) = 1 to-the-power-of( x, n ) = x * to-the-power-of( x, n-1 )

In fact, there are programming languages -- such as Prolog and Haskell -- in which you write the recursive equations just like that!

In Scheme, we would write:

(define to-the-power-of ; behaves like the built-in procedure expt (lambda (base expt) (if (zero? expt) 1 (* base (to-the-power-of base (- expt 1))))))

The fundamental idea behind recursion is this: If a problem can
be defined in terms of a similar, yet *simpler* problem,
recursion may be a useful tool for expressing a solution.

Quick Exercise: Write a curried version ofpower. Could we use your curried version ofpowerto create procedures for squaring and cubing an arbitrary number? Could we use your procedure to create procedures that compute arbitrary powers of, say, 2 or 10? [ examples ]

More formally, we will say that every recursive program consists of:

- one or more
*base*cases that terminate computation in a pre-defined answer, and - one or more
*recursive*cases that compute solutions in terms of simpler problems. The "limit" of these smaller problems is one of the base cases.

Each recursive case consists of:

- Splitting the data into smaller pieces. For example:
- breaking a list into parts
using
`car`and`cdr` - breaking a positive integer
`n`into 1 and`n - 1`

- breaking a list into parts
using
- Solving the pieces, perhaps with recursive calls. A recursive call is, in effect, a way of assuming that one of the pieces is already solved!
- Combining the solutions for the parts into a single solution
for the original data. For example:
- reassembling a list from its parts using
`cons` - reassembling a positive integer using
`+`

- reassembling a list from its parts using

This is usually where the descriptions of recursion end in our
textbooks. *"Okay,"* you might say, *"great. But how
do I do that??"*

In our last session, we saw that we can use inductive definitions to specify data types. An inductive definition is one that:

- lists one or more specific examples of the type, and
- describes how to construct more complex examples from simpler ones.

Inductive specifications have essentially the same structure as recursive programs. For this reason, inductive data specs -- especially ones formalized in a BNF description -- can serve as a powerful guide for writing recursive programs that operate on the data.

In fact, this guidance is so useful that I offer you a *Little
Schemer*-style commandment based on it:

When defining a program to process
an inductively-defined data type,
the structure of the program should follow the structure of the data. |

To see how this works, let's create a procedure that operates on
a list of numbers. You may recall the definition for a data type
called
`<list-of-numbers>`
from last session:

<list-of-numbers> ::= () | (<number> . <list-of-numbers>)

This BNF definition can serve as a pattern for defining programs
that operate on lists of numbers. A procedure that operates on
a `<list-of-numbers>` will receive one of two things
as an argument:

- the empty list,
`()`, or - a pair whose
`car`is a`<number>`and whose`cdr`is a`<list-of-numbers>`.

According to the data definition, **those are the only
possibilities**!

Such a procedure can examine its argument to determine whether the
object fits the first "arm" of the specification (the empty list)
or the second (a pair). For lists, we commonly use a
`null?` test. This boolean condition serves as the selector
in an `if` or `cond` expression that defines actions
to take for each arm.

For example, suppose that we wanted to define a program for determining the length of a list of numbers. The pattern for a simple procedure is:

(define list-length ; so as not to clobber the primitive procedure (lambda (lon) ... ))

Our first task is to determine the structure of the procedure. Following the rule above, our program's structure should mimic the structure of the BNF specification for the data type. The definition says that a list-of-numbers is either an empty list or a pair. So, we start with the following code:

(define list-length (lambda (lon) (if (null? lon) ;; then handle an empty list ;; else handle a pair )))

Now we can write code to handle the two cases in either order. Often, the base case has a simple answer, so we often write this case first. How should our procedure act when the list is empty? Of course, the length of the empty list is 0, so we add the following code:

(define list-length (lambda (lon) (if (null? lon) 0 ;; else handle a pair )))

Now, we handle the second part of the specification. What if
`lon` is not empty? The BNF for this element states that
such a list of numbers consists of a number followed by a list of
numbers. This tells us that we can decompose our problem into two
subproblems:

- the length of the first part of the pair, and
- the length of the second part of the pair.

What is the length of the `car`? What is the length of
the `cdr`? How do we combine these answers?

The `car` of the list is not itself a list, but it does
contribute one item to the length of the overall list.

The `cdr` of the list is the rest of the list. It, too, is
a `<list-of-numbers>` -- the same data type as the
argument to `list-length`. How can we find its length?
Call `list-length`!

So, the pair has a length of 1 (for the cell that holds the number)
plus the result of `(list-length (cdr lon))`:

(define list-length (lambda (lon) (if (null? lon) 0 (+ 1 (list-length (cdr lon))) ))) > (list-length '()) 0 > (list-length '(a)) 1 > (list-length '(q w e r t y u i o p)) 10

And our definition is complete!

Another way to think about the recursive case is this: Split the list
into its `car` and its `cdr`, which is also a
`<list-of-numbers>`. *Suppose that we already know the
answer for the cdr.* How can we solve the

Notice: We do *not* guard our code against the possibility of
trying to take the `cdr` of a non-list. Written as it is, it
*cannot* make this error! The procedure takes the `cdr`
of its argument only after it knoiws the argument is not the empty
list. But then the only alternative is a pair, which has a
`cdr`.

This assumes that the argument received by `list-length` is,
in fact, a `<list-of-numbers>`. The specification for
the procedure states as much. This *precondition* makes it the
responsibility of the caller of the procedure to provide a suitable
argument. If the caller doesn't, then our procedure is not responsible
for the error. The same is true in a statically-typed language,
though in that case we usually have the compiler to catch the error
for us.

Indeed, to do a check inside `list-length` would
ordinarily result in an unnecessary check, since the caller will be
guarding the call on his end of the computation!

We could use the same technique to implement `list-member?`,
from our
warm-up exercise.
`list-member?` returns true if `n` occurs in `lon`,
and false otherwise.

We now know to pattern our solution on the BNF definition of
`<list-of-numbers>`. So:

(define list-member? (lambda (n lon) (if (null? lon) ;; then handle an empty list ;; else handle a pair )))

In the base case, we know that `n` cannot be a member of an empty
list, so we return false.

(define list-member? (lambda (n lon) (if (null? lon) #f ;; else handle pair )))

In the recursive case, `n` is a member of `lon`
**either** if it is "a member of the `car`"
**or** if it is a member of the `cdr`, the rest of
`lon`. The `car` of the list is a number, so we can
check to see if `n` is equal to it. Scheme provides us with
built-in procedures for expressing both the equality, `=`, and
the disjunction, `or`, so:

(define list-member? (lambda (n lon) (if (null? lon) #f (or (= n (car lon)) (list-member? n (cdr lon))))))

If you wrote a complete solution to the exercise, it probably differed
slightly from this one by using another `if` in the recursive
case. The version here is more faithful to the BNF for our data type
specification and to how we think about the question, so many
functional programmers prefer it. However, either solution is fine.
The most important thing for is that you develop a habit for writing
recursive procedures by thinking in this way.

Quick Exercise: Can we we eliminate thefirstifexpression, too?

When you are first writing procedures of this type, you may well feel uncomfortable "trusting" that your solution works in the recursive case, since it relies on the procedure that you are writing. The only way to overcome this discomfort is to do thorough testing of the procedure -- and to get lots of experience writing recursive procedures!

In order for us to gain strength as recursive programmers, let's
practice on some less intuitive problems. I borrow these examples
from other textbooks, most notably Section 1.2.2 of
Essentials of Programming Languages.
I have used *EOPL* for this course in the past.

These problems are important for two reasons. First, we will use
the procedures we write later in the course and in future homework
assignments. But if that were the only reason they were important,
we would need to understand only **what** they do, but
not how they do it.

The second reason that they are important, though, is that they
illustrate several common **patterns in recursive programs** and
how to implement them. So it will be worth our effort to study in
detail **how** they do what they do.

Our examples today operate on values of a `<list-of-symbols>`
data type. As its name suggests, `<list-of-symbols>` is
quite similar to `<list-of-numbers>`. We can specify this
data type inductively as:

<list-of-symbols> ::= () | (<symbol> . <list-of-symbols>)

`remove-first` takes two arguments, a symbol `s` and a list
of symbols `los`. It returns a list just like `los` minus
the first occurrence of `s`. For example:

> (remove-first 'b '(a b c)) (a c)

Note that `remove-first` does *not* modify the original
`los`. In functional programming, our procedures almost never
modify their arguments; instead, they compute a new value for us.

We start with the familiar pattern for handling list recursion.

(define remove-first (lambda (s los) (if (null? los) ; then handle an empty list ; else handle a pair )))

In the base case, `los` is empty, so the result of removing
the first occurrence of `s`().

(define remove-first (lambda (s los) (if (null? los) '() ; else handle a pair )))

What if `los` is not empty? There are two cases. Either the
first element in `los` is the symbol we want to remove, or it
is not.

(define remove-first (lambda (s los) (if (null? los) '() (if (eq? (car los) s) ; then remove s from the car of los ; else remove s from the cdr of los ) )))

If the `s` is the first element in `los`, what is the
answer returned by `remove-first`? The rest of the list:

(define remove-first (lambda (s los) (if (null? los) '() (if (eq? (car los) s) (cdr los) ; else remove s from the cdr of los ) )))

Now comes the tough case... If the first element of `los` is
*not* the symbol we want to remove, then we need to remove the
first occurrence of that symbol from the rest of the list. What is
the answer to be returned by `remove-first` in this case? We
need a list whose `car` is the `car` of `los`
and whose `cdr` is the list we get by removing `s` from
the *rest* of `los`:

... Show examples of removingbfrom(a b c d)and(e d c b)and(c d e)...

... Draw pictures of lists that show the result ismaking a listfrom a head element and a tail list ...cons!

We reassemble a list from a `car` and a `cdr` using
`cons`. Into which list do we `cons` the `car`
of `los`? The result of removing the first occurrence of
`s` from the `cdr` of `s` -- which
`remove-first` can compute for us!

(define remove-first (lambda (s los) (if (null? los) '() (if (eq? (car los) s) (cdr los) (cons (car los) (remove-first s (cdr los)))))))

And we are done! Let's test our procedure:

> (remove-first 'a '(a b c)) (b c) > (remove-first 'b '(a b c)) (a c) > (remove-first 'd '(a b c)) (a b c) > (remove-first 'a '()) () > (remove-first 'a '(a a a a a a a a a a)) ; count 'em up! (a a a a a a a a a)

Quick Exercise:: Suppose that, instead of(cons (car los) (remove-first s (cdr los)))as the 'else' clause of the secondif, we had just(remove-first s (cdr los))What function wouldremove-firstthen compute?

Our understanding of the list-of-symbols data structure -- and
especially of its BNF description -- guided us well in writing this
procedure. We still have to think, of course. The task presented a
couple of challenges. But **the structure helps know what to think
about**.

The function `remove` behaves like `remove-first`, but
it removes all occurrences of the symbol, not just the first. The
structure of `remove-first` and `remove` are so similar
that we can focus on how to modify `remove-first` to convert it
into `remove`.

The only change we need to make is in the case that we find the symbol in the list. This means:

- In the base case, our answer is still the empty list.
- If the first item in the list does not match the item to remove,
then we still need to
`cons`into our recursive solution.

(define remove (lambda (s los) (if (null? los) ; on an empty list, the '() ; answer is still empty (if (eq? (car los) s) ;; WHAT DO WE DO HERE? (cons (car los) ; we still have to preserve (remove s (cdr los))) ; non-s symbols in los ))))

In `remove-first`, as soon as we find `s` we return
the rest of the `los`, into which are `cons`ed any
non-`s` symbols that preceded `s` in `los`.
But in `remove`, we need to be sure to remove not just the
first `s` (by returning the `cdr` of `los`)
but **all** the `s`'s, including any that may be lurking
in `(cdr los)`. So:

(define remove (lambda (s los) (if (null? los) '() (if (eq? (car los) s) (remove s (cdr los)) ;; *** HERE IS THE CHANGE! *** (cons (car los) (remove s (cdr los))))))) > (remove 'a '(a b c)) (b c) > (remove 'a '(a a a a a a a a a a)) ()

Notice the relationship between the structure of the data and the
the structure of our code. The structure of the data did not change
from `remove-first` to `remove`, so neither did the
structure of the procedure. *A small change in spec resulted in
a small change in code*.

`remove-first` and `remove` demonstrate the basic technique
for writing recursive programs based on inductive data specifications.
This is a pattern you will find in many programs, both functional and
object-oriented. We call this pattern
**structural recursion**.

Structural recursion is the basis for nearly every procedure we
write. Occasionally, we will encounter bumps along the way to
a solution. Rather than pitching structural recursion and flailing
at our code without guidance, we will look for ways to get over, or
around, the bump. Over the next few sessions, we will learn several
techniques that we can use when we encounter difficulties using
structural recursion. The first of these is the
**interface procedure**.

Unless you are omniscient, writing a recursive procedure will
occasionally require "fixing" the procedure along the way instead
of writing it straight through from beginning to end. Consider
the procedure `annotate`, which takes as its only argument
a `<list-of-symbols>`. For example, if we pass to
`annotate`

(jerry george elaine kramer)

it returns a list with each symbol *annotate*d by its
position in the list:

((jerry 1) (george 2) (elaine 3) (kramer 4))

We can use structural recursion to build the framework of our answer:

(define annotate (lambda (los) (if (null? los) ; then handle an empty list ; else handle a pair )))

The base case of the data spec is the empty list. In this case, an empty list can be returned, since there are no items to annotate:

(define annotate (lambda (los) (if (null? los) '() ; else handle a pair )))

Quick Interlude: Be careful with that base case... We've been using()as our result a lot -- but why? Under what conditions might the value of the base case be different?

The inductive case is a symbol followed by a list of symbols. We can
combine the annotated symbol with the rest of the list annotated using
`cons`. The result is:

(define annotate (lambda (los) (if (null? los) '()(cons)))<something computed from (car los)>(annotate (cdr los)))

When we write a procedure that computes a list of answers, one for each item in the original list, we will often use a piece of code that looks just like this. It will constitute a common mechanism for "putting our answer back together".

How can we annotate a symbol? By creating a list consisting of the symbol and its position of the symbol in the list:

(define annotate (lambda (los) (if (null? los) '() (cons (list (car los) position) (annotate (cdr los))))))

Oops! We've run into a slight problem. We need the position of the symbol in the list, but we haven't supplied it anywhere. We could pass the current position down to each recursive call:

(define annotate (lambda (los position) (if (null? los) '() (cons (list (car los) position) (annotate (cdr los) (+ position 1))))))

This does the work we need, but we have two related problems:

- What is the initial value of
`position`, the one used on the*first*call to`annotate`? The caller will have to tell`annotate`to start at position 1! -
`annotate`is defined to take one argument, but we have produced a procedure that requires two. Often, we don't want to change the spec of a program, even when we have the power to do so, because other parts of our code may rely on the specified interface.In the case of

`annotate`, changing the interface requires that*all*calls pass two arguments. Yet we always start annotating with position 1, and now we will have to repeat the 1 in every "first call" call to`annotate`.Finally, we will no longer be able to

`map`this procedure over a list of lists, because it takes two arguments.`map`requires a one-argument procedure. By taking`map`and similar higher-order procedures out of our toolbox, we give up much of the power and productivity in the functional style.

These reasons should persuade us to look for a different solution. Programmers face this problem all of the time and have developed a common "patch". First, rename this version of the solution as a helper procedure:

(define annotate-with-position(lambda (los position) (if (null? los) '() (cons (list (car los) position) (annotate-with-position (cdr los) (+ position 1)))))) (define annotate ;; now write annotate ... (lambda (los) (annotate-with-position los 1))) ;; ... to jump-start the helper

Second, implement `annotate` as
a procedure that calls the renamed procedure:

(define annotate-with-position(lambda (los position) (if (null? los) '() (cons (list (car los) position) (annotate-with-position (cdr los) (+ position 1)))))) (define annotate ;; now write annotate ... (lambda (los) (annotate-with-position los 1))) ;; ... to jump-start the helper

We call the new `annotate` an **interface procedure**. It
serves as an interface to the procedure that does the real work.

Creating an interface procedure is a common practice in many kinds of
programming, including functional programming. It allows us to write
our code *naturally* -- in the way that follows our understanding
of the problem -- even when the task becomes complicated, without
disturbing the tranquility of the world in which the procedure resides.

The interface procedure pattern illustrates a valuable wisdom: When
you encounter a difficulty implementing structural recursion, *don't
give up on the technique*. We are following the structure of our
data for many good reasons. Instead of giving up, solve the new
difficulty. The problem we encountered while implementing
`annotate` is so common that other programmers have developed
a standard solution. This wisdom generalizes beyond structural
recursion to any well-justified technique, including most every
design pattern we use.

Take a close look at this image. The big island is Luzon, one of the Philippine Islands. In the middle of Luzon is Lake Taal. Inside Lake Taal is Vulcano Island. Notice Crater Lake, the dot of water in the middle of Vulcano Island. Crater Lake holds a unique distinction. It is the largest lake on an island in a lake on an island in the world.

How's that for recursion?

You can see this image along with a few other fun lake/island combinations at The Island and Lake Combination.

- Reading -- Read Chapters 1-3 of
*The Little Schemer*.*Make sure to*.**study today's examples**of recursion**carefully**. Then, begin to use the techniques learn as you work on... - Homework 4, which is available now and due a week from today.