Session 12

Recursive Programs and Functional Style


CS 3540
Programming Languages and Paradigms


an array with max and second max marked

Opening Exercise

This is an easy problem to visualize. Suppose you have a list of at least two numbers:

    (6 1 2 3 9 4 1 2 8 1 2 4)

Your task:

Write a function named (2nd-max lon) that returns the value of the second-largest item in lon.

You may solve this problem with or without recursion. Without, you will probably need higher-order functions. If you would like to use a function but aren't sure whether it's a Racket primitive, ask.

Bonus points to the shortest solution and to the most efficient solution!

~~~~~

How many of you wrote recursive solutions? Functional ones? How short? How efficient? What ideas did you try?



Functional Solutions

If run-time efficiency is no concern, we can solve this in almost no code: sort the array from largest to smallest, then return the second element.

    (define 2nd-max
      (lambda (lon)
        (second (sort lon >))))

Note that Racket has a primitive sort function that takes as arguments a list of values and a function for comparing the items in the list. sort can handle a list of any type as long as the comparator works on the values.

If the thought of sorting the entire list to find two items makes you uneasy, then we might try another approach:

That may sound like it requires looping, but think back to the functional style of thinking we first saw in Session 6 when we computed BMIs. We can use the Racket primitives apply and max to do Steps 1 and 3. For Step 2, we need a function that can remove an item from a list. We wrote such a function for lists of symbols in Session 9, called remove-first. It turns out that Racket has a primitive named remove that does the same thing, but for any type of list!

All we need to do is invoke this sequence functionally:

    (define 2nd-max
      (lambda (lon)
        (apply max                   ; step 3
               (remove               ; step 2
                   (apply max lon)   ; step 1
                   lon))))

... how do these compare to other solutions the class offered?

The first solution is O(n log n). The latter is O(n), but makes three passes down the list. Can we do better?



Recursive Solutions

A list of numbers such as:

    (6 1 2 3 9 4 1 2 8 1 2 4)
has an inductive definition:
    <list-of-numbers> ::= ()
                        | (<number> . <list-of-numbers>)

We know that the lists we are processing have at least two items, but after we consider those two, we have a good old list of numbers.

How might we approach this problem if we could write a loop? We would probably create two local variables, largest and second, and initialize them using the first two items in the list. Then we would look at each remaining item to see if it is greater than either of these items and, if so, update the variables. Easy enough, though we have to think about several cases.

We can use structural recursion to build a function that works in this way:

    (define 2nd-max
      (lambda (lon)
        (if (null? lon)
            ; return the answer
            ; handle a pair
        )))

In both cases, though, we need access to the same two local variables. The then clause needs to return the value of second. The else clause needs to compare the next item in the list to largest and second and update them accordingly. If our function is to have access to these variables, it will have to take them as arguments.

So, our recursive function has to be a helper:

    (define 2nd-max-tr
      (lambda (largest second lon)
        (if (null? lon)
            second
            ; handle the pair with
            ; a recursive call
        )))

We can implement the recursive case using the same sort of if expressions we saw in our algorithm above:

    (define 2nd-max-tr
      (lambda (largest second lon)
        (if (null? lon)
            second
            (cond ((> (first lon) largest)
                        (2nd-max-tr (first lon) largest (rest lon)))
                  ((> (first lon) second)
                        (2nd-max-tr largest (first lon) (rest lon)))
                  (else (2nd-max-tr largest second (rest lon)))))

Now we can write 2nd-max as an interface procedure. It initializes the variables with the values of the first two items in the list:

    (define 2nd-max
      (lambda (lon)
        (2nd-max-tr
              (max (first lon) (second lon))
              (min (second lon) (first lon))
              (rest (rest lon)))))

The result is a function that examines each item in the list exactly once and finds the second-largest item. Our recursive solution is longer but more efficient: it makes only one pass over the list.

The helper function is a bit complicated, though. We handle all three possibilities for (first lon) -- largest, second largest, and neither -- with if expressions and repeated versions of the recursive call. Perhaps we can do better. Let's use the interface procedure as inspiration for something simpler:

    (define 2nd-max-tr
      (lambda (largest second lon)
        (if (null? lon)
            second
            (2nd-max-tr
               new value of largest
               new value of second
               new value of lon
            ))))

In an imperative solution, we would assign new values to the two variables and branch to the top, which updates the position in the array for the next pass. Doing the same thing here, on the recursive call, simplifies the helper. After a little work, we arrived at:

    (define 2nd-max-tr
      (lambda (largest second lon)
        (if (null? lon)
            second
            (2nd-max-tr
               (max largest (first lon))
               (max second  (min largest (first lon)))
               (rest lon)))))

Notice that 2nd-max-tr is tail-recursive, an idea we learned about last time. It looks a bit like a loop!



Comparing Our Solutions

Note: This section is just an outline of the things we talked about in class. I'll fill it in as time permits.

How might we compare these solutions?

    ... code length
    ... complexity for programmer
    ... run-time efficiency
        ...
    ... familiarity

What happens if we are not given the precondition that 2nd-max always receives a list with at least two items? How should our function act if it receives fewer than two items?

    ... we can guard against the "error" case in all solutions.
    ... when we have an interface procedure, we can guard once
    ... after that, we recurse on the full definition of a list of
        numbers.  No need to test (rest (rest lon))!

... the difference between writing and reading code.
... refactoring.
... local variables.



Static Properties of Variables

Now, on to a new topic. Next session, and for most of the rest of the course, we will use what we have learned about functional and recursive programming to explore issues in programming languages and their paradigms. We begin with the static properties of variables.

A property is static when its value can be determined by looking at the text of a program. A property is dynamic when the program must be executed in order to determine the property. We can use static properties of a program to detect errors and to improve program performance at interpretation or compile time.

We have already discussed one property of variables this semester: their data type. In Ada, the data type of a variable is static. In Java, it is static, with a twist allowed by the substitutability of objects of different classes. In Racket, the data type of a variable is dynamic -- though we typically write our code with a specific set of values and operations in mind.

In this session, we will begin to explore some of the static properties that variables can have.



A Little Language

Much of our work the rest of the semester will use small languages as a way to study programming language features. Today we introduce such a language. We will use a family of similar languages to study different language features throughout the course.

This little language has very few features, but it has just enough for us to study the topic du jour, which is the static property of free and bound variables.

     <exp> ::= <varref>
             | (lambda (<var>) <exp>)
             | (<exp> <exp>)

This little language provides a bare minimum of features: an expression is either a variable reference, a function of one parameter, or an application of a function to one argument.

Quick Exercise: Earlier, we discussed the three things that every programming language has. Which of these features does our little language have? Which features is it missing?

Quick Answer: This language does not have primitive values of the sort you are used to; its primitives are variable references. Its means of combination is a subset of Racket's parenthesized operations. The language has procedural abstraction via lambda. If its variables could name something, it would have naming abstraction, too.

This language is Racket-like in its use of parentheses and its use of lambda to define a nameless function, but it is universal in the features it provides. All languages you know and love have the same three features, plus many more. We will add more features to this language as time goes by, always with an eye to how our language processors would manipulate programs written in the language.



Free and Bound Variables

As you know, a variable can have a value or not have a value. Of course, in some languages, such as Java, every variable has a value, even if the program does not initialize it. For example, a Java integer defaults to 0, and object variables default to null.

Unfortunately, null which isn't really a value at all. Many of you encounter run-time "null pointer exceptions" as you learn how to write Java programs. These exceptions point out that the variable in question doesn't really have a value, and as a result the program cannot proceed.

On top of this idea of a variable having a value is the idea of how the variable receives its value. If a variable has the same name as the formal parameter on a method, then we know that the variable is "bound" to that declaration, and that the value of the variable will be the value passed as the corresponding argument to the function.

    int sumOfSquares( int m, int n )
    {
        // m and n are bound to formal parameters
        return m*m + n*n;
    }

Let's define "boundedness" and some related ideas that apply to all programming language:

In Racket and our little language, above, lambda expressions define functions and so are the source of boundedness.

Examples

Now let's see some examples from our little language that illustrate these concepts:

We often write combinators in Racket. For example, the compose function, which implements the sort of function composition you learn about in algebra, is a combinator:

    (define compose
      (lambda (f g)
        (lambda (x)
          (f (g x)))))

However, some of the functions you may think of as having only bound variables have free variables. This function is not a combinator:

    (define sum-of-applications
      (lambda (f x y)
        (+ (f x) (f y))))

Why? Because + is a free variable! Remember: Racket functions are bound to names in just the same way as any other value. You can verify this for yourself by evaluating:

    > (let ((+ *))
        ((lambda (f x y)
           (+ (f x) (f y)))
         add1 1 4))
    10

One note before we proceed. We will be writing programs to process programs our little language. The data type we will be processing is the grammar of the little language. That grammar uses Racket list notation, but that is an implementation detail. Calls to car, cdr, and cons will litter our code with implementation details. They will also obscure what our code means.

Soon we will write functions to access the parts of the expressions in our data type. These functions will allow us to

The former are called type predicates, and the latter are called access procedures.

We will read more about these next time.



Wrap Up



Eugene Wallingford ..... wallingf@cs.uni.edu ..... February 15, 2018