CS 3540: Session 7

Session 7
Thinking Functionally

Programming with Higher-Order Functions

Opening Exercise: A More Efficient Map

Last time, we had a file listing the masses of any number of modules, one per line:

> (file->lines "modules.txt")
'("12" "14" "1969" "100756")

... and wrote a function to compute the total amount of fuel needed to send all of the modules into space:

(define total-fuel
  (lambda (filename)
    (apply +
      (map mass->fuel
        (map string->number
           (file->lines filename))))))

This solution walks the list of modules *four* times! If the database is large, this might be prohibitively expensive. We can't do anything about loading the file or adding up the answers just yet, but...

YOUR CHALLENGE: call map only once!

For example:

> (define modules (file->lines "modules.txt"))
> (map (lambda
         (module) ?????)
    modules)
'(2 2 654 33583)

Ask yourself:

What do the values passed to your lambda expression look like?
How can you convert one of those values into the value we need to add up?

A Solution

Starting with one call to map as our foundation, we can focus on the rest of the task. modules is a list of strings. map will pass our lambda expression one string at a time. Our lambda should return the amount of fuel needed for that module.

Given a module string, we want to convert it to a number (mass) and ask mass->fuel to convert it to a fuel total. We can do those tasks together with a nested call:

(map  (lambda (module)
        (mass->fuel
          (string->number module)))
      modules)

If we prefer, we can name the lambda:

(define module->fuel
  (lambda (module)
    (mass->fuel (string->number module))))

and pass the named function to map:

(map module->fuel modules)

That's not so bad, is it?

Now we can add them up to compute the total fuel needed:

(apply + (map module->fuel modules))

We can also use the result from mapping to compute the module requiring the least fuel:

(apply min (map module->fuel modules))

or the average fuel needed:

(apply average (map module->fuel modules))

or name it and use it later when we know what we want to do:

(define fuel-needs
  (map module->fuel modules))

We can also put all the parts together to solve the original problem:

(define total-fuel
  (lambda (filename)
    (apply +
      (map module->fuel
           (file->lines filename)))))

A sequence of maps can usually be collapsed into a single map with a more powerful helper function.
Consider Problem 5 on Homework 3... after a quick review of the programming patterns.

Recap: Programming with Higher-Order Functions

For the last couple of sessions, we have been trying out a new way to write programs: asking functions to do more work for us. We found that apply and map will do a lot of work for us, if only we supply them with a helper function.

The purpose of map is to process all the items in a list in the same way:

a diagram with two lists of the same length, showing a function being applied to each item in the first list and producing the corresponding item in the second list

The purpose of apply is to combine all the items in a list into a single value:

a diagram with a list of values and a single value, showing a function being applied to all of the items in the list and producing the value

Used together, they create a powerful design pattern for solving an entire class of problems:

(apply reducer
  (map item-function
       list-of-values))

Why is programming this way such a challenge for you? You are used to thinking about problems in a different way. When you encounter a problem to solve, you start thinking about it — breaking it down into parts, solving the parts, putting the parts back together — in a particular way. These are habits you have learned and practiced for at least a couple of semesters.

Changing your mindset to a functional approach requires you to establish new habits and to break old ones. Creating new habits is a challenge, even when we want to change.

I know that many of you are not asking to change habits, or develop a new programming style. But it will make you a better programmer, and it will prepare you for something that is happening in industry right now. Give it a try, and you will be surprised.

One thing we can do when we are trying to break old habits and create new ones is to watch for the triggers that cause us to fall back into an old habit and have a plan for what to do instead. Let's see if we can identify triggers for some common procedural habits and match them up with alternative courses of action in functional programming.

An Exercise to Set the Stage

Suppose that we had the data for Homework 3's margin of victory problem in a Python list:

>>> games = [ [102, 51], [78, 67], [53, 94],     # first six games
              [ 64, 75], [54, 71], [64, 68] ]

Write Python code to solve the problem:

Write a function named total-margin that takes one argument, a list of this form. The function returns the total of all the differences in the list.

For example:

>>> total-margin(pairs)
135

Python has an abs function, too. But this is Python, so you can use assignment statements and a loop!

If you don't know Python, imagine that your data is in a language you do know — a Java arraylist, a C array, ... — and write your code in that language.

A Solution

Unshackled from the chains of Racket and functional programming, we might produce code that looks like this:

total_margin = 0
for p in games:
    difference = abs(p[0]-p[1])
    total_margin += difference

That's a pretty typical way for a procedural programmer to tackle the problem in many different languages. Notice that we don't have to access any individual number in the lists of lists, because we have a loop that treats the main list as a list of games; within that loop, we access the elements of a single game.

Now let's try to diagnose how we wrote the code and how we might think differently.

Growing a Solution in Python

Let's look closer at our loop:

total_margin = 0
for p in games:
    difference = abs(p[0]-p[1])    # 1. do something with p
    total_margin += difference     # 2. accumulate result from #1

This loop does two kinds of thing: process the elements of the list and process those results. In functional programming, we usually try to separate different tasks into different pieces of code. Each function should have one responsibility.

Trigger: A loop does different kinds of action.
Action: Decompose the loop into separate loops, each with a single responsibility.

Let's start with the do something with p part of the loop. Instead of processing the results immediately, we can save them to be processed later.

results = []
for p in games:
    difference = abs(p[0]-p[1])    # 1. do something with p
    results.append(difference)     # 2. record result from #1

In functional programming, we like to let functions help us solve problems. Let's factor the do something with p action out into its own function:

def margin_for(two_list):
    return abs(two_list[0]-two_list[1])

... and then use the new function in our solution:

results = []
for p in games:
    difference = margin_for(p)     # 1. do something with p -- in a fxn
    results.append(difference)     # 2. record result from #1

Trigger: A loop that does something with every item in a collection.
Action: Map a function over the list.

map implements the entire loop. We just have to give it the margin_for() function to apply to each item:

results = map(margin_for, games)

Yes, that is Python! The Python map function produces a "map object" that we can loop over, not a list, but the idea is similar.

We have made progress toward our solution:

games   = [ [102, 51], [78, 67], [53, 94], [64, 75], [54, 71], [64, 68] ]
results = map(margin_for, games)
# [51, 11, 41, 11, 17, 4]

Now, let's implement the second part of our original loop: add up the results:

total_margin = 0
for r in results:
    total_margin += r               # 1. accumulate sum from item r

Trigger: A loop that combines the value for every item into a single answer.
Action: Use a reducing function.

Python doesn't have one simple way of reducing a map object, but it does have a sum function that operates on a list. So we can replace the entire total_margin loop with:

total_margin = sum(list(results))

We can even get rid of the temporary variable results by substituting the expression that computes it in place of the name:

total_margin = sum(list(map(margin_for, games)))

In Racket, we have been using apply when we reduce lists. apply implements the entire loop. We just have to give it a reducing function, such as + or average.

Note: That last line of code is not the way we would solve this problem in Python using functional style. The Pythonic way is to use a list comprehension.

Growing a Solution in Racket

Now that we know the triggers, we can think about implementing our solution functionally in Racket. First, let's port our data back to a Racket list...

(define games
  '((102 51) (78 67) (53 94) (64 75) (54 71) (64 68)))

... and our margin-for() function to Racket...

(define margin-for
  (lambda (two-list)
    (abs (- (first two-list)
            (second two-list)))))

Now we can map margin-for() over the list...

(define results (map margin-for games))

... use apply to total up our results...

(define total-margin (apply + results))

Of course, we can do this without a temporary variable in Racket, too:

(apply + (map margin-for games))

And that is the body of the function we need to write:

(define total-margin
  (lambda (games)
    (apply + (map margin-for games))))

The apply expression can use any reducing function. Operators such as + and min are built-in functions that apply can use to combine values. average is a custom function we wrote for apply to use when combining values.

A Style of Programming

From the last few sessions and Homework 3, we have been using a common programming pattern: map a function over a list, then apply a reducer to turn map's result into a single answer.

(apply reducer
  (map item-function
       list-of-values))

Our solution to Session 6's opening exercise does that:

(apply string
  (map first-char
       list-of-strings))

It processes a list of strings to create a list of characters and then reduces that list into a single string. Solutions to Problems 3 through 5 on the homework do something similar.

There are plenty of slight variations on this pattern. The two most common choices we face are:

map a standard Racket function (say, Problem 3)
or a custom function we write (say, Problem 4)
apply a standard Racket function (say, Problem 4)
or a custom function we write (say, Problem 3).

But there are others. Problem 5 required that we pre-process the list by dropping a header row with rest. On that problem, some of you found it convenient to do multiple map steps rather than write a more complex item function.

We are not limited by the pattern. It simply gives us a way to think about a problem and to structure our solution.

On first exposure, you might imagine that you'll never use functions such as map and apply after you finish this course, but you might be wrong... As noted last time, Google developed MapReduce to solve a common problem that arises when working with massive data sets. It is used widely in industry.

... O(n) and parallelism
... a visit to The Principal

Many of the functions we have been writing implement a simple form of MapReduce, using Racket's primitive functions. Next week, we will begin to learn techniques for writing other kinds of mappers and reducers.

Exercise: Status Check

Let's write another map-reduce function. Suppose that we have lists of strings of this sort:

(define names
  '("Johnny" "christine" "FRANK" "Juliette" "JOANNA" "eugene"))

Write a Racket function average-length that returns the average length of the strings in a list of strings.

For example:

> (average-length '("billie" "eilish"))
6

> (average-length names)
6 2/3

Racket has a primitive function named string-length that returns the length of a string.

We have defined a custom function named average.

Here's one possible solution.

Another Pattern: Filtering a List

Here's our margin of victory data again:

(define games
  '((102 51) (78 67) (53 94) (64 75) (54 71) (64 68)))

Let's solve a different kind of problem:

Find the games in which the home team was picked to win.

How might you solve this in Python? Here's an imperative solution:

results = []
for p in games:
    if p[0] > p[1]:                # 1. if p meets a condition
        results.append(p)          # 2. record it in our result

In a functional style, we would move the operation into a function:

def home_team_won(game):
    return game[0] > game[1]

And use the function:

results = []
for p in games:
    if home_team_won(p):           # 1. if p meets a condition, in a fxn
        results.append(p)          # 2. record it in our result

This looks a lot like the map steps in all of our previous solutions, but with an important twist: We don't compute an item to put in the list... We decide whether we want to put the original item in our result!

Trigger: A loop with an if statement that finds values meeting a condition.
Action: use a filter.

filter is a function like map: it implements an entire loop. Instead of applying a function to every item and returning a list of results, it returns only the items that "pass the test" posed by its function argument.

To implement our solution, we can call filter and supply the text function, home_team_won():

results = filter(home_team_won, pairs)

As with map, Python's filter produces a "filter object" that we can loop over.

We can do all of this directly in Racket, too:

(define home-team-won
  (lambda (game)
    (> (first game) (second game))))

(filter home-team-won games)

As expected, Racket's filter returns a list.

We can also use the resulting list to compute other results. For example, we can compute the total margin of victory in games won by the home team by using it in the total-margin code.

Putting It All Together

map, filter, and apply are useful separately, but their real power comes when we use them together.

Recall our list of strings:

(define names
  '("Johnny" "christine" "FRANK" "juliette" "Joanna" "eugene"))

Write a Racket function total-starting-with, which returns the total number of characters in the names that start with a given letter.

For example:

> (total-starting-with "j" names)
20
> (total-starting-with "e" names)
6
> (total-starting-with "a" names)
0

Use the function (string-downcase str) to convert each string to lowercase before processing.

Use the function (string-prefix? str prefix) to determine if str starts with prefix.

If you need any other primitive function, ask!

When we process strings in this way, we usually have to convert the strings to a canonical form before processing. string-downcase helps us do that here

Evolving a Solution

; (map string-downcase names)
;
; (filter (lambda (s)
;           (string-prefix? s "j"))
;         (map string-downcase names))
;
; (map string-length
;      (filter (lambda (s)
;                (string-prefix? s "j"))
;              (map string-downcase names)))
;
; (apply + (map string-length
;               (filter (lambda (s)
;                         (string-prefix? s "j"))
;                       (map string-downcase names))))

(define total-starting-with
  (lambda (char list-of-strings)
    (apply + (map string-length
                  (filter (lambda (s)
                            (string-prefix? s char))
                          (map string-downcase list-of-strings))))))

Thinking Functionally

The patterns of data in our solutions look something like this:

MAP      from  (d1 d2 d3 d4 d5 d6 ...)
           to  (v1 v2 v3 v4 v5 v6 ...)

FILTER   from  (d1 d2 d3 d4 d5 d6 ...)
           to  (d1    d3       d6 ...)

APPLY    from  (d1 d2 d3  ...)
           to  n

You can create new habits, with attention and practice. Take baby steps. Use the REPL to help you build code you trust. Practice, practice, practice.

Wrap Up

Reading
- Nothing new. Review the notes for Sessions 1-7 along with any reading assignments given in them.
Homework
- Homework 3 was due last night.
Quiz 1
- The quiz comes at the end of Session 8, on Thursday. It will cover our readings so far and our in-class coverage of Racket and functional programming style. This includes:
  - Racket's built-in data types (primitives) and functions
  - Racket expressions (means of combination) and data structures
  - Racket definitions and functions (means of abstraction)
  We have paid special attention to Racket functions and how they differ from functions in other languages.
- I don't write study guides, but... Most sessions have an orientation section near the beginning, sometimes with a header like "Recap" or "Where Are We?". Today's is called Recap: Programming with Higher-Order Functions. These may help you see the terms and ideas we have been studying. And the items listed as the reading assignments each day are important, especially the short sections I wrote for you.

Session 7 Thinking Functionally

Programming with Higher-Order Functions

Opening Exercise: A More Efficient Map

A Solution

Recap: Programming with Higher-Order Functions

An Exercise to Set the Stage

A Solution

Growing a Solution in Python

Growing a Solution in Racket

A Style of Programming

Exercise: Status Check

Another Pattern: Filtering a List

Putting It All Together

Evolving a Solution

Thinking Functionally

Wrap Up

Session 7
Thinking Functionally