Draw box and pointer diagrams for the following data objects:
Before drawing your diagrams:
How many boxes does each picture require?
After drawing your diagrams:
Is the first object a list? Is the second?
We are still discussing the three things that every language has -- primitive expressions, means of combination, and means of abstraction -- and what kinds of features Racket offers in each category. Last session, we discussed Racket's aggregate data types, in particular the pair and its abstraction, the list. This session, we discuss functions, a powerful means of behavioral abstraction. You will be able to take what you saw of Racket's data structures, apply it to the creation of functions, and begin to program in a new way.
Before we proceed...
We first learned about modules in Session 3 when we saw the built-in Racket module rackunit. We can also create our own modules and export definitions. Take a look at a simple example of creating and using a user-defined module.
Optional digression: Racket also provides a module special form for creating modules with many other features. We probably won't use this feature in class this semester, but you can read about it in the Racket Guide.
Now, on with the show.
We have identified several important things that appear in any programming language:
Until now, we have discussed abstraction only in terms of naming values, and the ability to refer to objects by their names. Last week, we discussed how we could name data values, and how we even do this in real life, with numbers such as π and e. We do have a few more things we can discuss about data, such as indirect access to objects. We will return to data abstraction in a few weeks.
Now we explore behavioral abstraction. Depending on the languages study, we might call these abstractions subprograms, procedures, or functions. At its core is a simple idea. We have some computation that we would like to do, such as computing the circumference of a circle:
(* 2 pi 1) ; for a circle with radius = 1 (* 2 pi 10) ; for a circle with radius = 10 (* 2 pi 14.1) ; for a circle with radius = 14.1
We see an abstract operation here. We always multiply 2 and π by the radius of the circle. So we create an abstraction by keeping constant what is constant (the operation, the 2, and π) and turning what changes (the radius) into a parameter that be specified each time we use the abstract operation.
In Python, we create and name our abstract operation using def:
def circumference(radius): return 2 * pi * radius
Most languages work in the same way: we create and name a function using a single language primitive. This combination limits how and when programmers can create and use functions. But in Racket and many other high-level languages, functions are objects that
In Racket, naming things and creating functions are both so important that they are different actions. Separating these two features turns out to be an extremely valuable tool for abstracting away the details of a computation. Racket's own built-in operators even work this way! We hinted at this important idea last week when we said that "* evaluates to a function that does multiplication". This means that * is really just a name for something that performs multiplication.
The idea that a process can exist independent of a name isn't as surprising as it sounds, because you have been dealing with this concept for many years -- even before you began your study of computer science.
Think about finding the square root of a number. There are many ways to perform the operation. In some situations, we may be referring to a particular method (say, use Newton's method or use a calculator). In others, we might not care and be free to use any process we know. Each of these processes exists independent of any particular name, but we can think of each of them as take the square root. The same is true of sorting a pile of papers, or any other operation.
This leads us to several related ideas:
To create a function, we use the special form lambda:
(lambda (radius) (* 2 pi radius))
The term lambda comes from the lambda calculus of Alonzo Church. The lambda calculus is one of the foundations of computer science.
To name a function in Racket, we can use our old friend, define, to do what it has already been useful for: naming things. So:
(define circumference (lambda (radius) (* 2 pi radius)))
In your reading, you may have seen another way to define a Racket function that does not use lambda:
(define (circumference radius) (* 2 pi radius))
This is an example of syntactic sugar, an idea that we saw last time. Under the hood, Racket rewrites this shorthand expression into the lambda version! I'll continue to use the explicit lambda style in my code and encourage you to do so, too. This will help us become more comfortable with lambda, which will essential to us in many cases.
If we were studying many other languages, we would be nearly done with this topic. But because we are discussing functional programming in Racket, you will find that our consideration of procedural abstraction can go much longer and deeper. In fact, we have hardly begun to cover this important topic at all!
The need for procedural abstraction is pretty easy to come by. Think back to our opening exercise last time. In that problem, we want to determine the letter grade for a single person, but as your instructor, I need to assign letter grades for 29 students in the class. To do so, I would have to re-bind student-grade to a new value and then re-evaluate the cond or if expression that we wrote 28 more times. Copy and paste works wonders sometimes, but isn't there a more abstract idea at play here?
Important insight from computer science: Any process that repeats, whether in our programs or in our daily lives, is usually pointing us to an abstraction.
lambda is a special form that takes two arguments. (Do you remember what a special form is?) The first is an expression that specifies the function's formal parameters. The second is an expression that uses the parameters to specify an operation to compute. For example, we could write:
> (lambda (x) (* x x)) #<procedure>
When evaluated, a lambda expression returns a new procedure. The result you see, #<procedure>, is Dr. Racket's way of printing a compiled function. You will see something similar by evaluating the name of any of Racket's built-in functions:
> + #<procedure:+>
The built-in function + is printed out in much the same way as the result of our lambda expression -- except that the print form indicates the function is the primitive named +. Racket treats all function objects, whether user-defined or built-in, in pretty much the same way.
Because the lambda expression evaluates to a function, we can use it as an operator to construct a compound expression:
> ( (lambda (x) (* x x)) 5 ) 25
To find out what's going on here, let's return to the evaluation algorithm that we discussed last week. This is a compound expression, so we first evaluate each of the subexpressions. In general, Racket doesn't say much about the order that the arguments are evaluated, but we know that it must evaluate the operator first, in order to determine whether it is a special form or a function, and thus to know how to evaluate the rest of the arguments.
When we evaluate (lambda (x) (* x x)), we get a function. This particular function takes a single argument, x. When called, this function performs the operation defined by (* x x), using the value passed in for x.
Because the operator is a function, all of the remaining subexpressions are evaluated and passed to it.
The second subexpression, 5, is easy, because it is a number literal. Literals evaluate to themselves.
Next, we evaluate the expression as a whole. 5 is passed to the function as an argument, and replaces x wherever it appears in the body of the procedure. This produces the expression (* 5 5), which is then evaluated to produce a value of 25.
Quick Exercise: What happens if we pass two arguments to this lambda expression? Or none?>((lambda (x) (* x x)) 2 3) #<procedure>: arity mismatch; the expected number of arguments does not match the given number expected: 1 given: 2 arguments...: > ((lambda (x) (* x x))) #<procedure>: arity mismatch; the expected number of arguments does not match the given number expected: 1 given: 0 arguments...:These are type errors. We often think of type checking in terms of the types of values passed in as parameters, but the number of parameters is a part of the function's type. This function takes exactly one argument. If we pass it zero, or two, Racket's strong type checking catches it at run-time.
If we would like to apply this function to a different value, then we use it to construct another compound expression:
> ( (lambda (x) (* x x)) 7 ) 49
That's not very convenient. We have to write (lambda (x) (* x x)) each time we want to apply the operation to a number. If we only need to use the function once or twice, we may be happy to do so. But if we'd like to use it more often, we can assign a name to the value returned by the lambda expression:
(define square (lambda (x) (* x x)))
This says that we want to associate the name square with a function value. Note that there is no difference between giving a name to a data object such as the number 5 and giving a name to a function object. Functions are data objects!
Having defined square, we use it just as we would any Racket function:
> (square 7) 25 > (square (+ 3 3)) 36 > (+ 5 (square 3)) 14
The second example shows that the argument to our function can be any expression, just like any other operator. The third example shows an expression returning a value that is used as the argument to a function. Notice that there is no difference in the way built-in operators and user-defined operators are used or applied. This simple syntax is one of the things that makes Racket such a powerful language.
We can also use square to define other functions. For example, x² + y² can be expressed as (+ (square x) (square y)). We can define a function, sum-of-squares, that takes any two numbers as arguments and produces the sum of their squares:
(define sum-of-squares (lambda (x y) (+ (square x) (square y))))
The function sum-of-squares calls the function square twice, once for each of its parameters, x and y, and returns the sum of the results. Having defined sum-of-squares, we can use it just like square or any other function.
Note that using the formal parameter name x in both procedures does not create a conflict, because each function evaluation takes place in its own context, where x is bound to a specific value. This is no different from using the same parameter name in two Java methods or two Python functions.
It should be clear to you why named functions are so powerful: they allow us to hide details and solve problems at a higher level of abstraction.
Just to make sure that the notion of functions is clear, take a look at another example:
> (define radius 5) > (define pi 3.14159) > (define circumference (* 2 pi radius)) > circumference 31.4159 > (define circumference (lambda (radius) (* 2 pi radius))) > (circumference 6) 37.699079999999995 > (circumference 5) 31.4159 > (circumference radius) 31.4159
Do some practice on your own with these exercises that I designed just for you!
At this point, have we learned much new? You learned some Racket syntax, but you already have a decent understanding of functions from other languages. Perhaps the most surprising thing we've seen thus far is that we name functions in Racket the same way we name any other value. That feature hints that there is something more going on.
We say that a data object is first class when it is treated by the language in the same way that all other data objects are treated. We have now seen a couple of ways that functions are treated like other objects: we can use them as literal values, and we can name them in the same way we name any other data object.
In Racket, as in most languages you know, data values such as numbers and strings can be passed to functions as arguments and returned from functions as answers. In order for functions to be first class data values, we must be able to pass a function as an argument and be able to return a function as the value of applying a function. In Racket, we can. Furthermore, doing these things is a big part of the functional programming style.
First, let's consider passing one function as an argument to another function. Racket provides several primitive functions that expect function arguments. We can also write our own functions that take functions as arguments.
Suppose that you would like to add up a list a values. Easy, right?
> (+ 1 2 3 4 5) 15
But what if you don't know what the list of values is? Perhaps another function defines the list for us, or computes the list as its result.
For example, consider a grading system that uses your quiz-percentage function from Homework 2. Rather than call it interactively, the system will want to use it compute the quiz average for every student in the class. For each student, the system will query its database and get back a list of quiz grades, such as (60 50 40).
Or if we are adding up integers like above, we might ask the user for an upper bound and generate the list:
> (range 1 6) '(1 2 3 4 5)
In this case, we have the list (1 2 3 4 5) as a data value. How do we add up its members? The standard call to + won't work:
> (+ (range 1 6)) +: contract violation expected: number? given: '(1 2 3 4 5)
This isn't the answer we are going for.
We encounter this sort of situation occasionally in functional programming: we call a function to compute a set of values, which we then want to pass as the arguments to another function. Racket provides a primitive function, apply, that helps us solve the problem:
> (apply + (range 1 6)) 15 > (apply + '(1 2 3 4 5)) 15
apply takes two arguments: a function and a list of values. It returns as its value the result of applying the function with all the values in the list as its individual arguments.
range can produces many different sequences of numbers:
> (range 4 8) '(4 5 6 7) > (range 1 100) '(1 2 3 4 5 [...] 96 97 98 99) > (range 100) '(0 1 2 3 4 [...] 96 97 98 99)
... and we can use apply to operate on these values as arguments, along with any operator that takes any number of arguments:
> (apply + (range 4 8)) 22 > (apply * (range 4 8)) 840 > (apply + (range 1 100)) 4950 > (apply + (range 1 10000)) 49995000 > (apply + (range 1 1000000)) 499999500000 > (apply * (range 1 1001)) ; 1000! 40238726007709377354370243392300398571937486...
Notice that we pass a function to apply just as we pass any other argument: by giving its name in the argument list. We can, of course, also pass a lambda expression directly, as a function literal. For example:
> (apply (lambda (x y) (/ (+ x y) 2.0)) '(1 2)) 1.5
Just now, I computed 1000! without a loop or recursion. apply allows us to express a high-level operation: Multiply all of these numbers. We don't need a loop! We don't need recursion. This gives us a hint of what we mean when we talk about "functional programming".
We now have two "function calling patterns" available to us. We can call the function directly by using it as an operator:
(+ 1 2 3 4 5 ...)Or we can call it indirectly by passing its value to another function, which calls our function for us:
(apply + '(1 2 3 4 5 ...))
Abstracting out different function-calling patterns is another key feature of functional programming. We don't have to repeat control structures that occur frequently: we can turn them into functions.
Sometimes, we want to apply a function to every item in a list individually. For example, I might want to square every integer in a list:
(1 2 3 4 5) → (1 4 9 16 25)The Racket primitive map can do the trick. It takes two arguments, a function and a list, and calls the function with each member of the list one at a time. As its value, map returns a list of its answers. So:
> (map square '(1 2 3 4 5)) '(1 4 9 16 25)
This can be handy in a wide range of problems. For instance, imagine that I have a list of student records, called list-of-students:
(define list-of-students '((jerry 3.7 4.0 3.3 3.3 3.0 4.0) (elaine 4.0 3.7 3.7 3.0 3.3 3.7) (george 3.3 3.3 3.3 3.3 3.7 1.0) (cosmo 2.0 2.0 2.3 3.7 2.0 4.0)))
... and I want to compute the GPA of each of the students. If we had a function named average like + and *, that took any number of numeric arguments and returned the average of those numbers, we could compute the grade for a student like this:
> (average 3.7 4.0 3.3 3.3 3.0 4.0) 3.5500000000000003Racket does not provide average as a primitive, so I wrote one. I include its definition at the bottom of the source file for today's lecture; we will learn how it works next time.
With average in hand, we can write a function that uses it to compute the GPA of an individual student:
(define compute-gpa (lambda (student) (apply average (rest student))))
Do we see why we need to use apply? And rest?
Now I need to compute-gpa for each student. In a procedural programming style, you would now immediately think, "I need to write a for loop". You would advance through the student list one by one and call compute-gpa on each pass. But in Racket you would use map:
> (map compute-gpa list-of-students) (3.5500000000000003 3.5666666666666664 2.983333333333333 2.6666666666666665)
A call to map returns as its value a list of the same length as its second argument. The new list contains the results of applying the function to each item in the list of values.
We can also use map with computed arguments, as we did with apply above. We can compute the square of all the items in a list of numbers by:
> (map square (range 1 11)) (1 4 9 16 25 36 49 64 81 100) > (map square (range 1 1001)) (1 4 9 16 [...] 996004 998001 1000000)
And if we needed a function to compute the sum of the squares of an arbitrary list of numbers, we could use this map expression in conjunction with an apply:
(define sum-of-squares* ;; takes a list, not just two (lambda (list-of-numbers) (apply + (map square list-of-numbers)))) > (sum-of-squares* (range 1 4)) 14 > (sum-of-squares* (range 1 1001)) 333833500
map allows us to express a high-level operation: Square all of these numbers. We don't need a loop! This is the spirit of functional programming.
This should give you a hint of how functional programmers can get along so well without having looping constructs. Writing map and apply expressions is faster and less error-prone than the equivalent loop code (or recursive code). As such, these primitives make programmers much more productive. Sometimes, we will write our own map-like functions that allow us to get by without loops in other situations.
We'll use map whenever we can this semester. We won't use apply as often in this course, but when we do need it, it will seem indispensable!
So... We can pass a function as an argument to another function. What about returning a function as the result of a computation?
At first, you might ask, "Why would we even want to?" The idea may seem strange to you because your programming experience has been rather limited.
Homework 2 gives us a few hints of when we might want to write a function that produces a function is its value. Consider Problem 4's candy-temperature function. I do all of my cooking in Cedar Falls, Iowa, which is roughly 298m above sea level. It doesn't make a lot of sense that I have to send 977.69 feet as the second argument every time I want to convert a candy recipe. But I do. Wouldn't it be nice if I could pass the 977.69 to a function and have that function produce a candy temperature function tailored to Cedar Falls?
Or consider Problem 5. In an engineering setting, we generally work with a fixed tolerance. For example, when machining a part, the difference between a part's actual width and its expected width can never be greater than 0.01 inches. Wouldn't it be nice if we could pass a tolerance to a function and have that function produce the corresponding in-range? function for us?
In a functional programming language such as Racket you can. I hope that you will soon find this to be a natural idea.
Finally, consider a "real world" application in which we might want to return a function as the value of another function: self-verifying numbers. You encounter this problem and solution every day out in the world whenever you use a credit card number, an ISBN, or a UPC code. The solution shows us the use of first-class functions in a natural way. Read this section on self-verifying numbers to learn a bit about how we are able to know that made-up credit card numbers aren't legal -- and how a Racket function that returns a function helps solve the problem!
The self-verifying number scenario demonstrates that, in Racket, we can both pass functions as arguments and return functions as values. So, functions are first-class objects. Indeed, we will find that these capabilities will make us much more productive programmers. You may be surprised at often we will take advantage of this flexibility as we build language interpreters and other programs this semester.
Keep in mind that what you have just learned is not limited to Racket; other languages support these techniques. The important idea here from a programming languages perspective is that, to a program translator, functions can be values just like any other. They can be supported and manipulated in all the same ways as other data types. We have merely used Racket as a vehicle to learn this idea.
Other languages now have lambda expressions, too, as well as some of the language features we are studying. For example, this file demonstrates lambda in Python. Racket's way of creating and using functions gives us a lot more flexibility than you will find in most mainstream languages, with a more consistent syntax to boot.
(If you would like to go deeper with functional programming in Python, check out this page for some examples. Be warned: it uses some Python you may not have seen. And we haven't even gone that far in Racket yet!)
Here are some questions for review and entertainment! Don't be surprised to see problems of this sort on Quiz 1.
> (define five (lambda () 5)) > (five) 5 > five #[procedure five] > (define four 4) > four 4 > (four) procedure application: expected procedure, given: 4 (no arguments) >Explain each result in the above transcript. In particular, focus on the difference between four and five.