Draw box and pointer diagrams for the following data objects:
Before drawing your diagrams:
How many boxes does each picture require?
After drawing your diagrams: Is either of these objects is a list?
We are still discussing the three things that every language has -- primitive expressions, means of combination, and means of abstraction -- and what kinds of features Racket offers in each category. Last session, we discussed Racket's aggregate data types, in particular the pair and its abstraction, the list. This session, we discuss functions, a powerful means of behavioral abstraction. You will be able to take what you saw of Racket's data structures, apply it to the creation of functions, and begin to program in a new way.
Before we proceed...
We first learned about modules in Session 3 when we saw the built-in Racket module rackunit. We can also create our own modules and export definitions. Take a look at this simple example of creating and using a user-defined module.
(Racket also provides a module special form for creating modules with many other features. We probably won't use this feature in class this semester, but you can read about it in the Racket Guide.)
Now, on with the show.
We have identified several important things that appear in any programming language:
Until now, we have discussed abstraction only in terms of naming values, and the ability to refer to objects by their names. Last week, we discussed how we could name data values, and how we even do this in real life, with numbers such as π and e. We do have a few more things we can discuss about data, such as indirect access to objects. We will return to data abstraction in a few weeks.
Now we explore behavioral abstraction. Depending on the languages study, we might call these abstractions subprograms, procedures, or functions. At its core is a simple idea. We have some computation that we would like to do, such as computing the circumference of a circle:
(* 2 pi 1) (* 2 pi 10) (* 2 pi 14.1)
We see an abstract operation here. We always multiply 2 and π by the radius of the circle. So we create an abstraction by keeping constant what is constant (the operation, the 2, and π) and turning what changes (the radius) into a parameter that be specified each time we use the abstract operation.
In Python, we create and name our abstract operation using def:
def circumference(radius): return 2 * pi * radius
Most languages work the same way: we create and name a function using a single language primitive. This combination limits how and when programmers can create and use functions. But in Racket and many other high-level languages, functions are objects that
In Racket, naming things and creating functions are both so important that they are different actions. Separating these two features turns out to be an extremely valuable tool for abstracting away the details of a computation. Racket's own built-in operators even work this way! We hinted at this important idea last week when we said that "* evaluates to a function that does multiplication". This means that * is really just a name for something that performs multiplication.
The idea that a process can exist independent of a name isn't as surprising as it sounds, because you have been dealing with this concept for many years -- even before you began your study of computer science.
Think about finding the square root of a number. There are many ways to perform the operation. In some situations, we may be referring to a particular method (say, use Newton's method or use a calculator). In others, we might not care and be free to use any process we know. Each of these processes exists independent of any particular name, but we can think of each of them as "take the square root". The same is true of sorting a pile of papers, or any other operation.
This leads us to several related ideas:
To name a function in Racket, we can use our old friend, define, to do what it has already been useful for: naming things. To create a function, we use the special form lambda:
(lambda (radius) (* 2 pi radius))
The term lambda comes from the lambda calculus of Alonzo Church. The lambda calculus is the foundation of computer science.
If we were studying many other languages, we would be nearly done with this topic. But because we are discussing functional programming in Racket, you will find that our consideration of procedural abstraction can go much longer and deeper. In fact, we have hardly begun to cover this important topic at all!
The need for procedural abstraction is pretty easy to come by. Think back to our opening exercise last time. In that problem, we want to determine the letter grade for a single person, but as your instructor, I need to assign letter grades for 35 students in the class. To do so, I would have to re-bind favorite-student to a new value and then re-evaluate the cond or if expression that we wrote 34 more times. Copy and paste works wonders sometimes, but isn't there a more abstract idea at play here?
Important insight from computer science: Any process that repeats, whether in our programs or in our daily lives, is usually pointing us to an abstraction.
lambda is a special form that takes two arguments. (Do you remember what a special form is?) The first is an expression that specifies the function's formal parameters. The second is an expression that uses the parameters to specify an operation to compute. For example, we could write:
> (lambda (x) (* x x)) #<procedure>
When evaluated, a lambda expression returns a new procedure. The result you see, #<procedure>, is Dr. Racket's way of printing a compiled function. You will see something similar by evaluating the name of any of Racket's built-in functions:
> + #<procedure:+>
The built-in function + is printed out in much the same way as the result of our lambda expression -- except that the print form indicates the function is the primitive named +. Racket treats all function objects, whether user-defined or built-in, in pretty much the same way. Print form is one of the few distinctions.
Because the lambda expression evaluates to a function, we can use it as an operator to construct a compound expression:
> ( (lambda (x) (* x x)) 5 ) 25
To find out what's going on here, let's return to the evaluation algorithm that we discussed last week. This is a compound expression, so we first evaluate each of the subexpressions. In general, Racket doesn't say much about the order that the arguments are evaluated, but we know that it must evaluate the operator first, in order to determine whether it is a special form or a function.
When we evaluate (lambda (x) (* x x)), we get a function. This particular function takes a single argument, x. When called, this function performs the operation defined by (* x x), using the value passed in for x.
Because the operator is a function, all of the remaining subexpressions are evaluated and passed to it.
The second subexpression, 5, is easy, because it is a number literal. Literals evaluate to themselves.
Next, we evaluate the expression as a whole. 5 is passed to the function as an argument, and replaces x wherever it appears in the body of the procedure. This produces the expression (* 5 5), which is then evaluated to produce a value of 25.
Quick Exercise: What happens if we pass two arguments to this lambda expression? Or none?> ((lambda (x) (* x x))) #<procedure>: expects 1 argument, given 0 >((lambda (x) (* x x)) 2 3) #<procedure>: expects 1 argument, given 2: 2 3These are type errors. We often think of type checking in terms of the types of values passed in as parameters, but the number of parameters is a part of the function's type. This function takes exactly one argument. If we pass it zero, or two, Racket's strong type checking catches it at run-time.
If we would like to apply this function to a different value, then we use it to construct another compound expression:
> ( (lambda (x) (* x x)) 7 ) 49
That's not very convenient. We have to write (lambda (x) (* x x)) each time we want to apply the operation to a number. If we only need to use the function once or twice, we may be happy to do so. But if we'd like to use it more often, we can assign a name to the value returned by the lambda expression:
(define square (lambda (x) (* x x)))
This says that we want to associate the name square with a function value. Note that there is no difference between giving a name to a data object such as the number 5 and giving a name to a function object. Functions are data objects!
Having defined square, we use it just as we would any Racket function:
> (square 7) 25 > (square (+ 3 3)) 36 > (+ 5 (square 3)) 14
The second example shows that the argument to our function can be any expression, just like any other operator. The third example shows an expression returning a value that is used as the argument to a function. Notice that there is no difference in the way built-in operators and user-defined operators are used or applied. This simple syntax is one of the things that makes Racket such a powerful language.
We can also use square as part of other function definitions. For example, x² + y² can be expressed as (+ (square x) (square y)). We can define a function, sum-of-squares, that takes any two numbers as arguments and produces the sum of their squares:
(define sum-of-squares (lambda (x y) (+ (square x) (square y))))
The function sum-of-squares calls the procedure square twice, once for each of its parameters, x and y, and returns the sum of the results. Having defined sum-of-squares, we can use it just like in another square or any other function.
Note that using the formal parameter name x in both procedures does not create a conflict, because each function evaluation takes place in its own context, where x is bound to a specific value. This is no different from using the same parameter name in two Java methods or two Python functions.
It should be clear to you why named function are so powerful: they allow us to hide details and solve problems at a higher level of abstraction.
Just to make sure that the notion of functions is clear, take a look at another example:
> (define radius 5) > (define pi 3.14159) > (define circumference (* 2 pi radius)) > circumference 31.4159 > (define circumference (lambda (radius) (* 2 pi radius))) > (circumference 6) 37.699079999999995 > (circumference 5) 31.4159 > (circumference radius) 31.4159
Do some practice on your own with these exercises that I designed just for you!
We say that a data object is first class when it is treated by the language in the same way that all other data objects are treated. We have now seen a couple of ways that functions are treated like other objects: we can use them as literal values, and we can name them in the same way we name any other data object.
In Racket, as in most languages you know, data values such as numbers and strings can be passed to functions as arguments and returned from functions as answers. In order for functions to be first class data values, we must be able to pass a function as an argument and be able to return a function as the value of applying a function. In Racket, we can.
First, let's consider passing one function as an argument to another function. Racket provides several primitive functions that expect function arguments. We can also write our own functions that take functions as arguments.
Suppose that you would like to add up a list a values. Easy, right?
> (+ 1 2 3 4 5) 15
But what if you don't know what the list of values is? Perhaps another function defines the list for us, or computes the list as its result. So we might have (1 2 3 4 5) as a data value. How do we add up its members? The standard call to + won't work:
> (+ '(1 2 3 4 5)) +: expects argument of type <number>; given (1 2 3 4 5) > (define list-of-numbers '(1 2 3 4 5)) > (+ list-of-numbers) +: expects argument of type <number>; given (1 2 3 4 5)
This isn't the answer we are going for.
We encounter this sort of situation occasionally in functional programming: we call a function to compute a set of values, which we then want to pass as the arguments to another function. Racket provides a primitive function, apply, that allows us to solve the problem:
> (apply + '(1 2 3 4 5)) 15
apply takes two arguments: a function and a list of values. It returns as its value the result of applying the function with all the values in the list as its individual arguments.
Of course, we wouldn't usually call apply if we knew the list argument as a literal. But suppose that we had a function that could produce sequences of numbers:
> (sequence 4 8) '(4 5 6 7 8) > (sequence 1 100) '(1 2 3 4 5 [...] 96 97 98 99 100)
We can use apply to operate on these values as arguments:
> (apply * (sequence 4 8)) 6720 > (apply + (sequence 1 100)) 5050 > (apply + (sequence 1 10000)) 50005000 > (apply + (sequence 1 1000000)) 500000500000 > (apply * (sequence 1 1000)) ; 1000! 40238726007709377354370243392300398571937486...
Notice that we pass a function to apply just as we pass any other argument: by giving its name in the argument list. We can, of course, also pass a lambda expression directly, as a function literal. For example:
> (apply (lambda (x y) (/ (+ x y) 2.0)) '(1 2)) 1.5
apply allows us to express a high-level operation: Add all of these numbers. We don't need a loop! This gives a hint of what we mean when we talk about "functional programming".
An even more useful Racket primitive, map, takes a function as an argument. Sometimes, we want to apply a function to every item in a list individually. For instance, suppose I have a list of student records, called list-of-students:
(define list-of-students '((jerry 3.7 4.0 3.3 3.3 3.0 4.0) (elaine 4.0 3.7 3.7 2.0 3.3 3.7) (george 3.3 3.3 3.3 3.3 3.7 1.0) (cosmo 2.0 2.0 2.3 3.7 2.0 4.0)))
... and I want to compute the GPA of each of the students. Suppose that we have a function named average that takes any number of numeric arguments and returns the average of those numbers. (Racket does not provide average as a primitive, so I had to write it.)
First, we write a function that uses average to compute the GPA of an individual student:
(define compute-gpa (lambda (student) (apply average (rest student)))) ; do we see why we use apply?
In a procedural programming style, you would now immediately think, "I need to write a loop". You would count through the student list and call compute-gpa for each student. But in Racket you would use map:
> (map compute-gpa list-of-students) (3.5500000000000003 3.4 2.983333333333333 2.6666666666666665)
map takes two arguments, a function of one argument and a list of values. A call to map returns as its value a list of the same length. This list contains the results of applying the procedure to each item in the list of values.
We can also use map with computed arguments, as we did with apply above. We can compute the square of all the items in a list of numbers by:
> (map square (sequence 1 10)) (1 4 9 16 25 36 49 64 81 100) > (map square (sequence 1 1000)) (1 4 9 16 [...] 996004 998001 1000000)
And if we needed a function to compute the sum of the squares of an arbitrary list of numbers, we could use this map expression in conjunction with an apply:
(define sum-of-squares-all ;; takes a list, not just two (lambda (list-of-numbers) (apply + (map square list-of-numbers)))) > (sum-of-squares-all (sequence 1 3)) 14 > (sum-of-squares-all (sequence 1 1000)) 333833500
map allows us to express a high-level operation: Square all of these numbers. We don't need a loop! That is functional programming.
This should give you a hint of how functional programmers can get along so well without having looping constructs. Writing map and apply expressions is faster and less error-prone than the equivalent loop code (or recursive code). As such, these primitives make programmers much more productive. Sometimes, we will write our own map-like procedures that allow us to get by without loops in other situations.
We'll use map whenever we can this semester. We won't use apply as often in this course, but when we do need it, it will seem indispensable!
So... We can pass a function as an argument to another function. What about returning a function as the result of a computation?
At first, you might ask, "Why would we even want to?" The idea may seem strange to you because your programming experience has been rather limited.
Homework 2 gives us a few hints of when we might want to write a function that produces a function is its value. Consider Problem 3's candy-temperature function. I do all of my cooking in Cedar Falls, Iowa, which is roughly 298m above sea level. It doesn't make a lot of sense that I have to send 977.69 feet as the second argument every time I want to convert a candy recipe. But I do. Wouldn't it be nice if I could pass the 977.69 to a function and have that function produce a candy temperature function tailored to Cedar Falls?
Or consider Problem 5. In an engineering setting, we generally work with a fixed tolerance. For example, when machining a part, the difference between a part's actual width and its expected width can never be greater than 0.01 inches. Wouldn't it be nice if we could pass a tolerance to a function and have that function produce the corresponding in-range? function for us?
In a functional programming language such as Racket you can. I hope that you will soon find this to be a natural idea.
Finally, consider a "real world" application in which we might want to return a function as the value of another function: self-verifying numbers. You encounter this problem and solution every day out in the world whenever you use a credit card number, an ISBN, or a UPC code. The solution shows us the use of first-class functions in a natural way. Read this section on self-verifying numbers to learn a bit about how we are able to know that made-up credit card numbers aren't legal -- and how a Racket function that returns a function helps solve the problem!
The self-verifying number scenario demonstrates that, in Racket, we can both pass functions as arguments and return functions as values. So, functions are first-class objects. Indeed, we will find that these capabilities will make us much more productive programmers. You may be surprised at often we will take advantage of this flexibility as we build language interpreters and other programs this semester.
Keep in mind that what you have just learned is not limited to Racket; other languages support these techniques. The important idea here from a programming languages perspective is that, to a program translator, functions can be values just like any other. They can be supported and manipulated in all the same ways as other data types. We have merely used Racket as a vehicle to learn this idea.
Other languages now have lambda expressions, too, as well as some of the language features we are studying. For example, this file demonstrates lambda in Python, and this page, moves from that code to examples of functional programming in Python. Racket's way of creating and using functions gives us a lot more flexibility than you will find in most mainstream languages, with a more consistent syntax to boot.
Here are some questions for review and entertainment! Don't be surprised to see problems of this sort on Exam 1.
> (define five (lambda () 5)) > (five) 5 > five #[procedure five] > (define four 4) > four 4 > (four) procedure application: expected procedure, given: 4 (no arguments) >Explain each result in the above transcript. In particular, focus on the difference between four and five.