New languages come along all the time, sometimes with the sort of corporate oomph that helped Swift and Go become popular quickly.
Most of the reasons why languages are favorites are common from class to class: simplicity, utility, familiarity. This time around, several students mentioned that they like object-oriented programming or large libraries, and that their favorite language supports them well.
I suspect that Python's popularity derives much from how easy it is to whip up useful programs quickly. For example, to generate the tag cloud above from the list of languages and their counts, I needed a text file containing each language's name as often as it appeared in the survey. It took me only a minute or so to write this Python program to generate the text file for me. This combination of utility and programming speed stands out starkly in comparison to Java, C, and C++, the other languages most of you know. Languages like Ruby would fare as well in this regard.
My favorite 'favorite' comment is: "... because it is the only one I know". This perfectly reasonable, and the only answer possible if you know only one language, or know only one language well. I hope that this course, and the others you take in our program, give every CS student the chance to make a reasoned choice from a much larger set of choices. Knowing more languages won't diminish your love for your favorite; it will give depth to your preference.
My favorite answer of all time to "Why is this language your favorite?" was I liked the challenges we had to solve while I was learning Java. That says nothing about the language, of course, but it reminds me that solving fun problems is why so many of us like to program at all.
Now we know which programming languages we know. Let's begin to learn what we can know about programming languages and learn a new one as both a case study and a tool. The new language will help us to achieve the course objectives.
In this course, we will describe languages and their behavior in two ways:
We will also use a little mathematics, but we could certainly use even more to reason about programs and computation. That is not a major concern of this course, though.
Human language works well for introducing ideas because it lets us tell stories that people understand quickly. People are motivated by good stories. In this course, I'll try to give you simple, understandable English descriptions of programming languages and their features.
Why, then, do we need to use programs, or mathematics, or some other kind of language, to describe programming languages?
English allows too much ambiguity. For example, suppose that I tell you,
The meaning of a procedure call, f(E), is running f on E.
I have left many essential questions unanswered. Where do we find E? Do we evaluate the expression? Is E passed by value or by reference, or by some other protocol? We can ask many of the same questions about f as well. It might not even be clear what we mean when we say "running"!
We need a language with formal semantics in order to eliminate such ambiguities. For this reason, we will use computer programs as our main way of explaining what sentences in a language mean. In particular, we will use an interpreter, which takes a sentence and produces its behavior.
But wait a minute... Can a program be unclear as an explanation? It surely can! As you are first learning Racket this semester, you may well think that some of the Racket programs you read are unclear. Eventually, though, you will understand Racket better and not be confused by lack of experience.
Even when we speak a familiar programming language, a program can be unclear. Consider a few possibilities:
We will do our best to circumvent these these sources of ambiguity. First, we do not use the first three in our interpreters. Second, we try to avoid the effects of the fourth through abstraction: the use of sub-programs and high-level data structures. Finally, we will try not to create so many sub-programs that we can't see meaningful behavior.
This is where functional programming and Racket come in handy as tools. Functional programming encourages decomposition into smallish procedures, with no assignment statements, and Racket's flexibility allows us to abstract in ways that other languages do not.
We begin our study of programming languages by learning Racket. This part of the course allows us to do three different things:
Let us begin.
What kinds of things must be present in every programming language?
Your answer to this question is almost certainly limited by the languages you know now, both the number and the variety. If you learn only one language, or one kind of language, then your perspective on what is essential will be determined by those experiences.
Every programming language has three kinds of things:
... give examples of each in Python, using the program we saw earlier.
Some languages have more kinds of features than these, but they cover most of most languages.
Quick Exercise: Make a list of five features from a programming language you are familiar with. Categorize each item on your list as a primitive, a means of combination, a means of abstraction, or none. Where, for example, would you classify conditional statements such as if?
Notice that there are lots of things not on the list. For instance, if statements are not one of the three things that every language has. Do you know of any programming language without if statements? Probably not.
In CS1, you may have learned that, in order to implement the computations that programmers need to solve most problems, a language must support three kinds of control flow: sequence, selection, and repetition. So any complete programming language will offer programmers a way to make choices.
But there is a difference between being able to make a choice and having an if statement, or any explicit conditional statement, in a language.
Take, for example, Smalltalk. I use Smalltalk as an example occasionally throughout this course, because it is different from the languages you know in several ways. Smalltalk has no conditional statement, nor any special form for making selections. (We'll talk about the notion of a "special form" soon.) How can that be?
It turns out that Smalltalk has no statements or special forms for control flow of any kind. All Smalltalk has is message passing: objects sending messages to other objects. In this language, True and False are objects. They respond to messages like any other object. Of course, they respond to the same messages, only in different ways.
This gives programmers the ability to make decisions in a program. When I send a message to True, it behaves one way (If you are true, ...). When I send the same message to False, it behaves another way (If you are false, ...). So Smalltalk does provide the ability to make choices, but not through a conditional statement. There is no place in the Smalltalk compiler where you can find the behavior for "if...".
Your view of what a programming language must have should be changing already.
Likewise, a language need not have a looping statement such as for or while. How so? As Python and Java programmers learn, a collection of objects can know how to return a subset that meets some condition, return a sorted version of itself, or determine whether a particular kind of object exists. The language still has for or while statements, but programmers don't use them as much as they do in languages such as C and Ada. We can imagine them disappearing entirely.
Why does it help us to know that some things must be present in a language, but other do not? It helps us to know what to look for. If I think that a programming language must have an if statement, then when I encounter a language that doesn't, I may become disoriented, disappointed, or even angry. None of those emotions help me to learn the new language, and they may well close my mind to learning something useful.
Whenever you are faced with the task of learning a new language, first try determining what is primitive in the language, how things get combined, and how details are abstracted away. This will give you a framework to guide your task. Over the next several sessions, we will be learning Racket. Our efforts will be guided by this framework.
Learn a new language and get a new soul.
-- Czech proverb
(... you may even get a new personality!)
Racket has two kinds of primitive expressions, both of which will be quite familiar to you:
The set of values that a variable can hold includes the values that can be expressed literally, as well as higher-order objects. The most surprising of these to you may be procedures.
These primitive expressions can be combined to create more complex expressions. There is exactly one means of combination: the operator application. In Racket, all non-primitive expressions have the following features:
The syntax of an application expression is:
(<operator> <operand1> <operand2> <operand3> ...)
Most of the operators we use are functions. To evaluate a compound expression built with a function, we evaluate each of the operands and pass their values to the function, which then produces the value of the expression.
Note that there are no "statements" in Racket. Every combination is an expression formed like any other. Nearly every such expression has a value. This is an important difference from most other languages you know, because we will use these values to drive our programming in Racket.
Finally, Racket has a number of mechanisms for abstraction. For now, we will focus on just one: the ability to give a name to a value. We create names in the same way we create any compound expression, using the operator define:
(define <name> <expression>)
When we define something, we're telling Racket to substitute a value (perhaps created by a computation) wherever it sees the identifier. For example,
(define upper 10)
Thereafter, wherever the Racket interpreter sees upper, it will replace it with the value 10.
The <expression> is evaluated and associated with the <name>. We evaluate a definition for its side effect -- the naming of a value -- and not for its own value.
For the most part, definition is the only form of side effect we will use. It is how we name our programs, the top-level data used by our programs, and the data we use for testing.
Notice that define is not a function, because it does not evaluate all of its arguments. The first argument is taken literally, as the symbol to be used as the name. In Racket, we call an operator that does not evaluate all of its arguments before executing a special form. In a session or two, we will study another important special form, lambda, which creates procedures.
In addition to define and lambda, Racket offers a small core of essential special forms:
We will study all of these in due time. We will also see that two of these forms are not strictly necessary, because Racket has another way implement the same behavior.
Every other operator in Racket is either a function or a syntactic abstraction, something that is defined in terms of something else. We will also study the idea of syntactic abstraction in some detail later in the course.
Let's explore some examples of Racket expressions, using our Dr. Racket interpreter. This will give us a chance to learn a bit about how Racket works and also get to know our programming environment better.
Some things to pay attention to in Dr. Racket:
- using the Interactions window and the Definitions window
- working with files: open, execute, and evaluate
Primitives and Simple Expressions.
When we enter some expression at the prompt, for instance, a number, Dr. Racket will print the result of evaluating the number. Because numbers are numeric literals, the value is the same as the expression.
> 25 ;; a number 25 > 1.2 ;; handles integers and floats in the same way 1.2 > #t ;; a boolean ... also #f #t > #\a ;; a character ... we won't use these much #\a > "Eugene" ;; a string ... ditto "Eugene" > 'a ;; a symbol -- both identifier and value ... a ;; we use these a lot as data! Notice the quote. > (quote a) ;; quote is a special form. ' is shorthand a > 'a-symbol ;; a symbol -- Racket has fewer constraints on what a-symbol ;; can be a symbol than most other languages > '123->321 ;; see what I mean? 123->321
Here we see an important behavior of Racket interpreter: it evaluates every expression it reads. Primitive objects evaluate to themselves, and the "print form" of the object (what we see in an answer) is usually the same as the form we write. Those of you who know Python have probably used an interpreter such as IDLE and seen similar behavior.
What happens if we evaluate the symbol a , an "a" without the character escape sequence and without the quote?
> a a: undefined; cannot reference undefined identifier > (define a 5) > a 5 > a 5
Symbols serve as identifiers in Racket. That is, a Racket symbol can name a value. We will talk quite a bit more about definitions, identifiers, and values soon.
Some identifiers have values when we first start a Racket session. Watch this:
> min #<procedure:min> ; a primitive procedure on numbers > not #<procedure:not> ; a primitive procedure on booleans > string-length #<procedure:string-length> ; a primitive procedure on strings > list #<procedure:mlist> ; a primitive procedure on any args > + #<procedure:+> ; even + is a procedure!
These are some of Racket's primitive procedures, the built-in behaviors provided by the language.
Racket procedures are named by symbols, just like the variable names we would use in Java for ints and objects. That is correct: all Racket procedures, even the primitive operation for adding numbers. This turns out to be a remarkably powerful and useful idea, one that we'll come back to later.
Combinations and Compound Expressions.
We combine primitive objects with operators to form more complex expressions. As noted above, Racket's only mechanism for building compound expressions is the prefix expression. A compound expression is always enclosed in parentheses. This is probably different from your experience with other programming languages, where parentheses are usually optional. Keep this in mind always:
Random programming by inserting or deleting parentheses will get you nowhere, more so and faster than in other languages. Think about what you want to say, and ask questions if you don't know how to make it work.
Here are some more examples from Dr. Racket:
> (* 2 2) 4 > (- 4 2) 2 > (+ 3 5.2) ; handles integers and floats with equanimity 8.2 > (/ 4 2) 2 > (/ 1 3) ; and rationals are numbers, too! 1/3 > (- -3 -5) 2
What happens if we insert a pair of parentheses somewhere?
There are several important points for you to note about this Racket session. First, note that the leftmost element in a compound expression is the operator, followed by the operands. The Racket evaluator determines the value of the expression by applying the procedure specified by the operator to the values specified by the operands.
That last sentence is extremely important, and more complicated than you might think at first, so make sure you understand what it says.
Notice that in the last expression, the "-" occurs three times and means two different things. When it's the leftmost element in the expression, it represents the operation that is to take place; when its appended to the front of a number, it means that the number is negative. Spacing is important here: (- - 3 -5) would produce an error:
> (- - 3 -5) -: contract violation expected: number? given: #<procedure:-> argument position: 1st other arguments...: 3 -5
This points out a feature of Racket we will talk more about soon: we are allowed to pass procedures as arguments to other procedures! (But not to -.)
Another thing that you will note is that Racket's procedures for numbers accept both integers and real numbers without any explicit type coercion or casting. The result of adding 3 to 5.2 is 8.2. That's what most people would say, too. The result of dividing 1 by 3 is 1/3, a fraction, or a rational number. This, too, is obvious to people with no programming experience. Rational numbers are a data type in Racket.
Other computer languages usually make all sorts of distinctions among different types of numbers, but those distinctions are driven by the implementation of the language and processors for it, and not by our understanding of numbers.
Somehow, we programmers become conditioned by our languages into thinking that these distinctions are a necessary ones. They are not. Racket characterizes numbers as exact or inexact and makes distinctions in behavior driven by this mathematical idea.
If you need one more example of how Racket hides implementation details about its numbers, execute this Racket program. Try that in Java [ loop or recursive ], Python or Ada!! With a different version of the program, we can squeeze a few more digits out. Check out this file for some sample runs.
Note, too, that while Racket has no for or while loops, it does quite nicely with a deeply recursive program, thank you. You will learn the "magic" that makes this possible in just a few weeks.
NOTE: These programs, along with the sample interactions, are available in the .zip file for today's session notes. I'll bundle up a .zip file of code for you for most class sessions. Be sure to download the code, study it, run it, and modify it. That's the best way to learn the ideas we are studying!
A common question I receive is Why does Racket use prefix notation? Students usually already know infix notation for operators. This seems like an unnecessary complication.
Prefix notation has several advantages over other notations, such as infix and postfix. Two stand out:
3 4 8 6 5 4 7 6 5 8 + 9 --- 65
Using prefix notation, however, the issues of precedence are clear without anyone memorizing precedence rules. The above example would be written (- (+ 3 (/ (* 4 5) 6)) 7) in prefix notation.
Now, you may be saying to yourself at this point, "This expression isn't clear at all". But it is; it simply requires attention to different details than you are used to. I think you will find your comfort level with such expressions is mostly a matter of exposure. You will be as comfortable with this system as any other after you use it for a while.
There is another reason in the case of Racket, though. Recall: In Racket, arithmetic operators, and all the other operators you are used to, are functions! Using prefix notation for them is simply being consistent. This will give us some unexpected benefits later, when we decide to swap out an operator and replace it with a function or a different operator. In those cases, having predefined operators and user-defined functions be interchangeable will be a huge win.