In Session 2 and Session 3, we played a couple of games with numbers as a way to explore three high-level approaches to designing an algorithm:
(Note that top-down and bottom-up are also common ways to approach the writing of programs...)
Often it is wise to begin by trying to create a top-down solution, which causes us to expose the relevant sub-problems. You may then be able to create a bottom-up approach that combines solutions to sub-problems in a more efficient way. Occasionally, our experience from this attempts may help us to see some features of the problem that zoom in on an immediate solution.
Consider the Difference Game, simple two-player game played with sets of positive numbers. Ordinarily, the game starts with a set of size two. The players take turns adding a positive number to the set. The only requirements are that:
The first player who cannot move loses the game.
For example, consider this starting position:
Player 1's opening move is 2. But then Player 2 has no moves to make, so Player 1 wins.
Then consider this starting position:
Player 1's opening move is 12. Player 2 moves 8. Player 1 has no move to make, so Player 2 wins.
Your opening exercise: Play this game three times against a classmate. Take turns going first.
As you play, try to find a some pattern in how the game proceeds, maybe even a strategy for winning. Don't share your ideas with your opponent just yet!
... much fun ensues ...
Your next exercise: Play this game three more times against another classmate. Take turns choosing who goes first.
If you have a way of determining whether you want to go first or second, great -- but don't share it with your opponent just yet!
... more fun ensues ...
Now, tell us what you noticed...
Digression. Pragmatics of the game. If you can't find a move, you lose -- even if a move exists. (Eventually, there isn't.) Complexity and variety: large numbers, initial set has mopre than two numbers. "Knowing the truth" doesn't mean the game isn't fun, or a challenge.
Did you notice that this game depends on the factors that two numbers share, and that relative primes lead to a particular sort of game?
It turns out that this game is a variation on the theme of Euclid's algorithm for finding the greatest common divisor (GCD) of two numbers. It is straightforward to show that the set of numbers which can be added to the Difference Game's starting pair is identical the set of numbers generated by Euclid's original GCD algorithm, which uses repeated subtraction in place of division. Only the order is changed.
The Big Question: If given the choice, would you choose to move first or second in this game? Why?
Answer: It depends. The number of numbers generated by Euclid's algorithm is equal to m/gcd(m, n), including the original pair. So the number of moves available for any pair is equal to m/gcd(m, n) - 2.
If m/gcd(m, n) is odd, then you want to go first. If it is even, you want to go second.
When the starting positions contain larger numbers and a large number of moves, this can be non-trivial. What sort of algorithm can you use to find moves? Is your algorithm top-down, bottom-up, or zoom-in? Why? How expensive is it -- O(n²)?
The idea of "bottom up" doesn't seem to apply to the problem of finding moves. (Maybe a linear search from 1 up?) The key here is in recognizing the invariant that lets us zoom in on a "choosing to go first" algorithm...
Here are three algorithms for computing gcd(m, n), the greatest common divisor of two positive integers m and n. We will give an example of each operating on the case of m=70 and n=32.
Euclid's Modified Algorithm
while n != 0 r := m mod n m := n n := r return m
Question: What happens if m < n?
Answer: It swaps them on the first pass!
t := min(m, n) loop if m mod t = 0 and n mod t = 0 return t t := t - 1
We might call this an informed brute-force algorithm. It is "brute-force" because it simply tries all possible answers in order. It is "informed" because it is smart enough not to start at 2.
The "Middle-School Procedure"
fm := sorted list of prime factors of m fn := sorted list of prime factors of n c := common-factors(fm, fn) return product of c
This algorithm is often derided as a way of finding the GCD. Why? But remember the context in which we learn it... The goal of our middle school math class isn't computational efficiency, but understanding what GCD is and what it means!
Look how different these algorithms are. There are always many ways to solve a problem, express an idea, and even implement the same algorithm in code. We encounter options and face trade-offs.
Question: What is the big-Oh run-time efficiency for each of these algorithms in terms of the number of candidates considered?
For the Euclid's algorithm:
- Best case: m mod n = 0, so it considers 1 candidate.
- Usual case: The remainder is around n/2, so it considers fewer than O(log n) candidates.
- Worst case: The remainder is always n-1, so it considers O(n) candidates.
For linear search:
- Same best case.
- Otherwise, O(n).
For the middle-school procedure... ?
Question: Is this a reasonable way to compare these algorithms? Why not?
The middle-school procedure doesn't search for candidates. It computes an answer in a much different way. But it turns out to be quite expensive computationally, due to the nature of its steps.
Its expense is in finding prime factors. But remember: middle-school students work with small numbers and thus have the set of prime numbers they need in hand.
Exercise: Write an algorithm to find the common elements in two sorted lists. You may assume the existence of standard list operations, such as add, remove, contains, etc.
Solution: Here is one. It takes as arguments two ordered lists, lst1 and lst2.
lst1 = (2, 5, 7) lst2 = (2, 2, 3, 5, 11) lst1 = factors(2100) lst2 = (11)
What is this algorithms complexity? It is O(n) in the size of the longer list. But what are the best-, average-, and worst- case scenarios for the size of the lists, given particular m and n?
Putting it all together, if we have a list of prime numbers in hand, the complexity of the "middle-school procedure" is:
Question: What is a fairer way to compare the complexity of these algorithms? Count basic operations. Use m and n as ways to normalize the results.