Design and Analysis of Algorithms

These are pancakes, a gaping hole in the "eat a healthy breakfast" story that Mom and Dad like to tell us. [audio]. I'm not good enough a cook to make a perfect stack of pancakes all the same size, like the top photo. Mine are more likely to look like the bottom stack. Still, they are yummy.

Unfortunately, I'm also a little OCD. I like the order and symmetry of having my pancakes in order by size, largest on bottom. That way, the syrup and extra butter will run down and sweeten the whole stack! But my mom won't let me rearrange the pancakes by hand; neither will my wife.

They, however, do let me use a spatula, aka "the flipper". I am able to slip the spatulta under any pancake and flip over the entire stack on top of the spatula. That took a lot of practice! I also have to rearrange the pancakes "in place" -- no extra plate allowed. So I do. (That took even more practice!)

But: how can I do this quickly, before my mom or wife tells me to stop playing with my food? This is your task: help me put my pancakes in order.

You have a stack ofnpancakes of all different sizes, in no particular order. You are able to slip a spatula under any pancake and flip over the whole stack that is on the spatula. You would like to arrange my pancakes in order by size, largest on bottom.

Let's proceed in three steps:

- Just think for a minute about how you would do this. (It would be great if I could give each of you a stack of pancakes to experiment with!)
- Describe to your neighbor how you would do it. See if there is anything in her approach that would improve your own.
- Write down your algorithm in pseudocode.

How many times must you flip the pancakes?

How would you do it?

This is a great exercise in **thinking backwards** from
a subgoal.

I want the largest pancake on bottom. What action must I take? Flip it to the bottom. Where must it be in order to do that? On top. So: Find the largest pancake. Flip it to the top. Flip it to the bottom.

The same process works for the second largest pancake, and the third. After getting a pancake into place, we never mess with that part of the stack. So:

INPUT : stack of pancakes 1. while there is a pancake out of order choose the largest such pancake flip that pancake to the top flip that pancake to its correct position 2. eat pancakes

This is a **decrease-and-conquer** algorithm. In the
worst case, we *decrease by one* on each step. But
really it's a *variable decrease* algorithm. We
can decrease the number of out-of-place pancakes by more
than one on any given step.

Sorting pancakes is part of an interesting piece of CS trivia.
Bill Gates reputedly wrote only one academic computer science
paper,
Bounds for Sorting by Prefix Reversal
(mirror),
which proves bounds for how many flips are needed to sort
a stack. The worst case is 2*n*-3 flips. Let me
know if you'd like to learn more...

Now, let's review the exam. I'd like to highlight a few key ideas, and perhaps help you understand a couple of techniques better.

Design adivide-and-conqueralgorithm to computek^{n}fork> 0 and integern>= 0.

This is simpler problem than the divide-and-conquer multiplication algorithms we studied in class. How can we divide this problem into two parts? The most natural is:

k^{n}=k^{n/2}*k^{n/2}

**Quick Exercise**: Can we divide-and-conquer on the
*k*? If so, how well does it work?

What is our base case? We are dividing *n*, so
eventually we reach 1. That leaves a single *k*.
So:

power(k, n) if n == 1 return k else return power(k, n/2) * power(k, n/2)

That's the basic outline of a solution. The problem says
*n* can be 0, so we need to treat it as a special case.
The bigger issue is that *n* could be odd, in which
case this algorithm loses a factor of *k*. So:

power(k, n) if n == 0 # better factored out, so return 1 # that we don't repeat it else if n == 1 return k else if n is even return power(k, n/2) * power(k, n/2) else return k * power(k, n/2) * power(k, n/2)

That's a better solution. If we have extra time, how can we improve on this? We are multiplying the sub-results in both arms, and that could be factored out. More importantly, we are making the recursive call twice, and we can get by with one! So:

power_rec(k, n) if n == 1 return k else root = power(k, n/2) if n is even return root * root else return k * root * root

Here is a Python implementation of the optimal version. If you'd like an implementation of the less optimal version, make it so.

This is straightforward if you know the math, or have reviewed it recently. Computer scientists and programmers need to be able to work with this level of math comfortably.

Answer these questions about your divide-and-conquer algorithm for Problem 1. Assume thatn= 2^{m}for some integerm.

- Give a recurrence relation for counting the number of multiplications done by your algorithm.

- Solve your recurrence relation.

- How does this number compare to the number of multiplications done by a simple decrease-and-conquer algorithm for this problem?

Recurrence relations are still causing many of your problems.
The first thing you need to know is how to **recognize**
one. The second is how to **construct** one. The third
is how to **solve** one. You really have to learn the
idea on your own. Let's see if I can help you construct
one more reliably, with a little design "recipe". We've
seen the recipe for solving relations a few times. Learning
that takes practice.

Let's analyze the non-optimized versus of our algorithm above, because that's the one most of you wrote.

First, we **set up** the recurrence relation for the number
of multiplications. Follow these steps:

- What size of input is the base case?
- How many multiplications are done in the base case?
- What is the recursive case?
- How many multiplications are done in the recursive case?

Notice that the problem lets us assume that *n* =
2^{m} for some integer *m*. We
don't have to worry about zero or odd exponents other than 1.

This gives us:

M(1) = 0 M(n) = 1 + 2M(n/2)

That's what a recurrence relation looks like. When you are asked to create one, this is what you create.

Next, we **solve** the recurrence relation, to eliminate
the recursion and give a *closed form* solution for
the number of multiplications in terms of *n*. We
have seen
this recipe
before, several times. Follow those same steps.

__ Starting with n__, we substitute previous
values recursively until we find a pattern for

- What is M(
*n*/2)? - Substitute that value into the equation for M(
*n*). - Simplify.
- What is M(
*n*/4)? Substitute and simplify. - etc.

In this way, we derive...

M(1) = 0 M(n) = 1 + 2M(n/2) = 1 + 2(1 + 2M(n/4)) = 3 + 4M(n/4) = 3 + 4(1 + 2M(n/8)) = 7 + 8M(n/8) = 7 + 8(1 + 2M(n/16)) = 15 + 16M(n/16)

Notice that we do *not* work forward from the base
case. Doing so can be quite helpful while experimenting,
but we solve the relations to-down. It is often possible in
practice to work the other direction, but I have never seen
a student do it successfully on an exam.

Do you see the pattern yet? For an integer *i*, on
line *i* of the definition of M(n), we see
(2^{i}-1) +
2^{i}M(n/2^{i}). This requires
a bigger step from the problems we've seen in class, but it's
still basic exponents.

This pattern allows us to generalize the definition of our recurrence relation. We are left with...

M(1) = 0 M(n) = (2^{i}-1) + 2^{i}M(n/2^{i})

Once we have a pattern for *n*-*i*, we need to
substitute a value for *i* that turns the recursive case
into the base case. You can proceed with very small steps:

- What value of
*n*is the base case? - What value of
*n*is the generalized base case? - Set those values equal to one another, and solve for
*i*. - Substitute the result back into the recursive equation
for M(
*n*). - Simplify.

In essence, this process answers the question, "How many times would I have to keep substituting into the recurrence before I reach the base case?"

Here, the base case is *n* = 1, and the general term is
*n*/2^{i}. So:

n/2^{i}= 1 2^{i}= n log 2^{i}= log n i * log 2 = log n i = log n

In simpler cases, solving for *i* gives us *n* or
*n*-1. But here we need a value that, when 2 is raised
to the power, gives us *n*. That is precisely what
"log *n*" means!

So we plug that value in for *i* and simplify the
recursive case:

M(1) = 0 M(n) = (2^{i}-1) + 2^{i}M(n/2^{i}) ... [substitute i = logn] ... = (n-1) + nM(n/n) = (n-1) + nM(1) = (n-1)

And we have solved the recurrence relation. Now we can think about the algorithm.

This algorithm performs *n*-1 multiplications and is
O(*n*) -- exactly like the straightforward loop.

How can this be the case? We only go through log *n*
steps in this algorithm, not *n* steps. But recall that
our algorithm makes **two** recursive calls on each pass.
The result is full binary tree of multiplications: *n*
leaves and... *n*-1 internal nodes. That's where the
multiplication is done.

Can we do better? Yes, we already did -- with our optimized algorithm above.

**Quick Exercise**: Write and solve the recurrence
relation for our final algorithm above.

Do it now. Really.

No. Really.

This problem demonstrates the relationship between trees and sequences, and the fine line between a really good idea and an unfortunate implementation. It also shows why you want to be able to solve recurrence relations, so that you can do a sanity-check on your algorithms before investing too much time in code.

Here is a Python implementation of the optimal algorithm, tooled to count the multiplication operation. This can help you confirm just how few times the algorithm has to multiply.

Trace this algorithm for the inputand write down the following information just[89 45 68 90 29]beforeexecuting Line 7 on each pass:

- the state of the array
- the number of value comparisons done on the just-completed pass through the
while-loop

I can't overstate the power of tracing code. It is an
essential skill for *debugging*. It can also help
you *understand* a program, by seeing the patterns
that occur in the data. Practice tracing code as much as
you can while studying both algorithms and programs.
Practice, practice, practice.

Note two key parts of the question:

- We want the state of the array
*before*executing Line 7. - We want to count only the
*value comparisons*,, not the`arr[j] > v`*index comparisons*,.`j >= 0`

*Show code. Insert prints to give us the state of the
array*.

Here is
a Python implementation
of the code, tooled with a ** print** statement
that shows us the state of the array at the desired spot.
Can you add code to count the value comparisons for us?

Consider thepartition problem: Givennpositive integers, partition them into two disjoint subsets such that, when we sum the elements in each subset, the two sums are equal.

- What makes this problem computationally hard to solve by brute force?
- Describe one of the combinatorial algorithms we studied in class that can help us solve this problem as efficiently as possible using brute force.

This problem is hard because there are 2^{n}
subsets of a set of *n* items. For each of them, we
need to sum its members and sum the member of its complement.
With brute force, this will take O(2^{n})
time.

At least we can avoid also using O(2^{n})
space. In
Session 15,
we saw techniques for working with permutations represented
as O(*n*) arrays of arrays.
Session 16
talked about ways to adapt these ideas for working with
subsets.

*Read the assignment*. Good answers to this question
indicate at least familiarity with those ideas, if not a
mastery of them.

We do have a couple of ways to speed things up, if only marginally.

- We only need to enumerate subsets that contain
*n*/2 items or fewer. All sets larger than that are complements of the smaller sets. - We can some the items in the whole set at the outset.
If the sum is odd, we can signal failure, as no two
subsets can cover all
*n*items and have the same sums.

Suppose we are given an arrayA[0..n-1]of integers, in no particular order. Suppose further that we have an operatorthat can reverse the integers in slots 0 throughreverse(j)jin a single step.Design a

decrease-and-conqueralgorithm to sort the array.

Yes, this is Pancakes.

Again, this is a great exercise in thinking backwards from a
subgoal. The subgoal becomes the **invariant** that our
algorithm must satisfy, offered as a hint in the problem:

Afterksteps, the largestknumbers are in their correct positions.

On each pass through the loop, we need to find the largest number in the unsorted portion of the array, reverse it to the front of the array, and then reverse it to its correct position.

Here is
a Python implementation
of a solution, which is an array-based version of
our solution to the pancake problem
above. It includes a ** print** statement to
show that the invariant holds.

Could such a sort ever be useful? yes, indeed. It's perhaps not as useful these days when working with arrays, but think doubly-linked list...

- Reading -- Review your exam. All the code referred to above is available in the session's zip file.
- Homework -- Homework 5 is available and due next week. Ask questions early!