Session 16

Decrease and Conquer for Subsets

CS 3530
Design and Analysis of Algorithms

Refresher Exercise: The Johnson-Trotter Algorithm

Recall the Johnson-Trotter algorithm for generating minimal-change permutations without solving the subproblems explicitly. It uses two new ideas. First, each item in the set is given a direction, initialized to 'left'. An item is "mobile" if it points to an adjacent smaller number.

    ALGORITHM: johnson-trotter(n)
    INPUT    : integer n

    initialize A = [1 2 3 ... n]
    initialize D = [← ← ← ... ←]

    while there exists a mobile element
        k ← the largest mobile integer in A
        swap k and the element it points to
        reverse the direction of all elements in A larger than k

Last time, we traced the algorithm for n = 2 and n = 3, and an embedded exercise asked you to trace it for n = 4.

Do it again. Trace the algorithm for n = 3 and n = 4. It will make you stronger.

Generating Subsets

As with permutations, there is a simple top-down decomposition for computing the subsets of a set of elements ...

power set of {1..n}, with and without n

... which gives a straightforward algorithm:

    ALGORITHM: subsets(n)
    INPUT    : integer n

    partial ← subsets(n-1)
    return partial ∪ { S | (SUB + n) for all SUB in partial }

Quick Exercise: What is this algorithm's efficiency?

Because subsets(n) has 2n members, we really can't avoid time efficiency of θ(2n). But this algorithm also uses O(2n) space, as it computes all sets explicitly and builds the final solution in memory.

As with permutations, we can do better in regard to space. Let's use a variation of decrease-by-1: Generate one member of the result, then generate the remaining members implicitly, that is, without storing all the sub-results anywhere. This draws on the same idea of "morphing" elements one at a time that we saw in the Johnson Trotter algorithm last time. The key to this approach is to systematically cover whole set of answers.

The trick is to recognize that any bit string of size n corresponds to one member in subsets(n). This transforms the subset generation problem into a counting problem! We seed the algorithm with the "trivial" 0n, which corresponds to the empty set. From there, we count:

    value = 0n
      print value
      value ← value + 1        ( modular arithmetic,
      if value = 0n              bounded # of bits )

Consider the simple examples of n=2 and n=3:

    00    {}
    01    {b}
    10    {a}
    11    {a, b}

    000   {}
    001   {c}
    010   {b}
    011   {b, c}
    100   {a}
    101   {a, c}
    110   {a, b}
    111   {a, b, c}

Notice the unnatural order of generated series. Similar to the idea of minimal change for permutations, we usually want to see the sets that involve j only if all sets containing 1..j-1 have already been generated. This is sometimes called squashed order.

How can we generate the subsets in squashed order? This is easier than it sounds. We don't have to interpret the string as 010 as "no 1, yes 2, no 3"... We can read it in reverse: no 3, yes 2, no 1. Done!

Unfortunately, this is not a minimal-change algorithm. For example, the string 011 changes to 100. This changes the subset from {a, b} to {c} -- which requires removing two items and adding the third.

How can we generate the subsets in minimal-change order?

We can borrow the idea of a toggle value from the minimal-change algorithm for generating permutations: Append 0 to the front of every item in the solution for n-1, going left to right. Then append 1 to to the front of every item in the solution for n-1, going right to left.

Here is an example. Here, I go down for right, and up for left:


     00   -- adds 0 to front of 0, 1, moving left to right
     11   -- adds 1 to front of 0, 1, moving right to left

    000   -- adds 0 to front of previous four, moving left to right
    110   -- adds 1 to front of previous four, moving right to left

Alas, this requires that we store the results of subsets(n-1) in order to compute subsets(n).

Challenging Exercise: How can we generate the subsets in minimal change order without using so much space?

Perhaps we can borrow the idea of a set of toggle-like arrows from the Johnson-Trotter algorithm...

One Last Go at the Election Puzzle

In Session 13, we saw that we can divide a problem in different ways, leading to different divide-and-conquer algorithms. For example, mergesort divides its problems by the positions of the elements, while quicksort divides its problems by the values of the elements. The result is algorithms with quite different characteristics.

In our original attempt at the Elections puzzle, we divided candidates according to their position in the list of candidates, creating ranges of candidate solutions. This divide-and-conquer approach let us make one pass through the inputs using sqrt(n) range counters, and then perhaps a second pass through the inputs using sqrt(n) candidate counters.

Then, in Session 13, we borrowed an idea from quicksort, looking for a way to divide candidates by their values. The result was an algorithm that still makes at most two passes, but requiring only log2(longest candidate number) counters on the first pass and only 1 counter on the second! This is a huge improvement in space.

But, wait... In Session 14, we began our study of decrease-and-conquer algorithms.

Is there a decrease-and-conquer strategy for this puzzle?!?


Can you find it?

Hint: This time, focus on the list of votes, not the list of candidates.

Solving the Elections Puzzle Using Decrease-and-Conquer

In order to use a decrease-and-conquer strategy on this problem, we need to find an invariant that preserves the relevant features of our input sequence as we remove elements. That is, if an assertion is true about V(n), then it must also be true of V(n-1) or, more generally, V(n-k).

What feature(s) do we care about?

If candidate m has a majority in V(n), then she must also have a majority in V(n-k).

Let's consider how we might decrease and conquer. If we remove a single vote from V, what could happen?

What if we remove two identical votes from the list of votes? The same possibilities occur. We could change the status of the majority candidate in V(n-2). So such a move violates our invariant.

But... What if we remove two different candidates from the list of votes? In this case, we find a useful invariant:

If two elements in a sequence of n elements differ and one of them is a majority element, then that element is also a majority element in the n-2 element sequence created by removing the two elements.

Why? Because

    k   (k-1)
    - < -----     for all k and n > 2
    n   (n-2)

Dave, this is approaches is similar in spirit to the one you suggested as a possibility in class....

Consider these examples. Scan the list from left to right. When you see a two different votes adjacent to one another, remove the pair.

    (a) [3 5 6 5 5 4]
        [    6 5 5 4]
        [        5 4]
        [           ]   -- there is no possible majority candidate

    (b) [5 5 6 5 5 4]
        [5     5 5 4]
        [5     5    ]   -- 5 is a possible majority candidate

    (c) [5 5 5 6 6 4 4]
        [5 5     6 4 4]
        [5         4 4]
        [            4] -- 4 is a possible majority candidate

    (d) [5 5 5 6 5 4 4]
        [5 5     5 4 4]
        [5 5         4]
        [5 5          ] -- 5 is a possible majority candidate

In cases (b) through (d), we need to make a second pass. That pass will confirm 5 as the majority candidate in (b) and (d), and expose 4 as a false positive in (c).

Notice that processing the list in the other direction can result in a different potential majority candidate. For example, scanning (c) right-to-left gives 5 as a potential majority holder. The second scan will confirm that it did not win a majority either.

We can use this invariant to generate an algorithm to find the majority in a list of votes, if there is one.

We could do just as we did in the examples above:

  1. Repeatedly erase any two different votes until the list is empty or all remaining votes are for the same candidate.
  2. If a vote remains, then it is for a potential majority element. Make a second pass to confirm or disconfirm this candidate.

Unfortunately a naive implementation of as algorithm based on erasing can be quite inefficient, requiring that we repeatedly scan previously-seen elements looking for an item to erase. This costs both time and space.

However, this is the kernel idea for an efficient solution. Instead of using extra space or multiple scans, we can simulate both using a counter. See:

    counter   ← 0
    candidate ← <undefined>
    until no input remains
        v ← read next element
        if counter = 0
           increment counter
           candidate ← v
        else if candidate = v
           increment counter
           decrement counter

    if counter = 0
        return 0

    make second pass through input,
       counting occurrences of candidate
    if its count > n/2
       then return candidate
       else return 0

Wow. We still have to make just two passes through the votes, but instead of using

... we now use only one counter and one candidate variable on both passes!

Notice, too, that this algorithm does not depend on the nature of the elements or their representations. The previous two algorithms work only with numbers or other comparable and bit-decomposable values.

This is one case where a decrease-and-conquer algorithm can outperform a divide-and-conquer algorithm!

Having options as you think about a problem makes you a more powerful problem solver. Having ways to create options is a good way to become a better problem solver. Creativity is more about opportunity and flexibility than having a big brain or any particular skill.

Wrap Up

Eugene Wallingford ..... ..... March 7, 2014