Session 21

Colored Dots and Onions


CS 3530
Design and Analysis of Algorithms


Puzzle: Dot Connections

We are given a set of 2n dots spaced evenly along a long line. n of the dots are blue, and n are red. The dots of each color are numbered 1..n.

The dots can appear in any order, with only one requirement. For any i in [1..n], the red-i dot appears to the left of the blue-i dot. For any other value of j in [1..n], though, the red-i and blue-i dots can appear left or right of the red-j and blue-j dots.

For example, this is a legal input:

    r4 r2 b2 r3 b4 r1 b3 b1

Notice that every red dot is to the left of its corresponding blue dot, but otherwise we have a jumble.

Suppose that each red dot is connected to its corresponding blue dot.

colored dots connected in a line

We want to know the total length of the n connections along the straight line. For this configuration, the total is 4 + 1 + 3 + 2 = 10.

Your first task is this:

Write an algorithm to compute the total length of the connections for a given input.

Some candidate algorithms are O(n²). Can you achieve O(n)? Or better?

Next:

Generate a legal input with n = 5.

Try to make a challenging sequence, one that exploits the hardest part of the problem.

Finally:

Trade sequences with someone else. Apply your algorithm to the sequence you receive.

How well does your algorithm work?



Debriefing the Dots

What are some of the possible ways to attack this problem?

My first candidate uses brute force. I call it Single Array, Compute After:

    INPUT: 2n values of form color:number

    store the input in an array slots of size 2n
    total ← 0
    for i = 1 to n do
        red_slot  = slot of red-i
        blue_slot = slot of blue-i
        distance  = blue_slot - red_slot
        total = total + distance
    return total

How well does Candidate 1 perform?

A variation of this approach creates two arrays, one for the color of the dot in position i, and one for the number on the dot. This behaves pretty much like the single array approach.

But once we strike upon the idea of multiple arrays, we can use it to create a better approach. Why not store the red dots in one array, and the blue dots in another? Then we can store their positions in the original input.

I call my second candidate Parallel Arrays, Compute After. It preprocesses the input.

    INPUT: 2n values of form color:number

    create two arrays of size n, red and blue
    for each position in the input
        store position in color[number]

    total ← 0
    for i = 1 to n do
        distance  = blue[i] - red[i]
        total = total + distance
    return total

How well does Candidate 2 perform?

Better! Can we do mo' better? Well, we know that we always see blue-i after red-i, so do we need to store the blue dots at all? We could compute the distance right then.

I call this candidate Single Array, Compute on the Fly. It store only the red dots.

    INPUT: 2n values of form color:number

    create one array of size n, red

    total ← 0
    for i = 1 to 2n do
        if the dot is a red-j then
           red[j] = i
        else it is a blue-j
           distance  = i - red[j]
           total = total + distance
    return total

What improvements does Candidate 3 offer?

This improves in both space and time, but only in the constants. The complexity classes are the same. How can we do even better?

It sounds like we need a good invariant...

Notice that ...

    Σ [ position(blue-i) - position(red-i) ]

... is equal to ...

    [ Σ position(blue-i) ] - [ Σ position(red-i) ]

So we don't need to store any dots at all! I call the algorithm that takes advantage of this little trick One Good Invariant. It store no dots.

    INPUT: 2n values of form color:number

    total ← 0
    for i = 1 to 2n do
        if the dot is a red-j then
           total = total - i
        else it is a blue-j
           total = total + i
    return total

Now we see an improvement in the order of the algorithm:

Nice.

This is about as far as most of us would go with this problem. We have a solution that is O(1) in space and O(n) in time. But a few creative computer scientists have come up with a variant of One Good Invariant that can often be implemented slightly more efficiently. It frames the algorithm in terms of the differences rather than the positions.

We might call this version Explicit Delta.

    INPUT: 2n values of form color:number

    delta ← 0
    total ← 0
    for i = 1 to 2n do
        if the dot is red then
           delta = delta + 1
        else it is blue
           delta = delta - 1
        total = total + delta
    return total

This algorithm uses two counters, one for the delta and one for the running total, independent of n. This is still O(1) in space. It still makes a single pass over 2n items, which is O(n) in time. But the arithmetic may well be simpler, shaving off some of the constant overhead cost of the computation.

You may well look at this algorithm and say,

That can't possibly work. I don't believe you.

Fair enough. I didn't quite believe it myself. So do what I did.

Quick Exercise: Trace the Explicit Delta algorithm on our example input above:

    r4 r2 b2 r3 b4 r1 b3 b1

Trust me. I won't lead you astray...



The Sliding Delta Pattern

The last two algorithms above may seem a bit clever for us to create on our own. But they use a common idea that computer scientists have re-discovered in many domains. It's an idea that, once we know about it, we can use to solve many sequence processing problems efficiently.

Context

Forces

Solution

This is called the Sliding Delta pattern. The counter maintains the delta.

Examples

Mathematical Basis

Mathematical Characteristics

Whenever you encounter a problem that fits the profile of a sliding delta, this pattern can be handy.



Puzzle: Onions

We are given a sequence of n parentheses, left and right, in any order.

For example, (( is a legal sequence. So is ()). As are ((())) and ((())))(()())((()).

An onion is a subsequence of 2k parentheses where the first k are left parentheses and the second k are right parentheses, (((...))). Notice that, when k > 1, the onion contains smaller onions.

We are interested in the number of onions in the input.

For the examples above, the number of onions are 0, 1, 3, and 7, respectively.

Write an algorithm to compute the number of onions in a given input.

Doing this in O(n) space isn't too difficult. Can you do it in O(1)?



Peeling the Onions

The first idea to come to mind is one you may have encountered when studying data structures. It uses a stack.

    stack ← empty
    total ← 0
    while there is input do
      if input is ( then
         push stack
      else if stack not empty
         pop stack
         total ← total + 1
    return total

The worst-case scenario for this algorithm is a large input of size n = 2k, consisting of k left parentheses followed by k right parentheses. In this case, we store k items on the stack, and space usage is O(n).

There's a subtle difficulty lying in this approach. How can we ensure that (()()) counts as only two onions, not three? The outer pair of parentheses is not an onion! But the stack loses context, and we get the wrong answer

Perhaps we can fix the stack-based algorithm to make it work. But there is a simpler candidate 2, based on the Sliding Delta pattern.

This is O(n) in time, of course, but uses minimal space -- only two counters and one character for the input.

What, again you don't believe me? That's fine. Trace this algorithm on all four examples above. Does it work?



Wrap Up



Eugene Wallingford ..... wallingf@cs.uni.edu ..... April 3, 2014