### A Review of the Graph as a Data Structure

Remember graphs...

• G = (V, E)
• V is a set of vertices
• E is a set of edges, pairs of vertices
• such networks used to model all sorts of problems

Most of you studied trees in a data structures course. A tree is a special kind of graph: the edges are directed, there are no cycles, and there are no "tangles". The vocabulary of graphs should sound familiar. The basic search algorithms will, too.

This reading will only scratch the surface of graph algorithms. We will do more in later sessions this semester.

### An Exercise: Design

Develop a divide-and-conquer algorithm to count the number of leaves in a binary tree.

Written in the textbook style, this algorithm seems unnatural to me:

```    algorithm leafCount(T)
output    number of leaves in T

if T = ∅
return 0
else if TL = ∅ and TR = ∅
return 1
else
return leafCount( TL ) + leafCount( TR )
```

What is the basic operation for this algorithm's efficiency? The comparison. How many comparisons does this algorithm make?

```    C(0) = 1
C(T) = 3 + C( TL ) + C( TR )
```

We can solve this kind of recurrence relation as we solve any other. The complication is that the two subtrees can have vastly different sizes and shapes. Let's save solving them for later.

Notice how an OOP implementation of trees changes the nature of the algorithm's run-time efficiency. Using polymorphic objects, empty trees will be objects. This shifts the decision of whether a tree is empty to object construction time... which makes algorithms such as these run faster because we don't need to make any comparisons made at all! In that case, what is the basic operation?

### Operating on Graphs

Divide/decrease-and-conquer are natural ways to work on graphs, as the edges out of a vertex provide a natural way to split of a graph into subgraphs. Processing all vertices -- as the standard pre-, in-, and post-order traversal algorithms do -- can be implemented with divide-and-conquer. Searches tend to be decrease-and-conquer. In particular, breadth- and depth-first search are decrease-by-one algorithms. Search in a binary search tree is decrease-by-half. Search in a k-ary search tree is decrease-by-(k-1)/k).

Consider this depth-first search algorithm for traversing a graph:

```    DFS(G)

1. for each v in V
mark(v) ← 0
2. count ← 0
3. for each v in V
if mark(v) = 0
dfs(v)

dfs(v)
1. count ← count + 1
2. mark(v) ← count
3. for each w ∈ { vertices adjacent to v }
if mark(w) = 0
dfs(w)
```

This is a decrease-by-one algorithm. It selects one vertex and then visits the remaining n-1 vertices in the same way.

Quick Exercise: Why do we need two procedures? Why won't dfs(v) suffice?

... (Hint: Not all graphs are connected!)

### An Exercise: Use

Use the DFS algorithm on above to traverse this graph:

```                     f -- b    c -- g
\ /  \ /    /
d -- a -- e
```

Start at (a) and break ties in alphabetical order. Draw the DFS tree, and label each node with its order reached (pushed on stack) and order done (popped off stack).

A complex graph can give a simple depth-first traversal. The higher the connectivity of the graph (E/V), the more nodes we can "pick off" on a single DFS descent into the graph.

```            a 1 7
/       \
b 2 3           c 5 6
|               |
d 3 2           g 6 5
|               |
f 4 1           e 7 4
```

### Sorting a Directed Graph

In a directed graph, or digraph, the edges are "one-way streets". Each edge is an ordered pair of vertices. We can now speak of the in-degree and out-degree of a vertex, the number of edges pointing to and from a vertex, respectively. The digraph is also an incredibly useful tool for modeling many problem domains.

How can we "sort" a digraph? We can sort the vertices or edges according to their values, but that usually isn't all that useful. What can be useful is to sort the graph in topological order, that is, according the order its edges impose on its vertices.

The edges of a graph create an ordering on the vertices that works a bit like <:

vi < vj iff there exists a path of vertices from vi to vj

Visually, this looks like means vi < vj if and only if there is a path such as:

```    vi → a → b ... → vj
```

A topological sorting of a graph lists its vertices in a way that any time there is an edge (vi, vj) ∈ V then vi appears before vj in our list.

Here is an example graph and its topological sorting:

```    V = { c1, c2, c3, c4, c5 }
E = { (c1, c3), (c2, c3),
(c3, c4), (c3, c5),
(c4, c5)           }

sorted list: c1 c2 c3 c4 c5
```

Not all digraphs can be sorted in this way, though. Any cycle among the vertices will create a problem. Consider this graph, which has just one more edge than the example above:

```    V = { c1, c2, c3, c4, c5 }
E = { (c1, c3), (c2, c3),
(c3, c4), (c3, c5),
(c4, c5), (c5, c2) }

sorted list: c1 c2 c3 c4 <c5>
```

Fortunately, many computing applications involve directed acyclic graphs (DAGs)...

What might an algorithm that sorts a graph look like? One simple way is to use DFS to record the order in which vertices are popped from the traversal stack, then reverse that list to get the sorted list. To handle graphs with cycles, our algorithm should fail any time we pop a vertex that has an edge leading to a previously-visited vertex (called a "back" edge).

... try it on the graphs above ...

A nice decrease-by-one solution is to select vertices that have an in-degree of 0:

```    repeat until V is empty
v = any vertex in V with degreeIN = 0
if none, then fail
V = V - { v }
E = E - { e | e = (v, vk) for any vk in V }
```

Will there always be such a vertex in a DAG? Yes. (Try to prove it by induction... It's actually quite fun!) And each time we remove such a vertex, the remaining DAG must have at least one, too.

... try it on the graphs above ...

Eugene Wallingford ..... wallingf@cs.uni.edu ..... April 2, 2014