## Session 6

### Heuristic Search

#### Artificial Intelligence

today's slides in PDF

### Recap: Using Knowledge of the Problem to Improve Search

BFS and DFS are uninformed search strategies. Uninformed search algorithms explore the states of a problem without knowing anything about the environment. These algorithms work in the same way for every problem, because they do not consider any features that are specific to the problem being solved.

Knowledge is power. Intelligent agents often use knowledge of a domain to make search very efficient -- or even to make search go away! Consider how you play tic-tac-toe, or how an expert chessplayer plays the opening phase of the game. In each case, knowledge of the problem can take a situation that requires search for a novice and replaces it with reflex or table look-up.

How can we use knowledge to improve search?

• To change the order in which we consider the nodes.
• To throw away nodes from the list of nodes to consider.

• To formulate the problem better.
• To select the best search algorithm.

The latter two are really external to the search process itself. We'll be more concerned here with the first two, which focus on how to the search itself better.

Last time, we began to consider a simple form of knowledge that can be used to improve search: path cost. In uninformed search, the "cost" of applying each operator is the same as the cost for any other. The only sense of cost we see is how many operator applications does it take to reach a state. "Old" states were ones that we reached in fewer steps and so had been in the list of states to consider longer; "new" states were ones that we reached in more steps.

But we know that not all operators are created equal. Some operators cost more money than others (flying versus driving). Some take more time than others (driving versus flying, for long distances). Some operators are longer in distance (Cedar Falls-to-Waverly versus Cedar Falls-to-Dubuque). If our search process takes into account the actual cost of operators, then it ought to be able to come up with better solutions...

Consider our simple map from last time... Knowledge of the cost of applying an operator is problem-specific, because the operators are problem-specific. So any search algorithm that uses these costs in its deliberation is called informed.

### Uniform Cost Search

In Uniform Cost Search (UCS), we use a strategy that sorts the list in ascending order on path cost. The path cost of a node n is usually called g(n).

This algorithm is a variant of BFS, because it tends to explore states closer to the initial state first. (Applying more operators usually means more total cost.) But UCS's definition of "close" is more practical, because it counts the real cost of a sequence of actions, rather than estimating the cost by the number of actions.

Generally, newly-expanded states will enter the queue near the back, with older states nearer the front. But a path of low real cost, even if it requires more steps, can leap-frog ahead of a path with fewer operators.

How does UCS stack up against our evaluation criteria from last time?

• completeness: guaranteed to find a solution if one exists
• time: generally expensive for the same reasons as BFS, but oftenan improvement
• space: generally expensive for the same reasons as BFS, but often an improvement
• optimality: only if "cost" of applying each operator > 0

The problem with UCS is that it is too much like BFS. We aren't using much domain knowledge, just sorting on better information already available.

### Greedy Search

Where to look next? Well, consider the task of search while the algorithm is running: If my goal is to be in Champaign, Illinois, then there are two costs associated with traveling to the Quad Cities, and two costs associated with traveling to Dubuque. The Quad Cities are farther away from Cedar Falls than is Dubuque, but they are closer to Champaign.

UCS uses g(n) to guide its search. g(n) measures the cost expended in getting from the initial state to the current state, the sunk cost of taking the path. (Remember, the agent doing search is thinking ahead at this point, so it hasn't actually expended the cost, just imagined what would happen if it did.)

What about looking at what seems to be a more important measure: what will it cost to get from the current state to the goal?

Greedy search orders the unexplored states in ascending order of their expected future path cost. Usually, we don't have a way to know the exact future cost. (In most cases, the knowledge needed to compute future cost would also compute the solution!) For this reason, greedy search uses a heuristic function, called h(n), to estimate future cost. h(n).

Hill climbing is usually a very effective search technique, but it falls victim to several kinds of problem: foothills, plateaus, and ridges. How can we solve them? (Randomize direction, step size... or jump randomly!)

Greedy search doesn't look like an uninformed method any more, since it uses information that is very problem-specific. But that information can be provided as an argument to the search algorithm, so the general-purpose algorithm still works.

Greedy approaches are often quite efficient, but they generally are not optimal:

• h(n) is an estimate, so it can be wrong.
• Greedy search considers only on future cost, but not past cost.

How does greedy search stack up against our evaluation criteria from last time?

• completeness: no, for the same reasons as DFS, but usually a big improvement
• time: quite efficient, for the same reasons as DFS
• space: quite efficient, for the same reasons as DFS
• optimality: no

### The A* Algorithm

Both UCS and GS can be improved. Improving them in the same direction leads to the same result: f(n) = g(n) + h(n). The result is A*.

A* sorts the list of nodes in ascending order on f(n) = g(n) + h(n). In effect, A* combines UCS (which is optimal and complete, but inefficient) with GS (neither optimal nor complete, but efficient). The result is an efficient algorithm that can find optimal solutions efficiently!

Of course, h(n) is usually an estimate, since it is often difficult to know the cost of a path to a goal until searching the whole space below the node. What kinds of estimates might we have?

• An overestimating h will make A* inefficient, perhaps even moreso than vanilla UCS, since it prevents us from extending good paths. Bad overestimates result in even more inefficient search.

• An underestimate is not so bad. If an underestimate points us down a bad path, the actual cost we discover by expanding the node will right the search. Bad underestimates are, of course, worse than close underestimates, but even still A* will right itself when it encounters a high g value.

Optimistic h(n) are usually denoted h'(n). When we use f(n) = g(n) + h'(n), we discover that A* is both optimal (UCS) and as quick as possible (GS)!

Notice that f increases at each step of the search. In terms of function maximization, you can think of A* as scaling a mountain, where f is the altitude. The reason that A* is so efficient is that it walks up the sharpest contour of the "function". Since f always takes you up, A* is complete.

A good h'(n) results in more elongated contours (a feature of DFS) toward the goal.

If h'(n) = 0 for all n, then A* == UCS. If h'(n) = h(n) for all n, then A* is perfect -- go straight to the goal!

### Finding Good Heuristics

So, if we take A* as our search algorithm of first choice, then the key to doing good search is finding good h'(n) functions. This is where the real domain knowledge comes in: What do we know about the problem domain that will allow us to evaluate the closeness of a state to a goal?

We would like h' to be as accurate as possible -- "Better is better" -- without overestimating future cost. (But, if you err, err on the side of badly underestimating...)

Consider the 8-puzzle. The typical solution has depth d = 20 and branching factor b = 3. An exhaustive search expands 3^20 = 3.5 * 10^9 states -- but there are only 9! = 362,880 unique states in the puzzle!

Two possible h'(n) are:

• h1 = number of tiles out of place. This is admissible, since each tile moves >= 1 time.
• h2 = sum of Manhattan distances for each. This, too, is admissible, since it assumes that the only moves we make with a tile put it into its home immediately.

One way to compare metrics is to compare the number of nodes expanded using each. We could run an experiment using random initial states and average the cost of each approach over these trials.

INSERT TABLE

It appears that h2 <= h1 at all depths, so h2 must be always better. In such a case, we say that h2 dominates h1.

But how do we invent candidate heuristics in the first place? One useful technique is to relax the constraints on operators and then imagine a solution. For the 8-puzzle, there is only one operator: A tile can move from a moves to b if adjacent(a, b) and b is empty. This results in at least three possible relaxations:

• a tile can move from slot a moves to slot b if adjacent(a, b) -- h2
• a tile can move from slot a moves to slot b if b is empty -- ??
• a tile can move from slot a moves to slot b -- h1

Does "h3" make any sense?

### Wrap Up

• Reading -- Read Chapter 5 in Ginsberg.

• Homework -- None.

Eugene Wallingford ==== wallingf@cs.uni.edu ==== September 20, 2001