Session 7

Adversarial Search


810:161

Artificial Intelligence


today's slides in PDF


Recap: Search

Uninformed techniques use problem-independent information to determine the order in which the agent considers states.

Informed techniques use problem-dependent information to determine the order in which the agent considers states. They typically use domain knowledge in two ways:

We considered three informed search methods:


Recap: The A* Algorithm

A* sorts the list of nodes in ascending order on f(n) = g(n) + h(n).

In effect, A* combines UCS (which is optimal and complete, but inefficient) with GS (neither optimal nor complete, but efficient). The result is an efficient algorithm that can find optimal solutions efficiently!

Pessimistic h(n) functions underestimate the distance to the goal and are denoted h'(n). When we use f(n) = g(n) + h'(n), we discover that A* is both optimal (UCS) and as quick as possible (GS)! So, we only "admit" as possible h(n) functions the h'(n) functions.

If h'(n) = 0 for all n, then A* == UCS. If h'(n) = h(n) for all n, then A* is perfect -- go straight to the goal!


An Example with Heuristic Functions

So, if we take A* as our search algorithm of first choice, then the key to doing good search is finding good h'(n) functions. This is where the real domain knowledge comes in: What do we know about the problem domain that will allow us to evaluate the closeness of a state to a goal?

We would like h' to be as accurate as possible -- "Better is better" -- without overestimating future cost. (But, if you err, err on the side of badly underestimating...)

Consider the 8-puzzle. The typical solution has depth d = 20 and branching factor b = 3. An exhaustive search expands 3^20 = 3.5 * 10^9 states -- but there are only 9! = 362,880 unique states in the puzzle!

Two possible h'(n) are:

Both are admissible, because they both underestimate the number of moves required in a given position.

Actually, h2 <= h1 at all depths, so h2 must be always better. In such a case, we say that h2 dominates h1.

But how do we invent candidate heuristics in the first place? One useful technique is to relax the constraints on operators and then imagine a solution. For the 8-puzzle, there is only one operator: A tile can move from a moves to b if adjacent(a, b) and b is empty. This results in at least three possible relaxations:

Does "h3" make any sense?


Problems with Search in the Real World

How can these search techniques "break"?

  1. Even as fast as it is, A* is not fast enough. The search tree for chess contains approximately 10160 nodes, and chess is simple compared to most tasks that an intelligent faces.

  2. An agent cannot evaluate every node in a search tree from its own perspective. Sometimes, other agents affect the state of the world.

How do humans address these problems?

  1. We don't consider the whole state space.

    When Emanuel Lasker (world chess champion from 1894 to 1920) was asked how many moves he considered when analyzing a chess position, he replied, "Only one. But it's always the best move."

    Humans filter the search space, eliminating most of the bad moves. -- and, occasionally, some of the good ones! This is called pruning.

  2. We take into account our opponent's different perspective.

    "If I go here, then she'll go there. But then I can do this, but she'll do that. ..."

    Humans evaluate positions from the perspective of the player who is "on the move".

Let's look at a way to solve the second problem first, and then at a way to solve the first problem second.


Search in an Adversarial World

We'll use simple games as our examples, but being in an "adversarial world" and competing with other agents is a natural part of many complex environments. Ford is competing with General Motors, and Microsoft is competing with, well, everybody. Games give us a nice way to consider some of the important issues without all of the complexities of real-world competition.

Our tree of possible futures now has a distinctly new flavor: in a game, players take turns. So we need to evaluate levels of the tree differently, depending on whose turn it is.

Consider how we might think about playing tic-tac-toe as search:

selective lookahead example in tic-tac-toe

Let's call the player who goes first "max". max moves with an "x". Let's call the player who goes second "min". min moves with an "o".

Notice how I've used knowledge of tic-tac-toe to reduce the search space. I've taken advantage of rotational and reflective symmetries to cut the branching factor by 67%, 75%, and 29%, respectively.

Let's suppose that I am max. Clearly, in order to choose a move in this search tree, I need to take into account that min gets to choose the second move. I can't decide that I like the position in the lower righthand corner best and so will make moves to force it there, because min will make the choice among the five possibilities after my first move into a side slot. My heuristic function needs to take into account what I expect min to do in that position--and I surely shouldn't count on her to make a move that will help me out!

So, in adversarial search, we need to evaluate positions on alternating levels of the search tree from alternating perspectives.

How can I choose the move I think is "best"? I could search all the way to the root...

Consider this generic search tree:

a simple min-max tree

If search to the "leaves" of the tree, then I will find states that are either wins for max (labeled 1), wins for min (labeled 0), or perhaps ties (labeled 1/2).

Then we can apply the simple idea of alternating evaluation from the perspectives of max and min to determine the value of any state:

  1. Label the leaves of the tree as "wins" or "losses".

  2. Pick an unlabeled state whose children are all labeled.

  3. If the state is on a max level, label it with the maximum of its children's values; otherwise, label it with the minimum of its children's values.

  4. Go to 2.

This algorithm is called "minimax". Now you can see why we call the players what we do: the names tell us the players' roles in the algorithm.


Improving Minimax 1.0

In this simplest form, our algorithm has some problems:

We know that depth-first search generally reduces time and space costs, so... A more efficient minimax algorithm:

1. Set L to be a list containing the initial state.

2. Forever:

   a. Let x = first item on L.
      If it is the initial state and has a value, stop.

   b. If x has a value, then
         Retrieve x's parent, p.
         If p is a max state, then set p's value to max(p's value, x's value).
         If p is a min state, then set p's value to min(p's value, x's value).
         Remove x from L.

      Else if x is a leaf, then
         Set x's value to the value of the position.

      Else if x is an interior state,
         Set x's value either to -infinity, if it's a max state,
                           or to +infinity, if it's a min state.
         Add x's children to the front of the list.

This does a depth-first search and drastically cuts the amount of space required by Minimax.

What are some of the costs of the new algorithm?


Improving Minimax 2.0

Our new algorithm uses much less time, by doing a depth-first search of the state space. But it still has a major problem:

We simply can't afford to consider every state in a large search space... Let's borrow a solution from A*:

[ Could we use a heuristic function from A* as a static evaluation function? ]

Here is a new Minimax algorithm that uses the idea of static evaluation:

1. Set L to be a list containing the initial state.

2. Forever:

   a. Let x = first item on L.
      If it is the initial state and has a value, stop.

   b. If x has a value, then
         Retrieve x's parent, p.
         If p is a max state, then set p's value to max(p's value, x's value).
         If p is a min state, then set p's value to min(p's value, x's value).
         Remove x from L.

      Else if x is a leaf or is at level d, then
         Set x's value to the value of the position, computed as e(x).

      Else if x is an interior state,
         Set x's value either to -infinity, if it's a max state,
                           or to +infinity, if it's a min state.
         Add x's children to the front of the list.


Evaluating Minimax 3.0

This version of Minimax allows the agent to drastically cut the amount of time required by Minimax. More to the point, it gives the agent complete control over how much time it spends on search, by adjusting the value of d.

Setting d arbitrarily has disadvantages, too, though:


Adversarial Search

Play Chinook, the best checkers player in the world!


Wrap Up


Eugene Wallingford ==== wallingf@cs.uni.edu ==== September 26, 2001