810:161 Session 8

Session 8

Knowledge in Adversarial Search

810:161

Artificial Intelligence

Recap: Minimax

Here is a new Minimax algorithm that uses the idea of static evaluation:

1. Set L to be a list containing the initial state.

2. Forever:

   a. Let x = first item on L.
      If it is the initial state and has a value, stop.

   b. If x has a value, then
         Retrieve x's parent, p.
         If p is a max state, then set p's value to max(p's value, x's value).
         If p is a min state, then set p's value to min(p's value, x's value).
         Remove x from L.

      Else if x is a leaf or is at level d, then
         Set x's value to the value of the position, computed by e(x).

      Else if x is an interior state,
         Set x's value either to -infinity, if it's a max state,
                           or to +infinity, if it's a min state.
         Add x's children to the front of the list.

Minimax v3.0 reflects the following ideas:

Label each internal state according to whose move it is, based on the value of each of its children.
Use depth-first search (to limit the space used).
Stop search at some depth, d, and use a static evaluation function, e(x), to assign values to those states as if they were leaves (to limit the time used).
What problems can arise?

A Group Exercise

Design a static evaluation function e(x) for states in the game of Tic-Tac-Toe.
You may assume that each state contains information about what is in each cell of the game board and about whose move it is.
Your goal: to design a function that would outperform any other group's function in head-to-head competition!
Results
Most of you tried to design functions that favored occupying the center square and tried to favor possible ways to win. A simple metric would be to return (# of possible wins for X remaining) - (# of possible wins for O remaining) ... which would tend to favor whoever occupies the center.
A function I have seen used in a couple of programs is:
(3X₂ + X₁) - (3O₂ + )₁)
where X_n denotes the number of lines containing n Xs and nothing else.

Problems with Minimax v3.0

Problem 1
When the values assigned to a state and its children by e(x) are very different, that is,
abs( e(x) - e(x9s parent) ) >> 0
the evaluation function is less likely to be giving an accurate picture of any state.

Solution: Search to "quiescence". Selectively extend noisy paths until they reach quiescence, at which point you can trust the static evaluation function to be more accurate.
Problem 2
Sometimes, the value of a state in the world is about to change drastically -- but at level (d + 1).

This problem is called the "horizon effect".
You could try to do a quick search from the state or two that look like the best choices, to see if a worse state is coming soon. But that won't help in the cases where something good lies just beyond the horizon on a bad path.
Unfortunately, we have not found a general solution to the problem of the horizon effect.

Pruning the Serach Space

The previous problems both occur when Minimax hasn't search enough. But, sometimes, Minimax does too much work -- it looks at too many states.
Looking at all of the states at levels <= d is sometimes a waste of time. A program would be better served looking deeper into the tree, but only at good moves. But how?
Consider this situation in Minimax:

We don't need to consider state c to know the value of A.
Or consider this situation:

We don't need to consider state E to know the value of A.
Pr consider this situation:

We don't need to consider state I!
These ideas are the motivation for alpha-beta pruning, a modification to Minimax that eliminates moves that we know aren't legitimate competitors.

Questioning Assumptions

As always, we need to keep in mind the assumptions that underlie our techniques. What are they? When do they apply?

Minimax, with or without alpha-beta pruning, assumes a 2-player game with alternating moves.
When, if ever, is the assumption violated?
Could we modify Minimax to handle a 3-player game?

Minimax, with or without alpha-beta pruning, assumes a world with perfect information.
When, if ever, is the assumption violated?
Could we modify Minimax to handle a world in which an agent has access only to partial information?

Minimax, with or without alpha-beta pruning, assumes optimal play by the opponent.
Consider the following position:
Or the following position:
How could we take into account such situations in a Minimax search?

What role can "book" moves play in improving search performance?
That brings us back to an issue that has peeked around the corner at us many times in our discussion of search: the trade-off between "base-level" reasoning and "meta-level" reasoning...

Some History of Game-Playing Programs

chess
checkers
Othello
backgammon
Go
bridge
You can play Chinook on-line. Chinook is the best checkers player in the world today, and probably no worse than the second-best player of all-time, including all humans.

Wrap Up

Reading -- If you haven't yet, read Computers, Games and the Real World, a paper by Ginsberg on search, game playing, and AI.

Homework -- Homework 2 will be available soon....

Eugene Wallingford ==== wallingf@cs.uni.edu ==== September 27, 2001

Session 8

Knowledge in Adversarial Search

810:161

Artificial Intelligence

Recap: Minimax

A Group Exercise

Problems with Minimax v3.0

Pruning the Serach Space

Questioning Assumptions

Some History of Game-Playing Programs

Wrap Up