Session 8

Knowledge in Adversarial Search


810:161

Artificial Intelligence


today's slides in PDF


Recap: Minimax

Here is a new Minimax algorithm that uses the idea of static evaluation:

1. Set L to be a list containing the initial state.

2. Forever:

   a. Let x = first item on L.
      If it is the initial state and has a value, stop.

   b. If x has a value, then
         Retrieve x's parent, p.
         If p is a max state, then set p's value to max(p's value, x's value).
         If p is a min state, then set p's value to min(p's value, x's value).
         Remove x from L.

      Else if x is a leaf or is at level d, then
         Set x's value to the value of the position, computed by e(x).

      Else if x is an interior state,
         Set x's value either to -infinity, if it's a max state,
                           or to +infinity, if it's a min state.
         Add x's children to the front of the list.

Minimax v3.0 reflects the following ideas:

  1. Label each internal state according to whose move it is, based on the value of each of its children.

  2. Use depth-first search (to limit the space used).

  3. Stop search at some depth, d, and use a static evaluation function, e(x), to assign values to those states as if they were leaves (to limit the time used).

    What problems can arise?


    A Group Exercise

    Design a static evaluation function e(x) for states in the game of Tic-Tac-Toe.

    You may assume that each state contains information about what is in each cell of the game board and about whose move it is.

    Your goal: to design a function that would outperform any other group's function in head-to-head competition!

    Results

    Most of you tried to design functions that favored occupying the center square and tried to favor possible ways to win. A simple metric would be to return (# of possible wins for X remaining) - (# of possible wins for O remaining) ... which would tend to favor whoever occupies the center.

    A function I have seen used in a couple of programs is:

    (3X2 + X1) - (3O2 + )1)

    where Xn denotes the number of lines containing n Xs and nothing else.


    Problems with Minimax v3.0

    Problem 1

    When the values assigned to a state and its children by e(x) are very different, that is,

    abs( e(x) - e(x9s parent) ) >> 0

    the evaluation function is less likely to be giving an accurate picture of any state.

    example of a non-quiescent situation

    Solution: Search to "quiescence". Selectively extend noisy paths until they reach quiescence, at which point you can trust the static evaluation function to be more accurate.

    Problem 2

    Sometimes, the value of a state in the world is about to change drastically -- but at level (d + 1).

    example of the horizon effect

    This problem is called the "horizon effect".

    You could try to do a quick search from the state or two that look like the best choices, to see if a worse state is coming soon. But that won't help in the cases where something good lies just beyond the horizon on a bad path.

    Unfortunately, we have not found a general solution to the problem of the horizon effect.


    Pruning the Serach Space

    The previous problems both occur when Minimax hasn't search enough. But, sometimes, Minimax does too much work -- it looks at too many states.

    Looking at all of the states at levels <= d is sometimes a waste of time. A program would be better served looking deeper into the tree, but only at good moves. But how?

    Consider this situation in Minimax:

    example of pruning at level 2

    We don't need to consider state c to know the value of A.

    Or consider this situation:

    example of pruning at level 3

    We don't need to consider state E to know the value of A.

    Pr consider this situation:

    example of pruning at level 5!

    We don't need to consider state I!

    These ideas are the motivation for alpha-beta pruning, a modification to Minimax that eliminates moves that we know aren't legitimate competitors.


    Questioning Assumptions

    As always, we need to keep in mind the assumptions that underlie our techniques. What are they? When do they apply?

    1. Minimax, with or without alpha-beta pruning, assumes a 2-player game with alternating moves.

      When, if ever, is the assumption violated?

      Could we modify Minimax to handle a 3-player game?

    2. Minimax, with or without alpha-beta pruning, assumes a world with perfect information.

      When, if ever, is the assumption violated?

      Could we modify Minimax to handle a world in which an agent has access only to partial information?

    3. Minimax, with or without alpha-beta pruning, assumes optimal play by the opponent.

      Consider the following position: example: Minimax returns move barely better but with fewer opportunities to succeed

      Or the following position: example: Minimax returns a move that draws when a move with winning chances exists

      How could we take into account such situations in a Minimax search?

    What role can "book" moves play in improving search performance?

    That brings us back to an issue that has peeked around the corner at us many times in our discussion of search: the trade-off between "base-level" reasoning and "meta-level" reasoning...


    Some History of Game-Playing Programs

    chess
    checkers
    Othello
    backgammon
    Go
    bridge

    You can play Chinook on-line. Chinook is the best checkers player in the world today, and probably no worse than the second-best player of all-time, including all humans.


    Wrap Up


    Eugene Wallingford ==== wallingf@cs.uni.edu ==== September 27, 2001