Here is a new Minimax algorithm that uses the idea of static evaluation:
1. Set L to be a list containing the initial state. 2. Forever: a. Let x = first item on L. If it is the initial state and has a value, stop. b. If x has a value, then Retrieve x's parent, p. If p is a max state, then set p's value to max(p's value, x's value). If p is a min state, then set p's value to min(p's value, x's value). Remove x from L. Else if x is a leaf or is at level d, then Set x's value to the value of the position, computed by e(x). Else if x is an interior state, Set x's value either to -infinity, if it's a max state, or to +infinity, if it's a min state. Add x's children to the front of the list.
Minimax v3.0 reflects the following ideas:
What problems can arise?
Design a static evaluation function e(x) for states in the game of Tic-Tac-Toe.
You may assume that each state contains information about what is in each cell of the game board and about whose move it is.
Your goal: to design a function that would outperform any other group's function in head-to-head competition!
Results
Most of you tried to design functions that favored occupying the center square and tried to favor possible ways to win. A simple metric would be to return (# of possible wins for X remaining) - (# of possible wins for O remaining) ... which would tend to favor whoever occupies the center.
A function I have seen used in a couple of programs is:
(3X2 + X1) - (3O2 + )1)
where Xn denotes the number of lines containing n Xs and nothing else.
Problem 1
When the values assigned to a state and its children by e(x) are very different, that is,
abs( e(x) - e(x9s parent) ) >> 0
the evaluation function is less likely to be giving an accurate picture of any state.
Solution: Search to "quiescence". Selectively extend noisy paths until they reach quiescence, at which point you can trust the static evaluation function to be more accurate.
Problem 2
Sometimes, the value of a state in the world is about to change drastically -- but at level (d + 1).
This problem is called the "horizon effect".
You could try to do a quick search from the state or two that look like the best choices, to see if a worse state is coming soon. But that won't help in the cases where something good lies just beyond the horizon on a bad path.
Unfortunately, we have not found a general solution to the problem of the horizon effect.
The previous problems both occur when Minimax hasn't search enough. But, sometimes, Minimax does too much work -- it looks at too many states.
Looking at all of the states at levels <= d is sometimes a waste of time. A program would be better served looking deeper into the tree, but only at good moves. But how?
Consider this situation in Minimax:
We don't need to consider state c to know the value of A.
Or consider this situation:
We don't need to consider state E to know the value of A.
Pr consider this situation:
We don't need to consider state I!
These ideas are the motivation for alpha-beta pruning, a modification to Minimax that eliminates moves that we know aren't legitimate competitors.
As always, we need to keep in mind the assumptions that underlie our techniques. What are they? When do they apply?
When, if ever, is the assumption violated?
Could we modify Minimax to handle a 3-player game?
When, if ever, is the assumption violated?
Could we modify Minimax to handle a world in which an agent has access only to partial information?
Consider the following position:
Or the following position:
How could we take into account such situations in a Minimax search?
What role can "book" moves play in improving search performance?
That brings us back to an issue that has peeked around the corner at us many times in our discussion of search: the trade-off between "base-level" reasoning and "meta-level" reasoning...
chess
checkers
Othello
backgammon
Go
bridge
You can play Chinook on-line. Chinook is the best checkers player in the world today, and probably no worse than the second-best player of all-time, including all humans.