Document

Transcript Document

Problem Solving Using Search
Reduce a problem to one of searching a
graph.
View problem solving as a process of
moving through a sequence of problem
states to reach a goal state
Move from one state to another by taking
an action
A sequence of actions and states leading
to a goal state is a solution to the problem.
Trees
A tree is made up of nodes and links connected so that there are no
loops (cycles).
Nodes are sometimes called vertices.
Links are sometimes called edges.
A tree has a root node.
Where the tree ”starts”.
Every node except the root has a single parent (aka direct ancestor).
An ancestor node is a node that can be reached by repeatedly going
to a parent.
Each node (except a terminal, aka leaf) has one or more children
(aka direct descendants).
A descendant node is a node that can be reached by repeatedly
going to a child.
Graphs
Set of nodes connected by links.
But, unlike trees, loops are allowed.
Also, unlike trees, multiple parents are allowed.
Two kinds of graphs:
Directed graphs.
Links have a direction.
Undirected graphs.
Links have no direction.
A tree is a special case of a graph.
Representing Problems with
Graphs
Nodes represent cities that are connected by
direct flight.
Find route from city A to city B that involves the
fewest hops.
Nodes represent a state of the world.
Which blocks are on top of what in a blocks scene.
The links represent actions that result in a change
from one state to the other.
A path through the graph represents a plan of action.
A sequence of steps that tell how to get from an initial state
to a goal state.
Problem Solving with Graphs
Assume that each state is complete.
Represents all (and preferably only) relevant aspects of the
problem to be solved.
In the flight planning problem, the identity of the airport is sufficient.
But the address of the airport is not necessary.
Assume that actions are deterministic.
We know exactly the state after an action has been taken.
Assume that actions are discrete.
We don’t have to represent what happens while the action is
happening.
We assume that a flight gets us to the scheduled destination without
caring what happens during the flight.
Classes of Search
Uninformed, Any-path
Depth-first
Breadth-first
In general, look at all nodes in a search tree in a
specific order independent of the goal.
Stop when the first path to a goal state is found.
Informed, Any-path
Exploit a task specific measure of goodness to try to
reach a goal state more quickly.
Classes of Search
Uninformed, optimal
Guaranteed to find the ”best” path
As measured by the sum of weights on the graph
edges
Does not use any information beyond what is
in the graph definition
Informed, optimal
Guaranteed to find the best path
Exploit heuristic (”rule of thumb”) information to
find the path faster than uninformed methods
Scoring Function
Assigns a numerical value to a board position
The set of pieces and their locations represents a
singel state in the game
Represents the likelihood of winning from a
given board position
Typical scoring function is linear
A weighted sum of features of the board position
Each feature is a number that measures a specific
characteristic of the position.
”Material” is some measure of which pieces one has in a
given position.
A number that represents the distribution of the pieces in a
position.
Scoring Function
To determine next move:
Compute score for all possible next positions.
Select the one with the highest score.
If we had a perfect evaluation function,
playing chess would be easy!
Such a function exists in principle
But, nobody knows how to write it or compute
it directly.
Combinatorial Optimization
Combinatorial optimization and enumeration problems are modeled
by state spaces that usually lack any regular structure.
Combinatorial optimization algorithms solve instances of problems
that are believed to be hard in general, by exploring the usuallylarge discreet solution space of these instances.
Goal is to find the “best” possible solution in the state space.
Exhaustive search is often the only way to handle such
“combinatorial chaos”
Many “real” problems exhibit no regular structures to be
exploited, and that leaves exhaustive enumeration as the only
approach in sight.
Combinatorial optimization algorithms try to reduce the effective size
of the space, and explore the space efficiently.
Branch and Bound Algorithms
Branch and bound is a general algorithmic
method for finding optimal solutions of
problems in combinatorial optimization.
Solution space is discrete.
The general idea:
Find the minimal value of a function f(x) over
a set of admissible values of the argument x
called feasible region.
Both f and x may be of arbitrary nature.
Branch and Bound Algorithms
A branch-and-bound procedure requires two
tools.
First, a smart way of covering the feasible region
by several smaller feasible subregions.
This is called branching, since the procedure is
repeated recursively to each of the subregions and all
produced subregions naturally form a tree structure.
Tree structure is called search tree or branch-andbound-tree.
Its nodes are the constructed subregions.
Branch and Bound Algorithms
Second tool is bounding
a fast way of finding upper and lower bounds
for the optimal solution within a feasible
subregion.
Basic Approach
Based on a simple observation (for a
minimization task)
If the lower bound for a subregion A from the search
tree is greater than the upper bound for any other
(previously examined) subregion B, then A may be
safely discarded from the search.
This step is called pruning.
It is usually implemented by maintaining a global
variable m that records the minimum upper bound
seen among all subregions examined so far
any node whose lower bound is greater than m can be
discarded.
Basic Approach
When the upper bound equals the lower bound
for a given node, the node is said to be solved.
Ideally, the algorithm stops when all nodes have
either been solved or pruned.
In practice the procedure is often terminated
after a given time
at that point, the minimum lower bound and the
maximum upper bound, among all non-pruned
sections, define a range of values that contains the
global minimum.
Basic Approach
The efficiency of the method depends critically on the
effectiveness of the branching and bounding algorithms
used.
Bad choices could lead to repeated branching, without any
pruning, until the sub-regions become very small.
In that case the method would be reduced to an exhaustive
enumeration of the domain, which is often impractically large.
There is no universal bounding algorithm that works for
all problems.
Little hope that one will ever be found.
General paradigm needs to be implemented separately for each
application.
Branching and bounding algorithms are specially designed for it.
Min-Max Algorithm
Limited look-ahead plus scoring
I look ahead two moves (2-ply)
First me – relative level 1
Then you – relative level 2
For each group of children at level 2
1.
2.
Check to see which has the minimum score
Assign that number to the parent
3.
I pick the move that lands me in the position where you can do the
least damage to me.
Represents the worst that can happen to me after your move from that
parent position
This is the position which has the maximum value resulting from applying
Step 1.
Can implement this to any number (depth) of min-max level pairs.
May 1997
Used Min-Max.
256 specialized chess processors coupled into a 32 node supercomputer.
Examined around 30 billion moves per minute.
Typical search depth was 13ply
- but in some dynamic situations it could go as deep as 30.
Alpha-Beta Pruning
Pure optimization of min-max.
No tradeoffs or approximations.
Don’t examine more states than is necessary.
”Cutoff” moves allow us to cut off entire branches of
the search tree (see following example)
Only 3 states need to be examined in the
following example
Turns out, in general, to be very effective
Move Generation
Assumption of ordered tree is optimistic.
”Ordered” means to have the best move on
the left in any set of child nodes.
Node with lowest value for a min node.
Node with highest value for a max node.
If we could order nodes perfectly, we
would not need alpha-beta search!
The good news is that in practice
performance is close to optimistic limit.
Move Generator
Goal is to produce ordered moves
Needed to take advantage of alpha-beta
search.
Encodes a fair bit of knowledge about a
game.
Example order heuristic:
Value of captured piece – value of attacker.
E.g., ”pawn takes Queen” is the highest ranked
move in this ordering
Static Evaluation
Other place where substantial game
knowledge is encoded
In early programs, evaluation functions
were complicated and buggy
In time it was discovered that you could
get better results by
A simple reliable evaluator
E.g., a weighted count of pieces on the board.
Plus deeper search
Static Evalution
Deep Blue used static evaluation functions
of medium complexity
Implemented in hardware
”Cheap” PC programs rely on quite
complex evaluation functions.
Can’t search as deeply as Big Blue
In general there is a tradeoff between
Complexity of evaluation function
Depth of search.
Time of Defeat: August 1994
Read all about it at:
http://www.cs.ualberta.ca/~chinook/
TD-Gammon
Neural network that is able to teach itself to play backgammon solely
by playing against itself and learning from the results
Based on the TD(Lambda) reinforcment learning algorithm
Starts from random initial weights (and hence random initial
strategy)
With zero knowledge built in at the start of learning (i.e. given only a
"raw" description of the board state), the network learns to play at a
strong intermediate level
When a set of hand crafted features is added to the network's input
representation, the result is a truly staggering level of performance
The latest version of TD Gammon is now estimated to play at a
strong master level that is extremely close to the world's best human
players.
The Match
Exhibition match played in 1998.
100 games were played over 3 days to
reduce any element of luck.
Final result was narrow 8 point win for
Davis.
Davis and Tesauro conclude performance
was “superhuman”.
TDGammon made only one serious mistake
in 100 games!
Murakami defeated 6 games to 0 in 1997.
The name Proverb comes from "probabilistic cruciverbalist,"
meaning a crossword solver based on probability theory.
Proverb
Will Shortz, New York Times crossword
puzzle editor, first issued his challenge to
computer designers in 1997 after Deep
Blue beat Gary Kasparov.
Shortz contended that crossword puzzles,
unlike chess, draw on particular human
skills and thought processes that can be
inaccessible to computing machines.
Proverb
"Is the computer going to be able to solve the clues
involving puns and wordplay? I don't think so," Shortz
wrote in an introduction to a volume of The New York
Times daily crossword puzzles. He gave examples of clues
he felt computers would miss, such as "Event That
Produces Big Bucks" referring to "Rodeo," "Pumpkincolored" translating as "Orange," or "It Might Have
Quarters Downtown" meaning "Meter."
Proverb
"So if you were one who lamented the loss of the human to
the computer in chess, don't despair. In a much more wide
ranging and, frankly, complex game like crosswords, we
humans still rate just fine."
And then again …
There was a lot of discussion at the 1999 tournament (American
Crossword Puzzle Tournament) of computer solutions to the contest
puzzles. Two weeks before the event I had sent advance copies of them
to Michael Littman at Duke University. Michael heads a team of
computer scientists that has developed a program called Proverb--the
world's first computer program designed to solve standard crosswords.
He immediately put Proverb through its paces. The results were so
interesting (in fact, so amazing) that I printed them out on large sheets of
paper and posted them, along with Michael's analysis, after each round at
the event. --Will Shortz
Wanna Bet?
(December 1996)
In Zia's book, Bridge, My Way, which appeared a few years
ago, he offered to take a one-million-pound bet that no
computer would be able to beat him at the bridge table. The
stunt seemed to work in that it produced a lot of publicity for
his book. That is, until last month when word reached him that
bridge program GIB, brainchild of American professor
Ginsberg, proved capable of incredible feats of declarer play.
The Bet is Off!
(December 1996)
Zia is the big star of the Fall Nationals, having just triumphed in
the premier event, the Reisinger. Smiling from ear to ear, he
accepts the congratulations of his predominantly female
admirers. Then he is accosted by a man he's never seen before.
"Mr. Mahmood, my congratulations; and incidentally, may I ask
you something?" "But, of course," replies the always amiable
Pakistani, "what's it about?" "It concerns a one-million-pound
bet." The Pakistani grows pale. "What is your name, sir?," he
immediately asks. "Matthew Ginsberg," says the man. Suddenly
there's little left of the great Zia with his aura of invincibility. He
cringes, and mumbles, "The bet is off!," and walks out of the
room.
And then …
Two years later GIB became World champion computer bridge,
and defeated the vast majority of the world-top bridge players
(including Zia Mahmood) participating in the 1998 Par Contest.
However, such a par contest measures technical bridge analysis
skills only, and in 1999 Zia did beat various computer programs
including GIB in an individual round robin match.
But the story is only beginning …

Document

Transcript Document

Directory