decisionmaking

Transcript decisionmaking

Decision Making AI
John See
20 Dec 2010
Games Programming III (TGP2281) – T1, 2010/2011
Decision Making AI
• Ability of a game character to decide what to do
• Decision Making in Millington’s Model
Games Programming III (TGP2281) – T1, 2010/2011
Decision Making AI
•
•
•
•
•
Decision Trees
Finite State Machines (FSM)
Rule-based Systems
Fuzzy Logic & Neural Networks
Blackboard Architecture
Games Programming III (TGP2281) – T1, 2010/2011
Decision Trees
• Fast, easy to understand
• Simplest technique to implement, but extensions to the
basic algorithm can be sophisticated
• Typically used to control characters, animation or other ingame decision making
• Can be learned, and learning is relatively fast (compared
to fuzzy logic/NN)
Games Programming III (TGP2281) – T1, 2010/2011
Decision Trees – Problem Statement
• Given a set of knowledge, we need to generate a
corresponding action from a set of actions
• Map input and output – typically, a same action is used for
many different sets of input
• Need a method to easily group lots of inputs together under
one particular action, allowing the input values that are
significant to control the output
Games Programming III (TGP2281) – T1, 2010/2011
Decision Trees – Problem Statement
• Example: Grouping a set of inputs under an action
Enemy is visible
Enemy is now < 10m away
Attack
Enemy is visible
Enemy is still far (> 10m),
but not at flank
Attack
Enemy is visible
Enemy is still far (> 10m), at
flank
Move
Enemy is not visible, but
audible
Creep
Games Programming III (TGP2281) – T1, 2010/2011
Decision Trees – Algorithm Overview
• Made up of connected decision points
• Tree has starting decision, its root
• For each decision, starting from the root, one of a set of
ongoing options is chosen.
• Choice is made based on character’s knowledge
(internal/external)  Fast! No prior representation!
• Continues along the tree, making choices at each decision
node until no more decisions to consider
• At each leaf of the tree, an action is attached
• Action is carried out immediately
Games Programming III (TGP2281) – T1, 2010/2011
Decision Trees
• Check a single value and don’t contain any Boolean logic
(AND, OR)
• Representative set
•
•
•
•
Boolean – Value is true
Enumeration – Matches one of the given set of values
Numeric value – Value is within given range
3D Vector – Vector has a length within given range
• Examples?
Games Programming III (TGP2281) – T1, 2010/2011
Decision Trees – Combining Decisions
• AND two decisions – place in series in the tree
• OR two decisions – also use decisions in series, but the
two actions are swapped over from the AND
Games Programming III (TGP2281) – T1, 2010/2011
Decision Complexity
• Number of decisions that need to be
considered is usually much smaller than
number of decisions in the tree.
• Imagine using IF-ELSE statements to
test each decision?
• Method of building DTs: Start with simple
tree, as AI is tested in game, additional
decisions can be added to trap special
cases or add new behaviors
Games Programming III (TGP2281) – T1, 2010/2011
Decision Making - Branching
• So far, we have considered only binary trees – decisions
choose between 2 options.
• It is possible to build DT with any number of options, or
different decisions with different number of branches
Games Programming III (TGP2281) – T1, 2010/2011
Decision Making - Branching
• Deep DTs may result in a same alert being checked
numerous times before a decision is found
• Flat DTs are more efficient, requires less decision checking
Games Programming III (TGP2281) – T1, 2010/2011
Decision Making - Branching
• Still common to find DTs using only binary decisions
• Why?
• Underlying code for multiple branches simplifies down to
a series of binary tests (IF-ELSE statements)
• Binary DTs are easier to optimize. Some learning
algorithms that work with DTs require them to be binary
Games Programming III (TGP2281) – T1, 2010/2011
Decision Making – Performance
• Takes no memory, performance is linear with number of
nodes visited
• Assume each decision takes constant amount of time, and
tree is balanced, performance: O(log2 n), where n is number
of decision nodes in tree
• This DOES NOT consider the execution time of the
different checks required in the DT, which can vary a lot!
Games Programming III (TGP2281) – T1, 2010/2011
Decision Making – Balancing the Tree
• DTs can run the fastest when a tree is balanced
• A balanced tree keeps the height of its branches
approximately equal (within 1 to be considered balanced).
In our context, it will have about the same number of
decision making levels on each branch
Games Programming III (TGP2281) – T1, 2010/2011
Decision Making – Balancing the Tree
• 8 behaviors, 7 decisions
• 1st tree – extremely unbalanced, 2nd tree – balanced
• To get to H, 1st tree needs 8 decisions, 2nd tree needs 3
only
Games Programming III (TGP2281) – T1, 2010/2011
Decision Making – Balancing the Tree
• If all behaviors are equally likely, what is the average
number of decisions needed for both trees?
Games Programming III (TGP2281) – T1, 2010/2011
Decision Making – Balancing the Tree
• If we were likely to end up at decision A majority of time,
which is more efficient?
• How do we treat decisions that are time-consuming to run?
Let’s say A is the most time-consuming decision…
Games Programming III (TGP2281) – T1, 2010/2011
Decision Making – Merging Patterns
• DTs can be extended to allow multiple branches to merge
into a new decision – efficient, but some care is needed!
• A decision/action can be reached in more than one way
• Avoid causing loops, can never find an action leaf
Games Programming III (TGP2281) – T1, 2010/2011
Random Decision Trees
• To provide some unpredictability and variation to
making decisions in DTs
• Simplest way: Generate a random number and choose
a branch based on its value
• DTs are normally intended to run frequently, reacting to
the game state, random decisions can be a problem
• What is a potential problem with the following DT?
Games Programming III (TGP2281) – T1, 2010/2011
Random Decision Trees
Possible considerations…
1. Allow random decision to keep track of what it did last
time.
• When a decision is considered, a choice is made at random,
and that choice is stored. Next time the decision is considered,
no more randomness, previous choice is maintained.
• If something in the world changes, a different decision was
arrived, the stored choice should be removed.
2. Timing Out: Allow the AI to “time out” after a set time,
and a random choice is to be made again. Gives
variety and realism.
Games Programming III (TGP2281) – T1, 2010/2011
State Machines
• Often, characters in a game act in one of a limited set of
ways
• Carry on doing the same thing until some event or
influence makes them change
• Can use decision trees, but it is easier to model this
behavior using state machines (or finite state
machines, FSM)
• State machines take into consideration
• the world around them
• their internal state
Games Programming III (TGP2281) – T1, 2010/2011
State Machines – Basics
• Each AI character occupies one state at each instance
• Actions or behaviors are associated with each state
• So long as the character remains in that state, it will
continue carrying out the same actions/behavior
• States are connected by transitions
• Each transitions leads from one state to another, the
target state, and each has a set of associated conditions
• Changing states: when the game determines that
conditions of a transition are met, the conditions trigger
and a new state is fired
Games Programming III (TGP2281) – T1, 2010/2011
State Machines – Simple Example
• State machine to model a soldier – 3 states
• Each state has its own transitions
• The solid circle (with a transition w/o trigger condition)
points to the initial state that will be entered when the
state machine is first run
Games Programming III (TGP2281) – T1, 2010/2011
State Machines vs. Decision Trees
• Now, name some obvious differences in making
decisions using decision trees and state machines?
Games Programming III (TGP2281) – T1, 2010/2011
Finite State Machines (FSM)
• In game AI, a state machine with this kind of structure
(as seen earlier) is usually called a finite state machine
(FSM)
• An FSM has a finite number of states and transitions
• It has finite internal memory to store its states and
transitions
Games Programming III (TGP2281) – T1, 2010/2011
FSM – Generic Implementation
• Use a generic state interface that keeps track of a set of
possible states and records the current state it is in
• With each state, a series of transitions are maintained.
Each transition is also a generic interface with
conditions
• At each iteration (game loop), an update function is
called to check if any of the transitions from the current
state is triggered.
• If a transition is triggered, then the transition will be fired
• The separation of triggering and firing of transitions
allows the transitions to have their own actions
Games Programming III (TGP2281) – T1, 2010/2011
FSM – Generic Implementation
• Refer to textbook or other references for a more indepth code-level implementation of FSMs
Games Programming III (TGP2281) – T1, 2010/2011
FSM – Complexity
• State machines only requires memory to hold a
triggered transition and the current state
• O(1) in memory and O(m) in time, where m is the
(average) number of transitions per state
• The algorithm calls other supporting functions to
perform action and etc. These probably account for
most of the time spent in the algorithm.
• Hard-coded FSM – inflexible, does not allow level
designers the control over building the FSM logic
Games Programming III (TGP2281) – T1, 2010/2011
Hard-coded FSM
• Hard-coded FSM –
• Consists of an enumerated value, indicating which state is
currently occupied, and a function that checks if a transition is
followed
• States are HARD-CODED, and limited to what was HARDCODED
• Pros – Easy and quick implementation, useful for small
FSMs
• Cons:
• Inflexible, does not allow level designers the control over
building the FSM logic
• Difficult to maintain (alter) – Large FSMs, messy code
• Every character needs to be coded its own AI behaviors…
Games Programming III (TGP2281) – T1, 2010/2011
Hierarchical State Machines
• One state machine is a powerful tool, but will still face
difficulty expressing some behaviors
• Also if you wish to model somewhat different behaviors
from more than one state machines for a single AI
character
• Example: Modeling alarm behaviors with hierarchical s/m
(using a basic cleaning robot state machine)
Games Programming III (TGP2281) – T1, 2010/2011
Hierarchical State Machines
• If the robot needs to get power if it runs out of power &
resume its original duties after recharging, these
transition behaviors must be added to ALL existing states
to ensure robot acts correctly
Games Programming III (TGP2281) – T1, 2010/2011
Hierarchical State Machines
• This is not exactly very efficient. Imagine if you had to
add many more concurrent behaviors into your primary
state machine?
Games Programming III (TGP2281) – T1, 2010/2011
Hierarchical State Machines
• A hierarchical state machine for the cleaning robot
• Nested states – could be in more than one state at a time
• States are arranged in a hierarchy  next state machine
down is only considered when the higher level state machine
is not responding to its alarm
Games Programming III (TGP2281) – T1, 2010/2011
Hierarchical State Machines
• H* - “history state” node
• When the composite state (lower hierarchy) is first entered,
the H* node indicates which sub-state should be entered
• If composite state already entered, then previous sub-state is
restored using the H* node
Games Programming III (TGP2281) – T1, 2010/2011
Hierarchical State Machines
• Hierarchical state machine with cross hierarchy transition
• Most hierarchical s/m support transitions between levels of the
hierarchy
• Let’s say we want the robot to go back to refuel when it does
not find any more trash to collect…
Games Programming III (TGP2281) – T1, 2010/2011
Hierarchical State Machines
• Refer to textbook for more details on its implementation
• Performance:
• O(n) in memory (n is number of layers in hierarchy)
• O(nt) in time, where t is number is number of transitions per
state
Games Programming III (TGP2281) – T1, 2010/2011
DT + SM
• Combining decision trees and state machines
• One approach: Replace transitions from a state with a
decision tree
• Leaves of DT (rather than straightforward conditions/actions)
are now transitions to other states
Games Programming III (TGP2281) – T1, 2010/2011
DT + SM
• To implement state machine without decision tree
transitions…
• We may need to model complex conditions that require more
checking per transition
• May be time-consuming as need to check all the time
Games Programming III (TGP2281) – T1, 2010/2011
Fuzzy Logic
• Founded by Lotfi Zadeh (1965)
• “the essence of fuzzy logic is that everything is a matter
of degree”
• Imprecision in data…
• and uncertainty in solving problems
• Fuzzy logic vs. Boolean logic
• 50%-80% less rules than traditional rule-based systems,
to accomplish identical tasks
• Examples: Air-conditioner thermostat or washing
machine
Games Programming III (TGP2281) – T1, 2010/2011
Fuzzy Logic in Games
• Example of uses:
• To control movement of bots/NPCs (to smooth out
movements based on imprecise target areas)
• To assess threats posed by players (to make
further strategic decisions)
• To classify player and NPCs in terms of some
useful game information (such as combat or
defensive prowess)
Games Programming III (TGP2281) – T1, 2010/2011
Crisp data & Fuzzy data
• Crisp data (real numbers, value)
• Fuzzy data (a predicate or description, with degree
value)
• Fuzzy logic gives a predicate a degree value. Instead of
belonging to a set of being excluded (1 or 0, Boolean
logic), everything can partially belong to a set, and
some things more belong than others
Games Programming III (TGP2281) – T1, 2010/2011
Fuzzy sets
• Fuzzy sets – the numeric value is called the degree of
membership (these values are NOT probability values!)
• For each set, a degree of membership of 1 given to
something completely in the set. Degree membership of
0 given to something completely outside the fuzzy set
• Typical to use integers in implementation instead of
floating-point values (between 0 and 1), for fast
computation in game
• Note: Anything can be a member of multiple sets at the
same time
Games Programming III (TGP2281) – T1, 2010/2011
Fuzzy Control / Inference Process
• 3 basic steps in a fuzzy control or fuzzy inference process
Games Programming III (TGP2281) – T1, 2010/2011
Step 1 - Fuzzification
• Mapping process – converts crisp data (real numbers)
to fuzzy data (degree of membership)
• E.g.: Given a person’s weight, find the degree to which
a person is underweight, overweight or at ideal weight
Games Programming III (TGP2281) – T1, 2010/2011
Membership Functions
• Membership functions map input variables to a degree
of membership, in a fuzzy set between 0 and 1
• Any function can be used, and the shape usually is
governed by desired accuracy, the nature of problem, or
ease of implementation.
• Boolean logic m/f
Games Programming III (TGP2281) – T1, 2010/2011
Membership Functions
•
•
•
•
Grade m/f
Reverse grade m/f
Triangular m/f
Trapezoid m/f
Games Programming III (TGP2281) – T1, 2010/2011
Membership Functions
• Earlier example of using a set of membership functions
to represent a person’s weight
Games Programming III (TGP2281) – T1, 2010/2011
Step 2 – Fuzzy rule base
• Once all inputs are expressed in fuzzy set membership,
combine them using logical fuzzy rules to determine
degree to which each rule is true
• E.g.
• Given a person’s weight and activity level as input,
define rules to make a health decision
• If overweight AND NOT active then frequent
exercise
• If overweight AND active then moderate diet
• But having a fuzzy output such as “frequency exercise”
is not enough – need to quantify the amount of exercise
(e.g. 3 hours per week)
Games Programming III (TGP2281) – T1, 2010/2011
Fuzzy rules
• Usually uses IF-THEN style rules
• If A then B
• A  antecedent / premise
• B  consequent / conclusion
• To apply usual logical operators to fuzzy input, we need
the following fuzzy axioms:
• A OR B = MAX(A, B)
• A AND B = MIN(A, B)
• NOT A = 1 – A
Games Programming III (TGP2281) – T1, 2010/2011
Fuzzy rules
• Earlier example on weight (and now, including height)
overweight AND tall = MIN(0.7, 0.3) = 0.3
overweight OR tall = MAX(0.7, 0.3) = 0.7
NOT overweight = 1 – 0.7 = 0.3
NOT tall = 1 – 0.3 = 0.7
NOT (overweight AND tall) = 1 – MIN(0.7, 0.3) = 0.7
• Note that these fuzzy axioms (AND, OR, NOT) are not
the only definition of the logical operators. There are
other definitions that can be used…
Games Programming III (TGP2281) – T1, 2010/2011
Complete Rule Base
• With the above m/f for each input variable, common
requirement is to construct a complete set of all possible
combination of inputs. In this case, we need 18 rules
(2x3x3)
Games Programming III (TGP2281) – T1, 2010/2011
Rule evaluation (Creature example)
• We have an AI fuzzy decision making system, which
needs to evaluate whether a creature should attack the
player. Input variables: range, health, opponent ranking
Games Programming III (TGP2281) – T1, 2010/2011
Rule evaluation (Creature example)
• Rule base:
• If (in melee range AND uninjured) AND NOT hard
then attack
• If (NOT in melee range) AND uninjured then do
nothing
• If (NOT out of range AND NOT uninjured) AND
(NOT wimp) then flee
• Given specific degrees for the input variables, we might
get outputs that are like:
• Attack degree: 0.2
• Do nothing degree: 0.4
• Flee degree: 0.7
Games Programming III (TGP2281) – T1, 2010/2011
Rule evaluation (Creature example)
• So what do we do with those fuzzy membership output
values??
• Missing link: We also need to represent the output
variable as a fuzzy membership set!
Games Programming III (TGP2281) – T1, 2010/2011
Step 3 – Defuzzification
• Defuzzification process: Fuzzy output  Crisp output
• From previous step, each rule in rule base results in a
degree of membership in some output fuzzy set
• With the numerical output we got earlier (0.2 for attack,
0.4 for do nothing, 0.7 for flee), we shall construct a
composite output membership function
Games Programming III (TGP2281) – T1, 2010/2011
Step 3 – Defuzzification
• Not possible to be exact/accurate, but there are methods
that solve the problem as near as possible
• Using Highest Membership Function
• Choose fuzzy set which has highest degree of membership and
choose the output value that represents each set.
• 4 common points: min, max, average of min/max, bisector
• Very simple to implement but coarse defuzzification
Games Programming III (TGP2281) – T1, 2010/2011
Step 3 – Defuzzification
• Blending Based on Membership
• Blend each characteristic point based on its corresponding
degree of membership
• E.g. Character with 0.2 attack, 0.4 do nothing, 0.7 flee will
produce crisp output given by (0.2 * attack direction) + (0.4 *
do nothing direction) + (0.7 flee direction)
• Make sure that the eventual result is normalized (otherwise
result may be over-the-bounds or unrealistic)
• Common normalization technique: Divide total blended sum
by the sum of fuzzy output values
• Minimum values blended (Smallest of Maximum, SoM)
• Maximum values blended (Largest of Maximum, LoM)
• Average values blended (Mean of Maximum, MoM)
Games Programming III (TGP2281) – T1, 2010/2011
Step 3 – Defuzzification
• Center of Gravity
• Also known as Centroid of Area method  Takes into
account all membership values, rather than specific ones
(largest, smallest, average, etc.)
• First, each m/f is cropped at the membership value of its set
• Center of mass is found by integrating each in turn. This
point is chosen as output crisp value
• Unlike bisector of area method, we can’t compute this offline
since we do not know in advance the fuzzy membership
values, and how the m/f will be cropped
Games Programming III (TGP2281) – T1, 2010/2011
Misc: Dealing with Complex Rule Base
• We may have multiple rules in our rule base that will
results in the same output membership fuzzy set.
• E.g.
• Corner-entry AND going-slow THEN accelerate
• Corner-exit AND going-fast THEN accelerate
• Corner-exit AND going-slow THEN accelerate
• How do we deal with such situations? Which output
membership value for accelerate to choose?
Games Programming III (TGP2281) – T1, 2010/2011
Some Good Examples
• 2 Examples: Control Example & Threat Assessment
Example from “AI for Game Developers” ref book
Games Programming III (TGP2281) – T1, 2010/2011

decisionmaking

Transcript decisionmaking

Directory