Gaze Perception and Theory of Mind - emotion
Download
Report
Transcript Gaze Perception and Theory of Mind - emotion
Attention and Perception
of Attentive Behaviours
for Emotive Interaction
Christopher Peters
LINC
University of Paris 8
Virtual Humans
Computer models of people
Can be used as…
substitutes for the real thing in
ergonomic evaluations
Conversational agents
Display and Animation:
Two layers: skeletal layer and skin
layer
Skeleton is hierarchy of positions and
orientations
Skin layer provides the visual
appearance
Animating Characters
Animation Methods
Low level: rotate leye01 by 0.3 degrees around axis
(0.707,0.707,0.0) at 0.056 seconds into the
animation
High level: ‘walk to the shops’
Character must know where shop is, avoid obstacles on
the way there, etc. Must also be able to walk …
Autonomy
Direct animation automatic generation
Autonomy requires character to animate itself
based on simulation models
Models should result in plausible behaviour
Our Focus
1.
Attention and related behaviours
2.
(a) Where to look
(b) How to generate gaze behaviour
Perception of attentive
behaviours and emotional
significance
How to interpret attention
behaviours of others
Conversation initialisation
in Virtual Environments…
Why VE?
Cheap!
Quick
No need for expensive
equipment (facilities, robots,
etc)
Duplication at the click of a
mouse
Changes to environment can
be made quickly and easily, at
no extra cost
But…
Things we take for granted in
RL need to be programmed into
the virtual environment
Physics
And will only ever be
approximations of reality
1. Attention and Gaze
Our character is
walking down a virtual
street
Where should the
character look and how
should the looking
behaviours be
generated?
Foundation
Humans need to look
around
Ilab, University of Southern California
An ecological approach to
visual perception, J.J. Gibson,
1979.
Eyes in the front of our head
Poor acuity over most of
visual field
Even for places where we
have been before, memory is
far from perfect
Virtual humans should look
around too!
Significance to Virtual
Humans
Viewer perception
The impact of eye gaze on
communication using
humanoid avatars, Garau
et al., 2001.
Plausibility
“If they don’t look around,
then how do they know
where things are?”
viewer
Human
Significance to Virtual
Humans
Functional purposes
Navigation for digitial actors
based on synthetic vision,
memory and learning,
Noser et al., 1995.
Autonomy
If they don’t look around,
then they won’t know where
things are
Our Focus
Gaze shifts versus saccadic eye movements
General looking behaviours
Where to Look? Automating Certain Visual
Attending Behaviors of Human Characters,
Chopra-Khullar, 1999.
Practical Behavioural Animation Based On Vision
and Attention, Gillies, 2001.
Two problems:
1.
2.
Where to look
How to look
Approach
Use appropriately simplified models from
areas such as psychology, neuroscience,
artificial intelligence …
Appropriate = fast, allowing real-time operation
Capture the high-level salient aspects of such
models without the intricate detail
Components
Where to look
Sensing
Attention
Memory
Gaze Generator
How to look
System Overview
Input environment through
synthetic vision component
1.
Process visual field using
spatial attention model
2.
Modulate attended object
details using memory
component
3.
Generate gaze behaviours
towards target locations
Visual Sensing
Three renderings taken per visual update
One full-scene rendering ( to attention module)
Two false-colour renderings ( to memory
module)
False-colour Renderings
Approximate acuity of the eye with two
renderings
Fovea
Periphery
Viewpoint
Fovea
Periphery
Renderings
Renderings allow both spatial/image and
object based operations to take place
1(a) Where to look
Model of Visual Attention
“Bottom-up”
Two-component theory of attention
Endogenous
Voluntary, task driven
‘Look for the coke can’
“Top-down”
Exogenous
Environment appears to ‘grab’ our attention
Colour, intensity, orientation, motion, texture,
sudden onset, etc
Bottom-up Attention
Orientation, intensity and colour contrast
Bottom-up Attention
Model
Cognitive engineering
Itti et al. 2000
http://ilab.usc.edu/bu/
Biologically inspired
Inputs an image, outputs
encoding of attention
allocation
Peters and O’ Sullivan
2003
Input Image
Intensity
RG Colour
BY Colour
Gaussian Pyramid
Each channel acts as the first
level in a Gaussian or Gabor
pyramid
Each subsequent level is a
blurred and decimated version
of the previous level
Image processing techniques
simulate early visual
processing
Center-Surround Processing
Early visual processing
Ganglion cells
Respond to light in a center-surround pattern
Contrast a central area with its neighbours
Contrast important, not amplitude (CONTEXT)
Simulated by comparing different levels in
image pyramids
Saliency Map
Conspicuity Maps
Input
Result of center-surround calculations for
each feature type
Define the ‘pop-out’ for each feature type
Intensity
Colour
Orientation
Integrated into saliency map
Attention directed preferably to lighter
areas
Saliency Map
Memory
Differentiate between what an agent
has and hasn’t observed
Agents should only know about objects
that they have witnessed
Agents won’t have exact knowledge about
world
Used to modulate output of attention
module (saliency map)
Object-based, taking input from synthetic
vision module
Stage Theory
The further information goes, the longer
it is retained
Attention acts as a filter
Stimulus Representations
Two levels of detail
representation for
objects
Proximal stimuli
Early representation of
the stimulus
Data discernable only
from retinal image
Observations
Later representation of
stimuli after resolution
with the world database
Stage Theory
Short-term Sensory Storage (STSS)
From distal to proximal stimuli
Objects have not yet been resolved with world
database
Stage Theory
Short-term memory (STM) and Long-Term
Memory (LTM)
Object-based
Contains resolved object information
Observations store information for attended
objects
From proximal stimuli to observations
Object pointer
World-space transform
Timestamp
Virtual humans are not completely autonomous
from the world database
Memory Uncertainty Map
Can now create a memory uncertainty map
for any part of the scene the agent is looking
at
The agent is uncertain of parts of the scene it has
not looked at before
Depends on scene object ‘granularity’
Attention Map
Determines where
attention will be allocated
to
Bottom-up components
Top-down (see 2)
Memory
Modulating the saliency
map by the uncertainty
map
Here, sky and road have low
uncertainty levels
Human Scanpaths
Eye movements
and fixations
Inhibition of Return
Focus of attention must change
Inhibit attended parts of the scene from
being revisited soon
Image-based IOR
Problem: Moving viewer or dynamic scene
Solution: Object based memory
Object-based IOR
Store uncertainty level with each object
Modulate saliency map by uncertainty levels
Artificial Regions of Interest
Attention map at lower resolution than visual
field
Generate AROIs from highest values of
current attention map to create scanpath
Assume simple one-to-one mapping from attention
map to overt attention
1(b) How to look
Generate gaze animation given a target
location
Gaze shifts
Combined eye-head gaze shifts to visual and
auditory targets in humans, Goldring et al., 1996.
Targets beyond oculomotor range
Gaze Shifts
Contribution of head
movements
Head Movement Propensity, J.
Fuller, 1992.
‘Head movers’ Vs. ‘Eye movers’
±40 degree orbital threshold
Innate behavioural tendancy for
subthreshold head moving
Midline-attraction and Resetting
Blinking
Subtle and often overlooked
Not looking while leaping: the linkage of blinking
and saccadic gaze shifts, Evinger et al., 1994.
Gaze-evoked blinking
Amplitude of gaze shift influences blink probability
and magnitude
2. Perception of Attention
Attention behaviours may elicit attention
from others
Predator-prey
Gaze-following
Goals
Intentions
Gaze in Infants
Infants
Notice gaze direction
as early as 3 months
Gaze-following
Infants are faster at
looking at targets that
are being looked at by
a central face
Respond even to
circles that look like
eyes
www.mayoclinic.org
Theory of Mind
Baron-Cohen (1994)
• Eye Direction and
Intentionality Detectors
• Theory of Mind Module
Perrett and Emery (1994)
• More general Direction
of Attention Detector
• Mutual Attention
Mechanism
Our Model
ToM for conversation initiation
Based on attention behaviours
Key metrics in our system are Attention
Levels and Level of Interest
Metrics represent the amount of attention
perceived to have been paid by another
Based primarily on gaze
Also body direction, locomotion, directed gestures
and facial expressions
Emotional significance of gaze
Implementation (in progress)
Torque game engine
http://www.garagegames.com
Proven engine used for number of AAA titles
Useful basis providing fundamental
functionality
Graphics exporters
In-simulation editing
Basic animation
Scripting
Terrain rendering
Special effects
Overview
Synthetic Vision
Approximated human vision for computer agents
Why?
Inexpensive – no special hardware required
Bypasses many computer vision complexities
Segmentation of images, recognition
Enables characters to receive visual information in a
way analogous to humans
How?
Updated in a snapshot manner
Small, simplified images rendered from agents
perspective
Textures, lighting and sfx disabled
False-colouring
False-colours provide a lookup scheme for acquiring
objects from the database
False colour defined (r,g,b)
where
Red is the object type identifier
Green is the object instance
identifier
Blue is the sub-object identifier
Allows quick retrieval of objects
Intentionality Detector
(ID)
Represents behaviour in terms of volitional
states (goal and desire)
Based on visual, auditory and tactile cues
Our version only based on vision
Attributes intentionality characteristic to objects
based on the presence of certain cues
Implemented as a filter on objects from visual
system
Only “agent” objects can pass the filter
Direction of Attention
Direction of Attention Detector (DAD)
More useful than EDD alone
Eye, head, body and locomotion direction read
from database after false-colour lookup
Used to derive Attention Level metric from
filtered stimuli
Direction of Attention
What happens when eyes aren’t visible?
Hierarchy of other cues
Head direction > Body direction > Locomotion
direction
Mutual Attention
Comparison between:
Eye direction read from other agent
Focus of attention of this agent
See 1. Generating Attention Behaviours
If agents are focus of each others attention,
Mutual Attention Mechanism (MAM) is
activated
Attention Levels
Perception of
attention paid by
another
At instant of time
Based on
orientation of
body parts
Eyes, head,
body, locomotion
direction
Attention Levels
Direction is weighted for each segment
Eyes provide largest contribution
Less attention
Attention Levels
Also
Support for
Locomotion
direction
Directed gestures
Directed facial
expressions
Gestures and
expressions convey
emotion
Attention Profiles
Attention levels over a time period are stored in
memory
Attention Profile
Consideration of all attention levels of another agent
over a certain time frame
Analysis can provide a higher level description of
their amount of interest in me
Level of interest
General descriptor of amount of interest that
another is perceived to be paying…
…over a time period
Error prone process due to reliance on perception
Links between gaze and interest
Were they actually looking at you?
Human gaze perception is good, but not perfect
Gaze does not necessarily mean attention e.g. blank stare
Inherently probabilistic
Theories of Mind are theories
Application
Conversation initiation scenarios
Subtle negotation involving gaze
Avoid social embarrassment of engaging in
discourse with an unwilling participant
Our ToMM
Stores simple, high-level theories
Useful for conversation initialisation behaviours
– Have they seen me?
• ID, DAD and MAM
modules
– Have they seen me
looking?
• ID, DAD and MAM
modules
– Are they interested in
interacting?
• Level of Interest metric
Future Work
Finish implementation of the model
Further links between attention/memory and
emotion
Hardware based bu attention implementation
Integration of facial expression and gestures
from Greta
Thank you!
Questions