Document 7569719

Download Report

Transcript Document 7569719

4D Modeling and Mobile Visualization
June, 2002
William Ribarsky and Nickolas Faust
GVU Center and GIS Center
Georgia Institute of Technology
Georgia Tech GVU Center
Research Goals
What do we do with ever-expanding collections of 3D
data that are being automatically collected and
modeled from multiple sources?
•Scalable, hierarchical, fused data organizations
•For interactive visual exploration
•For other applications
•Mobile interfaces
•Mobile applications
Georgia Tech GVU Center
Accomplishments and Collaborations
• Development of mobile “situational visualization” applications
• User studies of multimodal interface for mobile and pervasive
computing
• Development of multiresolution techniques for interactive visualization
of high detail urban scenes (with Avideh Zakhor).
• Application of transparent dynamic objects and new hierarchical,
interactive volume rendering technique to weather and uncertainty
(with Suresh Lodha)
• Transition of tools within VGIS to users and applications (e.g., Sarnoff,
NIMA, etc.)
• Presentation of mobile emergency response application to President
Bush and Governor Ridge
Georgia Tech GVU Center
Urban Visualization: Our Ultimate Goal
To make this...
…Look like this
Georgia Tech GVU Center
Urban Visualization
To reach this goal requires:
•Hierachical structure
•Multiresolution methods
•Geometry-based methods
•Image-based rendering
Georgia Tech GVU Center
Hierarchical, Multiresolution Methods for
Highly Detailed Urban Scenes
•Data-adapted global quadtree
Block
}
Forest of
quadtrees
tree structure
Façade
1
…
…
Façade
N
Quadtreealigned
block
hierarchy
LOD
Hierarchy
…
…
…
…
…
Object
1
…
…
Object
M
…
…
Data-dependent
detailed
representation
(quadtree depth to
level of a “block”)
…
Georgia Tech GVU Center
…
…
Facades: Bounding Sphere Hierarchy
Sphere surrounding
façade vertex
Divide along longest axis
Construct balanced tree
hierarchy (here branching
factors is ~4)
Georgia Tech GVU Center
Continuous LOD Simplification
The initial combination step
This works well with the bounding sphere hierarchy (there
are few sliver triangles or too many triangles associated with
a single points).
Georgia Tech GVU Center
Application of Hierarchical Continuous LOD Method
Full Resolution (geometry plus
texture)
Factor of 50 reduction in
textured triangles
Georgia Tech GVU Center
Application of Hierarchical Continuous LOD Method
Constrained to simplify to single polygon
Full Resolution (geometry plus
texture)
Factor of 50 reduction in
textured triangles
Georgia Tech GVU Center
Transition from Façade-based LOD
to Block-based LOD
Block
Façade
1
…
LOD
Hierarchy
Façade
N
…
…
…
…
…
…
Object
1
…
…
Object
M
…
…
…
…
…
Requires simplification of façade to a few textured polygons
Georgia Tech GVU Center
Hierarchical, Multiresolution Methods for
Highly Detailed Urban Scenes (cont.)
View-Dependent LOD
Goal: interactive visualization
of extended, detailed urban
scenes anywhere on earth
Global quadtree
Bounding box
San Francisco urban model
Eye
Selected LOD
Georgia Tech GVU Center
Automatic Identification and Placement
of Trees, Shrubs, and Foliage
This can be used with Avideh Zakhor’s results to
automatically identify, remove, and model foliage.
Georgia Tech GVU Center
Application to Tree Modeling
Automated identification and
modeling of trees
Accurate placement of 3D modeled trees
Georgia Tech GVU Center
Multiresolution, Embedded Vector Features
Road, river, boundary or
other vector features that
continuously change LOD
as one flies in
Fly-in to Korea
This can be
combined with
Suresh Lodha’s
feature-preserving
techniques
Georgia Tech GVU Center
Multiresolution, Embedded Vector Features
Terrain-following
feature
Georgia Tech GVU Center
Interactive, Hierarchical Rendering of Non-Uniform
Volumes: Weather and Uncertainty
Dynamic 3D structure
Volume rendering of reflectivity of
severe storm passing over 2
Doppler radars
Doppler reflectivity over
North Georgia
Doppler velocity magnitude
Georgia Tech GVU Center
Situational Visualization
• Mobile, interactive visualization with real-time inputs
based on location, orientation, and situation
• Awareness based on where you are, where you are
going, and what you are doing
• Requires mobile computing and new type of interface
Georgia Tech GVU Center
Situational Visualization Applications
Based on visual computing that you can take with you
anywhere.
•Nano-Forecaster
•Mobile Surveyor (building personal worldview)
•Traffic Situator
•Situational Awareness Game
•Mobile pathfinder (e.g., entryways for disabled people)
•Venue Director
•And many other possible mobile applications
Georgia Tech GVU Center
Applications: Nano-Forecaster
Events: severe storm cells,
mesocylones, tornado signatures
Radar
Station
Radar
Station
Weather
Event Server
x
Tornadic signatures
Mesocyclones
over detailed
map
x
User position
Georgia Tech GVU Center
Multimodal Interface: Motivation
• For wearable or ubiquitous computing, traditional
computer interaction often fails
• For wearable or ubiquitous computing, user is often
employing hands for other tasks
• Ever smaller devices will carry ever larger capacity
for computing, storage, and networking. The interface
must be enriched to successfully use all this.
For these reasons...
We are investigating new interaction techniques:
•speech, gesture, and in combination for multimodal
interaction
•two-handed interaction
Georgia Tech GVU Center
Implementation
• VGIS,
a 3D terrain visualization
environment
• Gesture Pendant,
a chest mounted camera and
gesture recognition software
Gesture pendant
(worn on chest)
Infrared lights
Camera with
Infrared filter
Georgia Tech GVU Center
Speech and Gesture
• Speech
– a rich channel for human-to-human
communication
– provides rich command and query interfaces
• Gestures
– complement speech with redundancy, emphasis
– provides concise spatial references and
descriptions
– can be used where speech is inappropriate or in
use for other actions
Georgia Tech GVU Center
Multimodal Interfaces
• Multimodal interfaces appear to be well suited for
spatio-visual interfaces
• May be able to combine the strengths of speech and
gesture
• Provides a large repertoire of commands, allowing
users to chose their means of expression
• Mutual disambiguation may allow more robust
recognition by examining data from both interface
channels.
Georgia Tech GVU Center
Implementation
• IBM ViaVoice speech
recognition software
• Software to integrate
speech and gesture
commands
Georgia Tech GVU Center
Speech Interface
• The speech interface uses IBM ViaVoice
• Recognized speech utterances are time-stamped and
transferred over the network.
• Limited vocabulary
• Command synonyms
Georgia Tech GVU Center
Types of 3D Interaction
Selection
Navigation
Manipulation
•Bowman (Ph.D. Thesis, Georgia Tech, 1999) has studied these
techniques for 3D interaction in virtual environments
Georgia Tech GVU Center
Navigation
•Complex due to the large range of scales involved
– Methods must work at all scales
•Including scale, seven degrees of freedom must be
managed
•Constrained navigation in 3 modes (orbital, fly, walk)
Georgia Tech GVU Center
Speech Interface
• Continuous Movement
– Move {In, Out, Forwards, Backwards}
– Move {Left, Right, Up, Down}
– Move {Higher, Lower}
• Discrete Movement
– Jump {Forwards, Backwards}
– Jump {Left, Right, Up, Down}
– Jump {Higher, Lower}
Georgia Tech GVU Center
Speech Interface
• Speed
– Slower, Faster, Stop
• Direction
– Turn {Left, Right}
– Pitch {Up, Down}
• Modes of Navigation
– Orbit, Fly, Walk
Georgia Tech GVU Center
Gesture Interface
• The Gesture Pendant
– Chest mounted B&W
video camera
– A series of IR LED’s
illuminate the user’s
hand
– A IR pass filter prevents
other light sources from
interfering with
segmentation
– Gestures are with
respect to the body
3”
Georgia Tech GVU Center
Gesture Interface
• Gesture recognition
software segments the
video image into blobs,
based on preset
thresholds
• If the blob conforms to
trained height, width,
and motion parameters,
particular gestures are
recognized.
Georgia Tech GVU Center
Gesture Interface
Pan Left/Right
Pan Up/Down
Georgia Tech GVU Center
Multimodal Interface
• Users first give a
speech command.
• The Gesture Pendant
tracks the fingertip,
allowing gestures to
describe the speed of
the command, i.e. how
fast to turn or move.
Georgia Tech GVU Center
Metrics for Multimodal Interface
•
•
•
•
•
•
Voice & Gesture recognizability and responsiveness
Speed: efficient task completion
Accuracy: target proximity
Ease of learning
Ease of use
User comfort
Georgia Tech GVU Center
Multimodal Task
• Users trained speech recognizer (not
necessary in latest version)
• Users were shown how to position
their hands for the Gesture Pendant
• Experimented with multimodal
interface
Georgia Tech GVU Center
Multimodal Task
• The navigation task began in high orbit
– traveled west into the Grand Canyon
– traveled east into Atlanta
– in fly mode, traveled to the Georgia Tech campus
– in walk mode, parked in from of Tech Tower
Georgia Tech GVU Center
Lessons
• Users could remember both the voice and gesture
commands and some felt they were easier to learn
than keystroke commands.
• It is important that commands be mapped to some
action in every navigation mode. If users try a
command in the “wrong” mode and it does nothing,
users will conclude it does not exist.
• For better results, gesture recognition should be
improved and lag lessened.
• Alternative gestures should be evaluated.
Georgia Tech GVU Center
Lessons
• There were some issues with speech recognition lag.
This has since been improved by restricting the
vocabulary and grammar and improvements in
network code.
• Users would sometimes move their hands out of the
camera field of view. Displaying cursor to indicate
hand position may address this.
Georgia Tech GVU Center
Future Work
• Work on faster recognition, especially for gesture
interface
• Gesture recognition using a neural network
– Larger vocabulary possible
• Developing new version of the Gesture Pendant
– For outdoor use
– Stereo camera pair
– Laser grid and structured light
Georgia Tech GVU Center
Future Work
• Developing other capabilities for 3D interaction
– Two-handed interface (two twiddlers with
flywheels and orientation tracking)
Georgia Tech GVU Center
Results
We published three papers on this work
David Krum, William Ribarsky, Chris Shaw, Larry Hodges, and Nickolas Faust
“Situational Visualization,” pp. 143-150, ACM VRST 2001 (2001).
David Krum, Olugbenga Omoteso, William Ribarsky, Thad Starner, and Larry
Hodges “Speech and Gesture Multimodal Control of a Whole Earth 3D Virtual
Environment,” pp. 195-200, Eurographics-IEEE Visualization Symposium
2002. Winner of SAIC Best Student Paper award.
David Krum, Olugbenga Omoteso, William Ribarsky, Thad Starner, and Larry
Hodges “Evaluation of a Multimodal Interface for 3D Terrain Visualization,” to
be published, IEEE Visualization 2002. This is a formal user study of the
multimodal interface.
Georgia Tech GVU Center
Situational Visualization System
Laptop with
wireless Wavelan;
GPS
Twiddler
Full color, video resolution
display
Georgia Tech GVU Center
Situational Visualization: Demonstration of
Emergency Response to Terrorist Attack
Sarin gas cloud
GPS positions of
first responders
Overview and fly-in to attack point on Georgia Tech campus
Demonstrated to President Bush and Governor Ridge
on March 27, 2002.
Georgia Tech GVU Center
Next Phase Work
•Scalable multiresolution framework for urban landscape
geometry and textures (so that one can navigate smoothly
from a close-up of a building façade to an overview of
several city blocks).
•Initial work on image-based rendering techniques applied
to cityscapes (these will be needed for overviews when
one collects hundreds of city blocks with tens of thousands
of buildings)
•Use of the mobile geospatial database with computer
vision techniques to determine position and orientation
when sensor data are inaccurate or missing.
•Further work on uncertainty visualization in the VGIS
environment
Georgia Tech GVU Center
Next Phase Work
•Further Situational Visualization applications.
•Continued development and evaluation of multimodal
interface for 3D interaction.
•Transfer of mobile visualization applications to a
networked PDA.
Georgia Tech GVU Center
Additional Publications on this Work
• William Ribarsky, “Towards the Visual Earth,” Workshop on
Intersection of Geospatial Information and Information Technology,
National Research Council (October, 2001).
• William Ribarsky, Christopher Shaw, Zachary Wartell, and Nickolas
Faust, “Building the Visual Earth,” Vol. 4744B, SPIE 16th International
Conference on Aerospace/Defense Sensing, Simulation, and
Controls (2002).
• William Ribarsky, Tony Wasilewski, and Nickolas Faust, “From Urban
Terrain Models to Visible Cities,” to be published, IEEE CG&A.
• Justin Jang, William Ribarsky, Chris Shaw, and Nickolas Faust,
“View-Dependent Multiresolution Splatting of Non-Uniform Data,” pp.
125-132, Eurographics-IEEE Visualization Symposium 2002.
Georgia Tech GVU Center
User Studies of Multimodal Interface (cont.)
Demonstration of
use of gesture
pendant to recognize
hand gestures
Results
User task: objects
are located,
navigated to, and
identified.
•Mouse interface has best performance, then speech alone,
multimodal, and gesture alone.
•When mouse is not available or easy to use, a speech interface is a
good alternative for navigation tasks.
•Better, faster recognition of gestures could significantly improve
performance of the multimodal interface.
Georgia Tech GVU Center