PowerPoint-esitys

Download Report

Transcript PowerPoint-esitys

Head-tracking virtual 3-D display
for mobile devices
Miguel Bordallo López*, Jari Hannuksela*, Olli Silvén* and
Lixin Fan**,
* University of Oulu, Finland
** Nokia Research Center, Tampere, Finland
MACHINE VISION GROUP
Contents
Introduction
Head-tracking 3D virtual display
• Interaction design
• Face-tracking for mobile devices
• Mobile device’s constrains
• Field of view
• Energy efficiency
Implementation
Latency considerations
Performance
Summary
MACHINE VISION GROUP
Introduction
3D virtual displays
Calculate the relative position
of the user respect to the screen
Calculate the angle of the user’s
point of view
Render an image according to the
point of view
Result is a Virtual Window:
- Shows realistic 3D objects
- Based on parallax effect
* Video from Johnny Lee (Wiimote head tracking project)
The position information is used to render the 3D UI/content as if the user watched it from different angles.
The technology enable users to watch the content from different angles and become more immersed.
MACHINE VISION GROUP
Introduction
Mobile 3D virtual displays
•Mobile head-coupled display
can take advantage of the small size
•Movement of either user or device
•Mobile Devices have cameras and sensors integrated
•No need for external periferics
•Can increase UI functionalities
•New applications and concepts
•Realistic 3D objects can be rendered and perceived
•New interaction methods can be developed
•We know what the user looks at and we can use that information
MACHINE VISION GROUP
Demo
MACHINE VISION GROUP
Head-tracking mobile virtual 3D display
A simple use case
MACHINE VISION GROUP
Interaction design
MACHINE VISION GROUP
Introduction
Mobile face-tracking
• Head-coupled displays require robust and fast face-tracking
• Based on multiscale LBP, Cascade classifier and AdaBoost
• Excellent results in face recognition and authentication, face
detection, facial expression recognition, gender classification
MACHINE VISION GROUP
Introduction
Evaluating the distance to the screen
• Essential to compute de relative angle
• Ground truth determined With Kinect
• Two methods evaluated:
• Face size obtained with face tracking
• Flickering between frames
• No extra computations needed
• Good accuracy
• Motion estimation library:
• Harris corners + BLUE
• Computes changes of scale between frames
• Presents about 10% more accuracy
• Less flickering between frames
• Needs extra computations:
•Introduces latency, decreases framerate
•Worse input sequence for tracking
•More differences between frames
MACHINE VISION GROUP
Mobile constrains
Field of view
• Front Camera is on the device’s corner and
not pointing to the user:
• Reduced field of view (<45dg)
• Assymmetric FoV
• Even more reduced effective FoV
• Considerable minimum
distance to the screen
• User often outside of the point of view
• Tracking sometimes lost
• Need to show viewfinder on the screen
MACHINE VISION GROUP
Mobile constrains
Field of view
• Implemented solution: Wide angle lens
• Dramatically increases the effective
field of view (<160dg)
• Requires calibrated lens
• Requires de-warping routine
•Implemented with lookup tables
• Problems when several faces are on
the field of view
MACHINE VISION GROUP
Mobile constrains
Energy efficiency
Practical challenge of camera-based UI is to have an always active camera
Lower framerate -> High UI starting latencies
Higher framerate -> Small energy-efficiency
Application processor (even in mobile) is power hungry
Specific processors closer to the sensors are needed
Current devices include HW-codecs and GPUs:
Better energy efficiency due to small EPI
Mobile GPU already programable:
OpenGL ES
OpenCL Embeded Porfile
MACHINE VISION GROUP
Energy efficiency
GPU-accelerated face-tracking
GPU can be treated as an independent entity
Can be use concurrently with CPU
Use of GPU for feature extraction (format conversion + multiscaling + LBP)
Mobile GPUs still not very efficient for certain tasks
Computational and energy costs per VGA frame of feature extraction
MACHINE VISION GROUP
Implementation
• Demo platform: N900 (Qt + Gstreamer + openGL ES)
• Based on face-tracking external library
• Implementation details:
• Input image resolution : 320x240
• Frame rate: 16-20 fps.
• Base latency: 90-100 ms.
• Accepted field of view: < 45dg hori. & < 35dg vert.
•
MACHINE VISION GROUP
User’s distance range: 25 - 300 cm.
Implementation
Simple block diagram
MACHINE VISION GROUP
Implementation
Task distribution
MACHINE VISION GROUP
Implementation
Task distribution
Camera module
MACHINE VISION GROUP
Application Processor
CPU
Graphics Processor
GPU
Touchscreen
Display
Implementation
Task distribution
MACHINE VISION GROUP
Camera module
Application Processor
Graphics Processor
Touchscreen
Mobile constrains
Latency
• User interface latency is a critical issue
• Latency > 100ms. Very disturbing
• Realistic 3D rendering even more sensitive
• Not realistic if it happened a while ago !!!
MACHINE VISION GROUP
Mobile constrains
Latency hiding
A possible solution: Latency hiding
Requires good knowledge of the system’s timing
Predict the current position based on motion vector
MACHINE VISION GROUP
Performance
Demo platform: Nokia N900
ARM cortex A8, 600 MHz + PowerVR535 GPU
Comparison platform: Nokia N9
ARM cortex A8, 1 GHz + PowerVR535 GPU
MACHINE VISION GROUP
Remaining problems
• Face-tracking based 3D User Interfaces provide support for new concepts
• Face tracking can be offered as a platform level
• Current mobile platforms still present several shortcomings
• Energy efficiency compromises battery life
• Camera not designed for UI purposes
• Single camera implies difficult 3D context recognition
MACHINE VISION GROUP
Thank you
Any question?
MACHINE VISION GROUP
LBP fragment shader
implementation
• Uses OpenGL ES interface
• Two versions:
– Version 1: calculates LBP map in one grayscale channel
– Version 2: calculates 4 LBP maps in RGBA channels
•Access the image via texture lookup
•Fetch the selected picture pixel
•Fetch the neighbours values
•Compute binary vector
•Multiply by weighting factor
MACHINE VISION GROUP
Preprocessing
Create quad
Render each piece
in one channel
Divide texture &
Convert to grayscale
MACHINE VISION GROUP
GPU assisted face analysis process
MACHINE VISION GROUP