Active Range Imaging Datasets for Indoor Surveillance

Download Report

Transcript Active Range Imaging Datasets for Indoor Surveillance

Active Range Imaging Datasets
for Indoor Surveillance
C. Distante, G. Diraco, A. Leone
Institute for Microelectronics and Microsystems – CNR, Lecce (Italy)
One day BMVA symposium “Security and surveillance: performance evaluation” – London (UK), December 12, 2007
 Introduction
 Outline of active range vision
□ Range imaging technologies
□ Properties of Time-Of-Flight range sensors
 Active range vision Vs Passive vision
□ Comparison between TOF camera and stereo vision
□ Advantages and drawbacks in surveillance contexts
 Datasets for indoor surveillance
 Case study
□ TOF sensor-based fall detection
 Conclusions
One day BMVA symposium “Security and surveillance: performance evaluation” – London (UK), December 12, 2007
 Introduction
 Outline of active range
vision
 Active range vision Vs
Passive vision
 Datasets for indoor
surveillance
In the last years several active range sensors have been presented
(Canesta Inc., Mesa Imaging AG, 3DV Systems Ltd, …).
The ability to describe scenes in three dimensions opens new
scenarios, providing new opportunities in different applications,
including visual monitoring (object detection, tracking, recognition,
image understanding), security, biometrics, automotive, robotics,
medical imaging, …
 Case study
 Conclusions
Active range sensors provide depth information allowing to use
algorithms much less complex and allows problems to be approached
in a new, robust and cost-efficient way.
Datasets are presented in order to suggest a common basis for
comparative analysis of vision algorithms.
One day BMVA symposium “Security and surveillance: performance evaluation” – London (UK), December 12, 2007
Range Imaging (RIM) is the fusion of two different technologies,
integrating depth measurement and imaging aspects.
 Introduction
 Outline of active
range vision
 Active range vision Vs
Passive vision
 Datasets for indoor
surveillance
 Case study
It’s a new
investigated.
measurement
technique, not yet well-known
and
Depth Measurement Techniques Taxonomy
Contactless depth
measurement
Triangulation (stereoscopy, structured light - sub-millimeter resolution)
 Conclusions
Interferometry (light source scanning - sub-micrometer resolution)
Time-Of-Flight (millimeter resolution)
Pulse (Direct measure)
Continuous wave modulation (Indirect measure, eye-safe)
One day BMVA symposium “Security and surveillance: performance evaluation” – London (UK), December 12, 2007
Principles of phase shift modulation-based TOF sensors
 Introduction
 Outline of active
range vision
 Active range vision Vs
Passive vision
 Datasets for indoor
surveillance
 Case study
The depth estimation is realized by measuring the phase shift φ of
the signal round-trip from the device to the target and back.
 Conclusions
Intensity image
Depth image
High
distance
Low
distance
One day BMVA symposium “Security and surveillance: performance evaluation” – London (UK), December 12, 2007
Main features of modulation-based TOF sensors
 Introduction
 Outline of active
range vision
 Active range vision Vs
Passive vision
 Datasets for indoor
surveillance
 Case study
 Conclusions
•
•
•
•
•
•
•
•
•
standard CMOS technology
high frame rate (up to 30 fps)
fairly good spatial resolution (up to QCIF @ 176x144 pixels)
fairly good field of view (up to 80x80 degrees)
aliasing effects (non-ambiguity range up to 30 meters)
low depth measurement error (< 1% in non-ambiguity range)
direct Cartesian coordinate output (x, y, z) for 3D reconstruction
built-in band-pass optics for background light suppression
illumination power less than 1W (LED array, Class 1 for eye-safe)
Two critical parameters affect performances:
• modulation frequency (it is a design parameter that mainly
affects the non-ambiguity range)
• integration time (it could be adjusted and affects depth
resolution and frame rate)
One day BMVA symposium “Security and surveillance: performance evaluation” – London (UK), December 12, 2007
Comparison of the most important characteristics
of TOF cameras and stereo vision systems
TOF sensor
 Introduction
 Outline of active range
vision
 Active range vision
Vs Passive vision
 Datasets for indoor
surveillance
 Case study
 Conclusions
Stereo (passive)
vision
Depth
resolution
Sub-centimeter (if chromaticity
conditions are satisfied)
Sub-millimeter (if images are
highly textured)
Spatial
resolution
Medium (up to QCIF, 144x176
pixels)
High (over 4CIF, 704x576
pixels)
Portability
Dimensions are the same of a
normal camera
Two video cameras are needed
and also external light source
Computational
efforts
On-board FPGA for phase and
intensity measurement
High workload (the calibration
step and the correspondences
search process are hard)
High for a customizable
prototype (1500€ - 5000€)
It depends on the quality of
stereo vision system
Cost
One day BMVA symposium “Security and surveillance: performance evaluation” – London (UK), December 12, 2007
Advantages in the use of TOF
sensors in surveillance contexts
TOF sensor
 Introduction
 Outline of active range
vision
 Active range vision
Vs Passive vision
 Datasets for indoor
surveillance
 Case study
 Conclusions
Passive vision
Accurate depth measurement
in all illumination conditions
Sensible to illumination variations
and artificial lights. Unable to
operate in dark environments.
It does not affect principal
steps of monitoring
applications
Reduced performances in
segmentation, recognition, …
Partial
occlusions
Partially occluded objects are
detected as separated (if they
are at different depths)
Due to projective ambiguity,
occlusions are difficult to detect
(merging blobs) and handle
(predictive tracking strategies)
Objects
appearance
Camouflage is avoided but
appearance could affect depth
precision (chromaticity
dependence)
Camouflage effects are presented
when foreground / background
present same appearance
properties
Illumination
conditions
Shadows
presence
One day BMVA symposium “Security and surveillance: performance evaluation” – London (UK), December 12, 2007
Drawbacks in the use of TOF
sensors in surveillance contexts
Drawback description
 Introduction
 Outline of active range
vision
 Active range vision
Vs Passive vision
Aliasing
It affects the non-ambiguity range i.e.
the maximum achieved depth is reduced
(up to 30 meters)
Multi-path effects
Depth measurement is strongly corrupted
when the target surface presents corners
Objects reflection
properties
Materials having different colors exhibit
dissimilar reflection properties that affect
reflected light intensity and, therefore,
depth resolution
 Datasets for indoor
surveillance
 Case study
 Conclusions
Field of view
Usually it is limited so that an accurate
positioning of the sensor is needed. A
pan-tilt architecture could be used
One day BMVA symposium “Security and surveillance: performance evaluation” – London (UK), December 12, 2007
Datasets description
 Introduction
 Outline of active range
vision
 Active range vision Vs
Passive vision
 Datasets for indoor
surveillance
 Case study
 Conclusions
• Each sequence is acquired at QCIF resolution by a state-of-the-art
TOF sensor (MESA SR-3000) and it is composed by 1800 frames
captured at variable frame rate (by varying integration time)
• Sequences have been acquired in wall/ceilingmounting configurations at different subject
orientations, in presence/absence of occlusions,
in order to cover a large amount of events
• Extrinsic parameters are available for
calibration purpose (camera-floor distance,
camera orientation, scene depth)
• Sequences present people (one/more persons
in the scene) having different postures (stand,
sit, lay down, bent, squat)
• In the sequences persons have different behaviours (walking, falling
down, moving objects, picking objects, limping)
• Datasets are at http://siplab.le.imm.cnr.it
One day BMVA symposium “Security and surveillance: performance evaluation” – London (UK), December 12, 2007
Datasets description
 Introduction
A generic frame (176x144 pixels) of each sequence presents the following
structure:
 Outline of active range
vision
 Active range vision Vs
Passive vision
Raw data
1. Depth image (2Bytes/pixel – unsigned integer)
2. Intensity image (2Bytes/pixel – unsigned integer)
 Datasets for indoor
surveillance
FPGA processed data
 Case study
 Conclusions
3.
4.
5.
6.
7.
Depth image with noise reduction (2Bytes/pixel – unsigned integer)
Intensity image with noise reduction (2Bytes/pixel – unsigned integer)
Cartesian x coordinate (4Bytes/pixel – signed float)
Cartesian y coordinate (4Bytes/pixel – signed float)
Cartesian z coordinate (4Bytes/pixel – signed float)
For each frame a great amount of information is defined (495KBytes)!
One day BMVA symposium “Security and surveillance: performance evaluation” – London (UK), December 12, 2007
Aliasing and multipath effects
 Introduction
Intensity image
Noise is due to highreflective objects (LCD TV)
Depth image
 Outline of active range
vision
 Active range vision Vs
Passive vision
 Datasets for indoor
surveillance
 Case study
 Conclusions
Fluctuations are due to the continuously
adjusted emitted light power
0.5 meters
7.5 meters
One day BMVA symposium “Security and surveillance: performance evaluation” – London (UK), December 12, 2007
 Introduction
Intensity image
Depth image
 Outline of active range
vision
 Active range vision Vs
Passive vision
 Datasets for indoor
surveillance
 Case study
 Conclusions
Could the depth information help the tracking
in the presence of total occlusions?
One day BMVA symposium “Security and surveillance: performance evaluation” – London (UK), December 12, 2007
 Introduction
Intensity image
Depth image
 Outline of active range
vision
 Active range vision Vs
Passive vision
 Datasets for indoor
surveillance
 Case study
 Conclusions
One day BMVA symposium “Security and surveillance: performance evaluation” – London (UK), December 12, 2007
TOF sensor-based fall detection
 Introduction
 Outline of active range
vision
 Active range vision Vs
Passive vision
 Datasets for indoor
surveillance
 Case study
 Conclusions
Gaussians mixture for
background modelling
Bayesian approach
for segmentation
One day BMVA symposium “Security and surveillance: performance evaluation” – London (UK), December 12, 2007
 Introduction
 Outline of active range
vision
 Active range vision Vs
Passive vision
 Datasets for indoor
surveillance
 Case study
 Conclusions
In indoor surveillance applications, range images provide a better
perception of scenes in all illumination conditions, deterring the
use of cheap stereo systems that fail in dark or low-textured
environments.
If critical parameters of TOF sensor are adjusted, reliable,
computationally low-cost and real-time segmentation/tracking can
be realized by only using depth measure, since intensity images
present unwanted fluctuations.
Depth information overcomes projective ambiguity, whereas
intensity image provides appearance information, so that the
joined use of them improves critical steps (object recognition,
behavior analysis, …) allowing a better description of moving
objects.
The suggested datasets provide common basis to investigate vision
algorithms; they can be improved by defining ground-truth data to
quantify performances.
One day BMVA symposium “Security and surveillance: performance evaluation” – London (UK), December 12, 2007
THANK YOU FOR
YOUR ATTENTION
One day BMVA symposium “Security and surveillance: performance evaluation” – London (UK), December 12, 2007