Tracking objects across cameras by incrementally learning intercamera colour calibration and patterns of activity (ECCV 2006) Andrew Gilbert and Richard Bowden University of Surrey CSE 252C, Fall.
Download ReportTranscript Tracking objects across cameras by incrementally learning intercamera colour calibration and patterns of activity (ECCV 2006) Andrew Gilbert and Richard Bowden University of Surrey CSE 252C, Fall.
Tracking objects across cameras by incrementally learning inter camera colour calibration and patterns of activity (ECCV 2006)
Andrew Gilbert and Richard Bowden University of Surrey CSE 252C, Fall 2006 UCSD 1
Inter Camera Tracking
As the number of cameras in a network increase, the user’s ability is limited in being able to manage such vast amounts of information.
We would like to be able to automatically track objects across a camera network Tracking an object throughout each cameras field of view (FOV) Successful object “handover” between cameras in network We would like to be able to track objects across cameras without any explicit geometric or color calibration CSE 252C, Fall 2006 UCSD 2
Proposed Algorithm
Using an incremental learning method, the inter camera color variations and the probability distribution of spacio-temperal links between cameras are modeled.
Requires no color or spatial pre-calibration No batch processing We want the algorithm to work immediately, as more data becomes available will improve performance and is adaptable to changes in the cameras’ environments CSE 252C, Fall 2006 UCSD 3
Test Environment
Four non-overlapping color cameras in an office building.
CSE 252C, Fall 2006 UCSD 4
Intra-Camera Tracking
The static background color distribution is modeled.
A Gaussian mixture model on a per-pixel basis is used to form the foreground vs. background pixel segmentation, learned using an online approximation to expectation maximization. Shadows are identified and removed by relaxing a models constraint on intensity by not chromaticity, and the foreground object is formed using connected component analysis on the resulting binary segmentation. Objects are linked temporally with a Kalman filter to provide movement trajectories within each camera.
(Kalman Filter overview later!) CSE 252C, Fall 2006 UCSD 5
Object Description
Once foreground objects have been identified, an object descriptor is formed.
Color histogram is used to describe the color fingerprint of the object Spatially invariant Simple and efficient to compute Through some quantization, it also provides some invariance to changes in color appearance Similarity between objects is given by histogram descriptor intersection CSE 252C, Fall 2006 UCSD 6
CCCM
Consensus-Colour Conversion of Munsell colour space Breaks RGB color into 11 basic colors Each basic color represents a perceptual color category established through a physiological study of how humans categorize color.
Works best without color-calibration of cameras in network Provides consistent inter camera without calibration, relying on the perceptual consistency of the color If an object is perceived as red in both images, then CCCM will provide a consistent result With calibration, quantized RGB performs best (we’ll use this later).
CSE 252C, Fall 2006 UCSD 7
Color Descriptor
CSE 252C, Fall 2006 UCSD 8
Building Temporal links between Cameras
Assume that objects follow similar routes inter camera and repetitions form consistent trends across data. The temporal inter camera links can be used to link camera regions together, yielding a probabilistic distribution of objects movement between cameras.
As number of cameras in network increases, the number of possible links increases exponentially The majority of the possible links are invalid because they correspond to impossible routes We want a solution to distinguish between valid and invalid links A solution that CSE 252C, Fall 2006 UCSD 9
Building Temporal Links between Cameras
Within each cameras field of view, the tracking algorithm forms color descriptor for object, the median histogram of the object recorded over the entire trajectory within a single camera
B
( , 1 2 ,...,
b n
) Each new object is compared to previous objects within a given time window, T.
Add the other equations here…. JLG!!!
CSE 252C, Fall 2006 UCSD 10
Region Links
An example probability distribution showing a distinct link between two regions CSE 252C, Fall 2006 UCSD 11
Incremental Block subdivision and recombination
System based on rectangular subdivision Initially there are one block for each of the four cameras. If the maximum peak of the distribution is found to exceed the noise floor, this indicates a possible correlation between the blocks.
If correlation exists, block is subdivided into four equal sized blocks. Previous data and new data is used to form new links between newly formed sub-blocks If no correlation, the links are thrown away to minimize maintenanced Blocks which are found to have similar distributions to neighbors are combined together Reduces number of blocks and links maintained, and increases accuracy CSE 252C, Fall 2006 UCSD 12
Incremental Block subdivision and recombination
CSE 252C, Fall 2006 UCSD 13
Calculating Posterior Appearance Distributions
Given an object which disappears in region y we can model its reappearance probability over time as: Where the weight at time t is given as: This probability is used to weight the observation likelihood obtained through color similarity to obtain a posterior probability of a match.
CSE 252C, Fall 2006 UCSD 14
Modeling Color Variations
The CCCM color quantization assumes a similar color response between cameras.
Use CCCM as initial color descriptor, in parallel form color transformation matrices between cameras The tracked people are used as the calibration objects, and the transformation matrix is formed incrementally to model color changes between cameras CSE 252C, Fall 2006 UCSD 15
Modeling Color Variations
Six transformations (with inverses) provide the twelve transformations needed to transform objects between the four cameras This allows less coarse quantization (RGB) to be used with improved performance
T
12 1 2
T
13
T
12
T
14
T
23
T
24
T
34 CSE 252C, Fall 2006 3 4 UCSD 16
Modeling Color Variations
There is a color transformation matrix that transforms the color space of one camera into that of another camera With each tracked object, find this transformation matrix (using SVD) and then average it to the aggregate color transformation matrix Initially set to the Identity matrix CSE 252C, Fall 2006 UCSD 17
Results
Data used consisted of 10,000 objects tracked over a period of 72 hours of continuous operation Evaluation performed using unseen ground-truth 20 minute sequence with 300 instances of people tracked for more than 1 second.
CSE 252C, Fall 2006 UCSD 18
Results
CSE 252C, Fall 2006 UCSD 19
Results
CSE 252C, Fall 2006 UCSD 20
Results
CSE 252C, Fall 2006 UCSD 21
Conclusions
Derived main entry / exit areas in a camera probabilistically using incremental learning Simultaneously the inter-camera color variations are learned These allow people to be tracked between spatially separated uncalibrated cameras with up to 81% accuracy No a priori information used and learning is unsupervised Fulfills three ideals of working immediately, improving performance as more data is accumulated, and can adapt to changes CSE 252C, Fall 2006 UCSD 22
Let’s break it!
Adding / taking off layers of clothing of different colors would fool the object descriptor.
Take irregular paths. The link probabilities are built following the average paths.
Move really slowly, so that you blend in with the background. (?) Anything else?... (invisible cloak) CSE 252C, Fall 2006 UCSD 23
Results
CSE 252C, Fall 2006 UCSD 24
Kalman Filter
Rudolph E. Kalman in 1960 published a recursive solution to the discrete-data linear filtering problem.
Set of mathematical equations that implement a predictor-corrector estimator that is optimal in the sense that it minimizes the estimated error covariance Used extensively in computer graphics / vision for tracking Greg Welch, Gary Bishop (www.cs.unc.edu/~welch/kalman) CSE 252C, Fall 2006 UCSD 25
Kalman Filter
Trying to estimate the state of a discrete-time controlled process that is governed by the linear stochastic difference equation: With a measurement CSE 252C, Fall 2006 UCSD 26
Kalman Filter
Kalman Filter estimates system state at some time and then obtains feedback in terms of (noisy) measurement Equations for Kalman Filter fall into two categories: (1) Time Update Equations Responsible for projecting forward in time the current state and error covariance estimates to obtain for the next time step
a priori
estimates (2) Measurement Update Equations Measurement feedback - incorporates new measurements into the
a priori
estimate to obtain an improved
a posteriori
estimate CSE 252C, Fall 2006 UCSD 27
Kalman Filter
Time update =
predictor
Measurement update =
corrector
CSE 252C, Fall 2006 UCSD 28
Kalman Filter – Time Update
P is the error covariance estimate CSE 252C, Fall 2006 UCSD 29
Kalman Filter – Measurement Update
K is the kalman gain matrix CSE 252C, Fall 2006 UCSD 30
Kalman Filter
CSE 252C, Fall 2006 UCSD 31
Extended Kalman Filter (EKF)
The Kalman filter is estimating the state of a stochastic process that is
linear
. Many applications are
non-linear
!! Nothing to fear we have the EKF!
EKF = a Kalman filter that linearizes about the current mean and covariance CSE 252C, Fall 2006 UCSD 32
Kalman Filter
Because the predicted state is based only on the previous state, it is computationally and memory-usage efficient and there are many implementations available OpenCV has implementation.
Oh, and the state being tracked is the centroid of the objects… CSE 252C, Fall 2006 UCSD 33
Questions
Examples didn’t show occlusion during inter-camera tracking. How well does this system work with occlusion and tracking multiple people simultaneously?
What other descriptors in addition to color histograms can be used to describe the tracked objects?
Are there better performing intra-camera tracking techniques besides Kalman filter?
CSE 252C, Fall 2006 UCSD 34