Multiple-shot Person Re-identification by HPE signature
Download
Report
Transcript Multiple-shot Person Re-identification by HPE signature
Multiple-shot Person Reidentification by HPE signature
Loris Bazzani*, Marco Cristani*†, Alessandro Perina*,
Michela Farenzena*, Vittorio Murino*†
*Computer Science Department, University of Verona, Italy
†Istituto Italiano di Tecnologia (IIT), Genova, Italy
This research is founded by the EU-Project FP7 SAMURAI,grant FP7-SEC- 2007-01 No. 217899
Analysis of the problem (1)
• Person Re-identification: Recognizing an individual in
diverse locations over different (non-)overlapping camera
views
Different cameras
T=1
T = 23
Same camera
2
T = 145
T = 222
Analysis of the problem (2)
• We focus on the problem with non-overlapping cameras
• Problems in real scenarios:
–
–
–
–
–
–
Very low resolution
Severe Occlusions
Illumination variations
Pedestrians with very similar clothes
Pose and view-point changes
No geometry of the environment
• Solution:
- Histogram Plus Epitome (HPE) descriptor, and
- Multiple-shot approach
3
Outline
Overview of the proposed method
Pre-processing: Background Subtraction
“Images selection” for Multiple-shot
HPE descriptor
- Global descriptor
- Local descriptors
HPEs’ Matching
Results
Conclusions
4
Overview of the proposed method
• Employing global and local appearance-based features
• Exploiting the temporal consistency to make robust the descriptor
5
Background Subtraction
We employ a novel generative model: STEL [Jojic el al. 2009]
6
Capture the structure of an image class as a mixture of component
segmentations
Isolate meaningful parts that exhibit tight feature distributions
Learned Mixture Components
“Images selection” for Multiple-shot
Objective: discard redundant information and images with occlusions
Gaussian Mixture Models Clustering [Figueiredo and Jain 2002] of HSV
histograms
Automatic model selection employing the Bayesian Information Criterion
[Figueiredo and Jain 2002]
Discard the clusters with low number of instances
Keep a random instance for each cluster
Examples of ruled-out examples:
7
HPE descriptor: Global feature
• Capture chromatic global information
36-dimensional HSV histogram
Caused by
illumination
changes
8
(H=16, S=16, V=4)
Average the histograms of the
multiple instances
Robust to illumination and pose
variations, keeping the
predominant chromatic
information only
HPE descriptor: Local feature (1)
Epitome [Jojic el al. 2003]: generative model that analyzes the
presence of recurrent, structured local patterns
Generic
Epitome
Local
Epitome
9
HPE descriptor: Local feature (2)
Generic Epitome
:
36-dimensional HSV histogram of the Epitome
Local Epitome
:
Keep the patches with high
: probability that a patch in
the epitome having (i, j) as left-upper corner represents several
ingredient patches
Discard the patches with low entropy
Extract a 36-dimensional HSV histogram of the “survived”
patches
10
HPEs’ Matching
Re-identification: associating each element in the probe set B
to the corresponding element in the gallery set A
Minimize the following distance
where
11
is the Bhattacharyya distance and
Results (1)
iLIDS dataset:
- Multiple images of 119 pedestrians 128x64 pixels
- Comparison with Context-based method [Zheng et al. 2009]
- Cross-validation: SvsS 10 trials, MvsS/MvsM 100 trials
12
Results (2)
ETHZ dataset:
- Three datasets of 83, 35 and 28 pedestrians of 64x32 pixels
- Comparison with Partial Least Square (PLS) method
[Schwartz and Davis 2009]
- Cross-validation: Settings as for iLIDS
13
Results (3)
How many images do we need to perform a “good” person
re-identification? N = 5 seems to be the best trade-off
14
N = Number of images for the multi-shot approach
Conclusions
We proposed a novel descriptor for the person re-
identification problem, i.e., HPE descriptor
The descriptor is robust to low resolution, occlusions,
illumination variations, pedestrians with very similar
clothes, pose changes
It is based on the accumulation of images to gain robustness
Person re-identification problem is still far from being
solved
The results suggest that further improvements can be
reached
15
References
[Jojic el al. 2009] N. Jojic, A. Perina, M. Cristani, V. Murino, and B. Frey, “Stel
component analysis: Modeling spatial correlations in image class structure,”
IEEE Conference on Computer Vision and Pattern Recognition, pp. 2044–
2051, 2009.
[Figueiredo and Jain 2002] M. Figueiredo and A. Jain, “Unsupervised learning of
finite mixture models,” IEEE Trans. PAMI, vol. 24, no. 3, pp. 381–396, 2002.
[Jojic el al. 2003] N. Jojic, B. J. Frey, and A. Kannan, “Epitomic analysis of
appearance and shape,” in IEEE International Conference on Computer Vision.
Washington, DC, USA: IEEE Computer Society, 2003, p. 34.
[Schwartz and Davis 2009] W. Schwartz and L. Davis, “Learning discriminative
appearance-based models using partial least squares,” in XXIISIBGRAPI, 2009.
[Zheng et al. 2009] W. Zheng, S. Gong, and T. Xiang, “Associating groups of
people,” in BMVC, 2009.
16