Transcript Slide 1

Personal Memory Assistant
Abstract
Facial recognition and speaker verification
systems have been widely used in the security
field. In this area the systems have to be very
accurate to prevent unauthorized users from
accessing classified information. The
extensive list of possible uses of these
systems in the commercial world has not been
taken advantage of yet.
It is often difficult to remember the name of a
person who is encountered out of context or
infrequently. This situation can prove to be
very embarrassing for the forgetful person. It
can also be insulting to the person who is not
remembered. The Personal Memory Assistant
uses facial recognition and speaker verification
to help avoid this situation.
Overall Flow Chart
Audio
Database
Mic
The facial identification system is divided into the following two components:
Entry
Form
Search
Speaker
Identification
Compare
Camera
Facial Recognition System
True
Profile
Database
Facial
Recognition
Delete
Update
Display
Image
Database
•Detection
This isolates the face using a trained cascade of boosted classifiers
based on spatial contrasts. Once isolated, the face is aligned, normalized
for lighting and masked to obscure the background. Fig. 1. is an image
of this process.
•Recognition
Covariance matrices are calculated from the difference between the
captured face and the mean of a set of one subject’s stored faces. The
eigenface algorithm uses Principal Component Analysis (PCA) and Linear
Discriminant Analysis (LDA) to compare the collected faces with those
already stored for each subject.
wav files vs. Correct ID weight
1
0.8
A user discretely collects images and voice
samples of the person to be identified. The
facial recognition component analyzes the
image to identify the three closest facial
matches in the system. The speaker
identification component does the same to
identify the top two voice matches. The
guesses are compared using an algorithm that
was developed through testing. If the guesses
match, a picture of the person and personal
profile is displayed to the user. If no match is
made, the user has the option to add the
subject to the database. In addition to the
identification process, the system also gives
the option of searching for and updating
entries in the database.
Group 7
Authors
Scott Kyle CTE ’08
Erika Sanchez EE ’08
Meredith Skolnick CTE ’08
Advisor
Dr. Kenneth Laker
University of Pennsylvania
Dept. of Electrical and Systems Engineering
Speaker Identification System
0.6
0.4
The speech wave goes through the following three major
processing steps:
0.2
0
0
•Preprocessing
This is normalization of the sample to an amplitude
range of {-1,1}
5
Fig. 1 – Face Detection
10
15
20
25
30
35
40
45
Fig. 2 – Test results
System Testing
•Feature Extraction
Spectral values are saved from a Fast Fourier
Transform of the signal. The cluster of these values
for all samples of an ID is called the voiceprint.
•Pattern Matching
A Nearest Neighbor Algorithm with Euclidean
distances is used to compare each new sample to all
existing voiceprints. The PMA system then decides if
this sample should be saved as the voiceprint of a
new subject, or added to an existing subject’s
voiceprint.
•Overall Testing
-A subject pool was gathered with demographics like the US population
-Each subject was added to the database, and then tested 3 times
-All identification results were stored to be analyzed later.
•WAV number testing
-The same testing sample was used to test the Speaker Identification
accuracy with an incrementing number of samples in the database.
-The results are shown in Fig. 2. Weights range from 0-1.0.
-Vertical lines indicate thresholds for manipulating speaker or voice
weights.
System Performance
•Comparison formula
83.0% Correct Identification
4.3% Incorrect Identification
12.7% “No Match” when in system
50