Dublin City University Centre for Digital Video Processing SenseCam Work at Dublin City University Alan F.

Download Report

Transcript Dublin City University Centre for Digital Video Processing SenseCam Work at Dublin City University Alan F.

Dublin City University
Centre for Digital Video Processing
SenseCam Work at
Dublin City University
Alan F. Smeaton, Gareth J.F. Jones and Noel E. O’Connor (PIs)
Georgina Gaughan, Cathal Gurrin, Hyowon Lee, Hervé Le Borgne
(PostDocs)
Aiden Doherty, Michael Blighe, Ciarán Ó’Conaire, Michael McHugh,
Saman Cooray (PhD students)
Barry Lavelle, Paul Reynolds (Masters students)
Sandrine Áime (Summer student)
… 15 people working on SenseCams in some way at DCU
Center For Digital Video Processing,
Dublin City University, Ireland
Dublin City University
Centre for Digital Video Processing
Overview
• Our contribution to developing SenseCam work;
• Automatic event segmentation - 3 approaches;
• Application: generation of rolling weekly
summary based on Addenbrook’s
• Face detection and body patch matching
– Arizona data
• Using BT and other sensors for context
• Alternative way to presenting SenseCam images
Dublin City University
Centre for Digital Video Processing
Our (DCU) Contribution
• We do image/video analysis, indexing,
summarisation, etc. and we apply this to
SenseCam data;
• We have no particular SenseCam application,
we will develop underlying technology;
• We’re keen to hear about the real problems of
SenseCams in practice, and to offer …
• We consider the typical full-day SenseCam
images, do event segmentation and
summarisation;
Dublin City University
Centre for Digital Video Processing
A day’s SenseCam images
(3,000 – 4,000)
Event Segmentation
Multiple Events
Finishing work
in the lab
At the bus
stop
Chatting at Skylon Hotel
lobby
Moving to a
room
Summarisation
Tea time
On the way
back home
Dublin City University
Centre for Digital Video Processing
Automatic Event Segmentation
• Task: automatically determine events from a
collection of SenseCam image data;
• Based around image-image similarity using
MPEG-7 features where differences may
indicate events;
• Similar problem to shot bound detection in video
but more challenging given the fish-eye view
and lesser similarities within an event vs. a shot;
• Several approaches can be taken:
Dublin City University
Centre for Digital Video Processing
Similarity Calculation between 2 Images
Extract MPEG-7 descriptors
for this image
Extract MPEG-7 descriptors
for this image
• Scalable Colour
• Colour Structure
• Colour Layout
• Colour Moments
• Edge Histogram
• Homogeneous Texture
• Scalable Colour
• Colour Structure
• Colour Layout
• Colour Moments
• Edge Histogram
• Homogeneous Texture
:
:
Similarity Score
Dublin City University
Centre for Digital Video Processing
One Day’s Images
Event
Segmentation:
Approach I
For each image...
Extract MPEG-7
descriptors...
• Scalable Colour
• Colour Structure
• Colour Moments
• Edge Histogram
... to compare Similarity between...
... adjacent images
......
......
0.8 0.65 0.7 0.15
... adjacent blocks of 10 images
... pairwise
0.91
0.7
0.74 0.15
......
0.65
0.82
0.92
Event-segmented images of a day
Dublin City University
• Stage 1:
– comparison of adjacent
images
• Stage 2:
– Comparison every 2nd
image
• Stage 3:
– Comparison of blocks of
images
– Incorporation of a face
detector
Centre for Digital Video Processing
Dublin City University
Centre for Digital Video Processing
Preliminary Results Images from 1 day
Number of pictures: 2685
Manually detected events: 27
Correct events automatically identified
Precision
Color Moment
14
0.07
Edge Histogram
15
0.11
Color Structure
17
0.07
Scalable Color
18
0.04
Lots more to do, including fusion of descriptors and optimising windowing
Dublin City University
Centre for Digital Video Processing
Event Segmentation II
• Use similarity clustering, and time
– Combine low-level content analysis and
context information (i.e. metadata provided by
the SenseCam and temporal data)
– Generate a similarity matrix by fusing lowlevel and metadata information
– Implement time constraints to constrain
clustering
– Simple hierarchical clustering of images into
events
Dublin City University
Centre for Digital Video Processing
Event Segmentation: Approach II
One Day’s Images
... to variate the number of Events
For each image...
1 Event (whole set
..........
as 1 Event)
Extract MPEG-7
descriptors
+
GPS
meta-data ...
• Scalable Colour
• Colour Layout
• Edge Histogram
• Homogeneous Texture
• Light
• Temperature
• Accelerometer
2 Events
..........
4 Events
..........
Then apply
Temporal
constraints...
8 Events
:
...
... to calculate
Similarity among
images
:
Similarity matrix
Event-segmented images of a day
(2 Events)
Dublin City University
Centre for Digital Video Processing
Event Segmentation: Approach II
One Day’s Images
... to variate the number of Events
For each image...
1 Event (whole set
..........
as 1 Event)
Extract MPEG-7
descriptors
+
GPS
meta-data ...
• Scalable Colour
• Colour Layout
• Edge Histogram
• Homogeneous Texture
• Light
• Temperature
• Accelerometer
2 Events
..........
4 Events
..........
Then apply
Temporal
constraints...
8 Events
:
...
... to calculate
Similarity among
images
:
Similarity matrix
Event-segmented
images
ofof
a day
Event-segmented
images
a day
(4(2
Events)
Events)
Dublin City University
Centre for Digital Video Processing
Event Segmentation: Approach II
One Day’s Images
... to variate the number of Events
For each image...
1 Event (whole set
..........
as 1 Event)
Extract MPEG-7
descriptors
+
GPS
meta-data ...
• Scalable Colour
• Colour Layout
• Edge Histogram
• Homogeneous Texture
• Light
• Temperature
• Accelerometer
2 Events
..........
4 Events
..........
Then apply
Temporal
constraints...
8 Events
:
...
... to calculate
Similarity among
images
:
Similarity matrix
Event-segmented
images
ofof
a day
Event-segmented
images
a day
(4(8
Events)
(2
Events)
Dublin City University
Centre for Digital Video Processing
Approach II: Results
Dublin City University
Centre for Digital Video Processing
Approach III: Group Images into 3
Classes
• Static Person
– Person performing one activity
– E.g. at computer, meeting, eating etc.
• Moving Person
– Travelling between locations
• Static Camera
– Sense Cam is put down
– User is not wearing it
Dublin City University
Centre for Digital Video Processing
Features Used
1. Block-based Cross-Correlation
2. Spatiogram image colour similarity
•
Compares image colour spatial distribution
3. Accelometer motion
•
•
•
Feature-based training
Using Bayesian approach to classification
Viterbi algorithm used to smooth results
•
Applied to 1 day SenseCam images so far
Dublin City University
Centre for Digital Video Processing
Event
Segmentation:
Approach III
One Day’s Images
Classify each image into 3 groups
(Bayesian classification)...
......
For adjacent images, calculate...
Accelerometer
(motion)
+
Static Camera
Block-based
Cross-correlation
+
Spatiogram Similarity
Moving Person
Static Person
... then Smoothing
(viterbi algorithm)
SP
MP
SP
MP
SP
SC
Event-segmented (& classified) images of a day
Dublin City University
Centre for Digital Video Processing
Accelerometer Data Example
Dublin City University
Centre for Digital Video Processing
Generation of Weekly Summaries
• Assume events already segmented ;
• Calculate average values for events of low level
features from all images;
• Generate similarity matrix using the average
value from each event;
• Visually similar events can then be detected,
and the time period (week) structured
automatically into a short movie;
• Why a movie week … Addenbrooke’s
Cambridge application;
Dublin City University
Centre for Digital Video Processing
Generation of Weekly Summary
Event-Segmented image sets
Mon
Tue
Clustering of
similar Events
Wed
Thr
Compare Event-Event
similarity within a week
...
Fri
Sat
Sun
:
Event-level Similarity
matrix
Dublin City University
Centre for Digital Video Processing
Generation of Weekly Summary
Event-Segmented image sets
Similar Events - Aiden
working on the desk
Mon
Tue
Clustering of
similar Events
Wed
Thr
Compare Event-Event
similarity within a week
...
Fri
Sat
Sun
:
Event-level Similarity
matrix
Dublin City University
Centre for Digital Video Processing
Generation of Weekly Summary
Event-Segmented image sets
Similar Events - Aiden
waiting for bus
Mon
Tue
Clustering of
similar Events
Wed
Thr
Compare Event-Event
similarity within a week
...
Fri
Sat
Sun
:
Event-level Similarity
matrix
Dublin City University
Centre for Digital Video Processing
Generation of Weekly Summary
Event-Segmented image sets
Similar Events - Aiden at the
office corridor
Mon
Tue
Clustering of
similar Events
Wed
Thr
Compare Event-Event
similarity within a week
...
Fri
Sat
Sun
:
Event-level Similarity
matrix
Dublin City University
Centre for Digital Video Processing
Generation of Weekly Summary
Event-Segmented image sets
Mon
Unique Event 1
Tue
Clustering of
similar Events
Unique Event 2
Wed
Thr
Compare Event-Event
similarity within a week
...
Fri
Unique Event 3
Sat
Unique Event 4
Sun
Unique Event 5
Unique Event 6
:
Event-level Similarity
matrix
Dublin City University
Centre for Digital Video Processing
Generation of Weekly Summary
Event-Segmented image sets
Similar Events - Aiden waiting for bus
Mon
Similar Events - Aiden at the office corridor
Tue
Similar Events - Aiden working on the desk
Unique Events
Wed
Thr
Compare Event-Event
similarity within a week
...
Fri
Select images
Sat
Sun
Mon
:
Event-level Similarity
matrix
1 Week summary
(on Sunday)
Dublin City University
Centre for Digital Video Processing
Generation of Weekly Summary
Event-Segmented image sets
Similar Events - Aiden waiting for bus
Mon
Similar Events - Aiden at the office corridor
Tue
Similar Events - Aiden working on the desk
Unique Events
Wed
Thr
Fri
Compare Event-Event
similarity within a week
Sat
Select images
(on Sunday)
...
Sun
Select images
Mon
Tue
1 Week summary
:
Event-level Similarity
matrix
(on Monday)
Dublin City University
Centre for Digital Video Processing
Generation of Weekly Summary
Event-Segmented image sets
Similar Events - Aiden waiting for bus
Mon
Similar Events - Aiden at the office corridor
Tue
Similar Events - Aiden working on the desk
Unique Events
Wed
Thr
Fri
Select images
Sat
1 Week summary
(on Sunday)
Compare Event-Event
similarity within a week
Sun
...
Select images
(on Monday)
Select images
(on Tuesday)
Mon
Tue
Wed
:
Event-level Similarity
matrix
Dublin City University
Centre for Digital Video Processing
Generation of Weekly Summary
Event-Segmented image sets
Similar Events - Aiden waiting for bus
Mon
Similar Events - Aiden at the office corridor
Tue
Similar Events - Aiden working on the desk
Unique Events
Wed
Thr
Fri
Select images
1 Week summary
(on Sunday)
Sat
Sun
Compare Event-Event
similarity within a week
Mon
Select images
(on Monday)
Select images
(on Tuesday)
Select images
(on Wednesday)
...
Tue
Wed
:
Event-level Similarity
matrix
Dublin City University
Centre for Digital Video Processing
Preliminary Results
Number of similar images to a known event, from top 10 retrieved
COLOUR
LAYOUT
SCALABLE
COLOUR
HOMOGENEOUS
TEXTURE
EDGE
HISTOGRAM
Working in office
5 (50%)
5 (50%)
4 (40%)
10 (100%)
Walking
5 (50%)
9 (90%)
4 (40%)
9 (90%)
Meeting colleague (s)
9 (90%)
5 (50%)
8 (80%)
5 (50%)
Shopping
1 (10%)
4 (40%)
0 (0%)
7 (70%)
Meal at home
4 (40%)
4 (40%)
5 (50%)
6 (60%)
At coffee machine
6 (60%)
6 (60%)
4 (40%)
3 (30%)
On bus
3 (30%)
3 (30%)
3 (30%)
1 (10%)
Lunch at work
0 (0%)
2 (20%)
0 (0%)
1 (10%)
In bar
2 (20%)
2 (20%)
1 (10%)
2 (20%)
Giving lecture
1 (10%)
1 (10%)
1 (10%)
2 (20%)
3.6 (36%)
4.1 (41%)
3.0 (30%)
4.6 (46%)
EVENT
Average
Dublin City University
Centre for Digital Video Processing
Face Detection & Body Patch
Matching
• Apply face detection software to detection
the presence of a face in the SenseCam
image
• Body Patch Matching
– Identify similar body patch by color to detect
subsequent appearances within an event;
• This works well for personal photos, but
SenseCam images are lower quality;
Dublin City University
Centre for Digital Video Processing
Similarity Comparison by Person Detection
5:03pm 30 May 2006
Face
Extraction
8:28am, 7 June 2006
Face
Extraction
Similarity Score
Body Patch
Extraction
Body Patch
Extraction
Similarity Score
Combined
Similarity Score
Dublin City University
Centre for Digital Video Processing
Arizona State U. Data
• ASU gave us some SenseCam data 2 weeks ago
• Session rather than all-day images;
• Applied automatic event detection using 4x
MPEG-7 low-level feature descriptors
– Both Color Structure and Color Moments outperform
others
• Face Detection software performs badly on this
data
– Blurred Images cause “standard” face detection
software to fail
Dublin City University
Centre for Digital Video Processing
Event detection using ASU data:
28-June-2006
Number of pictures: 357
Manually detected events: 28
Relevant events automatically identified
Precision
Color Moment
6
0.25
Edge Histogram
11
0.28
Color Structure
14
0.42
Scalable Color
18
0.28
Dublin City University
Centre for Digital Video Processing
Event detection using ASU data:
28-June-2006
Number of pictures: 434
Manually detected events: 11
Relevant landmarks automatically
identified
Precision
Color Moment
6
0.17
Edge Histogram
7
0.15
Color Structure
6
0.12
Scalable Color
8
0.10
Dublin City University
Centre for Digital Video Processing
Using BT to provide context
• Achieved by logging Bluetooth devices in close
proximity to the SenseCam wearer;
• May be useful in determining which individuals
are present around each picture;
• Application created to poll and log Bluetooth
devices on phone;
• Currently developing host application to
interface with mobile device and retrieve log file
• Next step: synchronize time-stamps between
SenseCam images and Bluetooth log file
Dublin City University
Centre for Digital Video Processing
Use of Multi-Sensor Data
• Concept : To determine whether “events” can be
identified based on multiple sensor data
• Data collected from:
–
–
–
–
GPS Device
BodyMedia Device
Heart Rate Monitor
SenseCam
• Development of a framework to extract the relevant data
from the different data sources
– CSV files, XML files, text files, Excel files
Dublin City University
Centre for Digital Video Processing
Presenting SenseCam
Images?
E.g. intelligent summary of one day (playback for 1 minute)
... watching the fast playback of image sequences is not an ideal
interaction:
• Intensive concentration required during playback
• Event boundaries cannot be clearly presented
• Sense of time is skewed (more #images of an ‘important’ event,
even if it lasted only 1 minute; less #images of ‘unimportant’ regular
events even if they last many hours during the day)
Dublin City University
Centre for Digital Video Processing
Turn sequential playback into
an interactive, spatial browsing
interaction (similar to the way
we turn video playback into
keyframe browsing) =>
Dublin City University
Centre for Digital Video Processing
31 May 2006
Approach:
• 1-page visual summary of a day
• Each image represents each event
• Size of each image represents the
‘importance’ or ‘uniqueness’ of the
event
• Timeline on top orientates the user
about time when each event
happened
• Mouse-Over activated
Dublin City University
Centre for Digital Video Processing
31 May 2006
This is the most unique
Two
meetings that
eventunusual
of the day
happened that day in the
lab
Repeating Events are
listed as small size at the
bottom
Dublin City University
Centre for Digital Video Processing
31 May 2006
Mouse-Over will start
playback that Event, while
highlighting the time of
that Event: this event
(meeting a friend in Skylon
hotel lobby) happened in
the evening, for about 1.2
hour
Dublin City University
Centre for Digital Video Processing
31 May 2006
Talking with Gareth
happened only 10 minutes,
in the morning
Dublin City University
Centre for Digital Video Processing
31 May 2006
Working in the main
morning time: 1.2 hours
Dublin City University
Centre for Digital Video Processing
31 May 2006
Then my last desk-work of
the day (2 hours) just after
lunch time
Dublin City University
Centre for Digital Video Processing
31 May 2006
My lunch break
Dublin City University
Centre for Digital Video Processing
31 May 2006
My dinner time
Dublin City University
Centre for Digital Video Processing
31 May 2006
Conclusion:
• More relaxed, interactive, inviting
summary of the day than fastforwarding, while still taking
advantage of playback synergy
effect
• Playing each of the Events in its
location might be also good (without
having to Mouse-Over)
• ‘Importance’ is not by playing more
images in that Event (this skews
time), but by larger image size
Dublin City University
Centre for Digital Video Processing
Papers written
• “Exploiting context information to aid landmark detection in
SenseCam images”, submitted to ECHISE - 2nd International
Workshop on Exploiting Context Histories in Smart Environments:
Infrastructures and Design to be held at 8th UbiComp, Sept. 2006,
Irvine, CA, USA;
• “Structuring a Visual Lifelog Diary by Automatically Linking Events”,
submitted to 3rd ACM Workshop onCapture, Archival and Retrieval
of Personal Experiences (CARPE 2006) October, 2006, Santa
Barbara, California, USA.
• “Organising a daily visual diary using multi-feature clustering”,
submitted to SPIE Electronic Imaging, San Jose, January 2007;
Dublin City University
Centre for Digital Video Processing
Future Work
EVERYTHING !