A Snapshot Of MSR: 2005 Daniel T. Ling Corporate Vice President Microsoft Research Microsoft Corporation.

Download Report

Transcript A Snapshot Of MSR: 2005 Daniel T. Ling Corporate Vice President Microsoft Research Microsoft Corporation.

A Snapshot Of MSR: 2005
Daniel T. Ling
Corporate Vice President
Microsoft Research
Microsoft Corporation
Microsoft Research 2005
Founded in 1991
Staff of 750 in over 55 areas
Internationally recognized research teams
Small part of overall R&D ~$6 Billion
Research locations
Redmond, Washington,
San Francisco, California
Cambridge, United Kingdom
Beijing, People’s Republic of China
Mountain View, California
Bangalore, India
MSR Mission Statement
Expand the state of the art in each of the
areas in which we do research
Rapidly transfer innovative technologies
into Microsoft products
Ensure that Microsoft products have a
future
Wide Range Of Activities
Work with product groups
Program management team
TechFest in March
Participate in Research Community
Extensive publication and conference participation
Professional service - DARPA, NSF, NRC
Strong ties with universities
Joint research projects
Extensive visitor and speaker program
Students, faculty, research scientists
Post-docs, sabbaticals, interns
Community Events
Faculty Summit
DC Tech Fair
Open Day at MSR Cambridge
21st Century of Computing in Beijing
Workshops in specific areas
Email and Anti-Spam Conference (Stanford)
Social Computing Symposium (Redmond)
Conference on Converging Sciences (Trento, Italy)
UW / MSR Summer Institute
Data Mining, Invisible Computing, Software Tools, Specifications,
Security, Testing
2005: Biological and Computation Perspectives on Intelligent
Systems; Handling Imprecise Information
Inventing The Future…
Platform Elements
Networking, Distributed systems, Operating systems
Cellphone and other Devices
Sensor networks
Security, Protection against Malware
Reinventing Software Development
Languages, tools, compilers
Data and Documents
Data Solutions for a Terabyte World
Search
Fighting SPAM
UI and Collaboration
New UI – Speech, Ink, Gesture, Natural Language
Meetings and Collaboration
Modeling of People and Groups
Media
Graphics and Multimedia
Digital Photography and Video
Science
AIDS Vaccine, Quantum Computing, Astronomy
Algorithms, Cryptography
Susan Dumais
Shaz Qadeer
Ken Hinckley / Johnson Apacible
Nebojsa Jojic
Microsoft Research
Personalized Search
Susan Dumais
Senior Researcher
Adaptive Systems and Interaction
Search … Your Way
Stuff I’ve Seen (SIS)
Unified search over your content (mail, files, web, calendar,
contacts, music, notes, rss, etc.)
Try (something like) it, MSN Desktop Search
(http://toolbar.msn.com)
Memory landmarks
Memory Landmarks
Stuff I’ve Seen
MSN-DS
Search … Your Way
NewsJunkie:
Pizza delivery man w/ bomb incident
Stuff I’ve Seen (SIS)
Friends say Wells is innocent
Unified search over your content (mail, files, web,
calendar, contacts,
music, notes, rss, etc.)
Novelty
Score
Try (something like)
it, MSN Desktop Search
(http://toolbar.msn.com)
Memory landmarks
Looking for two people
Copycat case in Missouri
Gun disguised as cane
NewsJunkie
Articles Ordered by Time
Monitoring ongoing news events
Identify stories that are novel, given what you’ve
already read
Search … Your Way
Stuff I’ve Seen (SIS)
Unified search over your content (mail, files, web,
calendar, contacts, music, notes, rss, etc.)
Try (something like) it, MSN Desktop Search
(http://toolbar.msn.com)
Memory landmarks
NewsJunkie
Monitoring ongoing news events
Identify stories that are novel, given what you’ve
already read
Personalized Web Search … Today’s focus
Personalized Web Search (PS)
(w/ Jaime Teevan and Eric Horvitz)
Web Search
All users get the same results,
independent of previous search history,
current context, etc.
Personalized Web Search
Personalize search results, using rich
client-side information
Personal content (e.g., MSN-DS
index), activities
No profile setup or maintenance required
All profile storage and processing clientside, for improved privacy
User control over amount of
personalization
Web Search
Personalized Web Search
Personalized Search Demo
PS: Overview
Step 1:
Retrieve web search results, n>>10
Step 2:
Compute similarity (result, user)
User
Model
Step 3:
Re-rank search results
PS: Theoretical Framework
Score = Σ tfi * wi
World
N
wi = log
ni
r
i
Client
wi = log
R
wi = log
r
i
R
(N)
(ni)
(ri+0.5)(N-ni-R+ri+0.5)
(ni-ri+0.5)(R-ri+0.5)
(ri+0.5)(N’-n’i-R+ri+0.5)
(n’i-ri+0.5)(R-ri+0.5)
Where: N’ = N+R, ni’ = ni+ri
PS: Evaluation
How well does it work?
Rich space of algorithmic and UI possibilities
Experiment:
Participants judge top 50 results, 137 queries
User Model
No Profile < Query history < Web SIS < Recent SIS < All SIS
Document Model
Full document in results set < Snippets in results set
PS score + Web rank, even better
Internal deployment ongoing
Search … Your Way
Example systems
Stuff I’ve Seen -> MSN Desktop Search
NewsJunkie
Personalized Web Search
Questions / Comments ?
Contact information:
Susan Dumais
Senior Researcher
Adaptive Systems and Interaction Group
[email protected]
http://research.microsoft.com/~sdumais
Finding Concurrency Bugs in
Systems Software
Shaz Qadeer
Software Productivity Tools
Concurrency Is Important
Critical software
Operating systems, databases, embedded
software (e.g., flight control, handheld
devices, cell phones)
Single-chip multiprocessors will become
common
Software running on these chips will be even
more concurrent
Concurrent Systems Code Is
Complicated!
Shared memory between threads
Race conditions: some interleaving of
concurrently enabled actions causes an error
Data races: data being read by a thread might
be trashed by concurrent write of another thread
Reference counting bugs: a thread might
access a resource already freed by another
thread, memory leaks
IRP (I/O Request Packet) cancellation bugs:
unexpected cancellation of an IRP violates the
state machine of IRP
Concurrency Analysis Is Difficult (1)
Finite-data single-procedure program
n lines
m states for global data variables
1 thread
n * m states
K threads
(n)K * m states
Concurrency analysis is difficult (2)
Finite-data program with procedures
n lines
m states for global data variables
1 thread
Infinite number of states
Can still decide assertions in O(n * m3)
K  2 threads
Undecidable!
KISS: A Static Checker For Concurrent
Software
Has found a number of concurrency errors
in NT device drivers
Key new ideas
Technique to use any sequential checker to
perform concurrency analysis
Current implementation on top of Static Driver
Verifier
Find all errors that can manifest in a small
number of context-switches
Many steps later…
Context switch
Many steps later…
A few steps later…
Context switch
Data Races In DDK Drivers
Device extension shared among threads
Data races on device extension fields
two threads concurrently accessing a field
at least one access is a write
Driver
#fields without races
#fields with races
Tracedrv
3
0
Moufiltr
7
0
Kbfiltr
7
0
Imca
4
1
Startio
9
0
Toaster/toastmon
7
1
Diskperf
1394diag
14
17
0
1
1394vdev
17
1
Fakemodem
31
6
Toaster/bus
22
0
Serenum
Toaster/func
Mouclass
Kbdclass
21
17
32
33
2
5
1
1
Mouser
27
1
Fdc
54
9
KISS: A static checker for
concurrent software
No error found

Concurrent
program P
KISS
Sequential
program Q
SDV

Error in Q
indicates
error in P
KISS Insight
Many subtle concurrency errors manifest
themselves in executions with few context
switches
Analyze all executions with a small
number of context switches
KISS Strategy
Concurrent
program P
KISS

Sequential
program Q
SDV

Q encodes executions of P with small
number of context switches
instrumentation introduces lots of extra paths
to mimic context switches
Leverage all-path analysis of sequential
checkers
What does KISS stand for?
Smartphlow for Smartphone
Eric Horvitz (on sabbatical)
Ken Hinckley
Johnson Apacible
Adaptive Systems & Interaction
Smartphlow For Smartphone
Uses machine learning techniques to
predict traffic flow
Predict how long
until jams will
appear
Smartphlow For Smartphone
Uses machine learning techniques to
predict traffic flow
Predict how long
until jams will
appear
Predict how long
before traffic jams
will disappear
Smartphlow For Smartphone
Uses machine learning techniques to
predict traffic flow
Predict how long
until jams will
appear
Prototype has
~3,000 active users
Predict how long
before traffic jams
will disappear
Smartphlow Fuses Multiple Sources
Traffic data
Weather
Holidays &
Major Events
Incident reports
INCIDENT INFORMATION
Cleared 1637: I-405 SB
JS I-90 ACC BLK RL CCTV
1623 – WSP, FIR ON SCENE
• Event store
• Learning
• Reasoning
From Data to Predictive Models
Data store, user logs
Predictive models
system-wide status & dynamics
Incident reports
sporting events
weather
time of day
day of week
season
holidays
UAI paper to appear next week
Search over directed acyclic graph using
Bayesian information criterion
Smartphlow User Interface
UI designed for quick glances at screen
Quick overview of traffic status
Smartphlow User Interface
UI designed for quick glances at screen
Red clock  how long for jam
to dissipate
full circle = 1 hour
Smartphlow User Interface
UI designed for quick glances at screen
Surprise (!) jam notification
Small Screen Navigation
9 keys  zoom to 9 zones of screen
Zoom & flow animations maintain context
Bayesphone: Context-sensitive
communications
Caller ID
Context
Call handling costbenefit analysis
Ring
Voice Mail
The epitome of a virus:
Combating HIV with machine learning
Nebojsa Jojic
Microsoft Research
Collaborators
Vladimir Jojic, Microsoft/U Toronto
Carl Kadie, Microsoft
Jennifer Listgarten, Microsoft/U Toronto
Chris Meek, Microsoft
Brendan Frey, Microsoft/ U Toronto
Bette Korber, Los Alamos National Laboratory
Christian Brander, Harvard/MGH
Nicole Frahm, Harvard/MGH
Simon Mallal/ Royal Perth Hospital
Jim Mullins/ University of Washington
Epitome as a model of diversity
in natural signals
A set of image patches
Input image
Epitome
Using the epitome for recognition
The smiling point
Epitome of 295 face images
Images with the
highest total
posterior at the
“smiling point”
Images with the
lowest total
posterior at the
“smiling point”
Epitomes May Also Allow Some Variability
Epitome e:
Mean 
Variances 
Epitomes Can Be Computed For Ordered Datasets
(E.G., 1-D Arrays Or 2-D, Or 3-D Or N-d Matrices)
With Arbitrary Measurement Types:
Intensities
R, G, B values
Gradient values
Wavelet coefficients
Spectral energies
Nucelotide or aminoacid content
…
We even played with text and MIDI files
An Epitope Presented By
An MHC-I molecule
MHC-I Molecule
Peptide
Immune System Response
The Map Of HIV
From http://www.mcld.co.uk/hiv
(A simplified version of the LANL detailed map)
HIV diversity (LANL database)
HIV is encoded in an RNA sequence of about 10000 nucleotides,
divided into several genes. NEF is one of the shorter and moderately
variable ones.
The NEF length in the strain
The 73 nucelotides of the NEF gene
Note the insertions, deletions and mutations. A triplet of nucleotides encode
for one aminoacid. A change in a single aminoacid may lower the cellular
immunity to the virus in one patient and increase it in the other.
Known Epitopes In A Part Of HIV’s Gag Protein
Epitopes In Variable Regions
Colors signify different human immune types
A Vaccine For HIV/AIDS
Typical vaccines are near copies of the virus that is
being vaccinated against
HIV mutates at a high rate – can’t use traditional
techniques
Machine learning allows us to build compact forms of
“pseudo-virus” that covers the diversity of the HIV virus
(or rather a pseudo-protein that covers the diversity of
a particular HIV protein)
This pseudo-protein, which we call the epitome is
much shorter than the concatenation of all strains
The Epitome Of A Virus
Colors:
Different
patients
Sequence data
VLSGGKLDKWEKIRLRPGGKKKYKLKHIVWASRELERF
LSGGKLDRWEKIRLR KKKYQLKHIVW KKKYRLKHIVW
Epitome
Machine Learning Approach to
Vaccine Design
Use sample HIV strains from multiple patients
Build models that compactly encode as many epitopes (or
likely epitopes) as possible
Learning techniques
Myopic
Split and merge
Expectation Maximization
Coverage of all 10aa blocks from 245 Gag proteins (Perth data)
We Are Also Working On:
Epitope prediction
Evolution and immune pressure modeling
and inference
Wet lab confirmation experiments
(Harvard)
Looking Forward
Moore’s Law, bandwidth improvements mean
continued dramatic improvements in
computing
MSR is an environment for collaboration and
excellence in computer science research
MSR works actively with the research
community
MSR researchers are building technologies
for MS products that will enable this future
© 2004 Microsoft Corporation. All rights reserved.
This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.