Advanced Topics in Computer Systems (ACS, R01) 12th October 2010 Steven Hand Welcome and Introductions • Welcome to Cambridge, the ACS and R01! • First, everyone.

Transcript Advanced Topics in Computer Systems (ACS, R01) 12th October 2010 Steven Hand Welcome and Introductions • Welcome to Cambridge, the ACS and R01! • First, everyone.

Advanced Topics in Computer
Systems (ACS, R01)
12th October 2010
Steven Hand
Welcome and Introductions
• Welcome to Cambridge, the ACS and R01!
• First, everyone should introduce themselves
– Aim for 1-2 minutes
• Things to cover:
– Your name, and educational background;
– Your general areas of research interest; and
– Your biggest or best or most interesting project
(hw or sw) or other technical achievement.
• Add other topics if you wish!
What is Systems Research?
• Typically systems research is:
– Practical (motivated by real-world issues)
– Low-level (little maths or formalism)
– Pragmatic (good enough is good enough)
– Real (systems people build stuff)
• Pretty broad area: operating systems, filesystems, databases, distributed systems,
language runtimes, system security…
ACM SOSP 2009
• Premier systems conference is SOSP; runs every
two years (last was just under a year ago...)
• Technical program has 23 papers…
• “FAWN: Fast Array of Wimpy Nodes”
– Basic idea: investigate trade-off between power
consumption versus computing power
– Build a cluster of nodes with 500Mhz AMD Geode
CPU, 256MB RAM, 4GB flash & 100Mb/s ethernet
– Show “better” (in terms of #ops per Joule) than
similar system with conventional (quad core) nodes.
ACM SOSP 2009
• “RouteBricks: Exploiting Parallelism to Scale Routers”
– Investigates if commodity servers can be used to build a
high-performance router
– Demonstrate 35GB/s throughput with four off-the-shelf
(well Nehalem) servers
• “Multikernel: A new OS Architecture for Scalable
Multicore Systems”
– Argue that shared-by-default is the wrong way to build
OSes for emerging multicore hardware
– Instead suggest separate kernel per-core, and explicit
message passing (RPC) to co-ordinate
– Build a prototype - “Barrelfish” – and demonstrate that it
scales well, and performs comparably with Linux
ACM SOSP 2009
• “Fast Byte-Granularity Software Fault Isolation”
– Describes a new scheme to protect the OS kernel from
buggy or malicious kernel modules (drivers)
– Compiler back-end to infer types for driver code, plus runtime system to enforce isolation
– Not a totally new idea: this is about how to make it fast!
• “Automatically Patching Errors in Deployed Software”
– Monitors Windows binaries, and “learns” invariants (i.e.
normal behavior)
– If get a crash, work out which invariant(s) were violated
– Automatically generate a patch which avoids this (e.g.
input validation, new error handling path, …)
ACM SOSP 2009
• “Better I/O Through Byte-Addressable Persistent
Memory”
– Considers how to build a file-system if you have PCM
(flash++) directly on the memory bus
– Key insight is you can do updates in place, i.e. no extra
copies in kernel memory, no cascades
– Show 2x perf compared to using a RAMdisk
• “Operating System Transactions”
– Add transactional behavior to system calls (to get e.g.
‘automatic’ error recovery, better security (TOCTTOU))
– Build ‘TxOS’ (linux with tx support for 50% of syscalls)
– Show it works, and adds approx 10% overhead
ACM SOSP 2009
• “sel4: Formal Verification of an OS Kernel”
– Formally verify a version of the L4 microkernel
– Abstract spec in Isabelle/HOL, “executable spec” in
Haskell, and real version in C and assembly
– Show exec ‘refines’ abstract, and real refines exec
– (Incredibly impressive work!)
• “UpRight Cluster Services”
– Argue that BFT is cool/important, but too complex
– Build a library which ‘automatically’ allows you to add
BFT functionality to your code, and show it works
Phew!
• That’s just a selection of 9 papers – others
include robust sensor networks, schedulers for
distributed computing, using machine learning
to thwart identity attacks, and much more…
• So what should we learn from this?
1. “Systems” really does cover a lot of stuff
2. It changes over time as the world changes (e.g.
lots on multi-core this year)
3. We need to read a lot to get up to speed
This Course
• Aims to start of the process of reading systems
research papers
• We will cover a lot, but still a tiny fraction of
the whole space – you’ll need to do that in
your “spare time” over the rest of your career!
• The most important thing – and the primary
goal of this course – is to understand how to
read a systems paper…
Critical Thinking
• Reading a research paper (and particularly a
systems one) is not like reading a text book
– For lots of reasons…
• But the most important one is that the paper is
not necessarily “the truth”
– There’s no right and wrong, just “good” and “bad”
– These are inherently subjective qualities… but you
can’t get away with just your opinion: must argue.
• Critical thinking is the skill of marrying subjective
and objective judgment of a piece of work.
An Example: ApacheML
• A researcher builds a web server from a typesafe language (in this case, Ocaml)
• They argue that this will make the software
less vulnerable to bugs (null pointers, buffer
overruns, format violations, etc)
• They build a prototype and compare it to
Apache; the prototype adds 10% latency, and
scores only 20% less well on SpecWEB
• A “good” piece of research / a good system?
First let’s argue for…
• What’s the problem?
– Existing systems software (e.g. web servers) are buggy and insecure
• Why is it important?
– Security vulnerabilities cost billions of dollars every year!
• Why isn’t it solved by previous work?
– Many such vulnerabilities come from exploits which target code
written in unsafe languages such as C or C++; yet people continue to
write systems in these languages because of performance
– Traditional testing and code review doesn’t catch all the bugs
• What’s the approach?
– Use a modern type-safe language
• Why is this novel/innovative?
– Previous language run-times were super slow; but this work leverages
new compiler techniques and run-time support (which we built, and
will explain) to maintain high performance
And now against :-)
• Problem is overstated (or “oversold”)
– Security vulnerabilities do cost money, sure, but the ‘billions’ is across
the board, and includes viruses, worms, phishing, etc – buffer
overruns in the web server is a tiny part of the problem
• Problem doesn’t exist
– Apache, IIS etc have been battletested and all known bugs fixed; this is
the only way to be sure anyway (e.g. runtime bug?)
• Approach is broken
– Only 30 people in the world can write ocaml (and only 10 can read it!)
• Solution is insufficient:
– Dude, 10-20% hit on performance is not acceptable!
• Evaluation is unfair/biased:
– Apache includes support for X, Y, Z – your prototype is a just a toy!
– (And if you fixed this, your performance would be worse!)
So which is the “right” answer?
• There isn’t one!
– All (or most) of these arguments are (mostly) correct…
• So it’s chiefly a question of which ones ‘feel’ more
valid to you – a complex and subjective thing
• In this course, we’ll be reviewing a selection of
21 papers (3 per week)
– Cover all sorts of topics, and span 1982 to 2010
– All of these were peer-reviewed and published, so
should assume at least some merit there…
– However you get to decide whether you like or dislike
the paper, and make arguments either way (or both!)
Hints for Computer System Design
• You’ll hopefully have seen this on the premodule reading list (or on the web page)
• If you’ve not read it already, I strongly
recommend you do (at some stage).
• Basically a collection of “wisdom” from one of
the top systems guys in the world (Lampson)
• It’s not a typical systems research paper
– The problem identified is vague; the solutions are
general; and there’s no evaluation!
Key Insights from the Paper
• Designing a system is not like designing an
algorithm:
– Much less well specified (only have general
requirements), so huge amount of freedom
– And much more difficult to measure success
– (these are the main reasons ‘right’ and ‘wrong’ don’t
work for systems; need critical thinking)
• And often have fundamental tradeoffs:
– Simplicity versus Functionality
– Performance versus Robustness
– Throughput versus Latency
Some “Hints” are now [in]Famous
• Most famous is “the end-to-end principle”
• Often used as a argument to justify design
decisions for the Internet
• Actually is more like a “holy bludgeon”
– Used by e2e zealots to argue against anything; just
need to choose suitable ends!
• In practice, it’s still a useful principle, but cannot
be followed blindly
• (Same is true for all the other hints in the paper)
Using the Hints
• In general, these are most useful when thinking
about a system design
– Will be most useful for your future research projects
and essays, etc
• For this course, you’ll mostly be looking at other
people’s systems…
• … so another way of using the hints is as a set of
questions you can ask about the paper.
• Not always possible: often you won’t have
enough information about the implementation
So how do you read/review a paper?
• Do a first pass (5-10 mins): read title, abstract, intro
and conclusions. Aim to get a general idea of the paper.
• Next, sit down with a pen, and start reading
– make notes (‘!’, ‘huh?’, full sentences, …) as you go
• Try to identify the following key things:
–
–
–
–
What is the problem?
What is the solution / approach?
How does it compare with previous work?
(How well) Does the system work?
• Most of the above should be fairly objective (i.e. most
people should get similar answers)
Now for the fun bit
• After reading the paper, decide if you like it
– Make a judgment!
– Do this immediately after finishing reading the paper
(write a few sentences on the last page)
• Now put the paper aside, and take a break (or go
on to the next one)
• Finally: write up your review (in < 1000 words)
– You must use the form on the course web page
– The idea is to try and capture both your objective and
your subjective responses to the paper…
Parts of the Review
• 1. Paper Summary (no more than 250 words)
– Provide a brief summary of the paper (3-5 sentences)
– The aim is to prove you’ve read (and understood!) the
paper, so try to paraphrase and extract the essentials.
– At this stage you should try to be objective
• 2. The Problem
– What is the problem? Why is it important? Why is
previous work insufficient?
– (1 or 2 sentences for each answer should be sufficient)
– Once more you should try to be objective, i.e. report
what the authors say in the paper.
Parts of the Review
• 3. The Solution or Approach
– What is their approach? How does it solve the
problem? How is the solution unique and/or
innovative (if it is)? What are the details?
– Again rely on the paper itself to answer these; but
don’t just regurgitate it! Paraphrase & synopsize.
– Usually 5-10 sentences will be enough.
• 4. Evaluation
– How do they evaluate their solution? What questions
do they answer? What are the strengths / weaknesses
of (a) the system? (b) the evaluation itself?
– Aim for 3-4 sentences here.
Parts of the Review
• 5. What do you think?
– Here you finally get to explain your opinion!
– You should aim to give a ‘judgment’ on the work
(and on the paper); and you should attempt to
back it up with arguments (logical or rhetorical).
– This should be at least 3 sentences, but can be
more as required (subject to the total word limit)
• 6. Questions for the authors
– List one or two questions you’d like to ask
Reviewing Tips & Tricks
• While reading, you need to absorb what the
paper says, but try also to ask yourself:
– Is this really true?
– Does this argument make sense?
– Does this evaluation really support the claims?
• This is not about critical, not negative
– Be prepared for the paper to be wrong, but don’t
assume it is
– (Just like you shouldn’t assume it’s right!)
• This will take practice, but will get easier over
time (and for topics you’re more expert in)
Presentations
• As if reviews were not enough, each of you will
also do some presentations in this course!
– (In fact, from next week most of the time will be you
up here presenting, and not me 
• Each presentation should be 12-15 minutes long,
and should be given using a computer
– You can use your own laptop, or bring a USB stick with
your powerpoint or PDF file
• You can revise your presentations after you’ve
given them, and then you submit the final
versions after the end of the course.
Structure of a Presentation
• You need to cover three things:
1. What is the background/context of the paper: what
motivated the authors? What else was going on in
the research community at the time? How have
things changed since?
2. What does the paper actually say? What’s the
problem they tried to solve? What are the key
ideas? What did the authors actually do? What were
the results?
3. What do you think about the paper? What’s good
and what’s bad? What are the key takeaways? What
was the impact (or what is the likely impact)?
As if this wasn’t hard enough…
• Each presentation assignment also specifies a
certain “flavor”: Advocate, Critic or Balanced
• All should follow the structure described but:
– An Advocate should emphasize the good points, and
spend less on the negatives; in essence you are trying
to take the role of the original authors, and convince
people of the paper’s merits
– A Critic should still present the work fairly, but
towards the end focus on the negative aspects:
essentially try to convince that the paper’s no good!
– A Balanced presentation should try to cover both the
good and the bad (but still arrive at some judgment).
Why do Presentations?
• The aims of you doing presentations are:
– Learning to structure an argument (even one you
don’t believe in!) - you have no choice over
whether you get to argue for or against
– Generating discussion: most papers will have
people arguing for and against.
– Getting you to go a bit more in-depth on some of
the 21 papers: becoming a bit of an expert.
– Practice for your future research career…
Presentation Guidance
• Don’t spend too long covering the basics:
remember, everyone will have read the paper
– You should of course give a brief overview, not least to
set up the rest of the talk
– But don’t just “repeat” the paper in slide form, and
don’t spend too long on the results
• The aim is to generate discussion, so you need to
“add value” over the paper itself:
– Explore the arguments they make, and the
conclusions they draw. What do you think?
– Make sure you at least try to match the specified
“flavor”, even though it may be challenging.
Doing the Presentation
• Practice beforehand to ensure it’s 12-15min
• Most of you will be nervous: that’s normal!
– Remember this is a friendly group of people in a
closed room… and everyone’s in the same boat
– Think of the presentation as a discussion/dialogue
between you and the audience
– (& practice beforehand to help settle your nerves)
• Try not to get defensive or angry at questions
– This is not your paper, it’s just a class ;-)
Being in the Audience
• You’re not just sitting there: you need to get
involved!
– To kick things off, you’ll ask the questions from your
reviews, so be sure to bring a copy with you
– Everyone should participate in discussion
• Always be respectful of the speaker
– Academics (and systems researchers in general) can
get quite passionate about arguments
– This is good, but needs to be about the arguments
and the material, not about the individual
– (and remember, you’ll be up there one week ;-)
Grading
• 84% of the marks are for your reviews
– You need to do at least one review every week, and a
total of 12 (if you do >12, we take the best)
– I recommend you read all three papers (at least in
outline), and then write reviews for 2 of them
– Aim to spend about 2 hours on each paper
• 16% of the marks are for your presentations
– Aim to spend about 8 hours on each presentation
(including more in-depth reading of the paper)
– You can revise your presentations after giving them;
the final version is due at the end of term
Final Matters
• You can find the papers on the web page; your reviews
need to be submitted by 12 noon on Monday
– Remember to bring a copy of at least your questions with you to
the next class.
• (You can work on presentations up to the last minute)
• The SRG Seminars are on Thursdays at 4pm, in either FW26
or LT2: try to come along if you can!
• The NetOS Group Meetings are Tuesdays at 1pm in FW11;
this week includes an OSDI’10 trip report by Derek Murray
• This talk, the papers, review forms, and other resources are
on the course web page:
– http://www.cl.cam.ac.uk/teaching/1011/R01
• Good luck!

Advanced Topics in Computer Systems (ACS, R01) 12th October 2010 Steven Hand Welcome and Introductions • Welcome to Cambridge, the ACS and R01! • First, everyone.

Transcript Advanced Topics in Computer Systems (ACS, R01) 12th October 2010 Steven Hand Welcome and Introductions • Welcome to Cambridge, the ACS and R01! • First, everyone.

Directory