Transcript Research in System Administration
Washed by the Very Same Rain: System Administration Research
Alva L. Couch Tufts University [email protected]
Part I
What is research?
Who am I?
• Overseer of LISA: chair of steering committee, board liaison (since 2005) • 14 LISA papers since 1996 (+ 2 students who submitted sole-author papers) • 2 LISA “best paper” awards and 1 “best student paper” award since 1996.
• 2003 SAGE Professional Service Award (with Mark Burgess and Paul Anderson).
What is research?
• We all
think we know
, but • popular accounts of the nature of research are
misleading
, and • remain misleading
throughout recorded history!
A popular misconception
• Einstein created the theory of relativity
out of thin air.
• No one else
could have done it
but Einstein. •
Not!
• Einstein’s work began in a
context
already known, and of what was • Several mathematicians (notably Minkowski) were
working in the same context
and concurrently trying to come up with their own explanations!
A song and a woven cloth
• This presentation is structured like the folksong “
same rain
” by American folk singer
Pat Humphries
, which has been considered by some as a
paradigm for research and exploration
.
• It is a cloth woven from threads inspired by my wife’s mentor
Prof. Philip Morrison
(of MIT) and his TV series “
The Ring of Truth
”, which discusses how scientists develop their ideas.
“We're all living in a great big dipper…”
• All research occurs in a
context
, that includes – What work has been done before. – What community is interested. – What problems remain to be solved. • Context is a
moving target
change rapidly over time. that can
“We’re all washed by the very same rain…”
• By definition, research
doesn’t occur in a vacuum.
• If you see something important, chances are that a number of other people have
seen the same thing
. • Difference is whether you
do something about
understanding what you see! • Edison: 1% inspiration, 99% perspiration.
“We are swimming in the stream together…”
• Research is not about working alone, but rather about
communicating ideas
to a
community
that is exploring similar directions.
• Most important step is to identify your community (or communities).
• “Who are you swimming with?”
“Some in power and some in pain…”
• •
Failure
is a crucial part of research. • One’s hypothesis can be
invalid.
• Even after one has believed it for years. • Only by failing can one
learn
. • Only by being open to failure can one become
objective.
I have more wrong ideas than right ones!
The usual formula for how to do research
• • • • Determine
context of the problem.
• Survey
proposed solutions
. • Determine
new directions
• Choose
one direction
to explore.
to explore. • Develop a
hypothesis
about the direction.
Test
that hypothesis.
Evaluate Refine
the results of the test. the hypothesis, and repeat.
Key elements of the formula
• Context: maintaining an idea of what you know and don’t know about a problem. • History: keeping track of what you learn over time. • Evidence: how what you see supports or refutes what you might think. • Conversation: the ability to explain what you see to others.
An alternative formula
• Get
excited
about something. • Commit to learning
all that can be understood
about it. • Choose
some small part
of it to understand better. • Write down your
specific ideas
part. This is your “hypothesis”. about the nature of this • Test your understanding with “experiment”.
observation
. This is your • Remain
doubtful curious
of unconvincing evidence, and about contradictory evidence. • Refine
yourself
and then repeat!
Research versus learning
• Too often, research is mischaracterized as a
discovery product
, like finding a piece of gold in a gold mine. • Most research is instead a
learning process
, where you
learn something new about something you already see.
• The gold is not what you
see
, but what you
learn
.
Research redefined
• An
active learning process
… • In which you
explore learn
from the world… what happens, and • In a
continuing conversation
community of learning… with a • In a
changing and evolving context
of observed phenomena and human needs… • In which one
risks
being wrong, but
learns and evolves
from one’s mistakes.
The Ring of Truth
• My wife was the researcher for the TV series “The Ring of Truth”, which discusses the nature of science. • Each show concentrates on some aspect of the scientific method: Looking, Change, Mapping, Clues, Atoms, and Doubt.
• Let’s map these ideas into system administration terms!
Looking
• The ability to
look
and
see
at something familiar something new. • Burch and Cheswick,Tracing Anonymous Packets to Their Approximate Source,
Proc. LISA 2000.
• A denial of service (DoS) attack is not always a bad thing, and one can use a structured DoS to identify perpetrators of other DoS’s!
Change
• The ability to embrace the idea that one’s understanding of the world – and the world –
changes and improves
over time. • Finke, Manage People, Not Userids,
Proc. LISA 2005.
• A revisitation of the same author’s previous paper on the subject, in which he explains how his understanding and practice improved over time and reversed some prior decisions.
Mapping
• The ability to use
models and abstraction
to understand the world.
• Couch, Wu, and Susanto, Toward a cost model for system administration,
Proc. LISA 2005
. • A model of cost for helpdesks shows through simulation that helpdesks running near the limit of staff capacity experience chaotic changes in total value.
Clues
• The ability to look for and see clues toward
new and different explanations
of phenomena. • Gross and Rosson, Looking for Trouble: Understanding End-User Security Management,
Proc. CHIMIT 2007.
• The windows firewall message “do you want to allow this connection” is semantically equivalent – in the minds of most users – with “do you want to get your work done or not?”
Atoms
• The ability to come to grips with what is
knowable
and what is
unknowable
. • Burgess, Computer Immunology,
Proc. LISA 1998.
• Centralized control systems depend upon “knowing the unknowable,” whereas physical systems such as the human body depend upon distributed and “more knowable” notions.
Doubt
• The ability to face and embrace one’s
lack of understanding
of complex phenomena.
• Evard, An Analysis of UNIX System Configuration,
Proc. LISA 1997.
• Configuration management is often conceptualized as a simple choice between tools, but involves a more complex conflict between technical methods and human needs.
Part II
Steps toward engaging in research
Parts of becoming a researcher
• Engaging in
active learning
. • Being open to
doubt
. • Finding and maintaining
context.
Aids to effective learning
• • • •
Keeping a personal journal
of ideas, directions, hypotheses, experiments, conclusions, references.
Breadth
: documenting
every idea you get. Depth:
exploring
one new direction
at a time.
Documenting
each hypothesis and the evidence for and against it
as soon as possible.
Persistence of memory?
• • Don’t rely on your memory, no matter how good it is.
Your understanding of the problem is a moving target.
• To teach other people what you learned, you need to recall
what you didn’t know
before!
Example: my journal
• Dated entries describe hypotheses, tests, results, ideas. • In electronic form (plaintext). • Ideas often turn out to be wrong. • • I never delete or edit an entry!
• This is not a publication; it is a starting point for one.
It is more important to have a record than to be correct.
Being open to doubt
• • Doing research is about accepting that absolutely any idea you write down is – subject to continual validation and – can turn out to be invalid at any time in the future. • Each entry in the journal is a
starting point
for discussion, and
not a fact
.
In mine, the “invalidated” entries outnumber the “validated” ones.
Finding context and community
• Several resources can aid you in beginning: – The Anderson taxonomy of system administration topics. Anderson and Patterson, “A Retrospective on Twelve Years of LISA Proceedings”, Proc. LISA 1999.
– Book: Selected Papers in Network and System Administration (based upon the Anderson Taxonomy).
– Book: Handbook of Network and System Administration (beyond the Anderson taxonomy). – USENIX compendium of best papers (a testament to the “most interesting” topics and approaches). • Google can help, but only if you already know the proper keywords!
Just as important: find community
• Your community: the people in this room. • One often chooses a problem “for a community” rather than the other way around.
Essential skills of the researcher
• Focused reading • Documenting biases. • Collecting evidence. • Being open to surprises.
Focused reading
• A researcher doesn’t read a paper like a regular person.
• Reading occurs in a
context.
• To answer
specific questions.
The typical questions
• Relevance: is this work relevant to what I want to understand? • Context: where did their understanding start (when their work began)?
• Results: where did their understanding end (when they finished this paper)? • Doubt: what unknowns did they find?
Questions evolve!
• These are just a starting point. • As you focus upon a topic, reading becomes more focused as well. • E.g., “Is this relevant” becomes a question about a specific kind of relevance.
Part III
Examples
…(Ahem)…
• The original idea for this talk was to describe the whole “landscape” of system administration research and where things are today. • I thought about this a bit and decided that it was too broad an objective. • And it sounded a bit boring. • So instead, I am going to show you several examples of how to build your own landscape of what’s important to you. • And then, I’ll take requests!
How to build your own landscape
• Express your preconceptions honestly. • Use
focused reading
to find evidence for or against your preconceptions. •
Weigh
the evidence, preconceptions.
reevaluate
your • When the literature fails to support or refute, it’s time to do your own experiment.
Some parts of the current landscape (some of what’s hot)
• Power-aware systems • Adoption of automation tools versus writing your own tools. • Balancing security and business objectives. • Integrated management of systems, knowledge, security, audit data. • Dealing with various (existing and new) forms of spam. • (and many others).
Power-aware systems
• No paper at LISA as yet. • Two important posters at HotPower 2008: • Srikantaiah
,
Kansal, and Zhao, Energy Aware Consolidation for Cloud Computing.
• Lu and Varman, Workload Decomposition for Power Efficient Storage Systems, • Focused reading: – What is the problem? – What are the challenges?
– How could this apply to system administration?
Adoption of automation tools
• This is a hard one. • Let’s go digging: – Mentioned in my LISA 2005 talk “What is this thing called configuration management?”.
– Lots of hallway conversations. – Lots of very indirect evidence. – Evidence scattered all over the universe, one sentence at a time.
• I didn’t say this was always easy.
Balancing security and business objectives
• • Very few writings, but very controversial. One example:
Beattie, Arnold, Cowan, Wagle, Wright, and Shostack,
Timing the application of security patches for optimal uptime,
Proc. LISA 2002.
• Focused reading: – What questions remain? – Are there analogies with other “best practices”?
Integrated management
• Lots of references with scattered ideas. One example: • Wang, Verbowski, Dunagan, Chen, Wang, Yuan, and Zhang, STRIDER: A Black-box, State-based Approach to Change and Configuration Management and Support,
Proc. LISA 2003.
• Focused reading: – What is the problem? – How does their approach work? – Can it be applied to Linux?
Spam
• A huge number of references with different strategies. One example: • Singaraju and Kang, RepuScore: Collaborative Reputation Management Framework for Email Infrastructure,
Proc. LISA 2007.
• Focused reading: – What kind of spam does this prevent? – What requirements are there? – What limitations are there?
And the votes are in!
• Anomaly detection and correction • Networking and IT Infrastructure • Configuration management (3) • Databases and Information Storage (3) • Heterogeneity • IP telephony • Managing mobile and wireless computing (3) • Network and Information Security (3) • Remote administration • Scaling problems: large or high-volume (2) • User management • Virtualization (5)
So, the next topic is rather obvious:
• I happen to know “a bit” about virtualization: • Alva Couch, System administration thermodynamics,
;login: magazine
, Oct 2008.
Kinds of virtualization
• Whole operating system (XEN, VMWare, etc). • I/O virtualization: virtualize access to files, devices, etc, but not the operating system. – Monica Lam • Virtualization of configuration management – (NSDI: “Shards” system)
Requests?
(Feel free to put me on the spot)
Part IV
Epilogue
The Pat Humphries song upon which I patterned this presentation:
"We're all living in a great big dipper.
We're all washed by the very same rain.
We are swimming in the stream together, Some in power and some in pain.
We can worship this ground we walk on, Cherishing the dreams that lie deep inside.
Loving spirits will live forever.
We're all swimming to the other side.”
But the last verse is most relevant
“When we get there we'll discover All the gifts we've been given to share Have been with us since life's beginning And we never noticed they were there.
We can balance at the brink of wisdom Never recognizing that we've arrived.
Loving spirits will live together.
We're all swimming to the other side.”
Pat Humphries said, about “same rain”:
“This did not just come out of me. This came from a lot of different people and different places, and I just happened to be here at the right time for it to flow through my pen, my tape recorder.”
I would say the same thing about my own research.
Washed by the Very Same Rain: System Administration Research The End
Alva L. Couch Tufts University [email protected]