lockss-739.ppt

Download Report

Transcript lockss-739.ppt

UNIVERSITY of WISCONSIN-MADISON
Computer Sciences Department
CS 739
Distributed Systems
Andrea C. Arpaci-Dusseau
LOCKSS:
Lots of Copies Keeps Stuff Safe
Preserving Peer Replicas By Rate-Limited Sampled Voting,
Maniatis, Roussopoulos, Giuli, Rosenthal, Baker, Muliadi
(Stanford) -- SOSP’03
Motivation
Librarians: Responsibility to preserve important materials
Traditional approach:
• Acquire lots of copies
• Distribute around world
• Lend or copy to provide access
Academic publishing is moving to Web
• LOCKSS: Real system used by many libraries (1999)
• How to apply techniques to digital preservation?
Strength: Real problem that people care about, real solution
being used
Design Goals and Assumptions
Must be cheap to build and maintain
• No RAID systems
Need not operate quickly
• Want to prevent change, not expedite it
Must function properly for decades
No centralized control
Handle failures
• Handle malicious attackers
• Handle catastrophic random failures
How is this different from other P2P systems?
Design Principles
Cheap storage is unreliable
No long-term secrets
• Can’t hold private keys for arbitrary time periods
Use inertia
• Rate limit the amount of activity and change
Avoid third-party reputation
• Malicious users can lie about good users
• Attackers can “cash in” history of good behavior
Reduce predictability
• Make difficult for attackers to predict behavior of victims
Make intrusion detection intrinsic
• Part of the system itself
Assume strong adversary
• May want to change, suppress, or steal content
LOCKSS Overview
Libraries run persistent web caches
• Collect by crawling journal web-sites
• Distribute by acting as limited proxy cache
• Preserve by cooperating with others to detect and
repair damage
Peers vote on large archival units (AU’s)
• AU == year’s run of a journal
• Each peer holds different AU’s
• If AU damaged, call increasingly specific partial polls
Opinion Poll Protocol
Terminology:
• Loyal, malign, healthy, damaged peers
Goal:
• High probability loyal peers are healthy
(despite attacks by malign peers and failures)
• Low probability even powerful adversary can damage significant
proportion of loyal peers without detection
Overview
•
•
•
•
Poll initiator calls opinion poll on AU >> rate of random damage
Invites small subset of known peers (poll participant or voter)
Voter computes and returns digest of AU
Vote results for poll initiator:
– Landslide win: Votes overwhelmingly agree with own version
– Landslide loss: Repair AU by fetching copy of AU from peer
– Inconclusive poll: Raise alarm for human attention
• Who can benefit from the poll? What if voter disagrees?
Peer Lists per AU
Lists for every AU
• Friends list: Peers have outside relationships with friends
• Reference list: Peers encountered recently
– Bootstrap: Init with friends list
– Inner circle: Those invited to influence poll results
– Outer circle: Nominated by inner circle
Poll Initiation
Poll initiation: (about every 3 months per AU)
• Choose N random peers from ref list: Inner circle
• Send Poll [Poll ID, Diffie-Hellman Public Key]
• Wait for responses..
Voter from inner circle: Decide if want to participate
• Why might a peer not participate?
• Pick new DH public key, compute symmetric session key
• How does Diffie-Hellman work?
– A chooses secret a, sends g^a mod p
– B chooses secret b, sends g^b mod p
– Each computes secret (g^b mod p)^a mod p = (g^a mod p)^b mod p
• Why encrypt messages??
• Send back encrypted YES or NO to participate
– Send PollChallenge [Poll ID, DH public key, {challenge, YES}]
Poll Effort
Initiator: Produce computational effort for voter
• Why proof of computation by initiator needed?
• Use memory-bound functions (MBF) with poll id and challenge as
input
– Why are MBF good?
• Send back PollProof [Poll Id, poll effort proof]
– Even send this to voters who responded NO. Why?
Voter: Verifies result
• Less computation needed to verify result than compute
• Nominate outer circle peers (more later)
– Randomly selected from reference list
• Send Vote messages for AU
– Also send proof of computational effort in rounds
– Why proof of computation by voter needed? Why in rounds?
Vote Tabulation
Initiatator: Tabulates valid votes from inner circle
Three cases:
• Landslide loss: Agreeing votes <= D
– Repair AU
• Landslide win: Agreeing votes > V-D
– Opinion poll concludes successfully; reschedule poll
• Inconclusive: Raise alarm
Repair
• Initiator picks disagreeing voter and requests repair
• When is voter willing to supply content?
• Retabulate results with new content
Outer Circle
What is the purpose of the outer circle?
Initiator: Picks same number from every nominator
• Repeat same steps of protocol with outer circle
– Why?
• Differences?
Update reference list
• What is a malign peer trying to do?
• Who is removed?
• Insert: Valid/agreeing outer circle peers and random friends
– Why?
Adversary Attacks
Assume powerful adversary
•
•
•
•
•
•
•
•
•
Total information awareness
Perfect work balancing
Perfect digital preservation
Local eavesdropping
Local spoofing
Stealth
Unconstrained identities
Exploitation of Common peer vulnerabilities
Complete parameter knowledge
Adversary Attacks
Stealth modification
• Convince loyal peer has damaged AU
• Replace protected content with bad version
• Focus of paper
Nuisance
• Raise alarms
Attrition
• Make loyal peers waste computational resources so can’t repair
damage
Theft
• Acquire published content from peers without fee
• How does LOCKSS prevent?
Free-loading
• Obtain services without supplying to others
Stealth Modification Attack
1) Lurk phase
•
Increase foothold: malign peers in reference list (inner circle)
–
–
–
Wait until invited into circle
Act loyal
Nominate more malign peers
2) Attack phase
•
When see poll is vulnerable (I.e., overwhelming majority of inner
circle is malign), vote bad
Why is attacking successfully hard?
•
•
Rate limiting: Must wait for vulnerable polls to occur
Damaged loyal peers call and vote in polls using bad copy
–
•
Can be repaired or raise alarms
(doesn’t act differently when don’t have majority)
Must expend effort calling polls too
–
Loyal peer only requests repair if voted in malign peer’s polls
Simulations
Environment
•
•
•
•
1000 peers
Clusters of 30 peers; 80% for friends, 20% random
Call polls every 3 months on average
N (size of inner circle): 20, Q: 10
How many false alarms with no adversaries?
• 20 years, random damage at every peer: 5-10 years
Simulation: Lurking Time
How long must lurk for desired foothold ratio?
• 10% malign; how many years for 40% ratio? 50%?
• 30% malign; how many years for 50% ratio? 70%?
Simulation: Alarm Time
How long before attack detected
(I.e., inconclusive poll alarm raised)?
Simulation: Damage to AU
How many bad replicas? How many years?
When is irrecoverable damage caused?
Simulation: Worst-case
How long should adversary lurk before attack?
Simulation: Benefit of Churn
What churn rates are best?
Conclusions
Interesting motivation
• Real problem and deployed solution
Opinion Poll Protocol has many attractive properties
Uses problem domain to guide protocol
• Inertia: Adversaries can’t influence poll timing
• Friend list: Use outside relationships to influence trust
Attacking is very costly
• Must lurk long period to increase foothold in inner circle
• Must continually pay through proofs of computation (MBF)
• Immediately removed from lists if disagree
Easy to set off alarms
• If voting results are inconclusive, human notified