Adventures in Computer Security
Download
Report
Transcript Adventures in Computer Security
Formal Methods and
Computer Security
John Mitchell
Stanford University
Invitation
• I'd like to invite you to speak about the
role of formal methods in computer
security.
• This audience is … on the systems end …
• If you're interested, let me know and we
can work out the details.
Outline
What’s
a “formal method”?
Java bytecode verification
Protocol analysis
• Model checking
• Protocol logic
Trust
management
• Access control policy language
Big Picture
Biggest
problem in CS
• Produce good software efficiently
Best
tool
• The computer
Therefore
• Future improvements in computer
science/industry depend on our ability to
automate software design, development,
and quality control processes
Formal method
Analyze
a system from its description
• Executable code
• Specification (possibly not executable)
Analysis
based on correspondence
between system description and
properties of interest
• Semantics of code
• Semantics of specification language
Example: TCAS
[Levison, Dill, …]
Specification
• Many pages of logical formulas specifying
how TCAS responds to sensor inputs
Analysis
• If module satisfies specification,
and aircraft proceeds as directed,
then no collisions will occur
Method
• Logical deduction, based on formal rules
Formal methods: good and bad
Strengths
• Formal rules captures years of experience
• Precise, can be automated
Weaknesses
• Some subtleties are hard to formalize
• Methods cumbersome, time consuming
Users * Importance
Formal methods sweet spot
Worthwhile
Not feasible
Not worth
the effort
Multiplier parity
OS verification
System complexity * Property complexity
Target areas
Hardware
verification
Program verification
• Prove properties of programs
• Requirements capture and analysis
• Type checking and “semantic analysis”
Computer
security
• Mobile code security
• Protocol analysis
• Access control policy languages, analysis
Computer Security
Goal: protect computer
systems and digital
information
control
Network security
OS security
Web browser/server
Database/application
…
Security
Access
Crypto
Current formal methods use abstract view of cryptography
Mobile code: Java Applet
Local
window
Download
• Seat map
• Airline data
Local
data
• User profile
• Credit card
Transmission
• Select seat
• Encrypted msg
Java Virtual Machine Architecture
A.java
Java
Compiler
A.class
Compile source code
Java Virtual Machine
Loader
Verifier
B.class
Linker
Bytecode Interpreter
Java Sandbox
Four
complementary mechanisms
• Class loader
– Separate namespaces for separate class loaders
– Associates protection domain with each class
• Verifier and JVM run-time tests
– NO unchecked casts or other type errors, NO array overflow
– Preserves private, protected visibility levels
• Security Manager
– Called by library functions to decide if request is allowed
– Uses protection domain associated with code, user policy
– Enforcement uses stack inspection
Verifier
Bytecode
may not come from standard compiler
• Evil hacker may write dangerous bytecode
Verifier
checks correctness of bytecode
• Every instruction must have a valid operation code
• Every branch instruction must branch to the start of
some other instruction, not middle of instruction
• Every method must have a structurally correct
signature
• Every instruction obeys the Java type discipline
Last condition is fairly complicated
.
How do we know verifier is correct?
Many
attacks based on verifier errors
Formal studies prove correctness
• Abadi and Stata
• Freund and Mitchell
• Nipkow and others …
A type system for object initialization
in the
Java bytecode language
Stephen Freund John Mitchell
Stanford University
(Raymie Stata and Martín Abadi, DEC SRC)
Bytecode/Verifier Specification
Specifications
from Sun/JavaSoft:
• 30 page text description [Lindholm,Yellin]
• Reference implementation (~3500 lines of C code)
These
are vague and inconsistent
Difficult to reason about:
• safety and security properties
• correctness of implementation
Type
system provides formal spec
JVM uses stack machine
Java
Class A extends Object {
int i
void f(int val) { i = val + 1;}
}
JVM Activation Record
local
variables
Bytecode
Method void f(int)
aload 0 ; object ref this
iload 1 ; int val
iconst 1
iadd
; add val +1
putfield #4 <Field int i>
return
refers to const pool
operand
stack
data
area
Return addr,
exception info,
Const pool res.
Java Object Initialization
Point p = new Point(3);
p.print();
1:
2:
3:
4:
5:
No
new Point
dup
iconst 3
invokespecial <method Point(int)>
invokevirtual <method print()>
easy pattern to match
Multiple refs to same uninitialized object
JVMLi Instructions
Abstract
instructions:
• new allocate memory for object
• init initialize object
• use use initialized object
Goal
• Prove that no object can be used before
it has been initialized
Typing Rules
For
program P, compute for iDom(P)
Fi : Var type
type of each variable
Si : stack of types type of each stack location
Example:
static semantics of inc
P[i] = inc
Fi+1 = Fi
Si+1 = Si = Int
i+1 Dom(P)
F, S, i P
Typing Rules
Each
rule constrains successors of
instruction:
Well-typed
= Accepted by Verifier
Alias Analysis
Other
situations:
1: new P
2: new P
3: init P
or
new P
init P
Equivalence
classes based on line
where object was created.
The new Instruction
Uninitialized
object type placed on
stack of types:
P[i] = new
Fi+1 = Fi
Si+1 = i Si
i S i
i Range(Fi)
i+1 Dom(P)
F, S, i P
i : uninitialized object of
type allocated on line i.
The init Instruction
Substitution
of initialized object type
for uninitialized object type:
P[i] = init
Si = j , j Dom(P)
Si+1 =[/ j]
Fi+1 =[/ j] Fi
i+1 Dom(P)
F, S, i P
Soundness
Theorem: A well-typed program will not
generate a run-time error when executed
Invariant:
• During program execution, there is never
more than one value of type present.
• If this is violated, we could initialize one
object and mistakenly believe that a
different object was also initialized.
Extensions
Constructors
• constructor must call superclass constructor
Primitive
Types and Basic Operations
Subroutines [Stata,Abadi]
• jsr L jump to L and push return address on stack
• ret x jump to address stored in x
• polymorphic over untouched variables
Dom(FL) restricted to variables used by subroutine
Bug in Sun JDK 1.1.4
1:
2:
3:
4:
5:
6:
7:
8:
9:
jsr 10
store 1
jsr 10
store 2
load 2
init P
load 1
use P
halt
10: store 0
11: new P
12: ret 0
variables 1 and 2 contain references to
two different objects with type P11
.
verifier allows use of uninitialized object
Related Work
Java
type systems
• Java Language [DE 97], [Syme 97], ...
• JVML [SA 98], [Qian 98], [HT 98], ...
Other
•
•
•
•
approaches
Concurrent constraint programs [Saraswat 97]
defensive-JVM [Cohen 97]
data flow analysis frameworks [Goldberg 97]
Experimental tests [SMB 97]
TIL
/ TAL [Harper,Morrisett,et al.]
Protocol Security
Cryptographic
Protocol
• Program distributed over network
• Use cryptography to achieve goal
Attacker
• Read, intercept, replace messages,
remember their contents
Correctness
• Attacker cannot learn protected secret
or cause incorrect protocol completion
Example Protocols
Authentication
Protocols
• Clark-Jacob report >35 examples (1997)
• ISO/IEC 9798, Needham-S, DenningSacco, Otway-Rees, Woo-Lam, Kerberos
Handshake
and data transfer
• SSL, SSH, SFTP, FTPS, …
Contract
signing, funds transfer, …
Many others
Characteristics
Relatively
simple distributed programs
• 5-7 steps, 3-10 fields per message, …
Mission
critical
• Security of data, credit card numbers, …
Subtle
• Attack may combine data from many
sessions
Good target for formal methods
However: crypto is hard to model
Run of protocol
Initiate
A
Respond
B
Attacker
C
D
Correct if no security violation in any run
Protocol Analysis Methods
Non-formal
approaches
(useful, but no tools…)
• Some crypto-based proofs [Bellare, Rogaway]
• Communicating Turing Machines
[Canetti]
BAN and related logics
• Axiomatic semantics of protocol steps
Methods based on operational semantics
• Intruder model derived from Dolev-Yao
• Protocol gives rise to set of traces
– Denotation of protocol = set of runs involving arbitrary
number of principals plus intruder
Example projects and tools
Prove
protocol correct
• Paulson’s “Inductive method”, others in HOL, PVS,
• MITRE -- Strand spaces
• Process calculus approach: Abadi-Gordon spi-calculus
Search
using symbolic representation of states
• Meadows: NRL Analyzer, Millen: CAPSL
Exhaustive
finite-state analysis
• FDR, based on CSP
[Lowe, Roscoe, Schneider, …]
• Clarke et al. -- search with axiomatic intruder model
Sophistication of attacks
Low
High
Protocol analysis spectrum
Hand proofs
Poly-time calculus
Multiset rewriting with
Spi-calculus
Athena Paulson
NRL
Bolignano
BAN logic
Protocol logic
Model checking
FDR
Low
High
Protocol complexity
Murj
Important Modeling Decisions
How
powerful is the adversary?
How
much detail in underlying data types?
•
•
•
•
Simple replay of previous messages
Block messages; Decompose, reassemble, resend
Statistical analysis, traffic analysis
Timing attacks
• Plaintext, ciphertext and keys
– atomic data or bit sequences
• Encryption and hash functions
– “perfect” cryptography
– algebraic properties: encr(x*y) = encr(x) * encr(y) for
RSA encrypt(k,msg) = msgk mod N
Four efforts
Finite-state
(w/various collaborators)
analysis
• Case studies: find errors, debug specifications
Logic
based model - Multiset rewriting
• Identify basic assumptions
• Study optimizations, prove correctness
• Complexity results
Framework
with probability and complexity
• More realistic intruder model
• Interaction between protocol and cryptography
• Significant mathematical issues, similar to hybrid
systems (Panangaden, Jagadeesan, Alur, Henzinger, de Alfaro, …)
Protocol
logic
Rest of talk
Model
checking
• Contract signing
MSR
• Overview, complexity results
PPoly
• Key definitions, concepts
Protocol
logic
• Short overview
Likely to run out of time …
Contract-signing protocols
John Mitchell, Vitaly Shmatikov
Stanford University
Subsequent work by Chadha, Kanovich, Scedrov,
Other analysis by Kremer, Raskin
Example
Immunity
deal
Both
parties want to sign the contract
Neither wants to commit first
General protocol outline
I am going to sign the contract
I am going to sign the contract
A
Here is my signature
B
Here is my signature
Trusted
third party can force contract
• Third party can declare contract binding if
presented with first two messages.
Assumptions
Cannot
trust communication channel
• Messages may be lost
• Attacker may insert additional messages
Cannot
trust other party in protocol
Third party is generally reliable
• Use only if something goes wrong
• Want TTP accountability
Desirable properties
Fair
• If one can get contract, so can other
Accountability
• If someone cheats, message trace shows
who cheated
Abuse
free
• No party can show that they can
determine outcome of the protocol
Asokan-Shoup-Waidner protocol
Agree
Abort
m1= sign(A, c, hash(r_A) )
A
sign(B, m1, hash(r_B) )
r_A
???
sigT (a1,abort)
T
Attack?
m1
m2
A Net
a1
B
r_B
Resolve
B
A
B
A
???
T
sigT (m1, m2)
T
Network
If not already
resolved
Results
Exhaustive
finite-state analysis
• Two signing parties, third party
• Attacker tries to subvert protocol
Two
attacks
• Replay attack
– Restart A’s conversation to fool B
• Inconsistent signatures
– Both get contracts, but with different ID’s
Repair
• Add data to m3, m4; prevent both attacks
Related protocol
Designed
[Garay, Jakobsson, MacKenzie]
to be “abuse free”
• B cannot take msg from A and show to C
• Uses special cryptographic primitive
• T converts signatures, does not use own
Finite-state
analysis
• Attack gives A both contract and abort
• T colludes weakly, not shown accountable
• Simple repair using same crypto primitive
Garay, Jakobsson, MacKenzie
Agree
Abort
PCSA(text,B,T)
A
PCSB(text,A,T)
sigA(text)
A
m1 = PCSA(text,B,T)
???
B
sigB(text
)
A(text,B,T)
PCSB(text,A,T)
Attack
B
B
sigT(abort)
???
T
Network
T
Resolve PCS
A Net
B
PCSA(text,B,T)
sigB(text)
abort AND
sigB(text)
T
Leaked by T
abort
Modeling Abuse-Freeness
Ability to determine the outcome
Abuse
=
+
Ability to prove it
Not a trace property!
Depend
on set of traces through a state
Approximation for finite-state analysis
• Nondet. challenge A to resolve or abort
• If trace s.t. outcome challenge,
then A cannot determine the outcome
Conclusions
Online
contract signing is subtle
• Fairness
• Abuse-freeness
• Accountability
Several
interdependent subprotocols
• Many cases and interleavings
Finite-state
tool great for case analysis!
• Find bugs in protocols proved correct
Multiset Rewriting and Security
Protocol Analysis
John Mitchell
Stanford University
I. Cervesato, N. Durgin, P. Lincoln, A. Scedrov
A notation for inf-state systems
Linear Logic
()
Proof search
(Horn clause)
Multiset
rewriting
Finite Automata
Process
Calculus
• Many previous models are buried in tools
• Define common model in tool-independent formalism
Modeling Requirements
Express
properties of protocols
• Initialization
– Principals and their private/shared data
• Nonces
– Generate fresh random data
Model
attacker
• Characterize possible messages by attacker
• Cryptography
Set
of runs of protocol under attack
Notation commonly found in literature
A B : { A, Noncea }Kb
B A : { Noncea, Nonceb }Ka
A B : { Nonceb }Kb
• The notation describes protocol traces
• Does not
– specify initial conditions
– define response to arbitrary messages
– characterize possible behaviors of attacker
Rewriting Notation
Non-deterministic
infinite-state systems
Facts
F ::= P(t1, …, tn)
t ::= x | c | f(t1, …, tn)
States
Multi-sorted
first-order
atomic formulas
{ F1, ..., Fn }
• Multiset of facts
– Includes network messages, private state
– Intruder will see messages, not private state
Rewrite rules
Transition
• F1, …, Fk x1 … xm. G1, … , Gn
What
this means
• If F1, …, Fk in state , then a next state ’ has
– Facts F1, …, Fk removed
– G1, … , Gn added, with x1 … xm replaced by new symbols
– Other facts in state carry over to ’
• Free variables in rule universally quantified
Note
• Pattern matching in F1, …, Fk can invert functions
• Linear Logic: F1…Fk x1 … xm(G1…Gn)
Finite-State Example
a
q1
a
b
q0
a
q3
b
b
• Predicates: State, Input
• Function:
a
b
b
q2
• Constants: q0, q1, q2, q3, a, b, nil
• Transitions: State(q0), Input(a x) State(q1), Input(x)
State(q0), Input(b x) State(q2), Input(x)
...
Set of rewrite transition sequences = set of runs of automaton
Simplified Needham-Schroeder
Predicates
A B: {na, A}Kb
B A: {na, nb}Ka
A B: {nb}Kb
Ai, Bi, Ni
-- Alice, Bob, Network in state i
Transitions
x. A1(x)
A1(x) N1(x), A2(x)
N1(x) y. B1(x,y)
B1(x,y) N2(x,y), B2(x,y)
A2(x), N2(x,y) A3(x,y)
A3(x,y) N3(y), A4(x,y)
B2(x,y), N3(y) B3(x,y)
picture next slide
Authentication
A4(x,y) B3(x,y’) y=y’
A B: {na, A}Kb
B A: {na, nb}Ka
A B: {nb}Kb
Sample Trace
x. A1(x)
A1(na)
A1(x) A2(x), N1(x)
A2(na)
N1(x) y. B1(x,y)
A2(na)
B1(x,y) N2(x,y), B2(x,y)
A2(na)
A2(x), N2(x,y) A3(x,y)
A3(na, nb)
A3(x,y) N3(y), A4(x,y)
A4(na, nb)
B2(x,y), N3(y) B3(x,y)
A4(na, nb)
N1(na)
B1(na, nb)
N2(na, nb)
B2(na, nb)
B2(na, nb)
N3( nb)
B2(na, nb)
B3(na, nb)
Common Intruder Model
Derived
from Dolev-Yao model
• Adversary is nondeterministic process
• Adversary can
–
–
–
–
Block network traffic
Read any message, decompose into parts
Decrypt if key is known to adversary
Insert new message from data it has observed
• Adversary cannot
– Gain partial knowledge
– Guess part of a key
– Perform statistical tests, …
Formalize Intruder Model
Intercept, decompose and remember messages
N1(x) M(x)
N3(x) M(x)
N2(x,y) M(x), M(y)
Decrypt if key is known
M(enc(k,x)), M(k) M(x)
Compose and send messages from “known” data
M(x) N1(x), M(x)
M(x), M(y) N2(x,y), M(x), M(y)
M(x) N3(x), M(x)
Generate new data as needed
x. M(x)
Highly nondeterministic, same for any protocol
Attack on Simplified Protocol
x. A1(x)
A1(na)
N1(na)
A1(x) A2(x), N1(x)
A2(na)
N1(x) M(x)
A2(na)
M(na)
x. M(x)
A2(na)
M(na), M(na’)
M(x) N1(x), M(x)
N1(x) y. B1(x,y)
A2(na)
M(na), M(na’)
A2(na)
M(na), M(na’)
N1(na’)
B1(na’, nb)
Continue “man-in-the-middle” to violate specification
Protocols vs Rewrite rules
Can
axiomatize any computational system
But -- protocols are not arbitrary programs
Choose principals
Select roles
Client
Client
TGS
Server
Thesis: MSR Model is accurate
Captures “Dolev-Yao-Needham-Millen-Meadows- …”
model
• MSR defines set of traces protocol and attacker
• Connections with approach in other formalisms
Useful
for protocol analysis
• Errors shown by model are errors in protocol
• If no error appears, then no attack can be carried
out using only the actions allowed by the model
Complexity results using MSR
Bounded #
of roles
Intruder ,
with
only
Intruder ,
w/o
only
Bounded
use of
Unbounded
use of
??
NP –
complete
DExp –
time
Undecidable
All: Finite number of different roles, each role of finite length, bounded message size
Key insight: existential quantification () captures cryptographic
nonce; main source of complexity
[Durgin, Lincoln, Mitchell, Scedrov]
Additional decidable cases
Bounded
•
•
•
•
role instances, unbounded msg size
Huima 99: decidable
Amadio, Lugiez: NP w/ atomic keys
Rusinowitch, Turuani: NP-complete, composite keys
Other studies, e.g., Kusters: unbounded # data fields
Constraint
systems
• Cortier, Comon: Limited equality test
• Millen, Shmatikov: Finite-length runs
All: bound number of role instances
Probabilistic Polynomial-Time
Process Calculus
for
Security Protocol Analysis
J. Mitchell, A. Ramanathan, A. Scedrov, V. Teague
P. Lincoln, M. Mitchell
Limitations of Standard Model
Can
find some attacks
• Successful analysis of industrial protocols
Other
attacks are outside model
• Interaction between protocol and encryption
Some
protocols cannot be modeled
• Probabilistic protocols
• Steps that require specific property of
encryption
Possible
to “OK” an erroneous protocol
Non-formal state of the art
Turing-machine-based
•
•
•
•
analysis
Canetti
Bellare, Rogaway
Bellare, Canetti, Krawczyk
others …
Prove
correctness of protocol
transformations
• Example: secure channel -> insecure
channel
Language Approach
[Abadi, Gordon]
Write
protocol in process calculus
Express security using observational equivalence
• Standard relation from programming language theory
P Q iff for all contexts C[ ], same
observations about C[P] and C[Q]
• Context (environment) represents adversary
Use
proof rules for to prove security
• Protocol is secure if no adversary can distinguish it
from some idealized version of the protocol
Probabilistic Poly-time Analysis
[Lincoln, Mitchell, Mitchell, Scedrov]
Adopt
spi-calculus approach, add probability
Probabilistic polynomial-time process calculus
• Protocols use probabilistic primitives
– Key generation, nonce, probabilistic encryption, ...
• Adversary may be probabilistic
• Modal type system guarantees complexity bounds
Express
protocol and specification in calculus
Study security using observational equivalence
• Use probabilistic form of process equivalence
Needham-Schroeder Private Key
Analyze
part of the protocol P
AB: {i}K
B A : { f(i) } K
“Obviously’’
secret protocol Q
(zero knowledge)
A B : { random_number } K
B A : { random_number } K
Analysis:
P Q reduces to crypto condition
related to non-malleability [Dolev, Dwork, Naor]
– Fails for RSA encryption, f(i) = 2i
Technical Challenges
Language
for prob. poly-time functions
• Extend Hofmann language with rand
Replace
nondeterminism with probability
• Otherwise adversary is too strong ...
Define
probabilistic equivalence
• Related to poly-time statistical tests ...
Develop
specification by equivalence
• Several examples carried out
Proof
systems for probabilistic equivalence
• Work in progress
Basic example
Sequence
generated from random seed
Pn: let b = nk-bit sequence generated from n random bits
in PUBLIC b end
Truly
random sequence
Qn: let b = sequence of nk random bits
in PUBLIC b end
P
is crypto strong pseudo-random generator
PQ
Equivalence is asymptotic in security parameter n
Compositionality
Property
of observational equiv
AB
CD
A|C B|D
similarly for other process forms
Current State of Project
New
framework for protocol analysis
• Determine crypto requirements of protocols !
• Precise definition of crypto primitives
Probabilistic
ptime language
Pi-calculus-like process framework
• replaced nondeterminism with rand
• equivalence based on ptime statistical tests
Proof
methods for establishing equivalence
Future: tool development
Protocol logic
Protocol
Honest Principals,
Attacker
Private
Data
Alice’s information
• Protocol
• Private data
• Sends and receives
Intuition
Reason
•
•
•
•
about local information
I chose a new number
I sent it out encrypted
I received it decrypted
Therefore: someone decrypted it
Incorporate
knowledge about protocol
• Protocol: Server only sends m if it got m’
• If server not corrupt and I receive m
signed by server, then server received m’
Bidding conventions
Blackwood
(motivation)
response to 4NT
– 5 : 0 or 4 aces
– 5 : 1 ace
– 5 : 2 aces
– 5 : 3 aces
Reasoning
• If my partner is following Blackwood,
then if she bid 5, she must have 2 aces
Logical assertions
Modal
operator
• [ actions ] P - after actions, P reasons
Predicates
•
•
•
•
in
Sent(X,m)
- principal X sent message m
Created(X,m) – X assembled m from parts
Decrypts(X,m) - X has m and key to decrypt m
Knows(X,m)
- X created m or received msg
containing m and has keys to extract m from msg
• Source(m, X, S) – YX can only learn m from set S
• Honest(X)
– X follows rules of protocol
Correctness of NSL
Bob
knows he’s talking to Alice
[ recv encrypt( Key(B), A,m );
new n;
msg1
send encrypt( Key(A), m, B, n );
recv encrypt( Key(B), n )
]B
msg3
Honest(A) Csent(A, msg1) Csent(A, msg3)
where Csent(A, …) Created(A, …) Sent(A, …)
Honesty rule
(rule scheme)
roles R of Q. initial segments A R.
Q |- [ A ]X
Q |- Honest(X)
• This is a finitary rule:
– Typical protocol has 2-3 roles
– Typical role has 1-3 receives
– Only need to consider A waiting to receive
Conclusions
Security
Protocols
• Subtle, Mission critical, Prone to error
Analysis
methods
• Model checking
– Practically useful; brute force is a good thing
– Limitation: find errors in small configurations
• Proof methods
– Time-consuming to use general logics
– Special-purpose logics can be sound, useful
Room for another 5+ years of work
Access Control / Trust Mgmt
Root CA
Conference Registration
Stanford is accred university
Regular $1000
Academic $500
Student $100
Stanford
Mitchell is regular faculty
Faculty can ident students
Mitchell
Chander is my student
Chander
Registration message
Certification
Root signs: Stanford is accred university
Stanford signs: Mitchell is regular faculty
Faculty can ident students
Mitchell signs: Chander is my student
Users * Importance
Formal methods
Worthwhile
Not feasible
Not worth
the effort
System complexity * Property complexity
Sophistication of attacks
Low
High
System-Property Tradeoff
Hand proofs
Poly-time calculus
Multiset rewriting with
Spi-calculus
Athena Paulson
NRL
Bolignano
BAN logic
Protocol logic
Model checking
FDR
Low
High
Protocol complexity
Murj
The Wedge of Formal Verification
Value
to
Design
Verify
Abstract
Invisible FM
Refute
Effort Invested
Big Picture
Biggest
problem in CS
• Produce good software efficiently
Best
tool
• The computer
Therefore
• Future improvements in computer
science/industry depend on our ability to
automate software design, development,
and quality control processes