Concurrency Control

Download Report

Transcript Concurrency Control

CS 540
Database Management Systems
Concurrency Control
1
Concurrent access to data
Flight (fltNo, fltDate, seatNo, seatStatus)
Database: seats 22A and 22B are available.
1.
2.
3.
4.
John checks for availability and gets seat 22A
John books seat 22A
Mary checks for availability and gets seat 22B
Mary books seat 22B
Database: seats 22A and 22B are reserved.
2
Concurrent access to data: problems
Database: seats 22A and 22B are available.
1.
2.
3.
4.
John checks for availability and gets seat 22A
Mary checks for availability and gets seat 22A
John books seat 22A
Mary books seat 22A
Database: double booking on 22A!!!
• Database is not consistent.
• Solution: Run John’s before Mary’s (or vice versa)
– serial schedule
3
Serial schedules
Database: seats 22A and 22B are available.
1. Admin scans Flight to generate some reports.
2. Mary checks the flights for tomorrow.
It takes a long time!
Database: seats 22A and 22B are available.
• Nobody can use the database until Admin is done!!
How to run programs concurrently without them
messing up each other’s results?
4
Transaction
• a “program” of atomic database operations.
• atomic unit of database transformation.
START TRANSACTION
SQL-statement 1;
SQL-statement 2;
…
SQL-statement n;
COMMIT;
– rollback/ abort: undoes all transaction’s operations.
5
The ACID properties
•
•
A tomicity: All actions in the Xact happen, or none happen.
C onsistency: If each Xact is consistent, and the DB starts
consistent, it ends up consistent.
• I solation:
Xacts.
•
D urability:
Execution of one Xact is isolated from that of other
Today’s paper
If a Xact commits, its effects persist.
Transaction Interleaving
• Before: A = 0, B = 0, then?
• Correct schedule? Wrong schedule? Why?
Xact T1
read(A)
A=A+1
write(A)
Xact T2
Xact T1
read(A)
A=A+1
write(A)
read(A)
A=A*2
write(A)
read(A)
A=A*2
write(A)
read(B)
B=B*2
write(B)
read(B)
B=B+1
write(B)
read (B)
B=B*2
write (B)
7
schedule 1
Xact T2
read (B)
B=B+1
write (B)
schedule 2
Xact T1
read(A)
A=A+1
write(A)
read(B)
B=B+1
write(B)
Xact T2
read(A)
A=A*2
write(A)
read(B)
B=B*2
write(B)
schedule 3
Transaction Interleaving
•
Schedule 1 (interleaved, more concurrent) = Schedule 3 (serial)
– (A=0,B=0) --- T1 --> (A=1,B=1) -- T2 --> (A=2,B=2)
– Consistency and isolation: transform DB in serial
•
•
Serializable schedule: a concurrent schedule that impacts DB like a serial schedule.
How to check if a concurrent schedule is serializable?
Xact T1
read(A)
A=A+1
write(A)
Xact T2
read(A)
A=A*2
write(A)
read(B)
B=B+1
write(B)
read (B)
B=B*2
write (B)
8
schedule 1
Xact T1
read(A)
A=A+1
write(A)
read(B)
B=B+1
write(B)
Xact T2
read(A)
A=A*2
write(A)
read(B)
B=B*2
write(B)
schedule 3
Use conflicting operations
• We denote operations of transaction i as writei or readi.
• Conflicting operations:
– operations from different transactions such that:
• two writes of the same data item: writei(A); witej(A).
• a read and a write on the same data item: readi(A); writej(A).
• Change the relative order of conflicting operators =>
change the final state of DB.
– A serializable schedule has the same order of conflicting operators
as a serial schedule.
9
Serialization graph
• Define dependencies between transactions with conflicting operations
• Serialization (precedence) graph: Ti --> Tj for a dependency from Ti to Tj
• Serializable if serialization graph does not have any cycle
Xact T1
read(A)
A=A+1
write(A)
Xact T2
T1
Xact T1
read(A)
A=A+1
write(A)
read(B)
B=B+1
write(B)
10
schedule 1
T1
read(A)
A=A*2
write(A)
read(B)
B=B*2
write(B)
read(A)
A=A*2
write(A)
read (B)
B=B*2
write (B)
Xact T2
T2
read (B)
B=B+1
write (B)
schedule 2
T2
Guaranteeing isolation
• Scheduler guarantees serializability
– restricts the access of transactions on data items.
– enforces some order on conflicting operations.
• Two approaches:
– Pessimistic: There are many conflicting transactions.
– Optimistic: There are a few conflicting transactions.
Locking Protocol
• A “protocol” for accessing data
– well-formed transactions lock/unlock “access units”
before/after using them
– lock manager grants/manages locks
• Goals of locking protocol
– ensure serializability
– preserve high concurrency
• Parameters of a locking protocol?
12
Locking Protocol: Parameters
• What “modes” of locks to provide?
Compatibility?
– e.g., S for shared, X for exclusive
• How to “well-behave” to obtain and hold
locks?
– in what sequence?
– how long to hold?
• What “units” to lock?
– database? table? tuple? what else?
13
Lock modes and compatibility
• Shared lock = read lock= S
– multiple transactions hold a shared lock over a data item.
• Exclusive lock = write lock = X
– at most one transaction holds an exclusive lock over a data item.
• Lock manager gives locks based on compatibility matrix:
X
S
X
N
S
N
N
Y
Motivation: a “simple” protocol
• Lock modes:
– S for shared and X for exclusive access
– compatibility: (S, S) = T, otherwise F
• Behavior:
– lock (the maximum mode) before access
– release lock immediately after
• Unit:
– a relation
15
Simple Protocol: what’s wrong?
Xact T1
Xact T2
X.lock(A’s relation)
read(A)
A=A+1
write(A)
X.release-lock(A’s relation)
X.lock(A’s relation)
read (A)
A=A*2
write (A)
X.release-lock(A’s relation)
X.lock(B’s relation)
read(B)
B=B*2
write(B)
X. release-lock(B’s relation)
X.lock(B’s relation)
read(B)
B=B+1
write(B)
X.release-lock(B’s relation)
16
Solution: 2 Phase Locking (2PL)
• Each transaction has two phases:
– Getting locks (growing)
• acquire lock of the required mode (S or X)
• can only lock data items during this phase.
– may also upgrade the locks (from S-lock to X-lock).
• read/ write the locked data items.
• no release-lock in this phase
– Releasing locks (shrinking)
• can only release locks on the data items.
– may also downgrade the locks (from X-lock to S-lock).
• the phase starts with the first release-lock.
• no locking after the first release-lock.
• Rule: Transactions do not get any new lock after giving up one.
Simple protocol => 2PL protocol
• 2PL --> serializable
Xact T1
X.lock(A)
…
Xact T2
X.release-lock(A)
X.lock(A)
…
X.release-lock(A)
X.lock(B)
…
X.release-lock(B)
X.lock(B)
…
X.release-lock(B)
18
Xact T1
S.lock(A)
…
S.lock(B)
…
X.release-lock(A)
Xact T2
X.lock(A)
…
…
X.release-lock(B)
X.lock(B)
…
X. release-lock(A)
X. release-lock(B)
Why 2PL schedule is serializable?
• Locks of conflicting operations are not compatible.
• 2PL does not allow the swap of conflicting operations.
– serial order between conflicting operations
– all conflicting operations of T1 before T2.
• It is possible to swap non-conflicting operations.
– provides concurrency
• Some serializable schedules are not allowed in 2PL!
– 2PL schedules are a subset of all serializable schedules
– 2PL does not provide the ideal degree of concurrency.
Locking Protocol: Granularity?
• Unit of locking: How to increase concurrency?
– coarse units? fine units?
• Granularity:
– concurrency vs. overhead
– hierarchical lockable units:
• Database, relations/ files, pages, tuples, attributes
• Correctness problem:
– T1 S.locks a tuple, T2 X.locks the file?
20
Granularity Locking Example
T1: I
DB
T2: Grant S?
T1: I
Relation Student
Relation Enrollment
T1: X
Tuple
Tuple
Tuple
Tuple
• Transaction T1 writes on a tuple in Student => X lock on tuple
• Transaction T2 wants to scan the whole DB => S lock on DB, allowed??
• T1 has to implicitly notify other transactions
21
– puts “intentions locks” on the tuple’s parents in the lockable units hierarchy
Granularity Locking Example, contd.
T1: IS
DB
T2: Grant S?
T1: IS
Relation Student
Relation Enrollment
T1: S
Tuple
Tuple
Tuple
Tuple
• T1 reads some tuples from Student => S on tuple, I on its parents
• T2 wants to scan the whole DB => S lock on DB, allowed??
• We need different types of intention locks to increase concurrency
– IS and IX
22 – also SIX (intension to upgrade), why is it useful?
Lock Compatibility Table
NL
IS
IX
S
SIX
X
NL
YES
YES
YES
YES
YES
YES
IS
YES
YES
YES
YES
YES
NO
IX
YES
YES
YES
NO
NO
NO
S
YES
NO
YES
NO
NO
SIX YES YES
NO
NO
NO
NO
NO
NO
NO
NO
X
YES
YES
NO
X
SIX
S
IX
IS
NL
privilege ordering
23
Compatibility Example
SIX
SIX
Relation Student
X
Grant S? IS? IX?
DB
SIX
Relation Enrollment
X
Tuple
Tuple
Tuple
• Questions:
– (SIX, S) = No? (SIX, IS) = Yes? (SIX, IX) = No?
24
Tuple
Granularity Locking
• Database: as hierarchy of lockable units
• Locking: to lock a unit,
– first lock all containing units with “intension”
– intension locks: IS, IX, SIX (intension to upgrade)
• Unlocking:
– release all relevant locks at once, or leaf to root
– why this order?
25
Granularity Locking: DAG
• Generalization: DAG of units:
– S locks at least one path to the node
– X locks all paths to the node
26
How “long” to hold a lock?
• 2PL requirement:
– only “shrink” after “growing”
• End of transaction:
– unlock (to make data accessible) at xact commit
– why?
27
Cascading rollback
T1
T2
write(B)
read(A)
write(A)
read(A)
write(B)
commit
T2 Aborts
We should never
have let T1 commit
Consistency: Dirty-Data Based
T does not overwrite dirty data of other xacts
0
T does not commit any writes until EOT
1
T does not read dirty data from other xacts
2
other xacts do not dirty any data read by T
before T completes
3
• Sacrificing semantic guarantees for performance
• How to lock for each degree?
29
Examples
• Garbage reads
– T1: w(A)
A?
– T2: w(A)
– value of A is not clear => prevented by degree 0.
• Lost updates
– T1: w(A)
abort
– T2:
w(A)
– degree 1 can prevent T1.abort from masking T2.w(A).
• Dirty reads:
– T1: w(A)
abort
– T2:
r(A)
– degree 2 can prevent T2.r(A) from reading bogus A.
• Inconsistent (unrepeatable) reads
– T1:
w(A)
– T2: r(A)
r(A)
– degree 3 can ensure repeated reads will be consistent.
30
Degrees of Consistency:
Example of a Control System
• The transaction that reads a gauge and writes values in the database
– degree 0, for performance reasons.
• The transaction that reads the data and computes mean and var.
– degree 1, because mean and var. should be consistent.
• The transaction that reads the mean and prints it.
– degree 2, we do not show an “undone” mean
• The transaction that reads both mean and var.
– degree 3, insures that mean and var. are consistent
31
Degrees of Consistency:
Locking-Based Definition
• T does not overwrite dirty data of other xacts
– set write locks on dirty data (well-formed on w)
• T does not commit any writes until EOT
– set “long” write locks (2P/EOT on w)
• T does not read dirty data from other xacts
– set read locks (well-formed on r)
• Other xacts do not dirty any data read by T
before T completes
– set “long” read locks (2P/EOT on r)
32
Additional issues
• Deadlock?
• Optimistic concurrency control
– Assume rare unserializable behavior
– Timestamping followed by validation
33
What you should know
•
•
•
•
34
Why are locking protocols necessary?
What is 2PL?
What are the four degrees of consistency?
How are the degrees of consistency related to
locking?
Carry away messages
• We see a messy problem gets formulated
nicely
– Look for important, but messy problems
• Cross-field advantages (OS <=> DB)
35