Transaction Management and Concurre cy Overview R & G Chapter 16-17 There are three side effects of acid. Enhanced long term memory, decreased short term.
Download
Report
Transcript Transaction Management and Concurre cy Overview R & G Chapter 16-17 There are three side effects of acid. Enhanced long term memory, decreased short term.
Transaction Management and Concurre
cy Overview
R & G Chapter 16-17
There are three side effects of acid.
Enhanced long term memory,
decreased short term memory,
and I forget the third.
- Timothy Leary
Administrivia
• Homework 3 due today by end of class period
• Homework 4 available on class website
– Due date: April 10 (after Spring Break)
• Midterm 2 is next class! Thu 3/22
– In class, covers lectures 10-17
– Review will be held Wed 3/21 7-9 pm 306 Soda
Hall
Components of a DBMS
transaction
query
Query Compiler
Execution Engine
Data Definition
We are here
Transaction Manager
Logging/Recovery
Schema Manager
Concurrency Control
Buffer Manager
LOCK TABLE
Storage Manager
BUFFERS
BUFFER POOL
DBMS: a set of cooperating software modules
Correctness criteria: The ACID properties
•
•
A tomicity: All actions in the Xact happen, or none happen.
C onsistency: If each Xact is consistent, and the DB starts
consistent, it ends up consistent.
•
I solation:
Execution of one Xact is isolated from that of other
Xacts.
•
D urability:
If a Xact commits, its effects persist.
We are here
Review - Definitions
• Transaction - a sequence of read and write operations
(read(A), write(B), …)
– DBMS’s abstract view of a user program
• Serial schedule: Schedule that does not interleave the actions
of different transactions.
• Equivalent schedules: For any database state, the effect of
executing the first schedule is identical to the effect of
executing the second schedule.
• Serializable schedule: A schedule that is equivalent to some
serial execution of the transactions.
(Note: If each transaction preserves consistency, every
serializable schedule preserves consistency. )
Anomalies with interleaved execution
• Reading Uncommitted Data (“WR”, dirty reads):
– T2 reads a value A that T1 wrote but didn’t commit
– e.g: T1 moves $100 from account B to account A
T2 adds 6% interest to account A
$1100 + $66 + 900 = $2066 total after T1 & T2
A=1000 A=A+100=1100
T1:
T2:
R(A), W(A),
B=1000
R(A), W(A), C
A=1100
->A=1166
B=1000-100 A=?
=900
B=1000
R(B), W(B), Abort
A=1100*1.06 A=1166
=1166
B=1000
1166+1000 = 2166 ->
Bank is out $106
because T2 read T1’s
A value before it was
done!
Anomalies with interleaved execution
• Unrepeatable Reads (“RW” Conflicts):
– T1 reads a value A that is then written by T2
– e.g. T1 and T2 place orders for books, and A represents
the quantity in stock (1)
A=1
T1:
T2:
R(A),
A=0
R(A), W(A), C
A=1
A=0
A=0
A=-1
R(A), W(A)
Hopefully T1 gets an error if there is
an integrity constraint!
Anomalies with interleaved execution
• Overwriting Uncommitted Data (“WW”, lost
update):
– T2 overwrites a write by T1
– e.g. T1 assigns seat A to passenger 1 and
seat B to passenger 2
T2 assigns seat A to passenger 3 and
seat B to passenger 4
B=2
A=1
T1:
T2:
W(A),
W(A), W(B), C
A=3
B=4
A=3
B=4
A=3
B=2
W(B), C
-> Passenger 1’s partner is
sitting with passenger 3!
How to prevent anomalies? Locks!
• Database allows “objects” to be locked
– “object” might be entire database, file, page, tuple
• Two kinds of locks:
– “Shared” or “Read” Lock
• No one else is allowed to write the object if you have this
– “Exclusive” or “Write” Lock
• No one else is allowed to read or write the object
Locks are not enough
• If lock/unlock objects right away, anomalies are
still possible
– e.g. The unrepeatable read example
T1 obtain
ShareLock(A)
T1:
T2:
T1 release
ShareLock(A)
R(A),
T1 obtain
ShareLock(A)
R(A), W(A), C
T2 obtain
ExclusiveLockL(A)
R(A), W(A)
T2 release
ExclusiveLock(A)
Locks are not enough
• Idea: Two Phase Locking
– In a transaction,
#locks
• only acquire locks in one phase
• only release locks in a second phase
Time
• once one lock has been released, can never acquire
another lock during transaction
T1 obtain
ExclusiveLock(A)
T1:
T2:
T1 release lock(A)
R(A),
R(A), W(A), ….
T2 waits to obtain
ExclusiveLock(A)
R(A),W(A), C
T2 obtains
ExclusiveLock(A)
Lock-Based Concurrency Control
Two-phase Locking (2PL) Protocol:
–
Each Xact must obtain:
•
•
–
–
–
a S (shared) lock on object before reading, and
an X (exclusive) lock on object before writing.
If an Xact holds an X lock on an object, no other Xact can
get a lock (S or X) on that object.
System can obtain these locks automatically
Two phases: acquiring locks, and releasing them
•
•
No lock is ever acquired after one has been released
“Growing phase” followed by “shrinking phase”.
• Lock Manager keeps track of request for locks and
grants locks on database objects when they become
available.
Strict 2PL
•
2PL allows only serializable schedules but is
subjected to cascading aborts.
•
Example: rollback of T1 requires rollback of T2!
T1 obtain
ExclusiveLock(A)
T1:
T2:
T1 release
ExclusiveLock(A)
R(A), W(A),
R(A), W(A), R(B), W(B)
T2 obtains
ExclusiveLock(A)
•
Abort
T2 read T1’s value of
A, so it must also be
aborted.
To avoid Cascading aborts, use Strict 2PL
Strict 2PL (cont)
•
Strict Two-phase Locking (Strict 2PL) Protocol:
– Same as 2PL, except:
– All locks held by a transaction are released only
when the transaction completes
#locks
vs
• Advantage: no other transaction even reads anything
you write until you commit.
– e.g a transaction will only read committed data.
• Disadvantage: transactions end up waiting.
– e.g. inserts may lock a whole table
• Why? Because of Phantom problem
The Phantom Problem
• Even Strict 2PL (on
individual rows) will not
assure serializability:
• Consider T1 – “Find
oldest sailor with Rating
= 1”
• T1 sees 2 different
answers, even though it
did no updates itself.
Time
T1
1
Obtain Share Lock on all
existing Sailors with
rating = 1
2
Compute oldest sailor
(age = 71)
T2
3
Obtain Exclusive
Lock on new
Sailor tuple
4
Insert Sailor age
96 rating 1
5
Commit
6
Compute oldest sailor
(age = 96)
Transactions in SQL
• A new transaction is started with first SQL statement
issued by a user
SELECT S.SID, R.BID
FROM SAILORS S, RESERVES R
WHERE S.SID = R.SID
• All statements from that point on appear in the same
transaction
UPDATE SAILORS SET RATING = RATING - 1
UPDATE RESERVES SET DATE = DATE+1
• A user can use commands COMMIT or ROLLBACK to
explicitly end a transaction and start a new one
COMMIT
• Any new statements that follow will occur in a new
transaction
Transactions in SQL
• Because of performance, some anomalies might be acceptable for
some applications
– Internet apps in particular!
• SQL 92 supports different “Isolation Levels” for a transaction
(Lost Update not allowed at any level)
Isolation Level
Dirty
Read
Unrepeatable
Read
Phantom
Problem
Read Uncommitted Maybe Maybe
Maybe
Read Committed
No
Maybe
Maybe
Repeatable Reads
No
No
Maybe
Serializable
No
No
No
Aborting a Transaction (i.e., Rollback)
• If an xact Ti aborted, all actions must be undone.
• To undo actions of an aborted transaction, DBMS
maintains log which records every write.
• Log is also used to recover from system crashes
(which can occur as either hardware or software failure)
– All active Xacts at time of crash are aborted when
system comes back up.
The Database Log
XID 1:
REDO TID 1 <data>
REDO TID 2 <data>
COMMIT
XID 2:
REDO TID 3 <data>
…
Buffer
Data Page
Data Page
• Updates are logged in the database
log.
• Log consists of “records” that are
written sequentially.
– Typically chained together by Xact id
– Log is often archived on stable
storage.
– And backed up to even more stable
storage (like tape!)
Log
Data Pages Log
The Database Log
• Write-Ahead Logging protocol
– Log record must go to disk before the changed page!
– All log records for a transaction (including its commit
record) must be written to disk before the transaction is
considered “Committed”.
Update TID 2 <data>
Update TID 1 <data>
Buffer
Data Page
XID 1:
REDO TID 1 <data>
REDO TID 2 <data>
COMMIT
XID 2:
REDO TID 3 <data>
…
Data Page
3
SQL: COMMIT XID1
2
Log
1
Data Pages Log
The Log
• Log records are either UNDO or REDO records
– Depends on the Buffer manager
• UNDO required if: Buffer mgr allows uncommitted data to
overwrite stable version of committed data (STEAL buffer
management).
• REDO required if: xact can commit before all its updates are on
disk (NO FORCE buffer management).
• The following actions are recorded in the log:
– if Ti writes an object, write a log record with:
• If UNDO required need “before image”
• IF REDO required need “after image”.
– Ti commits/aborts: a log record indicating this action.
How Professor Roth made some traders
very unhappy at market open…
• Log on disk can fill up quickly
for active DBMS
• DBMS duplexes between two
logs
– Writes DB log to one
– While archiving the other
– When DB log is full, reverse
• Must STOP all DBMS activity
while you reverse
XID 1:
REDO TID 1 <data>
REDO TID 2 <data>
COMMIT
XID 2:
REDO TID 3 <data>
…
Log
Log1 LogLog2
How Professor Roth made some traders
very unhappy at market open…
• Stock DB at market open is VERY
busy
• We wanted to make sure there
was an empty log just before
market open
• I wrote a script to ‘switch’ logs 5
mins before market open
If (not already writing a log)
Then switch logs
Write the old log to tape
• Except I forgot a comma, so it
was more like this:
Switch logs; write the old log to tape
...and because the DB was writing out
its log, the DB hung until it was done
writing to tape!
XID 1:
REDO TID 1 <data>
REDO TID 2 <data>
COMMIT
XID 2:
REDO TID 3 <data>
…
Log
I’m waiting…
Log1
Log2
(Review) Goal: The ACID properties
•
•
A tomicity: All actions in the Xact happen, or none happen.
C onsistency: If each Xact is consistent, and the DB starts
consistent, it ends up consistent.
• I solation: Execution of one Xact is isolated from that of other
Xacts.
•
D urability:
If a Xact commits, its effects persist.
What happens if system crashes between
commit and flushing modified data to disk ?
D Durability - Recovering From a Crash
• Three phases:
– Analysis: Scan the log (forward from the most recent
checkpoint) to identify all Xacts that were active at the time
of the crash.
– Redo: Redo updates as needed to ensure that all logged
updates are in fact carried out and written to disk.
– Undo: Undo writes of all Xacts that were active at the
crash, working backwards in the log.
• At the end – all committed updates and only those
updates are reflected in the database.
• Some care must be taken to handle the case of a crash
occurring during the recovery process!
Summary
• Concurrency control and recovery are among the
most important functions provided by a DBMS.
• Concurrency control is automatic
– System automatically inserts lock/unlock requests and
schedules actions of different Xacts
– Property ensured: resulting execution is equivalent to
executing the Xacts one after the other in some order.
• Write-ahead logging (WAL) and the recovery
protocol are used to:
1. undo the actions of aborted transactions, and
2. restore the system to a consistent state after a crash.