Sunday 8 March 2015

Concurrency control with optimistic methods


One-line summary: Instead of using locks, optimistically do transaction then check at end if can commit results, and retry if not.

Overview/Main Points

  • Locking bad because:
    • overhead not present in sequential case. Even read-only transactions need to use locks.
    • lack of general-purpose deadlock free locking protocols.
    • large parts of DB on secondary storage - concurrency significantly lowered when it is necessary to leave congested node while accessing disk
    • to allow transaction to abort, need to hold on to locks until EOT
    • locking may only be necessary in worst case
  • optimistic:
    • reading a value/pointer can never cause loss of integrity; reads are completely unrestricted, although returning a value from a query is considered equivalent to a write.
    • writes are severly restricted. Have read phase, validation phase, then write phase. During read, all writes are on local copies. Validation ensures consistency across active transactions. Write phase is when local copies get written to global DB.
  • concurrency control procedures maintain sets of objects accessed by transaction. maintain read and write set for transaction.
  • serial equivalence: if have individual transactions that are executed concurrently, have serial equivalence if there exists some serial sequence of the transactions that produce the final data structure. serial equivalence is sufficient (but not necessary) for integrity of DB.
  • validation of serial equivalence: assign each transaction a transaction number (TN), and explicitly force serializability. Need one of these three conditions (for i
  • Ti completes write phase before Tj starts read phase
  • write set of Ti does not intersect read set of Tj, and Ti completes its write phase before Tj starts its write phase
  • write set of Ti does not intersect read set or write set of Tj, and Ti completes its read phase before Tj completes its read phase. (1) says Ti completes before Tj starts. (2) states writes of Ti don't affect read phase of Tj, and Ti finishes writing before Tj starts writing, hence doesn't overwrite Tj. (3) similar to (2), but does not require that Ti finish writing before Tj starts writing.Assign transactions numbers when transaction enters validation, to prevent fast transactions from blocking behind slow but earlier starting transactions.
  • serial validation: assign transaction nmber, validate, and do write phase all in critical section. Fine if one CPU and write phase can take place in main memory. Otherwise too inefficient - not enough concurrency. If multiple CPUs, be more clever about critical sections.
  • parallel validation: maintain set of active transactions (in read but not yet completed write phase); do validation against all transactions in active set. To prevent an aborted transaction from causing further aborts, do multistage validation.
  • analysis for large B-trees shows that optimistic concurrency control is a win - very rare that one insertion would cause another concurrent insertion to restart. (Assumes random access to B-tree.)

Relevance

Like restartable atomic sequences in OS world - non-blocking synchronization is a good thing, as long as conflicts are rare.

Flaws

  • I don't believe conflicts are as rare as they suggest. There is distinct non-randomness in realistic access patterns; these correlated accesses have much higher probability of conflicting.
  • possible to have starvation as opposed to deadlock with degenerate access patterns and optimistic access control. This is never mentioned.

No comments:

Post a Comment