Skip to main content

Transactions, ACID etc.

·2 mins
  • D = durability. Easiest to understand.
  • C = consistency, i.e. application-level constraints or invariants are met. Property of the application that’s using the DB and not the DB itself. So, kind of useless.
  • A = atomicity or, better explained as, abortability. Either all commands in a transaction go through or none of them.
  • I = isolation is the trickiest one.
    • Concurrent operations in a DB can lead to various race conditions.
      • Dirty reads: read something that’s not yet committed.
      • Dirty writes: transaction T1 writes on top of partial writes, which aren’t yet committed, by another transaction T2.
      • Read skew: Let’s say a long-running client reads a lot of records from a DB. If transactions keep changing records during its run, the state of the DB the client captures may not be consistent.
      • Write skew: Let’s say a transaction checks for a premise and then writes based on that. What happens if the premise is no longer valid by the time the write happens?
    • DB isolation features:
      • Read committed: doesn’t allow dirty reads or writes.
      • Snapshot isolation or repeatable reads: no read skews.
        • Need to track transaction level information per record to implement this.
      • Serializable: no write skews.
        • Strongest level but isn’t cheap to implement. So, not necessarily the default amongst all the DBs.
        • Algorithms to implement:
          • Literally serialize on the DB. Can work if all transactions are guaranteed to be fast.
          • 2PL: lock all rows that match the premise before updating them. So, pessimistic locking.
          • SSI, i.e. snapshot serializable isolation: Allow all transactions to run concurrently but don’t let them commit if they’d introduce a conflict. So, optimistic.
  • 2-phase commit and 2-phase locking, i.e. 2PC and 2PL are different things.
    • You use 2PC to support transactions across multiple systems. (I still need to learn about those in more detail.)

Open questions:

  • In 2PL, how do you handle situations where the premise was based on non-existent rows that appeared, because of some other transaction, before the write happened?