I am one of those people who function better by writing things down. One day, I realized that most of my notes don’t have to be private, so here they are - my second brain. Be warned that, if you stumble upon something here that doesn’t make sense to you, it isn’t meant to!
Transactions, ACID etc.
D = durability. Easiest to understand. C = consistency, i.e. application-level constraints or invariants are met. Property of the application that’s using the DB and not the DB itself. So, kind of useless. A = atomicity or, better explained as, abortability. Either all commands in a transaction go through or none of them. I = isolation is the trickiest one. Concurrent operations in a DB can lead to various race conditions. Dirty reads: read something that’s not yet committed. Dirty writes: transaction T1 writes on top of partial writes, which aren’t yet committed, by another transaction T2. Read skew: Let’s say a long-running client reads a lot of records from a DB. If transactions keep changing records during its run, the state of the DB the client captures may not be consistent. Write skew: Let’s say a transaction checks for a premise and then writes based on that. What happens if the premise is no longer valid by the time the write happens? DB isolation features: Read committed: doesn’t allow dirty reads or writes. Snapshot isolation or repeatable reads: no read skews. Need to track transaction level information per record to implement this. Serializable: no write skews. Strongest level but isn’t cheap to implement. So, not necessarily the default amongst all the DBs. Algorithms to implement: Literally serialize on the DB. Can work if all transactions are guaranteed to be fast. 2PL: lock all rows that match the premise before updating them. So, pessimistic locking. SSI, i.e. snapshot serializable isolation: Allow all transactions to run concurrently but don’t let them commit if they’d introduce a conflict. So, optimistic. 2-phase commit and 2-phase locking, i.e. 2PC and 2PL are different things. You use 2PC to support transactions across multiple systems. (I still need to learn about those in more detail.) Open questions: ...