aarondwi.github.io

View the Project on GitHub

Is Serializability Needed?

As long as the devs can tune the concurrency, it is not

Most devs, are already used to manage part of concurrency. There are lots of reasons:

  1. Databases have different implementation for isolation, with different guarantee, and most don’t even support serializability. Devs typically overcome with advisory locking, FOR UPDATE/SHARE, optimistic CC via version column, etc.
  2. Most people also use cache to speed up read queries, and without the checking of current state. Which makes the system as a whole not serializable.
  3. Big companies typically need to use microservices, to break the physical barrier to development. This makes the need for distributed transactions, in forms of choreography/orchestration appear. Both are NOT serializable, as I argued here
  4. Handling non-transactional system, such as 3rd party, already force user to think about concurrency semantics.
  5. As I argued here, even non Invariant Confluence data can be handled without serializability, albeit need more work.
  6. CPU can also deliver wrong results. Its effect depends, but for ACID, mutex violation is the worst that can happen.

These big companies/systems show that even without complex CC algo, or only limited guarantee, with domain understanding, system can be made to work, meeting perf/integrity requirement:

  1. Facebook only use RA + RYW. Flighttracker does RYW consistency company-wide, with RAMP-TAO for atomicity. Before those 2, facebook does fully without any kind of constraints
  2. Alibaba + Ant Financial do transactions across microservices with SEATA, which at most is just a RC system
  3. Telco system and ad networks, with NDB/RonDB which are only a RC system. VoltDB has serializable per shard, 2PC cross shard. 2PC are not usually used caused it is slow. VoltDB also XDCR makes it basically eventual.