Why CAP theorem is not enough?

Vishal bhardwaj
3 min readMay 4, 2022

To understand distributed data systems more, first let's understand CAP theorem, then we can discuss why it is not enough while designing distributed data systems.

CAP theorem

In an event of network failure in a distributed data system, it is possible to provide either availability or consistency but not both.

Let’s try to dig deep and understand a bit more about this, define this in terms of the linearizability of the system. Now the question comes, what is a linearizable system?

Linearizability means that modifications happen instantaneously, and once a registry value is written, any subsequent read operation will find the very same value as long as the registry will not undergo any modification.

(1) Linearizable system

Linearizability refers to the fact that changes are made instantly, and that once a registry value is written, each subsequent read operation will return the same value as long as the register remains unchanged.

So, if the requirements are such that we require a Linearizable system then, if replicas are disconnected from the other replicas owing to a network fault, as a result, some replicas are unable to execute requests while detached, they must either wait until the network problem is resolved, or they must wait until the network problem is resolved (Consistency).

(2) Non-linearizable system

In this case, we don’t have a single registry or single source of truth. We employ asynchronous database replication in our system, with a Primary node that handles both reads and writes and a Follower node that handles only reads.

Because replication is asynchronous, there is a time lag between the modification of the Primary node row and the time when the Follower makes the same update.

Here, Each replica can handle requests individually, even if they are not connected with other replicas (such as multi-readers). In this case, the application can be left alone and It can be used during network problems (Availability). Thus, applications that don’t require linearizability can be more tolerant of network failures.

But network partitions are a kind of fault, so they aren’t something you would normally choose while building your system.

So, the question comes, what choice do we have when there is no network partition?

PACELC Theorem

When the system is running normally in the absence of network partition, the tradeoff has to be made between latency and consistency.

(3) PACELC Theorem

So, in the case of partitioning, a choice must be made between availability and consistency, otherwise, in the case of normal system operation, one must choose between reduced response time and consistency while designing a distributed data system.

(4) PACELC Theorem example

Thanks ❤

[Connect with me on Linkedin]

--

--