This content originally appeared on DEV Community and was authored by Manas Tole
The CAP theorem, introduced by Eric Brewer in 2000, helps us understand the limitations of distributed systems.
It says:
In a distributed system, you can only choose 2 out of 3 properties at the same time:
Consistency (C)
Availability (A)
Partition Tolerance (P)
You cannot have all three at once.
The 3 Pillars of CAP
1. Consistency (C)
Meaning: Every read gets the latest written data (or an error if not possible).
Analogy: Imagine you update your WhatsApp profile picture. If your friend checks it, they should see the latest photo immediately.
-
Real-World Example:
- In banking systems, if you transfer $100 from your account, the system must instantly update the balance everywhere. Otherwise, you could withdraw money twice.
- Here, Consistency is critical.
2. Availability (A)
Meaning: The system always responds to requests (read/write), even if the data is not the most recent.
Analogy: Think of Amazon.com during a big sale. Even if one server is slow to update stock, the website should still show you some product information instead of crashing.
-
Real-World Example:
- In e-commerce websites, users should be able to browse and place orders even if one server hasn’t updated the latest inventory count.
- Slight data delay is acceptable, but downtime is not.
3. Partition Tolerance (P)
Meaning: The system continues to run even if some parts of the network fail or cannot talk to each other.
Analogy: Imagine a company with two office branches. If the internet line between them goes down, both offices should still keep working separately.
-
Real-World Example:
- In global apps like Netflix or Hotstar, servers are spread across countries. If network issues prevent servers in the US from talking to servers in India, both regions should still serve users.
- This is Partition Tolerance.
In short:
C = Correct data
A = Always available
P = Survives network failures
But in distributed systems, you can’t have all three.
The CAP Trade-Off: Choosing 2 out of 3
The CAP theorem says that in a distributed system, if there’s a network partition (nodes can’t talk to each other), you must choose between:
Consistency (C) → Everyone sees the same data.
Availability (A) → The system always responds.
Partition Tolerance (P) → The system keeps working even if the network is broken.
You cannot have all three at once.
Scenarios
1. CP (Consistency + Partition Tolerance, sacrifice Availability)
Data is always correct and consistent across nodes.
Works even if the network splits.
May reject or block requests during network issues.
Real-world examples:
**Traditional relational databases, such as MySQL and PostgreSQL, when configured for strong consistency, prioritize consistency over availability during network partitions.
-
Banking/ATM systems:
- Imagine you are withdrawing $200 from an ATM. The banking system must immediately deduct the amount from your account and update the balance across all servers before allowing any other transactions. If the system chose availability over consistency, another ATM might still show your old balance, allowing you to withdraw additional money you don’t actually have. To avoid such errors and maintain trust, banks sacrifice temporary availability (e.g., ATM being “out of service” during synchronization) in favor of strict consistency.
Accuracy is more important than 100% uptime.
2. AP (Availability + Partition Tolerance, sacrifice Consistency)
The system always responds (no downtime).
Works during network partitions.
Different nodes may show slightly outdated or inconsistent data.
Real-world examples:
NoSQL databases like Cassandra and DynamoDB are designed to be highly available and partition-tolerant, potentially at the cost of strong consistency.
-
Amazon shopping cart:
- Suppose you add a laptop to your Amazon shopping cart during a Black Friday sale. Even if some servers are temporarily down or experiencing delays, the system ensures that your action succeeds. The cart might not instantly reflect the most up-to-date stock count (slightly relaxing consistency), but availability is prioritized so customers can keep shopping without interruptions. Any conflicts or stock adjustments are resolved later during the checkout process.
Users prefer the site working, even if data takes a bit to catch up.
3. CA (Consistency + Availability, sacrifice Partition Tolerance)
Always consistent and available… but only when there’s no network failure.
Fails if the network breaks.
Single-node databases can provide both consistency and availability but aren’t partition-tolerant. In a distributed setting, this combination is theoretically impossible.).
Real-world example:
A simple database on your laptop → always consistent and available because there are no partitions.
But in global distributed systems, partitions are inevitable, so CA is not realistic.
Practical Design Strategies
Since you can’t have all three (Consistency, Availability, and Partition Tolerance) simultaneously, system designers make trade-offs based on application needs. Here are the most common approaches with real-world examples:
1. Eventual Consistency
Definition:
Updates do not need to be instantly visible everywhere. Instead, changes are propagated across servers gradually, and eventually, all replicas converge to the same state.When it’s used:
Useful when temporary inaccuracies are tolerable, but high availability and responsiveness are critical.-
Benefits:
- High availability (operations rarely fail).
- Low latency for reads and writes.
- Scales well across globally distributed systems.
-
Drawbacks:
- Users may temporarily see stale or outdated data.
- Not suitable for financial or mission-critical operations where correctness is essential.
-
Examples:
- DNS servers: Updating a domain’s IP may take hours to propagate, but eventually, all users see the correct record.
- CDNs (like Cloudflare, Akamai): Cached content may lag behind the origin server, but ensures faster delivery worldwide.
- Social media likes/comments: One user may see 99 likes while another sees 100, but eventually the counts match.
2. Strong Consistency
Definition:
Once a write is confirmed, every subsequent read reflects the latest value—regardless of which server is queried.When it’s used:
Needed in domains where accuracy and correctness are more important than uptime or speed.-
Benefits:
- Guarantees correctness and prevents anomalies like double-spending or overbooking.
- Simplifies reasoning about system state since data is always up-to-date.
-
Drawbacks:
- Lower availability during network partitions (the system may reject requests instead of risking inconsistency).
- Higher latency due to synchronization overhead.
-
Examples:
- Banking systems (ATMs, transfers): Ensures account balances are always accurate.
- Stock trading platforms: Prevents selling more shares than exist.
- Airline seat booking: Prevents two passengers from booking the same seat.
3. Tunable Consistency
Definition:
Some distributed databases let you configure the consistency level per operation—allowing you to balance between speed (availability) and correctness (consistency).When it’s used:
Ideal for large-scale applications where some operations require strict correctness, while others can tolerate temporary inconsistency.-
Benefits:
- Flexibility: choose the right balance for each type of query.
- Optimizes resource usage depending on operation criticality.
-
Drawbacks:
- More complex for developers and architects to design properly.
- Risk of misconfiguration leading to errors.
-
Examples:
- Cassandra: Lets developers specify whether queries need responses from one replica (faster, less consistent) or from the majority (slower, but more consistent).
-
E-commerce platforms:
- Order placement: Strong consistency to prevent duplicate or missing orders.
- Product recommendations: Eventual consistency is fine if updates take seconds to appear.
-
Social platforms:
- Messaging: Strong consistency ensures no message loss or misordering.
- Likes/follows: Eventual consistency works since small delays don’t hurt user experience.
4. Quorum-Based Approaches
Definition:
Quorum systems rely on a majority voting mechanism among nodes to determine the outcome of reads and writes. A quorum is reached when a sufficient number of nodes agree, ensuring consistency even in the presence of faults.When it’s used:
Often used when a balance between consistency and availability is required, especially in distributed consensus protocols.-
Benefits:
- Provides fault tolerance while still maintaining strong guarantees.
- Prevents split-brain scenarios in distributed systems.
- Scales well with replication and can tolerate node failures.
-
Drawbacks:
- Increased latency since multiple nodes must coordinate before confirming operations.
- Requires careful configuration of quorum sizes (read quorum + write quorum > total replicas).
-
Examples:
- Consensus algorithms: Paxos, Raft, Zab (used in ZooKeeper).
- Distributed databases: Dynamo-style systems, MongoDB with replica sets, Cassandra (when configured with quorum reads/writes).
- Use cases: Leader election, replicated logs, transaction coordination.
Beyond CAP: PACELC Theorem
CAP explains trade-offs during partitions, but what about when the system is running normally?
Daniel Abadi extended CAP with PACELC:
If Partition (P) → trade-off is Availability (A) vs Consistency (C) (same as CAP).
Else (E) → trade-off is Latency (L) vs Consistency (C).
Example: Even without network issues, if you want strong consistency, you’ll pay with higher latency (slower responses).
Conclusion
Banking systems → Prefer CP (accuracy over uptime).
E-commerce / Streaming apps (Amazon, Netflix, Hotstar) → Prefer AP (uptime over strict accuracy).
Single-node apps → Can have CA, but not when distributed.
The key lesson:
CAP is not about finding the “best” property.
It’s about making informed trade-offs based on your app’s needs.
This content originally appeared on DEV Community and was authored by Manas Tole