This content originally appeared on DEV Community and was authored by ZeeshanAli-0704
Capacity Estimation in System Design
Capacity estimation is essential to ensure that a system can handle its expected load and perform efficiently. It involves calculating the resources needed for processing/Traffic handling, storage, and network bandwidth.
Key Rules for System Design Estimation Calculations
-
Rounding Approximations
- Simplify calculations by rounding to more manageable numbers.
- Example: Instead of calculating for 86,400 seconds in a day, use 10,000 seconds to simplify.
-
Powers of 2 and 10
- Familiarize yourself with powers of 2 and 10 for quick estimations.
- Example values for powers of 2: 2, 4, 8, 16, 32, 64, etc.
- Example values for powers of 10:
- (10^1 = 10)
- (10^2 = 100)
- (10^3 = 1,000)
- (10^6 = 1,000,000) (1 million)
- (10^9 = 1,000,000,000) (1 billion)
- (10^12 = 1,000,000,000,000) (1 trillion)
-
Metric System
- Use metric system units for large numbers:
- 1 million = (10^6)
- 1 billion = (10^9)
- 1 trillion = (10^12)
- Use metric system units for large numbers:
Storage Capacity
-
Understand common storage units:
- 1 KB = (10^3) bytes
- 1 MB = (10^6) bytes
- 1 GB = (10^9) bytes
- 1 TB = (10^12) bytes
- 1 PB = (10^15) bytes
-
Key Metrics to Memorize
- 1 million requests per day ≈ 12 requests/second
- 1 million requests per minute ≈ 700 requests/second
- 1 million requests per hour ≈ 4,200 requests/minute
-
Latency Numbers
- Familiarize yourself with common latency benchmarks to make informed decisions during system design.
Note: Google on the table for latency you will find table
Table for Powers of 2 and 10
Power of 2 | Value | Power of 10 | Value |
---|---|---|---|
(2^1) | 2 | (10^1) | 10 |
(2^2) | 4 | (10^2) | 100 |
(2^3) | 8 | (10^3) | 1,000 |
(2^4) | 16 | (10^6) | 1,000,000 |
(2^5) | 32 | (10^9) | 1,000,000,000 |
(2^6) | 64 | (10^12) | 1,000,000,000,000 |
Let’s go through the capacity estimation process step-by-step using a hypothetical Twitter-like application as an example.
1. Traffic Estimation
Monthly Active Users (MAU): The number of unique users who use the application in a month.
Daily Active Users (DAU): The number of unique users who use the application in a day.
Example:
MAU: 300 million
DAU: 100 million (assume 1/3 of MAUs are active daily)
2. Read and Write Requests
To estimate read and write requests, we need to make some assumptions about user behavior.
Average Tweets per Day per User:
Let’s assume each active user tweets 2 times per day.
Read Requests:
Each user reads 100 tweets per day (including their feed, replies, and notifications).
Write Requests:
Each tweet is a write request.
Additional write requests for likes, retweets, and replies. Assume 2 additional write requests per tweet.
Calculations:
Daily Write Requests:
- Daily Write Requests=DAU×(Average Tweets per User+Additional Writes per Tweet)
- Daily Write Requests=DAU×(Average Tweets per User+Additional Writes per Tweet)
- Daily Write Requests=100 million×(2+2)=400 million
- Daily Write Requests=100 million×(2+2)=400 million
Daily Read Requests:
Daily Read Requests=DAU×Average Reads per User
Daily Read Requests=DAU×Average Reads per User
Daily Read Requests=100 million×100=10 billion
Daily Read Requests=100 million×100=10 billion
Requests per Second (RPS):
There are 86,400 seconds in a day.
Write RPS=Daily Write Requests86,400
Write RPS=86,400Daily Write Requests
Write RPS=400 million86,400≈4,630 writes per second
Write RPS=86,400400 million≈4,630 writes per second
Read RPS=Daily Read Requests86,400
Read RPS=86,400Daily Read Requests
Read RPS=10 billion86,400≈115,740 reads per second
Read RPS=86,40010 billion≈115,740 reads per second
- Storage Requirements
Assume the following for storage calculations:
Average size of a tweet: 280 bytes
Retention period: 1 year (365 days)
Additional storage for metadata (likes, retweets, etc.): 3 times the tweet size
Calculations:
Daily Storage for Tweets:
Daily Storage=Daily Write Requests×Average Tweet Size×4
Daily Storage=Daily Write Requests×Average Tweet Size×4
Daily Storage=400 million×280×4≈448 TB
Daily Storage=400 million×280×4≈448 TB
Annual Storage:
Annual Storage=Daily Storage×365
Annual Storage=Daily Storage×365
Annual Storage=448 TB×365≈163 PB
Annual Storage=448 TB×365≈163 PB
- Bandwidth Requirements
Assume each read and write request has the following average sizes:
Write request: 1 KB (including metadata)
Read request: 10 KB (average size of tweets fetched in a read)
Calculations:
Daily Bandwidth for Writes:
Daily Write Bandwidth=Daily Write Requests×1 KB
Daily Write Bandwidth=Daily Write Requests×1 KB
Daily Write Bandwidth=400 million×1 KB=400 TB
Daily Write Bandwidth=400 million×1 KB=400 TB
Daily Bandwidth for Reads:
Daily Read Bandwidth=Daily Read Requests×10 KB
Daily Read Bandwidth=Daily Read Requests×10 KB
Daily Read Bandwidth=10 billion×10 KB=100 PB
Daily Read Bandwidth=10 billion×10 KB=100 PB
Total Bandwidth per Day:
Total Daily Bandwidth=Daily Write Bandwidth+Daily Read Bandwidth
Total Daily Bandwidth=Daily Write Bandwidth+Daily Read Bandwidth
Total Daily Bandwidth=400 TB+100 PB≈100.4 PB
Total Daily Bandwidth=400 TB+100 PB≈100.4 PB
Bandwidth per Second:
Bandwidth per Second=Total Daily Bandwidth86,400
Bandwidth per Second=86,400Total Daily Bandwidth
Bandwidth per Second=100.4 PB86,400≈1.16 TB/s
Bandwidth per Second=86,400100.4 PB≈1.16 TB/s
Summary
For a Twitter-like application with 100 million daily active users:
Daily Write Requests: 400 million
Daily Read Requests: 10 billion
Write RPS: ~4,630 writes/second
Read RPS: ~115,740 reads/second
Annual Storage Requirement: ~163 PB
Bandwidth Requirement: ~1.16 TB/s
This content originally appeared on DEV Community and was authored by ZeeshanAli-0704