System Design¶
"Design is not just what it looks like and feels like. Design is how it works." — Steve Jobs
System design is the process of defining the architecture, components, modules, interfaces, and data for a system to satisfy specified requirements.
-
Scalability
Vertical vs horizontal scaling, load balancing, caching, and sharding strategies.
-
API Design
REST, GraphQL, gRPC patterns and best practices for service interfaces.
The Three Pillars¶
Every production system must balance these concerns:
Note: Click the diagram to view/edit the Excalidraw source.
| Pillar | Definition | Key Practices |
|---|---|---|
| Reliability | System works correctly despite faults | Redundancy, failover, testing, monitoring |
| Scalability | System handles increased load | Load balancing, caching, sharding, CDN |
| Maintainability | System is easy to operate and evolve | Clean code, documentation, observability |
Core Concepts¶
CAP Theorem¶
In a distributed data store, you can only guarantee two of the three:
Note: Click the diagram to view/edit the Excalidraw source.
| Property | Description |
|---|---|
| Consistency | Every read receives the most recent write or an error |
| Availability | Every request receives a response (not necessarily the latest) |
| Partition Tolerance | System works despite network failures between nodes |
Reality Check
Partition tolerance is mandatory in distributed systems. You choose between CP (Consistency + Partition Tolerance) and AP (Availability + Partition Tolerance).
ACID vs. BASE¶
| ACID (RDBMS) | BASE (NoSQL) |
|---|---|
| Atomicity | Basically Available |
| Consistency | Soft state |
| Isolation | Eventual consistency |
| Durability | |
| Strong consistency | High availability |
Back-of-the-Envelope Math¶
Essential numbers for capacity planning and system design interviews.
Powers of Two¶
| Power | Value | Common Name |
|---|---|---|
| $2^{10}$ | ~1,000 | 1 KB |
| $2^{20}$ | ~1,000,000 | 1 MB |
| $2^{30}$ | ~1,000,000,000 | 1 GB |
| $2^{40}$ | ~1,000,000,000,000 | 1 TB |
Latency Numbers¶
Note: Click the diagram to view/edit the Excalidraw source.
| Operation | Latency |
|---|---|
| L1 Cache Reference | 0.5 ns |
| Mutex Lock/Unlock | 100 ns |
| Main Memory Reference | 100 ns |
| Send 2KB over 1Gbps | 20,000 ns (20 us) |
| Read 1MB from Memory | 250,000 ns (250 us) |
| Datacenter Round Trip | 500,000 ns (500 us) |
| Disk Seek | 10,000,000 ns (10 ms) |
| CA to Netherlands Round Trip | 150,000,000 ns (150 ms) |
System Design Interview Framework¶
Note: Click the diagram to view/edit the Excalidraw source.
Deep Dives¶
- Scalability Strategies: Vertical vs Horizontal, Load Balancing, Caching, Sharding
- API Design: REST, GraphQL, gRPC, and best practices