infrastructure January 20, 2026 • 2 min read

Load Balancing

Understanding load balancing strategies, algorithms, and implementation patterns for distributed systems.

#system-design #load-balancing #distributed-systems #scalability

Load Balancing

Load balancing distributes incoming network traffic across multiple servers to ensure no single server bears too much load.

Why Load Balancing?

High Availability: No single point of failure
Scalability: Handle more traffic by adding servers
Performance: Optimal resource utilization
Flexibility: Maintenance without downtime

Types of Load Balancers

Layer 4 (Transport Layer)

Distributes traffic based on IP and TCP/UDP port info. Fastest, but no content awareness.

Layer 7 (Application Layer)

Can inspect HTTP headers, cookies, URL paths. More flexible routing decisions.

Algorithms

Round Robin

Distributes requests sequentially across servers:

Request 1 → Server A
Request 2 → Server B
Request 3 → Server C
Request 4 → Server A (cycle repeats)

Weighted Round Robin

Assigns more traffic to more powerful servers.

Least Connections

Routes to the server with the fewest active connections.

IP Hash

Routes based on client IP for session persistence.

Consistent Hashing

Used in distributed caching (e.g., Redis Cluster). Minimizes redistribution when adding/removing nodes.

Health Checks

Load balancers continuously monitor server health:

GET /health → 200 OK (Server is healthy)
GET /health → 503 (Server is unhealthy, remove from pool)

Real-World Example

Client → CDN → Load Balancer → [Server Pool]
                    ↓
              ┌─────┼─────┐
              ▼     ▼     ▼
           App-1  App-2  App-3
              │     │     │
              └─────┼─────┘
                    ▼
              Database (Primary)
                    │
              Database (Replica)