infrastructure January 20, 2026 2 min read

Load Balancing

Understanding load balancing strategies, algorithms, and implementation patterns for distributed systems.

Load Balancing

Load balancing distributes incoming network traffic across multiple servers to ensure no single server bears too much load.

Why Load Balancing?

  • High Availability: No single point of failure
  • Scalability: Handle more traffic by adding servers
  • Performance: Optimal resource utilization
  • Flexibility: Maintenance without downtime

Types of Load Balancers

Layer 4 (Transport Layer)

Distributes traffic based on IP and TCP/UDP port info. Fastest, but no content awareness.

Layer 7 (Application Layer)

Can inspect HTTP headers, cookies, URL paths. More flexible routing decisions.

Algorithms

Round Robin

Distributes requests sequentially across servers:

Request 1 → Server A
Request 2 → Server B
Request 3 → Server C
Request 4 → Server A (cycle repeats)

Weighted Round Robin

Assigns more traffic to more powerful servers.

Least Connections

Routes to the server with the fewest active connections.

IP Hash

Routes based on client IP for session persistence.

Consistent Hashing

Used in distributed caching (e.g., Redis Cluster). Minimizes redistribution when adding/removing nodes.

Health Checks

Load balancers continuously monitor server health:

GET /health → 200 OK (Server is healthy)
GET /health → 503 (Server is unhealthy, remove from pool)

Real-World Example

Client → CDN → Load Balancer → [Server Pool]

              ┌─────┼─────┐
              ▼     ▼     ▼
           App-1  App-2  App-3
              │     │     │
              └─────┼─────┘

              Database (Primary)

              Database (Replica)
SolutionTypeUse Case
NginxSoftware L7Web applications
HAProxySoftware L4/L7High performance
AWS ALBCloud L7AWS deployments
AWS NLBCloud L4Ultra-low latency

Design Tip: Always design for failure. Use multiple load balancers in active-passive or active-active configurations.

Related Articles