Load Balancing Guide

A robust Load Balancing configuration is critical for high-availability, scaling, and zero-downtime deployments of InnoSynth-Forjinn. This guide covers best practices, supported setups, recommended patterns, and troubleshooting for managing user and agent/API traffic.

Why Load Balancing?

Distributes incoming requests evenly across multiple app/server instances
Increases concurrency for heavy agent/AI workloads
Enables rolling upgrades and blue/green deployment (no downtime)
Provides fault tolerance—if one instance fails, traffic automatically reroutes

Options

L4/L7 Proxy

Use at either TCP (L4) or HTTP (L7) layers
Popular choices:
- Nginx, HAProxy, Traefik (Docker, cloud, Nix)
- AWS ALB/ELB, Google Cloud Load Balancer, Azure LB
- K8s Ingress (default, Nginx/GKE/Traefik/Kong plugins)

Round Robin, Sticky Sessions, or Weighted

Round Robin: Default, best for stateless flows and APIs
Sticky Sessions: If sessions/cookies are required for long/full conversations
Weighted: Slow node? Send less traffic, manually or by health/performance

Example: Nginx Config

upstream forjinn_cluster {
    server app1:3000;
    server app2:3000;
    server app3:3000;
}
server {
    listen 80;
    server_name forjinn.example.com;
    proxy_set_header Host $host;
    location / {
        proxy_pass http://forjinn_cluster;
    }
    # SSL/TLS config, error handling, etc...
}

Kubernetes

Use Service type: LoadBalancer for cloud-native zero-config scaling
For advanced use, deploy a custom Ingress controller with traffic shaping/rewriting

Health Checking

All load balancers support periodic health checks (/health or /status); remove dead pods/nodes automatically
Customize healthcheck endpoint in app config or docker-compose/k8s yaml

Scaling Strategy

Horizontal Scaling: Add more app/worker containers for bursty or batch load
Separate Web/API from Workers: Optimize performance/work separation

Security

Limit allowed origins and set up DDoS/WAF protection (cloud provider or 3rd party)
Terminate SSL at LB, then use internal HTTP for quicker app response

Troubleshooting

Uneven load? Check weights and instance health.
Sessions lost? Ensure LB/sticky cookie config, or prefer stateless JWT/token authentication.
Timeouts? Increase allowed timeout, especially for large agent/eval jobs.
Logs missing? Aggregate/load-balance logs to central collector.

Smart load balancing ensures smooth user experience, reliability, and resilience—always integrate with health checks, logs, and scaling automation.

Forjinn Docs

Load Balancing