Forjinn Docs

Development Platform

Documentation v2.0
Made with
by Forjinn

Load Balancing

Learn about load balancing and how to implement it effectively.

2 min read
🆕Recently updated
Last updated: 12/9/2025

Load Balancing Guide

A robust Load Balancing configuration is critical for high-availability, scaling, and zero-downtime deployments of InnoSynth-Forjinn. This guide covers best practices, supported setups, recommended patterns, and troubleshooting for managing user and agent/API traffic.


Why Load Balancing?

  • Distributes incoming requests evenly across multiple app/server instances
  • Increases concurrency for heavy agent/AI workloads
  • Enables rolling upgrades and blue/green deployment (no downtime)
  • Provides fault tolerance—if one instance fails, traffic automatically reroutes

Options

L4/L7 Proxy

  • Use at either TCP (L4) or HTTP (L7) layers
  • Popular choices:
    • Nginx, HAProxy, Traefik (Docker, cloud, Nix)
    • AWS ALB/ELB, Google Cloud Load Balancer, Azure LB
    • K8s Ingress (default, Nginx/GKE/Traefik/Kong plugins)

Round Robin, Sticky Sessions, or Weighted

  • Round Robin: Default, best for stateless flows and APIs
  • Sticky Sessions: If sessions/cookies are required for long/full conversations
  • Weighted: Slow node? Send less traffic, manually or by health/performance

Example: Nginx Config

upstream forjinn_cluster {
    server app1:3000;
    server app2:3000;
    server app3:3000;
}
server {
    listen 80;
    server_name forjinn.example.com;
    proxy_set_header Host $host;
    location / {
        proxy_pass http://forjinn_cluster;
    }
    # SSL/TLS config, error handling, etc...
}

Kubernetes

  • Use Service type: LoadBalancer for cloud-native zero-config scaling
  • For advanced use, deploy a custom Ingress controller with traffic shaping/rewriting

Health Checking

  • All load balancers support periodic health checks (/health or /status); remove dead pods/nodes automatically
  • Customize healthcheck endpoint in app config or docker-compose/k8s yaml

Scaling Strategy

  • Horizontal Scaling: Add more app/worker containers for bursty or batch load
  • Separate Web/API from Workers: Optimize performance/work separation

Security

  • Limit allowed origins and set up DDoS/WAF protection (cloud provider or 3rd party)
  • Terminate SSL at LB, then use internal HTTP for quicker app response

Troubleshooting

  • Uneven load? Check weights and instance health.
  • Sessions lost? Ensure LB/sticky cookie config, or prefer stateless JWT/token authentication.
  • Timeouts? Increase allowed timeout, especially for large agent/eval jobs.
  • Logs missing? Aggregate/load-balance logs to central collector.

Smart load balancing ensures smooth user experience, reliability, and resilience—always integrate with health checks, logs, and scaling automation.