Cache Systems: High-Performance for Workflows & AI

Caching in InnoSynth-Forjinn boosts performance, reliability, and cost efficiency for workflows, retrieval, agent memory, and repeated tool/model calls. This page covers available cache systems, usage scenarios, configuration, and best practices.

Why Use Cache?

Speed: Reduce latency for repeated queries, document retrieval, or LLM completions
Cost: Lower API calls to providers like OpenAI/GPT/Vector DBs by reusing computation/data
Scale: More requests per second with less load on storage and upstream services
Resilience: Serve results through transient LLM, DB, network outages

Supported Cache Types

In-Memory Cache

Fastest, lives entirely in app process or worker
Lost on restart; good for short-lived or dev/test flows

Redis Cache

clusterable, stable, supports persistence and multi-node setups
works for both token/session and retriever caches

Momento/Upstash Redis

Managed, cloud-based Redis-compatible cache
Ideal for cloud deployments and multi-region scale

GoogleGenerativeAIContextCache

Specialized cache for GenAI sessions/context; persists embeddings, search and session memory for context window reuse

Using Cache Nodes

Drag cache node (e.g. RedisCache, InMemoryCache) into retrieval/agent flow
Select "Cache enabled" option in Document Loader, Retriever, or agent
Configure cache TTL, key format, eviction/refresh logic as needed
Option to link to external Redis/Momento via Credentials

Best Practices

Use Redis/Momento for production, InMemory only for dev/test
Set suitable TTLs (time-to-live) for cache keys to balance speed and freshness
For retrieval: cache query results; for memory/tool calls: cache LLM/tool completions as appropriate
Monitor cache hit/miss ratio with platform dashboard or Redis CLI
Always store secrets/keys outside of cache, or use key hashing

Troubleshooting

Stale data: Lower TTL or implement cache invalidation policy.
Cache misses: Confirm key formatting is consistent; try warming with periodic “ping” queries.
Storage full/OOM: For Redis, monitor maxmemory and set eviction policy; scale Momento/Upstash tier as needed.

Clever use of caching is the secret weapon for scalable, ultra-low-latency AI/agent flows and cost-effective platform operations.

Forjinn Docs

Cache Systems