Cache Systems
Learn about cache systems and how to implement it effectively.
2 min read
🆕Recently updated
Last updated: 12/9/2025
Cache Systems: High-Performance for Workflows & AI
Caching in InnoSynth-Forjinn boosts performance, reliability, and cost efficiency for workflows, retrieval, agent memory, and repeated tool/model calls. This page covers available cache systems, usage scenarios, configuration, and best practices.
Why Use Cache?
- Speed: Reduce latency for repeated queries, document retrieval, or LLM completions
- Cost: Lower API calls to providers like OpenAI/GPT/Vector DBs by reusing computation/data
- Scale: More requests per second with less load on storage and upstream services
- Resilience: Serve results through transient LLM, DB, network outages
Supported Cache Types
In-Memory Cache
- Fastest, lives entirely in app process or worker
- Lost on restart; good for short-lived or dev/test flows
Redis Cache
- clusterable, stable, supports persistence and multi-node setups
- works for both token/session and retriever caches
Momento/Upstash Redis
- Managed, cloud-based Redis-compatible cache
- Ideal for cloud deployments and multi-region scale
GoogleGenerativeAIContextCache
- Specialized cache for GenAI sessions/context; persists embeddings, search and session memory for context window reuse
Using Cache Nodes
- Drag cache node (e.g. RedisCache, InMemoryCache) into retrieval/agent flow
- Select "Cache enabled" option in Document Loader, Retriever, or agent
- Configure cache TTL, key format, eviction/refresh logic as needed
- Option to link to external Redis/Momento via Credentials
Best Practices
- Use Redis/Momento for production, InMemory only for dev/test
- Set suitable TTLs (time-to-live) for cache keys to balance speed and freshness
- For retrieval: cache query results; for memory/tool calls: cache LLM/tool completions as appropriate
- Monitor cache hit/miss ratio with platform dashboard or Redis CLI
- Always store secrets/keys outside of cache, or use key hashing
Troubleshooting
- Stale data: Lower TTL or implement cache invalidation policy.
- Cache misses: Confirm key formatting is consistent; try warming with periodic “ping” queries.
- Storage full/OOM: For Redis, monitor maxmemory and set eviction policy; scale Momento/Upstash tier as needed.
Clever use of caching is the secret weapon for scalable, ultra-low-latency AI/agent flows and cost-effective platform operations.