Caching is a first-class architectural concern in agentic systems. This talk breaks down how Java applications can layer internal, distributed, and semantic caches. We'll explore in-process caching with Caffeine for ultra-low-latency access, distributed caching with Redisson and Valkey for shared cache and semantic caching using Vector Similarity Search to reduce latency and cost while scaling LLM access.
Type: Learning Session (50 min)
Track: Application Performance, Manageability, and Tooling
Audience Level: Beginner
Speaker: Dmitry Polyakovsky