Enterprises adopt open models to control cost, latency, and IP. This session shows how Java teams select, integrate, and operate LLMs using platforms and tools like LangChain4j and vector search to run locally and the cloud. It covers benchmarking, model size vs. throughput, memory footprints on the JVM, response-time tuning, and safety layers. It highlights where the latest GenAI Java projects complement inference pipelines and how to evaluate RAG quality with reproducible metrics. Attendees see end-to-end flows, from data grounding to deployment, with attention to observability, configuration, and rollback strategies.
Type: Learning Session (50 min)
Track: Machine Learning and Artificial Intelligence
Audience Level: Intermediate
Speaker: Brian Benz