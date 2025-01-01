Deploying large language models (LLMs) that reliably work in real-world applications requires robust evaluation. This talk dives into hands-on techniques for crafting effective evals to measure and improve your LLM's performance, as well as spotlighting common developer mistakes and how to avoid them.

Beyond evals, we share battle-tested insights from integrating Gemini models into production applications used by 100s of millions. Expect practical takeaways on tackling challenges, implementing best practices, and actionable strategies to build LLM-powered applications you can rely on.

If your team is using LLMs for solving real problems, and want to move beyond academic benchmarks to real-world impact, this talk is for you.