Jul 18, 2024

10:30 pm - 11:30 pm IST

vLLM: Easy, Fast, and Cheap LLM Serving For Everyone

Woosuk Kwon

Creator of vLLM

Kaichao You

vLLM contributor

In this webinar, we introduce vLLM, a high-performance, open-source LLM inference engine designed to make LLM serving easy, fast, and affordable for everyone.

Key Highlights:

  • Overview of vLLM and its key benefits

  • A deep dive into features like pipeline parallelism and speculative decoding

  • Live demonstration of setting up and optimizing LLM inference