Jul 18, 2024
10:30 pm - 11:30 pm IST
vLLM: Easy, Fast, and Cheap LLM Serving For Everyone
Woosuk Kwon
Creator of vLLM
Kaichao You
vLLM contributor
In this webinar, we introduce vLLM, a high-performance, open-source LLM inference engine designed to make LLM serving easy, fast, and affordable for everyone.
Key Highlights:
Overview of vLLM and its key benefits
A deep dive into features like pipeline parallelism and speculative decoding
Live demonstration of setting up and optimizing LLM inference