An AI stack: from scaling AI workloads to evaluating LLMs

Series

Video Embed

Hilary Term 2026 Strachey Lecture with Professor Ion Stoica, An AI stack: from scaling AI workloads to evaluating LLMs

Large language models (LLMs) have taken the world by storm, enabling new applications, intensifying GPU shortages, and raising

concerns about the accuracy of their outputs. In this talk, I will present several projects I have worked on to address these

challenges. Specifically, I will focus on Ray, a distributed framework for scaling AI workloads, vLLM and SGLang, two

high-throughput inference engines for LLMs, and LMArena, a platform for accurate LLM benchmarking. I will conclude with key

lessons learned and outline directions for future research.

More in this series

View Series

Transcript Available

Series

People

Keywords

Date Added: 26/02/2026

Duration: 00:55:58

Apple Podcast Video Apple Podcast Audio Video RSS Feed

Download Video Download Transcript