Skip to content
Summit

Liam Torres

Staff ML Engineer, OpenAI

ML SystemsGPU ProgrammingInference Optimization

About

Liam works on inference optimization at OpenAI. He's one of the core contributors to the Triton GPU programming language and has published research on speculative decoding and mixture-of-experts architectures.

Session

Day 2 — Sept 19 · 14:00 — 14:45

Inside speculative decoding

Track B