We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode Deep Dive into Inference Optimization for LLMs with Philip Kiely

Deep Dive into Inference Optimization for LLMs with Philip Kiely

2024/11/5
logo of podcast Software Huddle

Software Huddle

Shownotes Transcript

Today we have Philip Kiely from Baseten on the show. Baseten is a Series B startup focused on providing infrastructure for AI workloads.

We go deep on Inference Optimization. We cover choosing a model, discuss the hype around Compound AI, choosing an Inference Engine, Optimization Techniques like Quantization and Speculative Decoding all the way down to your GPU choice.