We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
back
LLM Distillation and Compression // Guanhua Wang // #278
49:47
Share
2024/12/17
MLOps.community
AI Chapters
Transcribe
Chapters
What's Guanhua's Preferred Coffee?
Key Takeaways from the Episode
Please Like, Share, and Subscribe to Our MLOps Channels!
What is the Phi Model?
Challenges in Optimizing Small Language Models
Overview and Benefits of DeepSpeed
What Are Some Crazy Unimplemented AI Ideas?
Post-Training Quantization vs. Quantization-Aware Training
Why Choose Quantization Over Distillation?
Using Lauras for LLM Optimization
Finding the LLM Scaling Sweet Spot
Exploring Quantization Techniques
Introduction to Domino: The Communication-Free LLM Training Engine
Training Performance Benchmarks of Domino
Strategies for Breaking Data Dependency in LLM Training
Wrap Up and Final Thoughts
Shownotes
Transcript
No transcript made for this episode yet, you may request it for free.