We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode 172: Transformers and Large Language Models

172: Transformers and Large Language Models

2024/3/11
logo of podcast Programming Throwdown

Programming Throwdown

Shownotes Transcript

172: Transformers and Large Language Models

Intro topic: Is WFH actually WFC?

News/Links:

Book of the Show

Patreon Plug https://www.patreon.com/programmingthrowdown?ty=h)

Tool of the Show

Topic: Transformers and Large Language Models

  • How neural networks store information

  • Latent variables

  • Transformers

  • Encoders & Decoders

  • Attention Layers

  • History

  • RNN

  • Vanishing Gradient Problem

  • LSTM

  • Short term (gradient explodes), Long term (gradient vanishes)

  • Differentiable algebra

  • Key-Query-Value

  • Self Attention

  • Self-Supervised Learning & Forward Models

  • Human Feedback

  • Reinforcement Learning from Human Feedback

  • Direct Policy Optimization (Pairwise Ranking)

** ★ Support this podcast on Patreon ★) **