FlashAttention 2: making Transformers 800% faster w/o approximation - with Tri Dao of Together AI - Transcript and Chapters - from Podcast Latent Space: The AI Engineer Podcast — Practitioners talking LLMs, CodeGen, Agents, Multimodality, AI UX, GPU Infra and all things Software 3.0