We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
back
“Sparsely-connected cross-layer transcoders: preliminary findings” by jacob_drori
33:16
Share
2025/6/19
LessWrong (30+ Karma)
AI Chapters
Transcribe
Chapters
Introduction to Sparsely-connected Cross-layer Transcoders
Understanding the Architecture: Vanilla vs. Sparsely-connected Modes
Virtual Weights: A Simplified Explanation
Masking: How It Works and Its Importance
Recap of the Architecture and Methods
Training the Sparsely-connected Transcoders
Quantitative and Qualitative Results: What Did We Find?
Observations and Dashboards: Insights from the Data
How I Updated on These Results
Limitations: Dead Latents and High Excess FVU
Memory and Feature Splitting: Additional Challenges
Conclusion and Acknowledgements
Appendix A: Prior Work
Appendix B: Virtual Weights for Downstream Attention
Shownotes
Transcript
No transcript made for this episode yet, you may request it for free.