We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode “Logits, log-odds, and loss for parallel circuits” by Dmitry Vaintrob

“Logits, log-odds, and loss for parallel circuits” by Dmitry Vaintrob

2025/1/20
logo of podcast LessWrong (30+ Karma)

LessWrong (30+ Karma)

Shownotes Transcript

Audio note: this article contains 51 uses of latex notation, so the narration may be difficult to follow. There's a link to the original text in the episode description.

Today I’m going to discuss how to think about logits like a statistician, and what this implies about circuits. This post doesn’t have any prerequisites other than perhaps a very basic statistical background that can be adequately recovered from the AI-generated “glossary” to the right. I think the material here is good thing to know in general (thinking through this helped clarify my thinking about a lot of things), and it will be useful background for a future post I’m planning on “SLT in a nutshell”. If you want a “TL:DR” takeaway of the discussion that follows, the gist is that neural networks use logit addition to integrate (roughly) independent “parallel” information from various sources; and that thinking [...]


Outline:

(01:14) Basics of logits and logistic tasks

(08:17) Parallel prediction circuits

(08:21) Log odds and independent predictions

(11:02) Independence and circuits

(14:37) Appreciating the wisdom of the elders

(15:31) Interpretability insights

The original text contained 5 footnotes which were omitted from this narration.


First published: January 20th, 2025

Source: https://www.lesswrong.com/posts/xFA2kstHifF9F2Fnm/logits-log-odds-and-loss-for-parallel-circuits)

    ---
    

Narrated by TYPE III AUDIO).