I'm behind on a couple of posts I've been planning, but am trying to post something every day if possible. So today I'll post a cached fun piece on overinterpreting a random data phenomenon that's tricked me before. Recall that a random walk or a "drunkard's walk" (as in the title) is a sequence of vectors <span>x_1, x_2, ldots</span> in some <span>mathbb{R}^n</span> such that each <span>x_k</span> is obtained from <span>x{k-1}_</span> by adding noise. Here is a picture of a 1D random walk as a function of time: Weirdly satisfyingA random walk is the "null hypothesis" for any data with memory. If you are looking at some learning process that updates state to state with some degree of stochasticity, seeing a random walk means that your update steps are random and you're not in fact learning. If you graph some collection of activations from layer to layer of a transformer [...]
The original text contained 4 footnotes which were omitted from this narration.
The original text contained 1 image which was described by AI.
First published: January 12th, 2025
Source: https://www.lesswrong.com/posts/s39XbvtzzmusHxgky/the-purposeful-drunkard)
---
Narrated by TYPE III AUDIO).
Images from the article:
)
)
)
)
)
)
)
)
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts), or another podcast app.