** Summary** Four months after my post 'LLM Generality is a Timeline Crux', results on o1-preview update me significantly toward LLMs being capable of general reasoning, and hence of scaling straight to AGI.
** Previous post** In June of 2024, I made a post, 'LLM Generality is a Timeline Crux', in which I argue
** Reasons to update** In the original post, I gave the three main pieces of evidence against LLMs doing general reasoning that I found most compelling: blocksworld, planning/scheduling, and ARC-AGI (see original for details). All three of those seem significantly weakened in light of recent research. Most dramatically, a new paper on blocksworld has recently been published by some of the same highly LLM-skeptical researchers (Valmeekam et al, led by Subbarao Kambhampati: 'LLMs Still Can’t Plan; Can LRMs? A Preliminary Evaluation of Openai’S O1 on Planbench'. TODO The planning/scheduling evidence seemed weaker almost immediately after the post [...]
Outline:
(00:04) Summary
(00:21) Previous post
(00:32) Reasons to update
(03:30) Updated probability estimates
First published: November 7th, 2024
Source: https://www.lesswrong.com/posts/wN4oWB4xhiiHJF9bS/llms-look-increasingly-like-general-reasoners)
---
Narrated by TYPE III AUDIO).