News
Falling for the statistical parrot " Less Wrong
1+ day, 10+ hour ago (1164+ words) If it reads confused and stupid, for once it really is part of the intended message I guess. Epistemic status: 0. Sun 2. 30am, with Claude having helped me prepare last minute a 4h lecture I had no adequate time for. And after a…...
A relatively brief explanation of Boltzmann Brains " Less Wrong
1+ day, 14+ hour ago (575+ words) (Initially written for the LW Wiki, but then I realized it was looking more like a post instead.) In 1895, the physicist Ignaz Robert Sch'tz, who worked as an assistant to the more eminent physicist Ludwig Boltzmann, wondered if our observed…...
Benchmarking Real Work " Less Wrong
1+ day, 15+ hour ago (419+ words) Thanks to Megan Kinniment for helpful comments and discussion. This is a two-part series on capability evaluation. Part 1 is about acquiring'fuzzy tasks, and part 2 is about analyzing'them. There are several well-described limitations of time horizons. But the strongest reason that…...
Trying to use NLAs to find out how Qwen 2. 5 7 B does multiplication " Less Wrong
1+ day, 16+ hour ago (347+ words) Neural language autoencoders were just introduced by Anthropic. In a fascinating paper, they showed that you can take the residual stream activations...
On getting unstuck " Less Wrong
1+ day, 10+ hour ago (108+ words) After more than a year of trials and new models, Anthropic's Claude AI has finally managed to beat Pok'mon Red. The writeup that clued me in to this...
NLA Verbalizations on Audit Bench: Llama 70 B " Less Wrong
2+ day, 6+ hour ago (121+ words) Quick Summary: * Ran Llama 70 B through Audit Bench with NLA * Strong Evidence evals were less sensitive to sampling method and more robust to KTO a...
An Introduction to Exemplar Partitioning for Mechanistic Interpretability " Less Wrong
2+ day, 7+ hour ago (833+ words) Voronoi partitions on activations reveal interpretable structure with orders of magnitude less compute than SAEs....
An Argument for Analogies'Polymaths 1/3 ' Less Wrong
2+ day, 9+ hour ago (234+ words) The following is a link-post to a series about polymathy (. .ism?) and makes a case for arguing by analogy as opposed to first principles (most of the...
Critical Thinking as a Gym Schedule " Less Wrong
2+ day, 15+ hour ago (729+ words) I have an apparently novel n=1 experiment. I would like to start by making it n=2. Hello, how are you? I tried to make myself smarter and it worked. The basic result is that I used to make accurate predictions…...
Why I am not too worried about AIpocalypse: Scott Alexander vs Nicolaus Copernicus " Less Wrong
2+ day, 15+ hour ago (30+ words) I have no good gears-level model of AI, and the expert views are all over the place (see AI Doc), so the only remaining argument is my physical intui...