Audio narrations of LessWrong posts. Includes all curated posts and all posts with 125+ karma. If you'd like more, subscribe to the “Lesswrong (30+ karma)” feed.
…
continue reading

1
[Linkpost] “Jaan Tallinn’s 2024 Philanthropy Overview” by jaan
1:17
1:17
Play later
Play later
Lists
Like
Liked
1:17This is a link post. to follow up my philantropic pledge from 2020, i've updated my philanthropy page with the 2024 results. in 2024 my donations funded $51M worth of endpoint grants (plus $2.0M in admin overhead and philanthropic software development). this comfortably exceeded my 2024 commitment of $42M (20k times $2100.00 — the minimum price of …
…
continue reading
I’ve been thinking recently about what sets apart the people who’ve done the best work at Anthropic. You might think that the main thing that makes people really effective at research or engineering is technical ability, and among the general population that's true. Among people hired at Anthropic, though, we’ve restricted the range by screening fo…
…
continue reading

1
[Linkpost] “To Understand History, Keep Former Population Distributions In Mind” by Arjun Panickssery
5:42
5:42
Play later
Play later
Lists
Like
Liked
5:42This is a link post. Guillaume Blanc has a piece in Works in Progress (I assume based on his paper) about how France's fertility declined earlier than in other European countries, and how its power waned as its relative population declined starting in the 18th century. In 1700, France had 20% of Europe's population (4% of the whole world population…
…
continue reading

1
“AI-enabled coups: a small group could use AI to seize power” by Tom Davidson, Lukas Finnveden, rosehadshar
15:22
15:22
Play later
Play later
Lists
Like
Liked
15:22We’ve written a new report on the threat of AI-enabled coups. I think this is a very serious risk – comparable in importance to AI takeover but much more neglected. In fact, AI-enabled coups and AI takeover have pretty similar threat models. To see this, here's a very basic threat model for AI takeover: Humanity develops superhuman AI Superhuman AI…
…
continue reading
Back in the 1990s, ground squirrels were briefly fashionable pets, but their popularity came to an abrupt end after an incident at Schiphol Airport on the outskirts of Amsterdam. In April 1999, a cargo of 440 of the rodents arrived on a KLM flight from Beijing, without the necessary import papers. Because of this, they could not be forwarded on to …
…
continue reading

1
“Training AGI in Secret would be Unsafe and Unethical” by Daniel Kokotajlo
10:46
10:46
Play later
Play later
Lists
Like
Liked
10:46Subtitle: Bad for loss of control risks, bad for concentration of power risks I’ve had this sitting in my drafts for the last year. I wish I’d been able to release it sooner, but on the bright side, it’ll make a lot more sense to people who have already read AI 2027. There's a good chance that AGI will be trained before this decade is out. By AGI I…
…
continue reading

1
“Why Should I Assume CCP AGI is Worse Than USG AGI?” by Tomás B.
1:15
1:15
Play later
Play later
Lists
Like
Liked
1:15Though, given my doomerism, I think the natsec framing of the AGI race is likely wrongheaded, let me accept the Dario/Leopold/Altman frame that AGI will be aligned to the national interest of a great power. These people seem to take as an axiom that a USG AGI will be better in some way than CCP AGI. Has anyone written justification for this assumpt…
…
continue reading

1
“Surprising LLM reasoning failures make me think we still need qualitative breakthroughs for AGI” by Kaj_Sotala
35:51
35:51
Play later
Play later
Lists
Like
Liked
35:51Introduction Writing this post puts me in a weird epistemic position. I simultaneously believe that: The reasoning failures that I'll discuss are strong evidence that current LLM- or, more generally, transformer-based approaches won't get us AGI As soon as major AI labs read about the specific reasoning failures described here, they might fix them …
…
continue reading

1
“Frontier AI Models Still Fail at Basic Physical Tasks: A Manufacturing Case Study” by Adam Karvonen
21:00
21:00
Play later
Play later
Lists
Like
Liked
21:00Dario Amodei, CEO of Anthropic, recently worried about a world where only 30% of jobs become automated, leading to class tensions between the automated and non-automated. Instead, he predicts that nearly all jobs will be automated simultaneously, putting everyone "in the same boat." However, based on my experience spanning AI research (including fi…
…
continue reading

1
“Negative Results for SAEs On Downstream Tasks and Deprioritising SAE Research (GDM Mech Interp Team Progress Update #2)” by Neel Nanda, lewis smith, Senthooran Rajamanoharan, Arthur Conmy, Callum McDougall ...
57:32
57:32
Play later
Play later
Lists
Like
Liked
57:32Audio note: this article contains 31 uses of latex notation, so the narration may be difficult to follow. There's a link to the original text in the episode description. Lewis Smith*, Sen Rajamanoharan*, Arthur Conmy, Callum McDougall, Janos Kramar, Tom Lieberum, Rohin Shah, Neel Nanda * = equal contribution The following piece is a list of snippet…
…
continue reading

1
[Linkpost] “Playing in the Creek” by Hastings
4:12
4:12
Play later
Play later
Lists
Like
Liked
4:12This is a link post. When I was a really small kid, one of my favorite activities was to try and dam up the creek in my backyard. I would carefully move rocks into high walls, pile up leaves, or try patching the holes with sand. The goal was just to see how high I could get the lake, knowing that if I plugged every hole, eventually the water would …
…
continue reading
This is part of the MIRI Single Author Series. Pieces in this series represent the beliefs and opinions of their named authors, and do not claim to speak for all of MIRI. Okay, I'm annoyed at people covering AI 2027 burying the lede, so I'm going to try not to do that. The authors predict a strong chance that all humans will be (effectively) dead i…
…
continue reading

1
“Short Timelines don’t Devalue Long Horizon Research” by Vladimir_Nesov
2:10
2:10
Play later
Play later
Lists
Like
Liked
2:10Short AI takeoff timelines seem to leave no time for some lines of alignment research to become impactful. But any research rebalances the mix of currently legible research directions that could be handed off to AI-assisted alignment researchers or early autonomous AI researchers whenever they show up. So even hopelessly incomplete research agendas…
…
continue reading

1
“Alignment Faking Revisited: Improved Classifiers and Open Source Extensions” by John Hughes, abhayesian, Akbir Khan, Fabien Roger
41:04
41:04
Play later
Play later
Lists
Like
Liked
41:04In this post, we present a replication and extension of an alignment faking model organism: Replication: We replicate the alignment faking (AF) paper and release our code. Classifier Improvements: We significantly improve the precision and recall of the AF classifier. We release a dataset of ~100 human-labelled examples of AF for which our classifi…
…
continue reading

1
“METR: Measuring AI Ability to Complete Long Tasks” by Zach Stein-Perlman
11:09
11:09
Play later
Play later
Lists
Like
Liked
11:09Summary: We propose measuring AI performance in terms of the length of tasks AI agents can complete. We show that this metric has been consistently exponentially increasing over the past 6 years, with a doubling time of around 7 months. Extrapolating this trend predicts that, in under five years, we will see AI agents that can independently complet…
…
continue reading

1
“Why Have Sentence Lengths Decreased?” by Arjun Panickssery
9:08
9:08
Play later
Play later
Lists
Like
Liked
9:08“In the loveliest town of all, where the houses were white and high and the elms trees were green and higher than the houses, where the front yards were wide and pleasant and the back yards were bushy and worth finding out about, where the streets sloped down to the stream and the stream flowed quietly under the bridge, where the lawns ended in orc…
…
continue reading

1
“AI 2027: What Superintelligence Looks Like” by Daniel Kokotajlo, Thomas Larsen, elifland, Scott Alexander, Jonas V, romeo
54:30
54:30
Play later
Play later
Lists
Like
Liked
54:30In 2021 I wrote what became my most popular blog post: What 2026 Looks Like. I intended to keep writing predictions all the way to AGI and beyond, but chickened out and just published up till 2026. Well, it's finally time. I'm back, and this time I have a team with me: the AI Futures Project. We've written a concrete scenario of what we think the f…
…
continue reading

1
“OpenAI #12: Battle of the Board Redux” by Zvi
18:01
18:01
Play later
Play later
Lists
Like
Liked
18:01Back when the OpenAI board attempted and failed to fire Sam Altman, we faced a highly hostile information environment. The battle was fought largely through control of the public narrative, and the above was my attempt to put together what happened.My conclusion, which I still believe, was that Sam Altman had engaged in a variety of unacceptable co…
…
continue reading

1
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit
27:39
27:39
Play later
Play later
Lists
Like
Liked
27:39Epistemic status: This post aims at an ambitious target: improving intuitive understanding directly. The model for why this is worth trying is that I believe we are more bottlenecked by people having good intuitions guiding their research than, for example, by the ability of people to code and run evals. Quite a few ideas in AI safety implicitly us…
…
continue reading

1
“You will crash your car in front of my house within the next week” by Richard Korzekwa
1:52
1:52
Play later
Play later
Lists
Like
Liked
1:52I'm not writing this to alarm anyone, but it would be irresponsible not to report on something this important. On current trends, every car will be crashed in front of my house within the next week. Here's the data: Until today, only two cars had crashed in front of my house, several months apart, during the 15 months I have lived here. But a few h…
…
continue reading

1
“My ‘infohazards small working group’ Signal Chat may have encountered minor leaks” by Linch
10:33
10:33
Play later
Play later
Lists
Like
Liked
10:33Remember: There is no such thing as a pink elephant. Recently, I was made aware that my “infohazards small working group” Signal chat, an informal coordination venue where we have frank discussions about infohazards and why it will be bad if specific hazards were leaked to the press or public, accidentally was shared with a deceitful and discredite…
…
continue reading

1
“Leverage, Exit Costs, and Anger: Re-examining Why We Explode at Home, Not at Work” by at_the_zoo
6:16
6:16
Play later
Play later
Lists
Like
Liked
6:16Let's cut through the comforting narratives and examine a common behavioral pattern with a sharper lens: the stark difference between how anger is managed in professional settings versus domestic ones. Many individuals can navigate challenging workplace interactions with remarkable restraint, only to unleash significant anger or frustration at home…
…
continue reading

1
“PauseAI and E/Acc Should Switch Sides” by WillPetillo
3:31
3:31
Play later
Play later
Lists
Like
Liked
3:31In the debate over AI development, two movements stand as opposites: PauseAI calls for slowing down AI progress, and e/acc (effective accelerationism) calls for rapid advancement. But what if both sides are working against their own stated interests? What if the most rational strategy for each would be to adopt the other's tactics—if not their ulti…
…
continue reading

1
“VDT: a solution to decision theory” by L Rudolf L
8:58
8:58
Play later
Play later
Lists
Like
Liked
8:58Introduction Decision theory is about how to behave rationally under conditions of uncertainty, especially if this uncertainty involves being acausally blackmailed and/or gaslit by alien superintelligent basilisks. Decision theory has found numerous practical applications, including proving the existence of God and generating endless LessWrong comm…
…
continue reading