LessWrong (Curated & Popular)
Audio narrations of LessWrong posts. Includes all curated posts and all posts with 125+ karma.If you'd like more, subscribe to the “Lesswrong (30+ karma)” feed.
“How Colds Spread” by RobertM
It seems like a catastrophic civilizational failure that we don't have confident common knowledge of how colds spread. There have been a number of studies conducted over the years, but most of those were testing secondary endpoints, like how long viruses would survive on surfaces, or how likely they were to be transmitted to people's fingers after touching contaminated surfaces, etc.
However, a few of them involved rounding up some brave volunteers, deliberately infecting some of them, and then arranging matters so as to test various routes of transmission to uninfected volunteers.
My conclusions from reviewing...
“New Report: An International Agreement to Prevent the Premature Creation of Artificial Superintelligence” by Aaron_Scher, David Abecassis, Brian Abeyta, peterbarnett
TLDR: We at the MIRI Technical Governance Team have released a report describing an example international agreement to halt the advancement towards artificial superintelligence. The agreement is centered around limiting the scale of AI training, and restricting certain AI research.
Experts argue that the premature development of artificial superintelligence (ASI) poses catastrophic risks, from misuse by malicious actors, to geopolitical instability and war, to human extinction due to misaligned AI. Regarding misalignment, Yudkowsky and Soares's NYT bestseller If Anyone Builds It, Everyone Dies argues that the world needs a strong international agreement prohibiting the development of superintelligence. This...
“Where is the Capital? An Overview” by johnswentworth
When a new dollar goes into the capital markets, after being bundled and securitized and lent several times over, where does it end up? When society's total savings increase, what capital assets do those savings end up invested in?
When economists talk about “capital assets”, they mean things like roads, buildings and machines. When I read through a company's annual reports, lots of their assets are instead things like stocks and bonds, short-term debt, and other “financial” assets - i.e. claims on other people's stuff. In theory, for every financial asset, there's a financial liability somewhere. For every bo...
“Problems I’ve Tried to Legibilize” by Wei Dai
Looking back, it appears that much of my intellectual output could be described as legibilizing work, or trying to make certain problems in AI risk more legible to myself and others. I've organized the relevant posts and comments into the following list, which can also serve as a partial guide to problems that may need to be further legibilized, especially beyond LW/rationalists, to AI researchers, funders, company leaders, government policymakers, their advisors (including future AI advisors), and the general public.
Philosophical problems Probability theory Decision theory Beyond astronomical waste (possibility of influencing vastly larger universes beyond our...
“Do not hand off what you cannot pick up” by habryka
Delegation is good! Delegation is the foundation of civilization! But in the depths of delegation madness breeds and evil rises.
In my experience, there are three ways in which delegation goes off the rails:
1. You delegate without knowing what good performance on a task looks like
If you do not know how to evaluate performance on a task, you are going to have a really hard time delegating it to someone. Most likely, you will choose someone incompetent for the task at hand.
But even if you manage to avoid that specific...
“7 Vicious Vices of Rationalists” by Ben Pace
Vices aren't behaviors that one should never do. Rather, vices are behaviors that are fine and pleasurable to do in moderation, but tempting to do in excess. The classical vices are actually good in part. Moderate amounts of gluttony is just eating food, which is important. Moderate amounts of envy is just "wanting things", which is a motivator of much of our economy.
What are some things that rationalists are wont to do, and often to good effect, but that can grow pathological?
1. Contrarianism
There are a whole host of unaligned forces producing the...
“Tell people as early as possible it’s not going to work out” by habryka
Context: Post #4 in my sequence of private Lightcone Infrastructure memos edited for public consumption
This week's principle is more about how I want people at Lightcone to relate to community governance than it is about our internal team culture.
As part of our jobs at Lightcone we often are in charge of determining access to some resource, or membership in some group (ranging from LessWrong to the AI Alignment Forum to the Lightcone Offices). Through that, I have learned that one of the most important things to do when building things like this is to try...
“Everyone has a plan until they get lied to the face” by Screwtape
"Everyone has a plan until they get punched in the face."
- Mike Tyson
(The exact phrasing of that quote changes, this is my favourite.)
I think there is an open, important weakness in many people. We assume those we communicate with are basically trustworthy. Further, I think there is an important flaw in the current rationality community. We spend a lot of time focusing on subtle epistemic mistakes, teasing apart flaws in methodology and practicing the principle of charity. This creates a vulnerability to someone willing to just say outright false things. We’re...
“Please, Don’t Roll Your Own Metaethics” by Wei Dai
One day, when I was an interning at the cryptography research department of a large software company, my boss handed me an assignment to break a pseudorandom number generator passed to us for review. Someone in another department invented it and planned to use it in their product, and wanted us to take a look first. This person must have had a lot of political clout or was especially confident in himself, because he refused the standard advice that anything an amateur comes up with is very likely to be insecure and he should instead use one of the established...
“Paranoia rules everything around me” by habryka
People sometimes make mistakes [citation needed].
The obvious explanation for most of those mistakes is that decision makers do not have access to the information necessary to avoid the mistake, or are not smart/competent enough to think through the consequences of their actions.
This predicts that as decision-makers get access to more information, or are replaced with smarter people, their decisions will get better.
And this is substantially true! Markets seem more efficient today than they were before the onset of the internet, and in general decision-making across the board has improved on...
“Human Values ≠ Goodness” by johnswentworth
There is a temptation to simply define Goodness as Human Values, or vice versa.
Alas, we do not get to choose the definitions of commonly used words; our attempted definitions will simply be wrong. Unless we stick to mathematics, we will end up sneaking in intuitions which do not follow from our so-called definitions, and thereby mislead ourselves. People who claim that they use some standard word or phrase according to their own definition are, in nearly all cases outside of mathematics, wrong about their own usage patterns.[1]
If we want to know what words mean...
“Condensation” by abramdemski
Condensation: a theory of concepts is a model of concept-formation by Sam Eisenstat. Its goals and methods resemble John Wentworth's natural abstractions/natural latents research.[1] Both theories seek to provide a clear picture of how to posit latent variables, such that once someone has understood the theory, they'll say "yep, I see now, that's how latent variables work!".
The goal of this post is to popularize Sam's theory and to give my own perspective on it; however, it will not be a full explanation of the math. For technical details, I suggest reading Sam's paper.
Brief...
“Mourning a life without AI” by Nikola Jurkovic
Recently, I looked at the one pair of winter boots I own, and I thought “I will probably never buy winter boots again.” The world as we know it probably won’t last more than a decade, and I live in a pretty warm area.
I. AGI is likely in the next decade
It has basically become consensus within the AI research community that AI will surpass human capabilities sometime in the next few decades. Some, including myself, think this will likely happen this decade.
II. The post-AGI world will be unrecognizable
Assumi...
“Unexpected Things that are People” by Ben Goldhaber
Cross-posted from https://bengoldhaber.substack.com/
It's widely known that Corporations are People. This is universally agreed to be a good thing; I list Target as my emergency contact and I hope it will one day be the best man at my wedding.
But there are other, less well known non-human entities that have also been accorded the rank of person.
Ships: Ships have long posed a tricky problem for states and courts. Similar to nomads, vagabonds, and college students on extended study abroad, they roam far and occasionally get into trouble.
...
“Sonnet 4.5’s eval gaming seriously undermines alignment evals, and this seems caused by training on alignment evals” by Alexa Pan, ryan_greenblatt
According to the Sonnet 4.5 system card, Sonnet 4.5 is much more likely than Sonnet 4 to mention in its chain-of-thought that it thinks it is being evaluated; this seems to meaningfully cause it to appear to behave better in alignment evaluations. So, Sonnet 4.5's behavioral improvements in these evaluations may partly be driven by growing tendency to notice and game evaluations rather than genuine alignment. This is an early example of a phenomenon that is going to get increasingly problematic: as evaluation gaming increases, alignment evaluations become harder to trust.[1]
To elaborate on the above: Sonnet 4.5 seems far more aware...
“Publishing academic papers on transformative AI is a nightmare” by Jakub Growiec
I am a professor of economics. Throughout my career, I was mostly working on economic growth theory, and this eventually brought me to the topic of transformative AI / AGI / superintelligence. Nowadays my work focuses mostly on the promises and threats of this emerging disruptive technology.
Recently, jointly with Klaus Prettner, we’ve written a paper on “The Economics of p(doom): Scenarios of Existential Risk and Economic Growth in the Age of Transformative AI”. We have presented it at multiple conferences and seminars, and it was always well received. We didn’t get any real pushback; instead our research...
“The Unreasonable Effectiveness of Fiction” by Raelifin
[Meta: This is Max Harms. I wrote a novel about China and AGI, which comes out today. This essay from my fiction newsletter has been slightly modified for LessWrong.]
In the summer of 1983, Ronald Reagan sat down to watch the film War Games, starring Matthew Broderick as a teen hacker. In the movie, Broderick's character accidentally gains access to a military supercomputer with an AI that almost starts World War III.
“The only winning move is not to play.” After watching the movie, Reagan, newly concerned with the possibility of hackers causing real harm, ordered a full...
“Legible vs. Illegible AI Safety Problems” by Wei Dai
Some AI safety problems are legible (obvious or understandable) to company leaders and government policymakers, implying they are unlikely to deploy or allow deployment of an AI while those problems remain open (i.e., appear unsolved according to the information they have access to). But some problems are illegible (obscure or hard to understand, or in a common cognitive blind spot), meaning there is a high risk that leaders and policymakers will decide to deploy or allow deployment even if they are not solved. (Of course, this is a spectrum, but I am simplifying it to a binary for ease...
“Lack of Social Grace is a Lack of Skill” by Screwtape
1.
I have claimed that one of the fundamental questions of rationality is “what am I about to do and what will happen next?” One of the domains I ask this question the most is in social situations.
There are a great many skills in the world. If I had the time and resources to do so, I’d want to master all of them. Wilderness survival, automotive repair, the Japanese language, calculus, heart surgery, French cooking, sailing, underwater basket weaving, architecture, Mexican cooking, functional programming, whatever it is people mean when they say “hey man, just let him c...
[Linkpost] “I ate bear fat with honey and salt flakes, to prove a point” by aggliu
This is a link post. Eliezer Yudkowsky did not exactly suggest that you should eat bear fat covered with honey and sprinkled with salt flakes.
What he actually said was that an alien, looking from the outside at evolution, would predict that you would want to eat bear fat covered with honey and sprinkled with salt flakes.
Still, I decided to buy a jar of bear fat online, and make a treat for the people at Inkhaven. It was surprisingly good. My post discusses how that happened, and a bit about the implications for Eliezer's thesis.<...
“What’s up with Anthropic predicting AGI by early 2027?” by ryan_greenblatt
As far as I'm aware, Anthropic is the only AI company with official AGI timelines[1]: they expect AGI by early 2027. In their recommendations (from March 2025) to the OSTP for the AI action plan they say:
As our CEO Dario Amodei writes in 'Machines of Loving Grace', we expect powerful AI systems will emerge in late 2026 or early 2027. Powerful AI systems will have the following properties:
Intellectual capabilities matching or exceeding that of Nobel Prize winners across most disciplines—including biology, computer science, mathematics, and engineering. [...]
They often describe this capability level as a "country of...
[Linkpost] “Emergent Introspective Awareness in Large Language Models” by Drake Thomas
This is a link post. New Anthropic research (tweet, blog post, paper):
We investigate whether large language models can introspect on their internal states. It is difficult to answer this question through conversation alone, as genuine introspection cannot be distinguished from confabulations. Here, we address this challenge by injecting representations of known concepts into a model's activations, and measuring the influence of these manipulations on the model's self-reported states. We find that models can, in certain scenarios, notice the presence of injected concepts and accurately identify them. Models demonstrate some ability to recall prior internal representations and distinguish...
[Linkpost] “You’re always stressed, your mind is always busy, you never have enough time” by mingyuan
This is a link post. You have things you want to do, but there's just never time. Maybe you want to find someone to have kids with, or maybe you want to spend more or higher-quality time with the family you already have. Maybe it's a work project. Maybe you have a musical instrument or some sports equipment gathering dust in a closet, or there's something you loved doing when you were younger that you want to get back into. Whatever it is, you can’t find the time for it. And yet you somehow find thousands of hours a ye...
“LLM-generated text is not testimony” by TsviBT
Crosspost from my blog.
Synopsis
When we share words with each other, we don't only care about the words themselves. We care also—even primarily—about the mental elements of the human mind/agency that produced the words. What we want to engage with is those mental elements. As of 2025, LLM text does not have those elements behind it. Therefore LLM text categorically does not serve the role for communication that is served by real text. Therefore the norm should be that you don't share LLM text as if someone wrote it. And, it is inadvisable to r...
“Post title: Why I Transitioned: A Case Study” by Fiora Sunshine
An Overture
Famously, trans people tend not to have great introspective clarity into their own motivations for transition. Intuitively, they tend to be quite aware of what they do and don't like about inhabiting their chosen bodies and gender roles. But when it comes to explaining the origins and intensity of those preferences, they almost universally to come up short. I've even seen several smart, thoughtful trans people, such as Natalie Wynn, making statements to the effect that it's impossible to develop a satisfying theory of aberrant gender identities. (She may have been exaggerating for effect, but it...
“The Memetics of AI Successionism” by Jan_Kulveit
TL;DR: AI progress and the recognition of associated risks are painful to think about. This cognitive dissonance acts as fertile ground in the memetic landscape, a high-energy state that will be exploited by novel ideologies. We can anticipate cultural evolution will find viable successionist ideologies: memeplexes that resolve this tension by framing the replacement of humanity by AI not as a catastrophe, but as some combination of desirable, heroic, or inevitable outcome. This post mostly examines the mechanics of the process.
Most analyses of ideologies fixate on their specific claims - what acts are good, whether AIs...
“How Well Does RL Scale?” by Toby_Ord
This is the latest in a series of essays on AI Scaling.
You can find the others on my site.
Summary: RL-training for LLMs scales surprisingly poorly. Most of its gains are from allowing LLMs to productively use longer chains of thought, allowing them to think longer about a problem. There is some improvement for a fixed length of answer, but not enough to drive AI progress. Given the scaling up of pre-training compute also stalled, we'll see less AI progress via compute scaling than you might have thought, and more of it will come from inference...
“An Opinionated Guide to Privacy Despite Authoritarianism” by TurnTrout
I've created a highly specific and actionable privacy guide, sorted by importance and venturing several layers deep into the privacy iceberg. I start with the basics (password manager) but also cover the obscure (dodging the millions of Bluetooth tracking beacons which extend from stores to traffic lights; anti-stingray settings; flashing GrapheneOS on a Pixel). I feel strongly motivated by current events, but the guide also contains a large amount of timeless technical content. Here's a preview.
Digital Threat Modeling Under Authoritarianism by Bruce Schneier
Being innocent won't protect you.
This is vital to understand...
“Cancer has a surprising amount of detail” by Abhishaike Mahajan
There is a very famous essay titled ‘Reality has a surprising amount of detail’. The thesis of the article is that reality is filled, just filled, with an incomprehensible amount of materially important information, far more than most people would naively expect. Some of this detail is inherent in the physical structure of the universe, and the rest of it has been generated by centuries of passionate humans imbibing the subject with idiosyncratic convention. In either case, the detail is very, very important. A wooden table is “just” a flat slab of wood on legs until you try building one at indus...
“AIs should also refuse to work on capabilities research” by Davidmanheim
There's a strong argument that humans should stop trying to build more capable AI systems, or at least slow down progress. The risks are plausibly large but unclear, and we’d prefer not to die. But the roadmaps of the companies pursuing these systems envision increasingly agentic AI systems taking over the key tasks of researching and building superhuman AI systems, and humans will therefore have a decreasing ability to make many key decisions. In the near term, humanity could stop, but seem likely to fail. That said, even though humans have relatively little ability to coordinate around such unilateralist di...
“On Fleshling Safety: A Debate by Klurl and Trapaucius.” by Eliezer Yudkowsky
(23K words; best considered as nonfiction with a fictional-dialogue frame, not a proper short story.)
Prologue:
Klurl and Trapaucius were members of the machine race. And no ordinary citizens they, but Constructors: licensed, bonded, and insured; proven, experienced, and reputed. Together Klurl and Trapaucius had collaborated on such famed artifices as the Eternal Clock, Silicon Sphere, Wandering Flame, and Diamond Book; and as individuals, both had constructed wonders too numerous to number.
At one point in time Trapaucius was meeting with Klurl to drink a cup together. Klurl had set before himself a simple...
“EU explained in 10 minutes” by Martin Sustrik
If you want to understand a country, you should pick a similar country that you are already familiar with, research the differences between the two and there you go, you are now an expert.
But this approach doesn’t quite work for the European Union. You might start, for instance, by comparing it to the United States, assuming that EU member countries are roughly equivalent to U.S. states. But that analogy quickly breaks down. The deeper you dig, the more confused you become.
You try with other federal states. Germany. Switzerland. But it doesn’t work...
“Cheap Labour Everywhere” by Morpheus
I recently visited my girlfriend's parents in India. Here is what that experience taught me:
Yudkowsky has this facebook post where he makes some inferences about the economy after noticing two taxis stayed in the same place while he got his groceries. I had a few similar experiences while I was in India, though sadly I don't remember them in enough detail to illustrate them in as much detail as that post. Most of the thoughts relating to economics revolved around how labour in India is extremely cheap.
I knew in the abstract that India is...
[Linkpost] “Consider donating to AI safety champion Scott Wiener” by Eric Neyman
This is a link post. Written in my personal capacity. Thanks to many people for conversations and comments. Written in less than 24 hours; sorry for any sloppiness.
It's an uncanny, weird coincidence that the two biggest legislative champions for AI safety in the entire country announced their bids for Congress just two days apart. But here we are.
On Monday, I put out a long blog post making the case for donating to Alex Bores, author of the New York RAISE Act. And today I’m doing the exact same thing for Scott Wiener, wh...
“Which side of the AI safety community are you in?” by Max Tegmark
In recent years, I’ve found that people who self-identify as members of the AI safety community have increasingly split into two camps:
Camp A) "Race to superintelligence safely”: People in this group typically argue that "superintelligence is inevitable because of X”, and it's therefore better that their in-group (their company or country) build it first. X is typically some combination of “Capitalism”, “Molloch”, “lack of regulation” and “China”.
Camp B) “Don’t race to superintelligence”: People in this group typically argue that “racing to superintelligence is bad because of Y”. Here Y is typically some combination of “uncontrollable”...
“Doomers were right” by Algon
There's an argument I sometimes hear against existential risks, or any other putative change that some are worried about, that goes something like this:
'We've seen time after time that some people will be afraid of any change. They'll say things like "TV will destroy people's ability to read", "coffee shops will destroy the social order","machines will put textile workers out of work". Heck, Socrates argued that books would harm people's ability to memorize things. So many prophets of doom, and yet the world has not only survived, it has thrived. Innovation is a boon. So we...
“Do One New Thing A Day To Solve Your Problems” by Algon
People don't explore enough. They rely on cached thoughts and actions to get through their day. Unfortunately, this doesn't lead to them making progress on their problems. The solution is simple. Just do one new thing a day to solve one of your problems.
Intellectually, I've always known that annoying, persistent problems often require just 5 seconds of actual thought. But seeing a number of annoying problems that made my life worse, some even major ones, just yield to the repeated application of a brief burst of thought each day still surprised me.
For example, I had...
“Humanity Learned Almost Nothing From COVID-19” by niplav
Summary: Looking over humanity's response to the COVID-19 pandemic, almostsix years later, reveals that we've forgotten to fulfill our intent atpreparing for the next pandemic. I rant.
content warning: A single carefully placed slur.
If we want to create a world free of pandemics and other biologicalcatastrophes, the time to act is now.
—US White House, “ FACT SHEET: The Biden Administration's Historic Investment in Pandemic Preparedness and Biodefense in the FY 2023 President's Budget ”, 2022
Around five years, a globalpandemic caused bya coronavirus started.
In the course of the pandemic, there have been a...
“Consider donating to Alex Bores, author of the RAISE Act” by Eric Neyman
Written by Eric Neyman, in my personal capacity. The views expressed here are my own. Thanks to Zach Stein-Perlman, Jesse Richardson, and many others for comments.
Over the last several years, I’ve written a bunch of posts about politics and political donations. In this post, I’ll tell you about one of the best donation opportunities that I’ve ever encountered: donating to Alex Bores, who announced his campaign for Congress today.
If you’re potentially interested in donating to Bores, my suggestion would be to:
Read this post to understand the case for dona...
“Meditation is dangerous” by Algon
Here's a story I've heard a couple of times. A youngish person is looking for some solutions to their depression, chronic pain, ennui or some other cognitive flaw. They're open to new experiences and see a meditator gushing about how amazing meditation is for joy, removing suffering, clearing one's mind, improving focus etc. They invite the young person to a meditation retreat. The young person starts making decent progress. Then they have a psychotic break and their life is ruined for years, at least. The meditator is sad, but not shocked. Then they started gushing about meditation again.
...