Next-Token Predictor Is An AI's Job, Not Its Species
Release Date: 04/02/2026
Astral Codex Ten Podcast
Having Your Own Government Try To Destroy You Is (At Least Temporarily) Good For Business On Friday, the Pentagon declared AI company Anthropic a “supply chain risk”, a designation never before given to an American company. This unprecedented move was seen as an attempt to punish, maybe destroy the company. How effective was it? Anthropic isn’t publicly traded, so we turn to the prediction markets. has a “perpetual future” on Anthropic stock, a complicated instrument attempting to track the company’s valuation, to be resolved at the IPO. Here’s what they’ve got:
info_outlineAstral Codex Ten Podcast
Last Friday, Secretary of War Pete Hegseth declared AI company Anthropic a “”, the first time this designation has ever been applied to a US company. The trigger for the move was Anthropic’s to allow the Department of War to use their AIs for mass surveillance and autonomous weapons. A few hours later, Hegseth and Sam Altman declared an agreement-in-principle for OpenAI’s models to be used in the niche vacated by Anthropic. Altman that he had received guarantees that OpenAI’s models wouldn’t be used for mass surveillance or autonomous weapons either, but given Hegseth’s...
info_outlineAstral Codex Ten Podcast
I. In The Argument, of the ways that AIs are more than just “next-token predictors” or “stochastic parrots” - for example, they also use fine-tuning and RLHF. But commenters, while appreciating the subtleties she introduces, object that they’re still just extra layers on top of a machine that basically runs on next-token prediction. I want to approach this from a different direction. I think overemphasizing next-token prediction is a confusion of levels. On the levels where AI is a next-token predictor, you are also a next-token (technically: next-sense-datum) predictor. On the...
info_outlineAstral Codex Ten Podcast
Here’s my understanding of : Anthropic signed a contract with the Pentagon last summer. It originally said the Pentagon had to follow Anthropic’s Usage Policy like everyone else. In January, the Pentagon attempted to renegotiate, asking to ditch the Usage Policy and instead have Anthropic’s AIs available for “all lawful purposes”. Anthropic demurred, asking for a guarantee that their AIs would not be used for mass surveillance of American citizens or no-human-in-the-loop killbots. The Pentagon refused the guarantees, demanding that Anthropic accept the renegotiation...
info_outlineAstral Codex Ten Podcast
Malicious are an evil trick from Dark Data Journalism. Some annoying enemy has a valid complaint. So you use FACTS and LOGIC to prove that something similar-sounding-but-slightly-different is definitely false. Then you act like you’ve debunked the complaint. My “favorite” example, spotted during the 2016 election, was a response to some #BuildTheWall types saying that illegal immigration through the southern border was near record highs. Some data journalist got good statistics and proved that the number of Mexicans illegally entering the country was actually quite low. When I looked...
info_outlineAstral Codex Ten Podcast
It’s that time again. Even numbered years are book reviews, odd-numbered years are non-book reviews, so you’re limited to books for now. Write a review of a book. There’s no official word count requirement, but previous finalists and winners were often between 2,000 and 10,000 words. There’s no official recommended style, but check the style of or my ACX book reviews (, , ) if you need inspiration. Please limit yourself to one entry per person or team. Then send me your review through . The form will ask for your name, email, the title of the book, and a link to a Google Doc. The...
info_outlineAstral Codex Ten Podcast
The problem: people hate crime and think it’s going up. But actually, crime and is . So what’s going on? In our discussion yesterday, many commenters proposed that the discussion about “crime” was really about disorder. Disorder takes many forms, but its symptoms include litter, graffiti, shoplifting, tent cities, weird homeless people wandering about muttering to themselves, and people walking around with giant boom boxes shamelessly playing music at 200 decibels on a main street where people are trying to engage in normal activities. When people complain about these things,...
info_outlineAstral Codex Ten Podcast
Last year, the US may have recorded the lowest murder rate in its 250 year history. Other crimes have poorer historical data, but are at least at ~50 year lows. This post will do two things: Establish that our best data show crime rates are historically low Argue that this is a real effect, not just reporting bias (people report fewer crimes to police) or an artifact of better medical care (victims are more likely to survive, so murders get downgraded to assaults)
info_outlineAstral Codex Ten Podcast
[Original post: ] I. Ajeya Cotra’s report was the landmark AI timelines forecast of the early 2020s. In many ways, it was incredibly prescient - it nailed the scaling hypothesis, predicted the current AI boom, and introduced concepts like “time horizons” that have entered common parlance. In most cases where its contemporaries challenged it, its assumptions have been borne out, and its challengers proven wrong. But its headline prediction - an AGI timeline centered around the 2050s - no longer seems plausible. The of the discussion ranges from late to , with more remote dates...
info_outlineAstral Codex Ten Podcast
The European discourse can be - for lack of a better term - America-brained. We hear stories of Black Lives Matter marches in countries without significant black populations, or defendants demanding their First Amendment rights in countries without constitutions. Why shouldn’t the opposite phenomenon exist? Europe is more populous than the US, and looms large in the American imagination. Why shouldn’t we find ourselves accidentally absorbing European ideas that don’t make sense in the American context?
info_outlineI.
In The Argument, Kelsey Piper gives a good description of the ways that AIs are more than just “next-token predictors” or “stochastic parrots” - for example, they also use fine-tuning and RLHF. But commenters, while appreciating the subtleties she introduces, object that they’re still just extra layers on top of a machine that basically runs on next-token prediction.
I want to approach this from a different direction. I think overemphasizing next-token prediction is a confusion of levels. On the levels where AI is a next-token predictor, you are also a next-token (technically: next-sense-datum) predictor. On the levels where you’re not a next-token predictor, AI isn’t one either.