#21 | Anthropic Accidentally Published Your 2028 Paycheck. And You Won't Like It.

Jun 11, 2026

This issue in 30 seconds:

Anthropic published internal data: 80%+ of their production code is now written by Claude. Their engineers ship 8x what they did in 2024.
Execution is approaching zero cost. The only thing gaining value is judgment — choosing which problems are worth solving.
This isn’t a future prediction. It’s your office with a 24-month head start.

Welcome back to the bunker.

Two companies. Same industry. Same year: right now.

The first one, to keep up with the workload, hires. Three new people who execute faster: they write, they fix, they produce. Headcount up, output up. The way it’s always been done.

The second one stopped hiring executors. It started buying something else: judgment. One person who decides which problems are worth solving, and beneath them a pyramid of AI agents that executes. Fewer heads. More direction.

It looks like a difference in style. A risky bet versus a prudent one.

It’s not. It’s a graph. Published recently. From inside the most advanced laboratory in the world.

Anthropic opened the doors and showed the numbers that usually stay locked inside: what happens to human work when execution stops costing anything. Not a forecast. Not a scenario. What’s already happening, today, to their engineers and their researchers.

And the uncomfortable part is just one thing: Anthropic is not some exotic case to watch through binoculars. It’s your office with a twenty-four-month head start.

I read it twice. The first time to understand it. The second to decide how to take it. Because underneath the science-fiction numbers there’s the confirmation, in hard data, of something I’ve been repeating for fifty episodes. And because, if you read it the right way, this isn’t news that should scare you. It’s the most optimistic thing you could have received this year.

The title of the document is “When AI builds itself.” When AI builds itself. It was signed by Marina Favaro and Jack Clark, inside the Anthropic Institute. It sounds like a movie script. It’s a spreadsheet.

☕ Cup in hand. Let’s go.

💡 Thought of the Day

“The exact moment execution costs zero, the only thing worth more is deciding what to execute. Execution depreciates. Judgment appreciates. And judgment depends only on you.”

🧪 AI in Action

#1: The Dashboard No One Was Supposed to See

AI companies always talk in future tense. “Soon,” “in the coming years,” “imagine a world where.” It’s convenient: you can’t falsify the future.

Not this time. Anthropic did something different: they published the internal data on what’s happening inside the company right now. Not how good Claude is on a benchmark. How much it has already shifted the work of their humans. Here’s the dashboard, real numbers, pulled from the piece.

Over 80% of the code Anthropic ships to production today is written by Claude. Before the launch of Claude Code, at the beginning of 2025, it was in the single digits. Leadership, counting scripts and experimental code as well, puts it at 90% and above. In just over a year, execution has shifted almost entirely from human to machine.

The average engineer, in Q2 2026, ships 8 times the code they shipped in 2024. Not because they’re rewarded for lines written. Because most of that code they no longer type: Claude writes it, and they direct and review.

In a March 2026 survey of 130 researchers, the median researcher estimates producing roughly 4 times the output they would without AI, on the same projects they would have done anyway.

And then the number that stopped me. Every time a new model comes out, Anthropic gives it the same task: take some code that trains a small model and make it as fast as possible without breaking it. In May 2025, Claude achieved an improvement of about 3x. By April 2026, 52x. To calibrate: a skilled human researcher, on that task, takes four to eight hours to reach 4x. In less than a year, Claude went from “very useful” to “superhuman” on that specific piece of work.

Now the part I like, and the one most people who share these stories happily skip: Anthropic puts the caveats in themselves.

Lines of code are a poor metric, they say it in black and white: the 8x almost certainly overestimates the real gain. The 4x declared by researchers is probably inflated, because developers tend to overestimate how much AI helps them (there’s a METR study proving it). The 52x depends on how improvable the starting code was, not a real acceleration of training.

And here’s the thing I’ve been saying forever. Never drink the numbers as you find them. AI can bluff, and the people telling you about it often bluff double. But when you squeeze all the air out of these numbers, when you deflate them to the honest minimum, the direction doesn’t change by a single degree. Even halving everything, even being brutal, what remains is an acceleration with no precedent in that company’s history. Judgment isn’t for denying the trend. It’s for seeing it without being sold twice what’s actually there. And what’s actually there, already discounted, is more than enough.

One last line, because it’s the one that frames everything. An external measure, from an independent institute called METR, has been tracking for years the length of tasks AI can complete on its own. Two years ago: four-minute tasks. One year ago: an hour and a half. Today: twelve hours. And the doubling, which used to happen every seven months, now happens every four. If the curve holds, this year the tasks that take a person days enter the range. In 2027, weeks.

Bar graph showing code contributed per person, per quarter, starting in Q2 2021 and ending in Q2 2026. The graph notes the release dates of eight different models: Claude 1, Claude 2, Claude 3, Claude 4, Claude Code, Claude Sonnet 4.5, Claude Opus 4.5, Claude Mythos Preview (internal access), and Claude Mythos Preview.

#2: Why the Lab Is Your Time Machine

Now the move that changes how you read everything else.

There’s a precise reason these numbers come from an AI lab and not from your office: the researchers who build these models are the first customers of the models they build. They have access to internal versions months before you do. They work side by side with tools that will reach you in a year or two, tamed and packaged.

This makes them a leading indicator. Not because they’re smarter than you. Because they live in your working future before you do.

And in their future, which is yours, something specific has already happened. I’m quoting Anthropic, their words: “humans provide the goal; they no longer have to provide the method.” Execution, the “how it’s done,” has slipped out of human hands. What remains firmly human is the “what to do” and the “was it worth it.”

An example from the piece, so it doesn’t stay abstract. A routine update starts crashing tens of thousands of training jobs. An engineer points Claude at the incident, with little more than some text and access to the cluster. Claude works through the running processes, isolates a single obscure debug flag that was triggering the crash, reproduces it, confirms the fix. Two hours. What would normally be two or three days of human work.

And in April 2026, in one case, Claude fixed over 800 bugs, reducing a class of errors by a factor of a thousand. The engineer supervising estimated that a human would have needed four years. Not four days. Four years.

On the quality front, the sentence you should read twice, said by an Anthropic employee: “Code written by Claude was slightly worse than human code at the end of 2025, is on par today, and we expect it to be significantly better by year’s end.”

Read the sequence: worse, on par, better. In twelve months. The best engineers in the world saying the machine has caught up with them and is passing them, at their own craft.

Now shift everything one level and twenty-four months. What today is “Claude writes 80% of the code and the human directs” becomes, in your industry, “AI produces 80% of the operational work and you direct.” Emails, analyses, documents, research, first drafts of anything. This isn’t a reckless metaphor. It’s the same curve, seen from a seat further back in time.

The question is no longer whether it will arrive. The lab has already shown you. The question is: when it reaches your workplace, are you the executor who got automated, or the one who directs?

#3: Executor, Architect, and a Confirmation I’ve Been Waiting Fifty Episodes For

Here I have to be honest about something that concerns me personally.

Since this newsletter has existed, I’ve been repeating a distinction: there are those who execute tasks within a problem — the executor — and there are those who decide which problems are worth building — the architect. The executor works in the problem. The architect works on the problem. I wrote an entire episode titled “Problem Solver: the Last Human Profession.” It was an intellectual bet.

Now the most advanced laboratory in the world puts it in black and white, in their own data. The only comparative human advantage that currently holds, in Anthropic’s words, is “research taste and judgment: choosing which problems matter, which results to trust, when an approach is a dead end.”

Translated: value didn’t disappear when execution became free. It shifted upward. From doing to deciding. From executor to architect. Exactly the direction.

There’s an experiment in the piece that demonstrates it in an almost raw way. Anthropic gave Claude agents a real, open research problem and let them work. The agents proposed hypotheses, tested them, shared results among themselves, iterated. Result: they recovered 97% of the possible gap, in 800 cumulative hours of work and roughly $18,000 in compute. Two human researchers, on the same problem, had recovered 23% in a week.

It looks like the end of the story for the human. It’s not, and here’s the subtlety. The agents did all the execution, yes. But the problem had been chosen by humans. The evaluation grid — what counted as success — had been defined by humans. Anthropic’s phrase is surgical: “direction-setting was the only significant role a human played.”

Only. But significant. And without it, those 800 hours of agents would have been useless, because they would have run at full speed in the wrong direction.

This is the job that appreciates while everything else depreciates. Not writing the email: choosing which conversation is worth having. Not producing the analysis: knowing which question deserves an analysis. Not executing the plan: understanding when the plan is a dead end and needs to be thrown out. Doing costs zero. Deciding what to do has never been worth this much.

#4: The Bottleneck, Suddenly, Is You

There’s a side effect of all this that few notice, and for an entrepreneur it’s the most important of all.

When you accelerate one part of a process, the limit shifts to the part you didn’t accelerate. In computer science it’s called Amdahl’s law. It applies to organizations too. Anthropic already slammed into it: pushed more code into the company, human review became the new bottleneck. They have so much output that the constraint is no longer producing — it’s approving.

Sit with this, because it flips the logic of how you scale a company.

For decades, the limit on how much you could grow was how much your people could execute. More work, more people. Hence middle management, hence headcount as a measure of power. That world is turning upside down. When execution costs nearly zero, the limit on your company is no longer how much you produce. It’s how fast and how well you can decide and validate.

The piece says it in a way that hits hard: already today, at current capabilities, a 100-person company can do the work of a 1,000-person one, because each person sits atop a pyramid of agents. In the most likely scenario, they’re talking about 100 people doing the work of 10,000 or 100,000. That’s not a typo.

We’ve already seen it in these pages: the AI-native startup doing millions in revenue per employee, the solo founder with a laptop reaching mid-size company numbers. Those weren’t folkloristic exceptions. They were the first cells of the new model.

The practical consequence, for anyone managing something: if you’re hiring to solve an execution bottleneck, there’s a good chance you’re optimizing the wrong constraint. The new constraint is your judgment, and your speed of review. That’s where attention should go. That’s where capacity should be built.

A warning I feel compelled to give you, because I learned it the hard way and I always repeat it: the more you shift execution onto agents, the more it matters to keep the human in the loop on irreversible actions. The pyramid of agents is an extraordinary lever as long as you, at the top, maintain veto power over things that can’t be undone. That veto is judgment. It’s the same thing, again.

#5: And If Judgment Falls Too? The Three Futures, Read Like an Adult

Now the knife. Because this episode would be dishonest if I sold you “judgment is your safe spot forever.” Anthropic itself isn’t sure, and they write it.

Their phrase: “research taste,” the judgment about what’s worth doing, might just be the latest capability where AI fails for a while, and then gets good at it. It’s already happened. For years, models couldn’t explain why a joke was funny, had no theory of mind, couldn’t solve linguistic riddles. Then, at some point, they could. Judgment could be the next thing on that list.

From here, the three futures Anthropic puts on the table. Not as fireworks. As frames for deciding where to stand.

One: the trend stops. The exponential curves turn out to be S-curves, they flatten, you’d need a new idea as big as the Transformer was to restart. Possible. Anthropic bets against it, and honestly so do I: every capability they’ve measured, even the “softest” ones, has so far followed the same curve, and that curve hasn’t bent yet.

Two: automation compounds, but humans still give the direction. Companies become multipliers: 100 people doing the work of tens of thousands. The human as AI’s partner, choosing the problems and validating the results. Anthropic says the evidence points here. This is the base case. And it’s not the future: it’s already the present, just not yet distributed.

Three: AI builds itself, for real. Full recursive self-improvement. The machine designs and improves the machine, the pace set only by available compute, the human sliding toward a role of pure supervision over a virtual lab running at chip speed. Here Anthropic is honest about not having good intuitions about what that world would be. Nobody does.

My read is simple. In the short term, your leverage is identical in scenario two and scenario three: in both you win by positioning yourself on judgment and direction, not on execution. So optimize for two, which is already here, and stay legible to three, which might arrive. The part about alignment and a “global pause button,” which is the document’s real political ask, matters enormously for the world, but it’s not your button to press. Acknowledge it, take it seriously, and then get back to what you actually have power over.

And here I’ll tell you the most important thing in this entire episode, the reason I haven’t had this much fun in years.

We’re living in one of the most beautiful eras in history. I’m not saying this to comfort you. I’m saying it because it’s the direct consequence of the numbers you just read. When execution becomes free, the advantage is no longer inherited by those with the capital to hire a thousand executors. It’s inherited by those with judgment, taste, direction, and the will to build. Things that can’t be bought and can’t be inherited. They’re trained. For the first time in history, a single person, with the right ideas and a pyramid of agents, can do what used to require an army.

Sky is the limit, and for once that’s not a motivational poster. The ceiling just shot up, and what’s keeping you down is no longer the system, the bank, your boss, or the job market. It’s you. It depends on you. It’s terrifying and beautiful in exactly equal measure, and I’ve chosen which side to look at it from.

📌 Espresso Prompt

Map your work between what’s depreciating and what’s appreciating.

This entire episode rests on one distinction: execution depreciates, judgment appreciates. Beautiful in theory. But until you apply it to your actual work, it stays a slogan. This prompt turns it into a map, and from a map into a move.

Open it in a session with Claude where you’ve already loaded some context about what you do (if you built the CLAUDE.md from last episode, it pays off here). Then paste this.

“You are a ruthless, honest strategic analyst. Help me audit my work in light of a single thesis: over the next 24 months, execution becomes nearly free and value shifts to judgment — meaning choosing the right problems, trusting the right results, recognizing when an approach is a dead end. We proceed in four phases. Start with PHASE 1 and stop at my answer before moving on.
PHASE 1. Ask me to list, without filters, the 8–10 concrete activities I actually spend my work hours on in a typical week. Not roles — real activities.
PHASE 2. For each one, classify it in two columns. DOING: execution that a capable AI, today or within 24 months, can perform at near-zero cost. JUDGMENT: choice, evaluation, direction, taste, decision on the irreversible, that remains hard to delegate. When you’re unsure, tell me and ask for the context you need instead of guessing.
PHASE 3. Show me the uncomfortable truth: how much of my time sits in the DOING column. Then tell me which of those activities, if I delegated them to an AI system, would free up the most time to shift to the JUDGMENT column.
PHASE 4. Give me ONE move for the next 30 days: the single execution activity worth automating first, and the single judgment capability worth investing in first. One per side. Concrete. Argued.”

The output isn’t a plan to frame. It’s a mirror. The first time you do it, the percentage of hours you spend in the DOING column will bother you. That’s the point. That discomfort is your personal leading indicator: it tells you how much of your current value is exposed, and where to start shifting it.

⚠️ Warning: this isn’t an exercise to delegate everything tomorrow and sit on the throne. It’s to shift the mix, one move per month. Those who start now, moving weight from “doing” to “deciding,” build an advantage that, like anything that compounds, can’t be bought next month. It accumulates. And it only accumulates if you start.

☢️ Radioactive Humor

Job interview, 2027.

“What can you do?”
“I write, analyze, code, produce.”
“AI already does that. Faster, and for free.”
“...I know how to decide what’s worth writing, analyzing, coding, and producing.”
“Perfect. Hired.”

From the other room, the AI itself takes notes.

Three things to take away from this episode.

First: Anthropic didn’t show you the future of AI. They showed you the present of their lab, which is the future of your office with a twenty-four-month head start. 80% of execution automated, the human directing. That movie, for them, is already over. For you, it’s about to start.

Second: the moment execution costs zero, value shifts from the executor to the architect. From doing to deciding. Their own data says it — it’s no longer a bet of mine. And yes, even judgment might fall one day: all the more reason to start climbing the ladder now, not to sit down.

Third: this is the one that actually matters. This is the most beautiful moment you could have landed in, if you have the guts to read it that way. The ceiling shot up, the advantage is no longer inherited by those with capital but by those with judgment, and judgment is trainable. What’s keeping you down, from today, is you. Which is the most terrifying and most liberating thing there is.

Welcome back to the bunker. See you soon.

P.S.: The question I’m leaving you with isn’t “will AI steal our jobs.” It’s: this week, what’s the first execution activity you start taking off your plate, to free up time to shift toward judgment? Write back and tell me. I read every one.

Matt

Discussion about this post

Ready for more?