The AI productivity paradox

Here's this week's free edition of Platformer: a fresh report on AI and the economy that looks at the complicated (and sometimes hilarious) question of whether AI is making workers and their bosses more productive.

Want to kick in a few bucks to support our work? If so, consider upgrading your subscription today. We'll email you all our scoops first, like our recent one about a viral Reddit hoax. Plus you'll be able to discuss each today's edition with us in our chatty Discord server, and we’ll send you a link to read subscriber-only columns in the RSS reader of your choice. You’ll also get access to Platformer+: a custom podcast feed in which you can get every column read to you in my voice. Sound good?

Upgrade

This is a column about AI. My boyfriend works at Anthropic. See my full ethics disclosure here.

One of the most important open questions in tech today had some surprisingly funny answers.

On average, does using AI at work make you more productive — or less?

The answer may depend on whether you’re a manager or a worker.

It may depend on which tools you’re using, and how well you’re using them.

Most important of all, though, it may depend on whether you’re deluding yourself.

One of the more famous papers about artificial intelligence last year came from METR, a nonprofit that evaluates frontier AI models. In July, it published results of a randomized controlled trial studying experienced open-source developers. It found that when they use AI tools, completing tasks takes them 19 percent longer than when they go without. That was surprising enough. But the real twist is that when these same developers were asked what AI had done for them, they reported that it had sped them up by 20 percent.

In the end, these developers learned the same lesson I insist on re-learning every time I download a new to-do app: feeling productive and being productive are two different things.

I thought of these sweet deluded developers today when reading about a new study from the AI consulting firm Section. The group surveyed 5,000 white-collar workers about whether AI is making them more efficient. Two-thirds of rank-and-file employees said AI saved them zero to two hours a week in their jobs. And 40 percent say they would be fine “never using AI again,” according to a story on the study from the Wall Street Journal. But more than 40 percent of executives said AI saves them more than eight hours a week.

What explains the divide?

One possibility is that the executives here have fallen into the same trap that the open-source developers did. They are answering all their emails with Gemini; they are creating slides with ChatGPT; they have five Claude Code agents running on six different monitors tackling different projects they came up with over the weekend. And while these tools are outputting a massive amount of … something, the business itself is not making any more money.

Another survey from this week lends credence to that explanation. PricewaterhouseCoopers’ survey of 4,454 CEOs across 95 countries found that 12 percent of companies say AI grew their revenues and reduced their costs, but 56 percent say they are getting “nothing out of it.”

Why is that? One theory is that AI usage simply shifts the burden of completing tasks around the organization. The Journal cites a study from Workday which found that much of the time that employees reported saving by using AI tools was offset by extended reviews of AI-generated content. In many cases, of course, it’s executives who are passing down AI-generated work to subordinates, who must then review the work and correct it before it can be implemented.

Last year, we got a memorable term for this kind of material: workslop. The term refers to “AI-generated work content that masquerades as good work, but lacks the substance to meaningfully advance a given task,” as CNBC defined it in September.

CNBC found that the plurality of workslop (about 40 percent) comes from peers. But at least 16 percent comes from above. It paints an irresistibly comic picture: one of wide-eyed executives using every AI tool at their disposal to create error-ridden documents and plans that their subordinates must then spend half of their time fixing.

Some of this comes down to the unreliability of the non-deterministic systems that underpin AI. Some of it, too, is a skill issue. A November PwC survey of almost 50,000 workers found that 92 percent of daily AI users report being more productive than their peers, 58 percent said it bolstered their job security, and 52 percent linked it to higher pay. Some of that productivity boost is likely an illusion, of course. But I suspect not all of it is.

Particularly because daily users are more likely to pay for state-of-the-art models and tools, instead of whatever basic Microsoft or Google tool comes with their email account.

There’s also the fact that these studies lag far behind the pace of development in AI itself. Developers in the METR study, for example, primarily used Cursor Pro with Claude 3.5 and 3.7 Sonnet. How much would those numbers change if you swapped in Claude Opus 4.5?

I suspect that further research will answer the question. In the meantime, though, there’s one more gap worth considering here.

One reason managers feel so productive using AI is that they benefit from doing so. Workers, on the other hand, are more likely to suspect that to use AI effectively is to aid in their own eventual replacement by AI tools. “While the upside is murky, the downside risks are clear and often existential for some folks who worry about their job security,” Molly Kinder, a senior fellow at the Brookings Institution who studies AI and labor, told me over email. “Given the difference in enthusiasm and motivation, it wouldn’t surprise me if productivity gains flowed directly from there.”

And hype continues to outpace the quality of many of the tools workers are currently being asked to use, she said.

“I think that is especially annoying to workers when they keep hearing from their employers how great the tools are, and when couched with fears about future displacement,” Kinder said.

So what to do about it?

Managers should learn not to mistake their own enthusiasm for business results. Just because they find using AI tools enjoyable doesn’t mean that the organization is benefiting as a result. They would be better off focusing on standard analytics than more vibes-based analyses of the AI future.

The story for workers is more complicated, and varies significantly based on the company, the role, and their particular feelings about AI. At a minimum, I think workers would benefit from understanding how good (and bad) state-of-the-art models are, and updating that understanding as new models and tools emerge. In the short run, it may give them more leverage in the office. And if their roles are eventually displaced, becoming better informed now could help them see it coming.

Because as funny as that METR study can seem, six months later the founder of Node.js is declaring that “the era of humans writing code is over.” For software engineers, at least, the jagged frontier has advanced far enough to swallow up a large part of their daily work. There was a time when AI merely made them feel like they were being productive. In time, though, it actually did.

On the podcast this week: Kevin and I explore how ads will change ChatGPT, and OpenAI. Then, Anthropic's Amanda Askell joins us to discuss Claude's new constitution.

Bonus podcast: Over on YouTube, Kevin and I recorded a 24-minute vibe coding tutorial for folks looking to get started.

Sponsored

Don't Follow Your Passion

Don’t ‘follow your passion’ if you want to have a fulfilling career.

80,000 Hours is a nonprofit that aims to help people find ways to have a career that feels more meaningful, impactful, and helps solve one of the world's most pressing problems. After reviewing over 60 studies on what makes for a dream job, they found that most of the common advice — like looking for work that pays well and isn’t stressful — doesn’t hold up to the evidence.

So what does?

In their research-driven career guide, they argue that to have a satisfying career, you should do work that feels meaningful because it contributes to helping others.

Also, most common ways of trying to make the world a better place don’t do as much good as people think:

Some careers have a much higher positive impact than others
The most urgent problems facing the world are often those that are most neglected by others
So you don’t have to follow the conventional path of being a doctor, teacher, or a charity worker if you want to do good.

Their guide is full of concrete, practical advice that aims to help you create a full career plan that you feel confident in, and it draws on over 10 years of research.

Get the free guide

Following

Overthinking Thinking Machines

What happened: More details are leaking out of Thinking Machines Lab and OpenAI following the firing of CTO Barrett Zoph from the former and his return to the latter alongside two other senior TML employees. The news understandably rattled the Thinking Machines investors who are being asked to support a new $50 billion valuation.

People in Silicon Valley were fascinated — and confused. Some initially said Zoph was fired for “unethical conduct,” while others claimed he was fired because Murati learned he was planning to leave the company. Since then, we’ve learned details of a workplace romance, conflict between execs, and more.

According to accounts in the Wall Street Journal and New York Times, Mira Murati’s problems with Zoph began this summer, when she learned he was having an undisclosed relationship with a female colleague. Zoph had apparently suggested recruiting that colleague to TML; their romantic involvement had begun back when they were both at OpenAI.

Zoph initially denied the relationship, but eventually both parties disclosed it to Murati this summer. The woman subsequently left for OpenAI. Zoph told Murati he’d been "manipulated into the relationship," according to the Journal. (Whatever that means.)

Soon after, Zoph went on a month-long break from work. When he returned, Murati placed him in a technical contributor role with more limited responsibilities. (Zoph told the Journal that this was a common practice for technical managers.)

Meanwhile, Zoph, co-founder Luke Metz, and researcher Sam Schoenholz had increasingly disagreed with Murati’s plans for the company, and felt its releases had been disappointing relative to its competitors. One flashpoint came when Meta initiated talks to buy TML; Zoph wanted Murati to pursue the deal, but she declined.

Then last week, according to the Times, the trio showed up to Zoph’s scheduled one-on-one with Murati. They requested that Zoph be given final say on technical decisions — a power Murati held at the company.

Murati asked if the trio had already committed to jobs somewhere else; Schoenholz and Metz said no, but Zoph refused to answer.

Zoph was fired two days later. A few hours after, the three TML employees joined OpenAI. Since then, nine more employees of the roughly 100-person startup have left for OpenAI or gotten OpenAI offers. Zoph will now lead OpenAI’s push to sell to more enterprise customers.

Why we’re following: Gossip aside, it’s still not entirely clear whether Zoph’s departure had more to do with workplace romance, disagreements about the startup’s direction, or something else. From the outside, it looks like Murati fired Zoph because she already knew he was leaving. (Or maybe had been looking for an excuse to do so.)

If we have the details right, though, it looks like both participants in the ill-fated office love story are back at OpenAI. That could lead to some awkward hallway conversations.

As for Thinking Machines, it's never a good look when multiple co-founders leave within a matter of months. But the most important question about the company isn't why those employees left. It's whether the group that remains can

What people are saying: “Thinking Machines Lab terminated my employment only after it learned I would be leaving the company. Full stop,” Zoph said in a statement to the WSJ. “At no time did TML cite to me any performance reasons or any unethical conduct on my part as the reason for my termination and any suggestion otherwise is false and defamatory,” he added.

Venky Ganesan, a partner at investment firm Menlo Ventures, sharaed some existential reflections with the Times in the wake of the news. “I am reminded of this line from Anna Karenina: 'All happy families are alike; each unhappy family is unhappy in its own way,’” he said. “Happy companies require many things to go right simultaneously. You only need one or two things to go wrong to have an unhappy company."

—Ella Markianos

The scale of nudification at Grok (and Meta)

What happened: In a new analysis, the New York Times and the Center for Countering Digital Hate estimated that Grok has created and publicly shared at least 1.8 million sexualized images of women. This means that of the 4.4 million images posted publicly over just nine days, at least 41 percent likely contained those images.

Separately, the CCDH estimated that more than 23,000 sexualized images created by Grok depicted children.

Grok is by far the worst actor in the space. But while Meta’s AI doesn’t let users create the sort of images Grok does, the company is still making money off of AI "nudifiying" apps like Grok. An audit by Indicator found that despite promising to crack down on nudifier ads in June of last year, Meta ran at least 4,431 nudifier ads since Dec. 4 on its various platforms. (A Meta spokesperson said the company is reviewing and enforcing policies against violative ads.)

The "good" news, Indicator’s Alexios Mantzarlis points out, is that some of the ads appear to be scams for apps that won't actually nudify people. It's hard to have much sympathy for people seeking to make deepfake nudes, but it's a shame that Meta profits from the exchange anyway

Why we’re following: Regulators around the world are reckoning with an unprecedented scale of abuse enabled by social platforms.

And the primary culprit of the abuse, X and its owner Elon Musk, have repeatedly refused to even acknowledge the problem. Just last week, Musk said he was “not aware of any naked underage images generated by Grok. Literally zero,” and X only said it would prevent such images from being generated after weeks of backlash.

What people are saying: “This is industrial-scale abuse of women and girls,” Imran Ahmed, the chief executive of the CCDH, told the Times, pointing to the ease of use and distribution unique to Grok.

Mantzarlis told the BBC last year that “this abuse vector re