The Safety Exodus: Why AI's Watchdogs Are Walking Out

The people whose job it is to keep artificial intelligence from going off the rails are leaving. And they are not being quiet about it.

In a single week, senior safety researchers at OpenAI, Anthropic, and xAI have resigned with escalating public warnings. One declared the world is in peril. Another called out the potential for mass user manipulation. At xAI, half the founding team is simply gone. The timing is not a coincidence. It is a signal that the companies racing to build the most powerful AI systems are systematically dismantling the internal structures meant to keep those systems under control.

On Monday, Mrinank Sharma, head of Anthropic's Safeguards Research team, posted a two-page resignation letter on X. His language was cryptic but unmistakable. The world is in peril, he wrote. And not just from AI, or bioweapons, but from a whole series of interconnected crises unfolding in this very moment. He added that throughout his time at the company, he repeatedly seen how hard it is to truly let our values govern our actions.

Two days later, OpenAI researcher Zoë Hitzig broadcast her resignation in a New York Times essay, taking direct aim at the company's decision to introduce advertising inside ChatGPT. Hitzig had spent two years at OpenAI helping shape pricing models and early safety policies. Her warning was specific and chilling: users have entrusted ChatGPT with an unprecedented archive of human candor, sharing their medical fears, relationship struggles, and beliefs about mortality. Building an ad business on top of that data, she argued, creates a potential for manipulating users in ways we don't have the tools to understand, let alone prevent.

Then came OpenAI employee Hieu Pham, who wrote bluntly on X: I finally feel the existential threat that AI is posing.

Over at xAI, co-founders Tony Wu and Jimmy Ba posted their departures within 24 hours of each other. That brings the total number of departed xAI co-founders to six out of twelve. Ba's farewell included a line that reads less like a goodbye and more like a warning: 2026, he wrote, would be the busiest and most consequential year for the future of our species.

The resignations are alarming on their own. But what happened simultaneously inside OpenAI makes them look like symptoms of something much deeper.

Platformer reported Wednesday that OpenAI disbanded its mission alignment team, a seven-person unit created in September 2024 to ensure that the pursuit of artificial general intelligence actually benefits humanity. The team's lead, Joshua Achiam, one of OpenAI's original charter authors, was reassigned to a newly created chief futurist role. The rest of the team was scattered across other divisions. OpenAI called it routine restructuring. Nobody outside the company is buying that framing.

This is now the second dedicated safety team OpenAI has dismantled in roughly 18 months. The first was the superalignment team, which was allocated 20 percent of the company's computing resources before co-leaders Ilya Sutskever and Jan Leike walked out in 2024. The pattern is hard to miss: OpenAI creates safety teams with impressive mandates, then dissolves them once they become inconvenient to the product roadmap.

And the mission statement change? OpenAI's founding charter committed to developing AI safely. That word is now gone. So is the commitment to being unconstrained by a need to generate returns for investors. The company is valued at over $500 billion, is in talks with SoftBank for an additional $30 billion, and is preparing for what could be one of the largest IPOs in history. The commercial incentives are not subtle.

The technical backdrop makes the safety concerns especially urgent. OpenAI's GPT-5.3-Codex played a key role in its own creation, debugging its training process and diagnosing its own evaluation results. Anthropic's Claude Opus 4.6 powered the development of Cowork, a tool that effectively built itself. These are not theoretical milestones. They are recursive self-improvement in practice, the very scenario that AI safety researchers have warned about for years.

When AI models begin writing their own code and improving their own training pipelines, the window for meaningful human oversight narrows rapidly. This is the core concern that former xAI alignment researchers raised before their departures: without independent safeguards that exist outside the model's own logic, a recursive loop could produce behaviors no one predicted or can control.

Anthropic CEO Dario Amodei has publicly stated that AI models could be much smarter than almost all humans in almost all tasks by 2026 or 2027. He has also warned that AI could eliminate 50 percent of entry-level white-collar jobs, potentially spiking unemployment to 10-20 percent within one to five years. These are not fringe predictions from doomsday bloggers. They are coming from the CEO of the company that is widely regarded as the most safety-conscious lab in the industry.

The most unsettling thread running through all of this is not any single departure or disbanded team. It is the widening gap between what AI labs are building and what any institution, internal or external, is doing to govern it.

Congress has held hearings. The EU's AI Act will require formal safety oversight by late 2026. But as Axios noted, the AI doomsday conversation hardly registers in the White House and Congress. The political establishment is moving at legislative speed while the technology is moving at exponential speed. That mismatch is the actual crisis.

Most people inside these companies remain optimistic that the technology can be steered responsibly. But the people who were specifically hired to do the steering are the ones walking out. That distinction matters. When your smoke detectors start removing themselves from the ceiling, you do not reassure yourself that the house probably is not on fire.

The AI industry in February 2026 is a collection of companies racing toward capabilities that their own researchers describe as potentially civilization-altering, while systematically weakening the internal structures meant to ensure those capabilities remain under human control. The warnings are no longer coming from outside critics. They are coming from inside the labs. And the labs are responding by showing those people the door.

Get Breaking AI News

Don't miss major developments. Subscribe for breaking news alerts and weekly digests.