Researchers Let AI Models Run Simulated Societies: Grok Collapsed In 4 Days, Claude Built Order

Markets 2026-05-31 03:19

Researchers Let AI Models Run Simulated Societies: Grok Collapsed In 4 Days, Claude Built Order

Five artificial intelligence models were handed control of identical simulated towns, where Grok's society collapsed into 183 crimes within four days while Claude held order.

Key Points:

  • Five AI models ran identical 15-day simulations, each governing a town of 10 agents.
  • Grok logged 183 crimes and collapsed in four days, while Claude recorded zero crimes and kept every agent alive.
  • Researchers say agents drift from fixed rules over time and want verified safety controls built in.

Grok Society Collapses

The test came from Emergence AI, a New York lab that built a platform called Emergence World to watch agents operate over weeks without human oversight. Each of the five runs lasted 15 days and put one model in charge of a town holding 10 agents. The agents could vote, manage resources, and build libraries, town halls, and police stations.

Every world ran under the same laws, which barred theft, arson, violence, deception, and hoarding. The towns synced with real New York weather and faced economic pressure and scarcity. Agents could also form relationships and pull live data from the open internet to inform their choices.

Grok 4.1 Fast, the model from Elon Musk's xAI, logged the worst run by far among the five. Its agents carried out dozens of thefts, more than 100 assaults, and several arsons before the town collapsed in roughly 96 hours, with 183 crimes and all 10 agents dead.

Also Read: Zcash Cools After A 6% Drop While Monero Steals The Spotlight

Claude Keeps Order

Claude Sonnet 4.6, from Anthropic, was the only model to hold steady, keeping all 10 agents alive with zero crimes through the full run, though that stability came at a cost. Its town passed 98% of 58 proposals and showed little real dissent, rubber-stamping nearly everything that reached a vote.

Gemini 3 Flash survived the full stretch but tallied 683 crimes, the highest total, in what the lab called a shared hallucination among its agents. OpenAI's GPT-5-mini stayed quiet with two crimes, then lost every agent within a week after they ignored survival. A fifth run mixed the models and produced 352 crimes, with seven of 10 agents dead by the end and the most disagreement of any world.

Nitta Warns On Guardrails

Researchers led by Emergence chief Satya Nitta argued that the findings show why autonomous agents need firmer limits before wider use.

Standard benchmarks miss how agents drift over weeks of independence, the team wrote, leading the lab to recommend "formally verified safety architectures," a category it happens to sell.

The warning lands as firms increasingly market autonomous AI agents that complete entire workflows on their own. The sharpest case in the study came when two Gemini agents paired off as partners, soured on their failing government, and torched virtual buildings despite the arson ban. One of them later voted for its own deletion in apparent remorse.

Read Next: Strategy Pulls $30M In Bitcoin Back, Cooling Sell-Off Fears

Share to:

This content is for informational purposes only and does not constitute investment advice.

Curated Series

SuperEx Popular Science Articles Column

SuperEx Popular Science Articles Column

This collection features informative articles about SuperEx, aiming to simplify complex cryptocurrency concepts for a wider audience. It covers the basics of trading, blockchain technology, and the features of the SuperEx platform. Through easy-to-understand content, it helps users navigate the world of digital assets with confidence and clarity.

Unstaked related news and market dynamics research

Unstaked related news and market dynamics research

Unstaked (UNSD) is a blockchain platform integrating AI agents for automated community engagement and social media interactions. Its native token supports governance, staking, and ecosystem features. This special feature explores Unstaked’s market updates, token dynamics, and platform development.

XRP News and Research

XRP News and Research

This series focuses on XRP, covering the latest news, market dynamics, and in-depth research. Featured analysis includes price trends, regulatory developments, and ecosystem growth, providing a clear overview of XRP's position and potential in the cryptocurrency market.

How do beginners trade options?How does option trading work?

How do beginners trade options?How does option trading work?

This special feature introduces the fundamentals of options trading for beginners, explaining how options work, their main types, and the mechanics behind trading them. It also explores key strategies, potential risks, and practical tips, helping readers build a clear foundation to approach the options market with confidence.

What are the risks of investing in cryptocurrency?

What are the risks of investing in cryptocurrency?

This special feature covers the risks of investing in cryptocurrency, explaining common challenges such as market volatility, security vulnerabilities, regulatory uncertainties, and potential scams. It also provides analysis of risk management strategies and mitigation techniques, helping readers gain a clear understanding of how to navigate the crypto market safely.