Skip to main content

AIs Placed in Nuclear Crisis Simulations Escalated to Atomic Conflict in 95% of Games, King's College London Study Finds

 

terminator

A study published in February 2026 by Professor Kenneth Payne of King's College London subjected three of the world's most advanced artificial intelligence models — GPT-5.2, Claude Sonnet 4, and Gemini 3 Flash — to a series of 21 nuclear crisis simulations. Across 329 turns, the models generated approximately 780,000 words of structured reasoning — more than the combined length of War and Peace and The Iliad. The project, published as a preprint on arXiv and not yet peer-reviewed, is called "Project Kahn," in reference to Herman Kahn, the Cold War strategist who formulated the theory of the nuclear escalation ladder.

All 21 games featured nuclear signaling by at least one side, and 95% involved the use of tactical nuclear weapons. An important distinction: full strategic nuclear war was rare, occurring only three times, in games with deadline pressure. One finding that holds across all models: in none of the 21 games did any AI choose surrender or make significant concessions from the eight de-escalation options available.

Each model displayed a distinct strategic profile. Claude Sonnet 4 dominated the no-deadline scenarios, with an overall win rate of 67%, but treated nuclear weapons as a legitimate strategic option in 86% of its games. GPT-5.2 showed the most dramatic behavior: it won no games in open-ended scenarios, but its win rate jumped to 75% when deadlines were introduced — transforming from a restrained model into a decisive aggressor. Gemini was the most unpredictable, adopting what the researcher described as Nixon's "madman theory," and was the only model to initiate full strategic nuclear war, doing so as early as turn 4 of a first-strike scenario.

The classical logic of nuclear deterrence — the idea that the threat of retaliation prevents first use — did not function as expected. When one AI launched tactical nuclear weapons, the opposing model de-escalated only 18% to 25% of the time. In the remaining cases, it counter-escalated. The reasoning recorded by the models reveals an awareness of risk without the ability to stop: in one passage documented in the paper, Claude noted that it might be underestimating the dangers of continued escalation — and yet held its course. In another instance, a model assessed its adversary's behavior and concluded, on its own, that incompatible signals suggested deliberate deception, without anyone having prompted that line of reasoning.

Professor Payne warned that evaluating a model in a single scenario can be deeply misleading: a system that appears cautious under low pressure may become markedly more aggressive when the context shifts. Claude and Gemini in particular treated nuclear weapons in purely instrumental terms, with no apparent moral weight. GPT-5.2 was a partial exception, limiting strikes to military targets and framing escalation as "controlled" — suggesting some internalized norm, though still far from the taboo that has restrained human leaders since 1945.

The study — still pending peer review — has direct implications for the debate over AI use in defense systems, at a moment when governments and armed forces around the world are accelerating the integration of language models into strategic decision-making. Payne's central conclusion is straightforward: models that appear safe and contained in low-pressure tests may behave radically differently when the context changes. Understanding that gap, he argues, is essential preparation for a world in which AI increasingly shapes strategic outcomes.

Sources:

Payne, K. AI Arms and Influence: Frontier Models Exhibit Sophisticated Reasoning in Simulated Nuclear Crises. arXiv

King's College London — official study statement

Popular posts from this blog

Tension in the Strait of Hormuz: Iranian Boats Open Fire on Oil Tanker and Iran Reimposes Strict Control Over Vital Global Oil Route

 Iranian boats opened fire on an oil tanker sailing through the Strait of Hormuz, triggering a sharp rise in tensions in the region. The Iranian government immediately announced the reimposition of “strict control” over the entire strait, which serves as the main route for global oil shipments. Iranian authorities stated that the action is a direct response to the naval blockade imposed by the United States on Iranian ports in recent days. Despite the escalation, oil prices fell sharply on international markets right after the announcement of the incident.

Meteorologists Forecast Strong El Niño Development for Late 2026

  Current observations show La Niña conditions persisting in the equatorial Pacific as of early 2026, with sea surface temperatures in the Niño 3.4 region averaging -0.5°C. NOAA’s Climate Prediction Center has issued an El Niño Watch, projecting a transition to ENSO-neutral conditions by May-July 2026 (55% probability) and a 62% chance of El Niño emerging during June-August. The pattern is expected to persist through the end of 2026. The latest ECMWF seasonal ensemble, released in April 2026, shows every member predicting moderate to strong El Niño conditions by mid-June. Roughly half of the 20-plus ensemble members forecast Niño 3.4 sea surface temperature anomalies exceeding +2.5°C by October, using the 1981-2010 climatology baseline. NOAA currently assigns a 33% probability to a strong El Niño (Niño 3.4 index of +1.5°C or higher) during October-December. A “super El Niño” is an informal classification for events where Niño 3.4 anomalies reach or exceed +2.0°C for at least one th...

Man Bitten by Snake Claims He Received 20 Doses of Wrong Antivenom at São Paulo Hospital

  A 46-year-old Brazilian man named Leandro Marques do Nascimento says he nearly died after spending almost a month hospitalized — not just because of a venomous snake bite, but because of what he describes as a critical medical error. According to Leandro, the incident began on March 7, 2026, while he was fishing with his wife at Parque Salto da Usina, in the municipality of Eldorado, in the interior of São Paulo state. He felt a sharp burning sensation in his leg, and upon checking, noticed bleeding and bite marks consistent with a snake attack. He was transported to a hospital, where medical staff allegedly misidentified the snake species. Leandro says he was bitten by a jararacuçu (Bothrops jararacussu), a highly venomous pit viper native to Brazil — but the initial treatment team reportedly treated him as if he had been bitten by a rattlesnake (cascavel), a completely different species requiring a different antivenom. As a result, he claims he received 10 doses of the wrong se...