UK gov lab found a universal GPT-5.5 jailbreak in just 6 hours

The UK AI Safety Institute (AISI) officially confirmed that expert red-teamers broke through GPT-5.5's s in just six hours, developing what is called a — a technique that bypassed across every malicious query OpenAI provided for testing. The attack worked even in multi-turn s, where the AI operates over several steps without human intervention. OpenAI subsequently pushed several updates to its safety stack, but a issue in the version made available to AISI meant the institute could not verify whether the final fix actually worked.

The evaluation report was published on April 30, 2026, but drew little attention until recent news prompted people to dig into the timeline. AISI also separately assessed a ed Mythos, which had only minimal s to begin with.

Key points

  • UK's AISI officially found a in GPT-5.5 during a formal evaluation
  • Expert red-teamers needed only six hours to develop the bypass
  • The attack worked across all malicious cyber queries and in multi-step s
  • OpenAI patched the guardrails, but AISI could not verify the final version due to a config issue
  • The report was published in late April 2026 but only recently gained wider attention
Read original