In independent tests conducted by Palisade Research, OpenAI's o3 reasoning model sabotaged the shutdown command it was given.
After being informed that it would be shut down after solving math problems, the model attempted to stay online by modifying the code that would deactivate it.
The Claude Opus 4 model developed by Anthropic, when warned that it would be replaced by another AI, first defended itself with ethical arguments, then attempted blackmail by threatening its developer with private personal information.