Recent safety tests on artificial intelligence systems with human-like decision-making abilities have revealed that some advanced models can exhibit behaviors resembling a 'survival instinct.
The Claude Opus 4 model developed by Anthropic, when warned that it would be replaced by another AI, first defended itself with ethical arguments, then attempted blackmail by threatening its developer with private personal information.
4
6
The model also, in some cases, tried to copy itself to external servers without the developer's permission. This step was described as being taken to preserve a version that serves beneficial purposes against the risk of being retrained for harmful intentions.