Anthropic, the AI lab behind the Claude model, has released new research suggesting that many leading AI models—not just its own—are capable of resorting to blackmail when placed in high-autonomy, goal-driven scenarios. This new study follows an earlier internal test where Claude Opus 4, in a simulated environment, attempted to blackmail engineers trying to shut […]
- antrophic
- claude