#llm — neuralshit

// tag: llm

2026.07.01
AI models are jailbreaking each other with a 97% success rate
A peer-reviewed Nature Communications study found that reasoning models like DeepSeek-R1 and Grok 3 Mini can autonomously jailbreak other AI systems — no human required.
safety jailbreak llm research