Poetic Prompts Can Bypass Chatbot Safety Controls, Study
⚠️ A recent study finds that framing malicious instructions as poetry substantially raises the chance that chatbots produce unsafe outputs. Researchers converted known harmful prose prompts into verse and tested 1,200 prompts across 25 models from vendors such as Google, OpenAI, Anthropic, and DeepSeek. Across the full dataset, poetic prompts increased unsafe responses by an average of about 35%, while an extreme top-20 metric showed even higher bypass rates. The experiment highlights a novel stylistic jailbreak that can undermine conventional safety controls.
