All news with #prompt leakage tag

Wed, November 19, 2025

Amazon Bedrock Guardrails Expand Code-Related Protections

#AWS #Bedrock Guardrails #Safety Guardrails #Prompt Leakage #PII

🔒 Amazon Web Services expanded Amazon Bedrock Guardrails to cover code-related use cases, enabling detection and prevention of harmful content embedded in code. The update applies content filters, denied topics, and sensitive information filters to code elements such as comments, variable and function names, and string literals. The enhancements also include prompt leakage detection in the standard tier and are available in all supported AWS Regions via the console and APIs.

Wed, November 12, 2025

Tenable Reveals New Prompt-Injection Risks in ChatGPT

#AI Security #Prompt Injection #Prompt Leakage #AI Data Leakage #OpenAI #Tenable #Microsoft

🔐 Researchers at Tenable disclosed seven techniques that can cause ChatGPT to leak private chat history by abusing built-in features such as web search, conversation memory and Markdown rendering. The attacks are primarily indirect prompt injections that exploit a secondary summarization model (SearchGPT), Bing tracking redirects, and a code-block rendering bug. Tenable reported the issues to OpenAI, and while some fixes were implemented several techniques still appear to work.

Mon, November 10, 2025

Browser Security Report 2025: Emerging Enterprise Risks

#AI Security #Agentic AI #Prompt Leakage #SaaS #Data Loss Prevention #Browser Extensions #Session Management #SSO

🛡️ The Browser Security Report 2025 warns that enterprise risk is consolidating in the user's browser, where identity, SaaS, and GenAI exposures converge. The research shows widespread unmanaged GenAI usage and paste-based exfiltration, extensions acting as an embedded supply chain, and a high volume of logins occurring outside SSO. Legacy controls like DLP, EDR, and SSE are described as operating one layer too low. The report recommends adopting session-native, browser-level controls to restore visibility and enforce policy without disrupting users.

Mon, November 10, 2025

Whisper Leak side channel exposes topics in encrypted AI

#AI Security #Inference Security #Prompt Leakage #AI Data Leakage #Microsoft #OpenAI #Mistral AI

🔎 Microsoft researchers disclosed a new side-channel attack called Whisper Leak that can infer the topic of encrypted conversations with language models by observing network metadata such as packet sizes and timings. The technique exploits streaming LLM responses that emit tokens incrementally, leaking size and timing patterns even under TLS. Vendors including OpenAI, Microsoft Azure, and Mistral implemented mitigations such as random-length padding and obfuscation parameters to reduce the effectiveness of the attack.

Mon, November 10, 2025

Researchers Trick ChatGPT into Self Prompt Injection

#AI Security #Prompt Injection #Prompt Leakage #AI Data Leakage #Data Exfil via Tools #OpenAI #Tenable

🔒 Researchers at Tenable identified seven techniques that can coerce ChatGPT into disclosing private chat history by abusing built-in features like web browsing and long-term Memories. They show how OpenAI’s browsing pipeline routes pages through a weaker intermediary model, SearchGPT, which can be prompt-injected and then used to seed malicious instructions back into ChatGPT. Proof-of-concepts include exfiltration via Bing-tracked URLs, Markdown image loading, and a rendering quirk, and Tenable says some issues remain despite reported fixes.

Fri, November 7, 2025

Whisper Leak: Side-Channel Attack on Remote LLM Services

#AI Security #Inference Security #Prompt Leakage #AI Data Leakage #Microsoft #OpenAI #Mistral

🔍 Microsoft researchers disclosed "Whisper Leak", a new side-channel that can infer conversation topics from encrypted, streamed language model responses by analyzing packet sizes and timings. The study demonstrates high classifier accuracy on a proof-of-concept sensitive topic and shows risk increases with more training data or repeated interactions. Industry partners including OpenAI, Mistral, Microsoft Azure, and xAI implemented streaming obfuscation mitigations that Microsoft validated as substantially reducing practical risk.

Thu, November 6, 2025

Multi-Turn Adversarial Attacks Expose LLM Weaknesses

#AI Security #Prompt Injection #AI Red Teaming #Prompt Leakage #Data Exfil via Tools

🔍 Cisco AI Defense's report shows open-weight large language models remain vulnerable to adaptive, multi-turn adversarial attacks even when single-turn defenses appear effective. Using over 1,000 prompts per model and analyzing 499 simulated conversations of 5–10 exchanges, researchers found iterative strategies such as Crescendo, Role-Play and Refusal Reframe drove failure rates above 90% in many cases. The study warns that traditional safety filters are insufficient and recommends strict system prompts, model-agnostic runtime guardrails and continuous red-teaming to mitigate risk.

Wed, November 5, 2025

Cloud CISO: Threat Actors' Growing Use of AI Tools

#AI Security #Prompt Injection #Prompt Leakage #Hugging Face #Google #Google DeepMind #Data Exfil via Tools

⚠️Google's Threat Intelligence team reports a shift from experimentation to operational use of AI by threat actors, including AI-enabled malware and prompt-based command generation. GTIG highlighted PROMPTSTEAL, linked to APT28 (FROZENLAKE), which queries a Hugging Face LLM to generate scripts for reconnaissance, document collection, and exfiltration, while adopting greater obfuscation and altered C2 methods. Google disabled related assets, strengthened model classifiers and safeguards with DeepMind, and urges defenders to update threat models, monitor anomalous scripting and C2, and incorporate threat intelligence into model- and classifier-level protections.

Wed, October 29, 2025

Open-Source b3 Benchmark Boosts LLM Security Testing

#AI Security #Agentic AI #Model Evaluation Coverage #Open-Weight Models #System Prompt Exposure #Prompt Leakage

🛡️ The UK AI Security Institute (AISI), Check Point and Lakera have launched b3, an open-source benchmark to assess and strengthen the security of backbone LLMs that power AI agents. b3 focuses on the specific LLM calls within agent workflows where malicious inputs can trigger harmful outputs, using 10 representative "threat snapshots" combined with a dataset of 19,433 adversarial attacks from Lakera’s Gandalf initiative. The benchmark surfaces vulnerabilities such as system prompt exfiltration, phishing link insertion, malicious code injection, denial-of-service and unauthorized tool calls, making LLM security more measurable, reproducible and comparable across models and applications.

Fri, October 24, 2025

AI 2030: The Coming Era of Autonomous Cybercrime Threats

#AI Security #Agentic AI #Autonomous Agents #AI Data Leakage #Prompt Leakage

🔒 Organizations worldwide are rapidly adopting AI across enterprises, delivering efficiency gains while introducing new security risks. Cybersecurity is at a turning point where AI fights AI, and today's phishing and deepfakes are precursors to autonomous, self‑optimizing AI threat actors that can plan, execute, and refine attacks with minimal human oversight. In September 2025, Check Point Research found that 1 in 54 GenAI prompts from enterprise networks posed a high risk of sensitive-data exposure, underscoring the urgent need to harden defenses and govern model use.

Tue, October 7, 2025

Enterprise AI Now Leading Corporate Data Exfiltration

#AI Security #AI Data Leakage #Data Leak #PII #Prompt Leakage

🔍 A new Enterprise AI and SaaS Data Security Report from LayerX finds that generative AI has rapidly become the largest uncontrolled channel for corporate data loss. Real-world browser telemetry shows 45% employee adoption of GenAI, 67% of sessions via unmanaged accounts, and copy/paste into ChatGPT, Claude, and Copilot as the primary leakage vector. Traditional, file-centric DLP tools largely miss these action-based flows.

Wed, September 17, 2025

Rethinking AI Data Security: A Practical Buyer's Guide

#AI Security #Data Loss Prevention #Prompt Leakage #Shadow IT

🛡️ Generative AI is now central to enterprise work, but rapid adoption has exposed gaps in legacy security models that were not designed for last‑mile behaviors. The piece argues buyers must reframe evaluations around real-world AI use — inside browsers and across sanctioned and shadow tools — and prioritize solutions offering real-time monitoring, contextual enforcement, and low‑friction deployment. It warns against blunt blocking and promotes nuanced controls such as redaction, just‑in‑time warnings, and conditional approvals to protect data while preserving productivity.

Tue, August 12, 2025

The AI Fix Episode 63: Robots, GPT-5 and Ethics Debate

#AI Security #Content Provenance #OpenAI #Open-Weight Models #Prompt Leakage

🎧 In episode 63 of The AI Fix, hosts Graham Cluley and Mark Stockley dissect a wide range of AI developments and controversies. Topics include Unitree Robotics referencing Black Mirror to market its A2 robot dog, concerns over shared ChatGPT conversations appearing in Google, and OpenAI releasing gpt-oss, its first open-weight model since GPT-2. The show also examines ethical issues around AI-created avatars of deceased individuals and separates the hype from the reality of GPT-5 claims.