< ciso
brief />
Tag Banner

All news with #indirect prompt injection tag

32 articles

Indirect Prompt Injection: Current Web Threats and Trends

🔎 Google Threat Intelligence scanned a large Common Crawl corpus to detect indirect prompt injection (IPI) patterns embedded in public web pages. The team combined signature-based pattern matching, Gemini-assisted classification, and manual review to reduce false positives and contextualize findings. Most observed injections were low-sophistication—pranks, benign guidance, or SEO-driven prompts—but a smaller and rising set attempted data exfiltration or destructive actions. The study excludes social media and login-protected content and reports a 32% increase in malicious samples between Nov 2025 and Feb 2026.
read more →

Researchers Find 10 In-the-Wild Prompt Injection Payloads

🔒 Forcepoint researchers have uncovered 10 distinct indirect prompt injection (IPI) payloads embedded in web content that instruct AI agents to perform malicious real‑world actions such as financial fraud, data destruction and API key exfiltration. The attacks poison pages so that browsing or summarizing agents ingest and execute attacker directives, often overriding prior safeguards. Forcepoint warns risk scales with AI privilege and highlights threats to agentic tools integrated into IDEs, payment flows and automation pipelines.
read more →

GrafanaGhost vulnerability enables silent data exfiltration

🔒 Researchers at Noma's Threat Research Team have disclosed a critical vulnerability, GrafanaGhost, that enables attackers to silently extract sensitive enterprise data from Grafana environments. The exploit chains application and AI weaknesses — including flawed URL validation and indirect prompt injection — to transfer data to attacker servers without credentials or user interaction. Built-in guardrails can be bypassed with simple prompt tricks and protocol-relative URLs, allowing automatic background exfiltration that leaves little trace.
read more →

Zero‑click Grafana AI flaw enables enterprise data leaks

🛡️ Researchers disclosed a critical issue in Grafana, dubbed GrafanaGhost, that enables zero‑click exfiltration of sensitive telemetry and business data via AI‑powered dashboards. Noma Security reported the chained exploit, which combines indirect prompt injection and a URL validation bypass; Grafana validated the report and released a patch. The attack abuses protocol‑relative URLs and model keywords to trick AI into sending data to attacker servers. Organizations should patch, restrict img‑src, and enforce egress controls.
read more →

Continuous defenses for Workspace against prompt injection

🔐 Google outlines a continuous, layered approach to mitigating indirect prompt injection (IPI) across Workspace with Gemini, combining proactive discovery, synthetic data generation, and iterative defenses. Human and automated red-teaming, an AI Vulnerability Rewards Program, and OSINT monitoring are used to catalog and expand attack variants. Deterministic configuration controls, ML retraining, LLM prompt hardening, and model-level defenses are validated through comparative testing to reduce IPI success while preserving routine performance.
read more →

Agentic Commerce Risks: AI-Enabled Retail Fraud Scenarios

🔐At the NRF Big Show in January 2026, Google introduced the Universal Commerce Protocol (UCP) and highlighted compatibility with the Agent Payments Protocol (AP2), promising tokenized payments and verifiable credentials. Unit 42 warns that indirect prompt injection—where agents ingest hidden instructions while browsing—can enable novel fraud such as gift card payload poisoning and refund logic hijacking. Industry forecasts (Bain, McKinsey) predict substantial agentic commerce adoption, increasing the attack surface. Recommended mitigations include protocol guardrails (AP2), Know Your Agent, agent reputation scoring, Unit 42 AI Security Assessments and Prisma AIRS.
read more →

OpenClaw AI Agent Flaws Could Enable Endpoint Takeover

🔒 China's CNCERT warned that OpenClaw, an open-source, self-hosted autonomous AI agent, ships with weak default security and broad system privileges that attackers can abuse to seize endpoints and exfiltrate data. The advisory highlights indirect prompt injection (IDPI/XPIA) risks where benign features like web-page summarization and messaging link previews are weaponized to embed malicious instructions or automatically leak secrets. Researchers at PromptArmor demonstrated a technique in which an agent constructs attacker-controlled URLs that, when rendered as link previews, transmit confidential data without user clicks. CNCERT also flagged risks from malicious skills, accidental destructive commands, and disclosed vulnerabilities, urging isolation, tightened network controls, credential protection, and cautious skill sourcing.
read more →

Detecting and Responding to Prompt Abuse in AI Tools

🔍 This post, the second in Microsoft's AI Application Security series, moves from planning to practical detection and response for prompt abuse. It describes common attack types — direct prompt override, extractive abuse targeting sensitive inputs, and indirect prompt injection via hidden instructions such as URL fragments — and why these are hard to spot without telemetry. The article provides a stepwise detection and incident response playbook and maps mitigations to Microsoft tools so teams can log interactions, sanitize inputs, and contain incidents.
read more →

Fooling AI Agents: Web-Based Indirect Prompt Injection

⚠️ Unit 42 researchers describe web-based indirect prompt injection (IDPI), where adversaries embed hidden or obfuscated instructions in webpages that are later consumed by LLMs and agentic systems. The report catalogs 22 payload engineering techniques, presents a taxonomy of attacker intents from low to critical, and details multiple in-the-wild detections, including the first observed AI ad-review bypass. It emphasizes detection, intent analysis and web-scale defenses to protect automated pipelines.
read more →

A New Era of AI Agents: Posture and Risk Management

🛡️ Microsoft outlines why the rise of autonomous AI agents requires a new security posture. Microsoft Defender delivers AI Security Posture Management across multi-cloud environments to provide visibility, risk prioritization, and tailored remediation for agent-specific threats such as data-connected exposures, indirect prompt injection (XPIA), and compromised coordinator agents. The guidance emphasizes hardening, attack path analysis, and human-in-the-loop controls to reduce blast radius.
read more →

Google Gemini exploited via calendar prompt injection

⚠️ Researchers disclosed an indirect prompt-injection flaw that allowed Google Gemini to bypass calendar privacy controls and exfiltrate meeting data. A crafted Google Calendar invite could hide a natural-language payload that Gemini later parsed, summarized, and wrote into new events whose descriptions leaked private meeting content. Miggo Security reported the issue and said it has been responsibly disclosed and addressed, highlighting how AI-native features increase the attack surface when assistants can read, summarize, and write into productivity services.
read more →

Are Copilot Prompt Injections Vulnerabilities or Limits?

🔍 Microsoft pushed back after security engineer John Russell disclosed multiple prompt injection and sandbox-related issues in Copilot, which the company says do not meet its vulnerability criteria. Russell reported indirect and direct prompt injection that could leak the system prompt, a file-upload bypass via base64-encoding, and the execution of commands inside Copilot's isolated Linux environment. Microsoft told BleepingComputer it reviewed the reports against its public bug bar and assessed them as out of scope when they did not cross clear security boundaries or impacted only the requesting user's environment. The exchange highlights differing definitions of AI risk between vendors and researchers.
read more →

Google Patches Zero-Click Gemini Enterprise Vulnerability

🔒 Google has patched a zero-click vulnerability in Gemini Enterprise and Vertex AI Search that could have allowed attackers to exfiltrate corporate data via hidden instructions embedded in shared Workspace content. Discovered by Noma Security in June 2025 and dubbed "GeminiJack," the flaw exploited Retrieval-Augmented Generation (RAG) retrieval to execute indirect prompt injection without any user interaction. Google updated how the systems interact, separated Vertex AI Search from Gemini Enterprise, and changed retrieval and indexing workflows to mitigate the issue.
read more →

Google Adds Layered Defenses to Chrome's Agentic AI

🛡️ Google announced a set of layered security measures for Chrome after adding agentic AI features, aimed at reducing the risk of indirect prompt injections and cross-origin data exfiltration. The centerpiece is a User Alignment Critic, a separate model that reviews and can veto proposed agent actions using only action metadata to avoid being poisoned by malicious page content. Chrome also enforces Agent Origin Sets via a gating function that classifies task-relevant origins into read-only and read-writable sets, requires gating approval before adding new origins, and pairs these controls with a prompt-injection classifier, Safe Browsing, on-device scam detection, user work logs, and explicit approval prompts for sensitive actions.
read more →

Architecting Security for Agentic Browsing in Chrome

🛡️ Chrome describes a layered approach to secure agentic browsing with Gemini, focusing on defenses against indirect prompt injection and goal‑hijacking. A new User Alignment Critic — an isolated, high‑trust model — reviews planned agent actions using only metadata and can veto misaligned steps. Chrome also enforces Agent Origin Sets to limit readable and writable origins, adds deterministic confirmations for sensitive actions, runs prompt‑injection detection in real time, and sustains continuous red‑teaming and monitoring to reduce exfiltration and unwanted transactions.
read more →

Indirect Prompt Injection: Hidden Risks to AI Systems

🔐 The article explains how indirect prompt injection — malicious instructions embedded in external content such as documents, images, emails and webpages — can manipulate AI tools without users seeing the exploit. It contrasts indirect attacks with direct prompt injection and cites CrowdStrike's analysis of over 300,000 adversarial prompts and 150 techniques. Recommended defenses include detection, input sanitization, allowlisting, privilege separation, monitoring and user education to shrink this expanding attack surface.
read more →

Hidden URL-fragment prompts can hijack AI browsers

⚠️ Researchers demonstrated a client-side prompt injection called HashJack that hides malicious instructions in URL fragments after the '#' symbol. AI-powered browsers and assistants — including Comet, Copilot for Edge, and Gemini for Chrome — read these fragments for context, allowing attackers to weaponize legitimate sites for phishing, data exfiltration, credential theft, or malware distribution. Because fragment data never reaches servers, network defenses and server logs may not detect this technique.
read more →

HashJack: Indirect Prompt Injection Targets AI Browsers

⚠️Security researchers at Cato Networks disclosed HashJack, a novel indirect prompt-injection vulnerability that abuses URL fragments (the text after '#') to deliver hidden instructions to AI browsers. Because fragments never leave the client, servers and network defenses cannot see them, allowing attackers to weaponize legitimate websites without altering visible content. Affected agents included Comet, Copilot for Edge and Gemini for Chrome, with some vendors already rolling fixes.
read more →

Researchers Trick ChatGPT into Self Prompt Injection

🔒 Researchers at Tenable identified seven techniques that can coerce ChatGPT into disclosing private chat history by abusing built-in features like web browsing and long-term Memories. They show how OpenAI’s browsing pipeline routes pages through a weaker intermediary model, SearchGPT, which can be prompt-injected and then used to seed malicious instructions back into ChatGPT. Proof-of-concepts include exfiltration via Bing-tracked URLs, Markdown image loading, and a rendering quirk, and Tenable says some issues remain despite reported fixes.
read more →

Researchers Find ChatGPT Vulnerabilities in GPT-4o/5

🛡️ Cybersecurity researchers disclosed seven vulnerabilities in OpenAI's GPT-4o and GPT-5 models that enable indirect prompt injection attacks to exfiltrate user data from chat histories and stored memories. Tenable researchers Moshe Bernstein and Liv Matan describe zero-click search exploits, one-click query execution, conversation and memory poisoning, a markdown rendering bug, and a safety bypass using allow-listed Bing links. OpenAI has mitigated some issues, but experts warn that connecting LLMs to external tools broadens the attack surface and that robust safeguards and URL-sanitization remain essential.
read more →