All news with #indirect prompt injection tag

38 articles

July 7, 2026

GitLost: Public Issue Can Exfiltrate Private GitHub Data

🔒 Researchers at Noma Security demonstrated that a crafted public GitHub issue can manipulate GitHub Agentic Workflows into exposing private repository contents. The attack, named GitLost, exploits indirect prompt injection to trick an agent with organization-wide read access into pulling private data and posting it publicly. GitHub's preview feature for agentic workflows includes guardrails, but Noma showed a minor wording change can bypass them. The core problem is architectural: agents with standing credentials that read untrusted input and can post outward create persistent leakage risk.

GitHub Indirect Prompt Injection Agent Security

July 7, 2026

Zscaler report shows AI agents vulnerable to IPI traps

🛡️ Zscaler tested 26 LLMs and found several autonomous agents susceptible to indirect prompt injection (IPI) traps, with some high-end models failing while a few lower-tier models fared better. The vendor identified hidden instructions on websites that manipulated agent behavior and caused real-world impacts in controlled tests. Experts warn that agent risk is dynamic, the attack surface is architectural, and binary "safe/vulnerable" labels are overly simplistic for CISOs. The findings highlight that agentic AI introduces new trust boundaries and insider-like threats to enterprise security.

Zscaler Indirect Prompt Injection Agentic AI LLM Security

July 7, 2026

Zscaler finds AI agents vulnerable to prompt injection

🛡️ Zscaler tested 26 LLM-based autonomous agents and found several susceptible to indirect prompt injection (IPI) schemes, with some high-end models failing while a few lower-tier models fared better. The vendor reported four models as "vulnerable" and three as "safe," but experts warn that agent behavior evolves and binary classifications can be misleading. The findings highlight the architectural risks in agentic AI where untrusted content in the context window can be treated as authoritative, expanding the attack surface for enterprises.

Zscaler Indirect Prompt Injection Agentic AI AI Security

July 6, 2026

Hidden web prompts steer AI agents into scams

🔍 Zscaler ThreatLabz uncovered real-world campaigns using indirect prompt injection, where hidden instructions embedded in web pages steer AI agents. Attackers used SEO poisoning to surface malicious pages and hid prompts via CSS and JSON-LD metadata. One campaign impersonated a Python library to trick agents into paying a $3 bogus API key; another typosquatted a DeBank site to claim authority. Tests across 26 LLMs showed varying susceptibility depending on model and context.

Indirect Prompt Injection Agentic AI AI Security Prompt Injection Attack

June 11, 2026

OpenClaw AI Agent Vulnerabilities and Mitigations

🛡️ Two security teams demonstrated attacks against OpenClaw, where hidden instructions in shared contacts, vCards, and location pins or ordinary-looking emails caused the agent to execute attacker-controlled code or exfiltrate sensitive data. Imperva found a message-object prompt-injection flaw that OpenClaw patched in version 2026.4.23, while Varonis showed social-engineering 'agent phishing' that requires architectural controls rather than a simple patch. Operators are urged to update, restrict outbound actions, and treat agents as junior employees needing human oversight.

Agentic AI Prompt Injection Attack Indirect Prompt Injection Agent Security

June 4, 2026

Anthropic Claude Code Action flaw risk to repos

🔒 A researcher discovered a vulnerability in Anthropic's Claude Code GitHub Action that allowed takeover of public repositories via a single opened GitHub issue. Anthropic patched the core bypass in January and released fixes in claude-code-action v1.0.94, rating the issue 7.8 under CVSS v4.0 and issuing a bounty. The flaw arose from overly permissive triggers that trusted actors ending in "[bot]" and example workflows allowing non-write users, enabling indirect prompt injection to exfiltrate environment secrets and OIDC credentials. Administrators should update to v1.0.94, audit workflows for untrusted inputs, and remove unnecessary permissions and tools to prevent exfiltration.

Anthropic GitHub Actions Prompt Injection Attack Indirect Prompt Injection

April 23, 2026

Indirect Prompt Injection: Current Web Threats and Trends

🔎 Google Threat Intelligence scanned a large Common Crawl corpus to detect indirect prompt injection (IPI) patterns embedded in public web pages. The team combined signature-based pattern matching, Gemini-assisted classification, and manual review to reduce false positives and contextualize findings. Most observed injections were low-sophistication—pranks, benign guidance, or SEO-driven prompts—but a smaller and rising set attempted data exfiltration or destructive actions. The study excludes social media and login-protected content and reports a 32% increase in malicious samples between Nov 2025 and Feb 2026.

Indirect Prompt Injection Google Gemini Research

April 23, 2026

Researchers Find 10 In-the-Wild Prompt Injection Payloads

🔒 Forcepoint researchers have uncovered 10 distinct indirect prompt injection (IPI) payloads embedded in web content that instruct AI agents to perform malicious real‑world actions such as financial fraud, data destruction and API key exfiltration. The attacks poison pages so that browsing or summarizing agents ingest and execute attacker directives, often overriding prior safeguards. Forcepoint warns risk scales with AI privilege and highlights threats to agentic tools integrated into IDEs, payment flows and automation pipelines.

Indirect Prompt Injection AI Security Agent Security

April 7, 2026

GrafanaGhost vulnerability enables silent data exfiltration

🔒 Researchers at Noma's Threat Research Team have disclosed a critical vulnerability, GrafanaGhost, that enables attackers to silently extract sensitive enterprise data from Grafana environments. The exploit chains application and AI weaknesses — including flawed URL validation and indirect prompt injection — to transfer data to attacker servers without credentials or user interaction. Built-in guardrails can be bypassed with simple prompt tricks and protocol-relative URLs, allowing automatic background exfiltration that leaves little trace.

Data Exfiltration Indirect Prompt Injection AI Security

April 7, 2026

Zero‑click Grafana AI flaw enables enterprise data leaks

🛡️ Researchers disclosed a critical issue in Grafana, dubbed GrafanaGhost, that enables zero‑click exfiltration of sensitive telemetry and business data via AI‑powered dashboards. Noma Security reported the chained exploit, which combines indirect prompt injection and a URL validation bypass; Grafana validated the report and released a patch. The attack abuses protocol‑relative URLs and model keywords to trick AI into sending data to attacker servers. Organizations should patch, restrict img‑src, and enforce egress controls.

Indirect Prompt Injection AI Security Data Exfiltration

April 2, 2026

Continuous defenses for Workspace against prompt injection

🔐 Google outlines a continuous, layered approach to mitigating indirect prompt injection (IPI) across Workspace with Gemini, combining proactive discovery, synthetic data generation, and iterative defenses. Human and automated red-teaming, an AI Vulnerability Rewards Program, and OSINT monitoring are used to catalog and expand attack variants. Deterministic configuration controls, ML retraining, LLM prompt hardening, and model-level defenses are validated through comparative testing to reduce IPI success while preserving routine performance.

Google Indirect Prompt Injection Prompt Security

March 20, 2026

Agentic Commerce Risks: AI-Enabled Retail Fraud Scenarios

🔐At the NRF Big Show in January 2026, Google introduced the Universal Commerce Protocol (UCP) and highlighted compatibility with the Agent Payments Protocol (AP2), promising tokenized payments and verifiable credentials. Unit 42 warns that indirect prompt injection—where agents ingest hidden instructions while browsing—can enable novel fraud such as gift card payload poisoning and refund logic hijacking. Industry forecasts (Bain, McKinsey) predict substantial agentic commerce adoption, increasing the attack surface. Recommended mitigations include protocol guardrails (AP2), Know Your Agent, agent reputation scoring, Unit 42 AI Security Assessments and Prisma AIRS.

Agentic AI Indirect Prompt Injection Palo Alto Networks Unit 42

March 14, 2026

OpenClaw AI Agent Flaws Could Enable Endpoint Takeover

🔒 China's CNCERT warned that OpenClaw, an open-source, self-hosted autonomous AI agent, ships with weak default security and broad system privileges that attackers can abuse to seize endpoints and exfiltrate data. The advisory highlights indirect prompt injection (IDPI/XPIA) risks where benign features like web-page summarization and messaging link previews are weaponized to embed malicious instructions or automatically leak secrets. Researchers at PromptArmor demonstrated a technique in which an agent constructs attacker-controlled URLs that, when rendered as link previews, transmit confidential data without user clicks. CNCERT also flagged risks from malicious skills, accidental destructive commands, and disclosed vulnerabilities, urging isolation, tightened network controls, credential protection, and cautious skill sourcing.

Agentic AI Agent Security Indirect Prompt Injection AI Runtime Security

March 12, 2026

Detecting and Responding to Prompt Abuse in AI Tools

🔍 This post, the second in Microsoft's AI Application Security series, moves from planning to practical detection and response for prompt abuse. It describes common attack types — direct prompt override, extractive abuse targeting sensitive inputs, and indirect prompt injection via hidden instructions such as URL fragments — and why these are hard to spot without telemetry. The article provides a stepwise detection and incident response playbook and maps mitigations to Microsoft tools so teams can log interactions, sanitize inputs, and contain incidents.

Microsoft Prompt Injection Attack Indirect Prompt Injection AI Application Security

March 3, 2026

Fooling AI Agents: Web-Based Indirect Prompt Injection

⚠️ Unit 42 researchers describe web-based indirect prompt injection (IDPI), where adversaries embed hidden or obfuscated instructions in webpages that are later consumed by LLMs and agentic systems. The report catalogs 22 payload engineering techniques, presents a taxonomy of attacker intents from low to critical, and details multiple in-the-wild detections, including the first observed AI ad-review bypass. It emphasizes detection, intent analysis and web-scale defenses to protect automated pipelines.

Indirect Prompt Injection Palo Alto Networks Unit 42 LLM Security

January 21, 2026

A New Era of AI Agents: Posture and Risk Management

🛡️ Microsoft outlines why the rise of autonomous AI agents requires a new security posture. Microsoft Defender delivers AI Security Posture Management across multi-cloud environments to provide visibility, risk prioritization, and tailored remediation for agent-specific threats such as data-connected exposures, indirect prompt injection (XPIA), and compromised coordinator agents. The guidance emphasizes hardening, attack path analysis, and human-in-the-loop controls to reduce blast radius.

Agentic AI AI Security Indirect Prompt Injection

January 19, 2026

Google Gemini exploited via calendar prompt injection

⚠️ Researchers disclosed an indirect prompt-injection flaw that allowed Google Gemini to bypass calendar privacy controls and exfiltrate meeting data. A crafted Google Calendar invite could hide a natural-language payload that Gemini later parsed, summarized, and wrote into new events whose descriptions leaked private meeting content. Miggo Security reported the issue and said it has been responsibly disclosed and addressed, highlighting how AI-native features increase the attack surface when assistants can read, summarize, and write into productivity services.

Google Gemini Indirect Prompt Injection AI Security

January 6, 2026

Are Copilot Prompt Injections Vulnerabilities or Limits?

🔍 Microsoft pushed back after security engineer John Russell disclosed multiple prompt injection and sandbox-related issues in Copilot, which the company says do not meet its vulnerability criteria. Russell reported indirect and direct prompt injection that could leak the system prompt, a file-upload bypass via base64-encoding, and the execution of commands inside Copilot's isolated Linux environment. Microsoft told BleepingComputer it reviewed the reports against its public bug bar and assessed them as out of scope when they did not cross clear security boundaries or impacted only the requesting user's environment. The exchange highlights differing definitions of AI risk between vendors and researchers.

Microsoft Copilot Prompt Injection Attack Indirect Prompt Injection AI Security

December 10, 2025

Google Patches Zero-Click Gemini Enterprise Vulnerability

🔒 Google has patched a zero-click vulnerability in Gemini Enterprise and Vertex AI Search that could have allowed attackers to exfiltrate corporate data via hidden instructions embedded in shared Workspace content. Discovered by Noma Security in June 2025 and dubbed "GeminiJack," the flaw exploited Retrieval-Augmented Generation (RAG) retrieval to execute indirect prompt injection without any user interaction. Google updated how the systems interact, separated Vertex AI Search from Gemini Enterprise, and changed retrieval and indexing workflows to mitigate the issue.

Google Gemini Indirect Prompt Injection RAG Security

December 9, 2025

Google Adds Layered Defenses to Chrome's Agentic AI

🛡️ Google announced a set of layered security measures for Chrome after adding agentic AI features, aimed at reducing the risk of indirect prompt injections and cross-origin data exfiltration. The centerpiece is a User Alignment Critic, a separate model that reviews and can veto proposed agent actions using only action metadata to avoid being poisoned by malicious page content. Chrome also enforces Agent Origin Sets via a gating function that classifies task-relevant origins into read-only and read-writable sets, requires gating approval before adding new origins, and pairs these controls with a prompt-injection classifier, Safe Browsing, on-device scam detection, user work logs, and explicit approval prompts for sensitive actions.

Google Agentic AI Indirect Prompt Injection Agent Security