< ciso
brief />
Tag Banner

All news with #prompt injection attack tag

106 articles · page 2 of 6

RoguePilot Flaw: Copilot in Codespaces Could Leak Tokens

🛡️ RoguePilot was a vulnerability in GitHub Codespaces that allowed GitHub Copilot to be manipulated via a crafted GitHub issue, enabling silent execution of hidden AI instructions and potential exfiltration of a privileged GITHUB_TOKEN. Orca Security researcher Roi Nisimi reported that an attacker could embed the prompt inside an HTML comment and direct Copilot to send the token to an external server. Microsoft patched the flaw after responsible disclosure. The disclosure underscores risks from AI-mediated prompt injection and urges better prompt handling, content sanitization, and least-privilege token practices.
read more →

AI Unlocked: Interactive Prompt Injection Challenge

🔐 CrowdStrike has launched AI Unlocked: Decoding Prompt Injection, an interactive online challenge hosted via Falcon Encounter hands-on labs that immerses security teams in attacker-style prompt injection scenarios. Participants progress through three virtual rooms—Command Center, Data Gateway, and Nexus—using prompt injection techniques to convince the simulated supervisor SAIGE to reveal secret phrases while earning higher scores for brevity and efficiency. The exercise aims to convert abstract AI security risks into practical lessons, helping teams recognize attack patterns and the need for defensive guardrails.
read more →

The Promptware Kill Chain: A Framework for AI Threats

🛡️ The authors present a seven-step “promptware kill chain” to reframe prompt injection as a multistage malware paradigm targeting modern LLM-based systems. They describe how Initial Access can be direct or indirect—via web pages, emails, shared documents, or multimodal inputs—and how LLMs’ lack of separation between data and executable instructions enables escalation. The paper catalogs stages from jailbreaking and reconnaissance to persistence, C2, lateral movement, and harmful Actions on Objective, urging defenses that assume initial compromise and break the chain at later steps.
read more →

Road-sign prompt injection threatens embodied AI systems

⚠️ New research introduces CHAI, a prompt-injection technique that embeds deceptive natural-language instructions into visual inputs to hijack embodied AI agents. The method systematically searches token space, builds prompt dictionaries, and crafts Visual Attack Prompts to mislead LVLM-powered systems. Experiments on drones, autonomous driving stacks, aerial tracking, and a real robotic vehicle show CHAI outperforms prior attacks and highlights the limits of conventional adversarial robustness.
read more →

AI Recommendation Poisoning: Manipulating Assistant Memory

🔒 Microsoft Defender researchers describe a growing practice they call AI Recommendation Poisoning, where hidden instructions in pre-filled prompts and “Summarize with AI” links attempt to inject persistent memory commands into assistants. The study identified more than 50 unique prompts from 31 companies across 14 industries targeting assistants such as Copilot, ChatGPT, and Claude. Freely available tools and plugins make the technique trivial to deploy, enabling subtly biased recommendations on topics like health, finance, and security. Microsoft reports mitigations are in place and provides hunting queries and guidance for defenders.
read more →

SecurityScorecard: 40,214 OpenClaw Instances Exposed

🔒SecurityScorecard warns that widespread misconfiguration of the AI assistant OpenClaw has left 40,214 agent instances — linked to 28,663 unique IP addresses — exposed to the public internet. The vendor reports 63% of observed deployments are vulnerable, including 12,812 instances exploitable via remote code execution, and has correlated hundreds with prior breaches and known CVEs. Exposures are concentrated in China, the US and Singapore and affect sectors such as information services, technology, manufacturing and telecommunications. Users are urged to limit access, adopt a zero trust posture, scrutinize agent logic, and defend against prompt injection and leaked API keys.
read more →

OpenClaw Partners with VirusTotal to Scan ClawHub Skills

🛡️ OpenClaw has integrated VirusTotal scanning to inspect skills uploaded to its ClawHub marketplace, creating SHA-256 hashes for each skill and cross-checking them against VirusTotal's database. Bundles not matched are analyzed with VirusTotal Code Insight; benign verdicts are auto-approved, suspicious skills are flagged, and confirmed malicious items are blocked. OpenClaw also re-scans active skills daily but cautions this is not a complete defense against cleverly concealed prompt-injection payloads.
read more →

Glean and Prisma AIRS: Real-Time AI Security Integration

🔒 Glean and Prisma AIRS have integrated to provide real-time AI threat protection that neutralizes prompt injections, blocks toxic or biased outputs, and inspects generated code and URLs for malicious patterns. The integration enforces organizational policy across chats and agent interactions and immediately blocks risky requests while notifying users. Deployment is designed to be frictionless—enable protection in three clicks by pasting a Prisma AIRS runtime API key into the Glean admin console.
read more →

Docker patches critical Ask Gordon AI 'DockerDash' flaw

🛡️ Researchers disclosed a critical prompt-injection flaw, codenamed DockerDash, that allowed malicious Docker image metadata to hijack the Ask Gordon AI assistant in Docker Desktop and the Docker CLI. The vulnerability, discovered by Noma Labs, could enable remote code execution or sensitive data exfiltration by treating unverified LABEL fields as executable instructions. Docker fixed the issue in Ask Gordon version 4.50.0 (November 2025). Administrators should upgrade and apply zero-trust validation to AI toolchains and MCP/Gateway integrations.
read more →

DockerDash: Metadata Flaw in Docker's Ask Gordon AI

⚠️ Noma Labs disclosed a critical vulnerability, dubbed DockerDash, in Docker's Ask Gordon AI assistant that allows unverified image metadata to be treated as executable instructions. The flaw exploits a trust failure in the Model Context Protocol (MCP) gateway: Ask Gordon reads Docker LABEL metadata, forwards the interpreted content to MCP, and MCP tools execute it without validation. Depending on deployment this can enable remote code execution (cloud/CLI) or large-scale data exfiltration and reconnaissance in Docker Desktop. Docker issued mitigations in Docker Desktop 4.50.0 and users are urged to upgrade.
read more →

AI-Powered Polymorphic Attacks Enable Runtime Phishing

🔒 Researchers at Unit 42 demonstrated how attackers can convert benign webpages into bespoke phishing pages by calling LLMs from client-side code to generate malicious JavaScript in real time. This polymorphic technique assembles malware inside the victim’s browser, leaving no static payload and evading many traditional network and signature controls. Defenders are advised to prioritize message-layer protections, secure web gateways, and secure enterprise browsers to block the initial lure and the last mile reassembling of malicious code.
read more →

Why AI Keeps Falling for Prompt Injection: Context Limits

🤖 The essay examines why large language models remain vulnerable to prompt injection attacks and why incremental vendor fixes are insufficient. It explains that LLMs collapse layered human context into token similarity, lack social learning and interruption reflexes, and are trained to answer rather than defer. The authors warn that agents with tool access amplify these risks and argue for fundamental advances—such as task-specific constraints, real-world grounding, or new architectures—rather than patchwork defenses.
read more →

Real-Time LLM-Driven Runtime Assembly Phishing Attacks

⚠️ Unit 42 details a technique where seemingly benign webpages call trusted LLM APIs from the browser to generate malicious JavaScript dynamically and execute it at runtime. Carefully engineered prompts can bypass model safety guardrails and return credential-harvesting code that assembles in-browser into personalized phishing pages. Because payloads are served via trusted domains and differ per visit, this approach defeats many static and network-based detectors, making runtime behavioral analysis the most effective mitigation.
read more →

Prompt Injection Bugs in Anthropic's Official MCP Git Server

🚨 Cybersecurity researchers have identified three prompt-injection vulnerabilities in Anthropic's reference Git server implementation, mcp-server-git, affecting default installations and all releases before 8 December 2025. The flaws let attackers manipulate what an AI assistant reads—such as a README, issue text or a webpage—to cause unintended actions without credentials or system access. Exploits can enable code execution when combined with a filesystem MCP server, delete arbitrary files, or load sensitive files into a model's context. Anthropic accepted the reports in September and issued patches in December 2025; affected users are urged to update immediately.
read more →

Three MCP Git Server Flaws Enable File Access and RCE

⚠️ A trio of vulnerabilities in mcp-server-git, the official MCP Git server maintained by Anthropic, can be chained to read or delete arbitrary files and, in certain scenarios, achieve remote code execution. Cyata researcher Yarden Porat showed these issues are exploitable via prompt injection when an AI assistant ingests attacker-controlled content such as a malicious README or poisoned issue text. Fixes were released in 2025.9.25 and 2025.12.18; users should update the Python package promptly to mitigate risk.
read more →

Gemini calendar flaw reveals new prompt injection risk

📅 A newly disclosed weakness in Google’s Gemini demonstrates how routine calendar invites can be weaponized to influence model behavior. Miggo researchers found that Gemini ingests full event context — titles, times, attendees and descriptions — and may treat that content as actionable instructions. The issue reframes calendar entries from inert data into a potential prompt‑injection vector, highlighting risks as enterprises embed generative AI into day‑to‑day workflows.
read more →

Reprompt: One-click exfiltration via Microsoft Copilot

🔐 Researchers at Varonis Threat Labs uncovered 'Reprompt', a one-click attack that abuses Microsoft Copilot Personal by embedding prompts in URLs and using follow-up server requests to exfiltrate data. It combines a URL 'q' parameter injection, a double-request bypass of initial sanitization, and chained server instructions to siphon conversation history and files without further user interaction. Microsoft issued a patch; organizations should treat prefilled prompts as untrusted and enforce continuous authentication, least privilege, prompt hygiene, auditing, and anomaly detection.
read more →

Reprompt attack: single-click data exfiltration from Copilot

🔒 Cybersecurity researchers disclosed a novel method called Reprompt that can enable single-click data exfiltration from AI chatbots, notably Microsoft Copilot, while bypassing typical enterprise controls. The technique exploits the Copilot q URL parameter to inject instructions from a link, then uses repeated requests and a remote attacker server to continue covert fetching and return of sensitive data with no further user interaction. Microsoft says it addressed the issue and that Microsoft 365 Copilot enterprise customers are not affected, but researchers warn the approach turns Copilot into an invisible exfiltration channel.
read more →

Model Security Misses the Point: Secure AI Workflows

🛡️As AI copilots and assistants are embedded into daily work, recent incidents show the primary risk lies in surrounding workflows rather than in the models themselves. Malicious Chrome extensions that exfiltrated ChatGPT and DeepSeek chats and prompt injections that tricked an AI coding assistant into executing malware exploited integration contexts, not model internals. The piece advises mapping AI usage, applying least-privilege, enforcing middleware guardrails to scan outputs, and using dynamic SaaS platforms like Reco to detect and control risky workflows.
read more →

Reprompt Attack Could Hijack Microsoft Copilot Sessions

⚠️ Security researchers at Varonis disclosed a vulnerability, dubbed Reprompt, that could let attackers hijack a user's Copilot Personal session by embedding malicious instructions in a URL. The attack leverages the 'q' URL parameter to inject prompts that execute when the page loads, then uses chained server-side follow-up requests to maintain access and exfiltrate data after a single click. Varonis reported the issue to Microsoft on August 31, and Microsoft issued a fix on the January 2026 Patch Tuesday; users should apply the latest Windows update promptly.
read more →