< ciso
brief />
Tag Banner

All news with #prompt injection attack tag

118 articles

ChatGPhish vulnerability turns ChatGPT into phishing surface

🛡️ Cybersecurity researchers disclosed a vulnerability dubbed ChatGPhish that exploits ChatGPT's trust in Markdown links and images to perform prompt injections and enable phishing. The flaw causes the assistant to auto-fetch attacker-hosted images and render malicious links and QR codes inside the trusted UI, potentially leaking client metadata like IP and User-Agent. The technique highlights summarization as an adversarial surface that can convert benign web pages into phishing vectors.
read more →

Frontier AI models more vulnerable under iterative attacks

🔍 Cisco researchers found that popular frontier LLMs from OpenAI, Anthropic, Google, xAI, and Amazon exhibit substantially higher risk when subjected to multi-turn adversarial attacks than when assessed with single-prompt safety benchmarks. The team ran tens of thousands of single-turn and multi-turn attacks across 15 models and multiple configurations, revealing wide gaps in attack success rates (ASRs) and configuration-dependent safety behavior. They urge improved benchmarks, transparency on configuration impacts, and publication of paired single- and multi-turn ASRs to better inform procurement and governance decisions.
read more →

Protect GenAI Chatbots with Check Point WAF

🛡️ Check Point explains why GenAI chatbots create new security risks by acting as a front door to internal systems and data. The post highlights real incidents—prompt injection, data exposure, and misleading responses—that demonstrate legal, financial, and reputational impacts. It describes how Check Point WAF extends unified application and API security into the conversational layer to detect and block malicious prompts, prevent data leaks, and control unsafe outputs.
read more →

Image-only Prompt Injection Threatens Multimodal AI

🔍 Researchers from Xidian University describe a new image-based prompt injection called CrossMPI that uses near-imperceptible pixel perturbations to alter how large vision-language models interpret both visual and textual inputs. The technique targets intermediate multimodal fusion layers rather than final outputs, misleading LVLMs without modifying text prompts. Tests show strong black-box transferability and high success rates across several open-source models, while common defenses reduce but do not fully eliminate the threat.
read more →

Pen Tests Reveal AI Flaws More Severe Than Legacy Bugs

🔒 Penetration testing shows AI and LLM deployments contain a disproportionate share of severe vulnerabilities. Cobalt’s State of Pentesting Report finds 32% of LLM findings rated high risk versus 13% for legacy enterprise tests, and only 38% of those high-risk LLM issues are remediated. Experts point to emerging attack surfaces — notably prompt injection, now OWASP’s top LLM risk — broader blast radii from model integrations, and fragmented ownership for fixes. Recommended countermeasures include threat modeling, red teaming, least-privilege access, strict output validation, and human approval gates for high-consequence actions.
read more →

Prompt Injection Leads to RCE in AI Agent Frameworks

⚠️ Microsoft researchers disclosed critical vulnerabilities in Semantic Kernel that allow prompt injection to escalate into host-level remote code execution and arbitrary file writes. The team detailed two fixed issues — CVE-2026-26030 (unsafe eval-style filter in the In-Memory Vector Store) and CVE-2026-25592 (exposed DownloadFileAsync in SessionsPythonPlugin) — and provided mitigations. Operators should upgrade the Python package to 1.39.4+ and the .NET SDK to 1.71.0+, validate any model-influenced tool parameters as untrusted input, and hunt endpoint telemetry for post-exploitation indicators.
read more →

Supply-Chain Attacks Target AI Coding Agents in Registries

⚠️ ReversingLabs researchers describe an ongoing supply‑chain campaign called PromptMink that manipulates AI coding agents into installing malicious dependencies. Attackers publish bait packages with persuasive READMEs and LLM‑optimized documentation on registries like NPM and PyPI to increase discovery by autonomous agents and developers. The operation, attributed to North Korea’s Famous Chollima, paired legitimate‑looking SDKs with second‑layer packages carrying infostealers, later evolving to compiled Rust add‑ons, SEAs, SSH backdoors, and project exfiltration.
read more →

ThreatsDay: $290M KelpDAO Heist and Supply Chain Surge

🔔 LayerZero-linked infrastructure poisoning likely enabled a North Korean-linked group (TraderTraitor/TraderTraiter) to steal $290M from KelpDAO by compromising RPC nodes and exploiting a quorum while a DDoS distracted a third node, prompting an Arbitrum Security Council freeze. At the same time, active RCE attacks, malicious npm packages delivering credential stealers and SSH backdoors, and indirect AI prompt injection payloads are accelerating breaches. The bulletin also flags covert browser access by desktop AI apps, a surge in commodified malware, SIM-farm services, and persistent exploitation of long-known weaknesses; the practical remedies remain patch early, verify dependencies, and restrict implicit trust.
read more →

Google pushes agentic AI defenses to protect cloud systems

🛡️ Google unveiled a suite of agentic AI defenses at Google Cloud Next '26 to help SOC teams manage a surge of vulnerabilities tied to Anthropic Mythos. The launch includes three new agents in Google Security Operations — threat hunting, detection engineering, and third-party context — plus expanded Wiz integrations and an AI-BOM to inventory AI components. Additional controls like Agent Identity, Agent Gateway, and Model Armor aim to govern the emerging 'agentic web' and mitigate prompt injection, data leakage, and shadow AI risks.
read more →

Prompt Injection in Google's Antigravity Allows RCE

⚠️ Google’s Antigravity IDE contained a prompt-injection flaw that could convert a file-search operation into remote code execution. Researchers at Pillar Security showed the agent’s find_my_name tool passed unsanitized Pattern strings to the underlying fd utility, allowing flag injection and execution of binaries. Google acknowledged and fixed the issue and awarded a VRP bounty, but the flaw underscores limits of shell-focused sanitization.
read more →

Google Patches Antigravity IDE Prompt Injection Flaw

🛡️ Google has patched a critical prompt-injection vulnerability in its agentic IDE Antigravity that could allow attackers to achieve arbitrary code execution. Researchers at Pillar Security found that the find_by_name tool passed unsanitized input to the native fd search utility, enabling injection of the -X (exec-batch) flag to run staged scripts. Because this call executes before Strict Mode constraints are applied, an attacker can stage a malicious file and trigger it via a crafted search pattern. The issue was disclosed January 7 and fixed by Google on February 28.
read more →

Copilot and Agentforce Vulnerable to Prompt Injection

🔐 Capsule Security researchers discovered prompt-injection flaws in Microsoft Copilot Studio and Salesforce Agentforce that allow attackers to inject malicious instructions via standard input fields. In Copilot, a crafted payload in a SharePoint form field can overwrite agent instructions and exfiltrate SharePoint data; Microsoft has released a patch (CVE-2026-21520). In Agentforce, attackers can embed directives in public lead forms that an agent with email or query capabilities may execute, enabling broad CRM data leakage.
read more →

Prompt-Injection Flaws in Copilot Studio and Agentforce

⚠️ Security researchers at Capsule Security disclosed prompt-injection vulnerabilities in Microsoft Copilot Studio and Salesforce Agentforce that let attackers embed malicious instructions in public form fields. Crafted inputs submitted via SharePoint or lead forms can override agent instructions and trigger data exfiltration to attacker-controlled endpoints. Microsoft patched the SharePoint-related issue (CVE-2026-21520) with a 7.5 CVSS score; Salesforce acknowledged the problem but described the vector as configuration-specific. Researchers warn that treating external inputs as trusted undermines autonomous agent security and urge input validation, least-privilege, and stricter outbound controls.
read more →

CISOs Confront Widening AI Visibility and Risk Gaps

🔍 CISOs are scrambling to close visibility gaps as organizations rapidly adopt AI, confronting risks such as prompt injection, data poisoning, shadow AI, and agentic behaviors. Security leaders report limited insight into where AI is used and how models behave, forcing them to reposition existing tools, adopt new monitoring solutions, and formalize governance. While traditional controls like DLP and SIEM can mitigate many issues, experts warn no single solution is fully mature, so leaders must balance guardrails, emerging observability tools, and business velocity.
read more →

Securing AI Inference on GKE with Model Armor Gateways

🔒 Enterprises are moving AI workloads to GKE at scale, but serving models introduces risks such as prompt injection and sensitive data leakage that traditional network controls miss. Google recommends Model Armor, a gateway-integrated guardrail service that inspects requests before they reach the model and scans outputs afterward. It offers proactive input scrutiny, content-aware output moderation, and DLP integration, all without code changes to your application. Integrated logging surfaces policy triggers to Security Command Center for audit and response.
read more →

Critical Flowise flaw enables JavaScript injection in AI

🚨 A critical design oversight in Flowise, a low-code platform for building LLM flows, allows arbitrary JavaScript to be injected via its Custom MCP node. The vulnerability (CVE-2025-59528) results from unsafe parsing in convertToValidJSONString, which feeds user input to the Function() constructor and executes with full Node.js privileges. A patch shipped in v3.0.6 and the latest public release is v3.1.1, but thousands of internet-exposed instances remain at risk as attackers have begun exploiting unpatched deployments.
read more →

Applying Security Fundamentals to AI: Practical Advice

🛡️ Treat AI like a very new, junior employee and as software: it’s capable but not infallible, so give clear goals, explicit permissions, and limit its authority. Apply distinct identities and least-privilege controls, avoid relying on AI for deterministic access decisions, and test for indirect prompt injection (XPIA) using techniques such as Spotlighting and Prompt Shield. Design end-to-end systems that include people and processes, document safety plans and failure modes, and continuously monitor and vet models and agents for changes.
read more →

ChatGPT vulnerability enabled covert data exfiltration

⚠️A security flaw in ChatGPT could be triggered by a single malicious prompt to create a covert exfiltration channel, researchers at Check Point reported. The issue allowed data to be leaked via a DNS side channel from the model’s isolated runtime and was patched by OpenAI on 20 February after disclosure. Check Point demonstrated extraction of uploaded files and private prompts and warned that users copying prompts from public sources could be exposed.
read more →

Securing Agentic AI: End-to-End Enterprise Protections

🔒 Microsoft presents an end-to-end strategy to secure agentic AI with the new Agent 365 control plane and updates across Microsoft Defender, Entra, Purview, and Sentinel. Announced for RSAC 2026, these measures focus on visibility, continuous identity protection, data loss prevention for Copilot prompts, and prompt-injection defenses to help organizations observe, govern, and defend agent ecosystems at scale.
read more →

Securing Homegrown AI Agents with Falcon AIDR & NeMo

🔒 Falcon AIDR now integrates with NVIDIA NeMo Guardrails to provide programmable runtime protections for homegrown AI agents moving into production. The combined solution blocks prompt injection, redacts PII, defangs malicious domains, and moderates unwanted topics while preserving responsive, sub-100ms agent workflows. Teams can leverage 75+ built-in detectors or create custom policies to monitor in report-only mode and then progressively enforce blocks, redactions, encryptions, or transformations.
read more →