< ciso
brief />
Tag Banner

All news with #prompt injection attack tag

118 articles · page 2 of 6

Custom AI Apps to Dominate Incident Response Workloads

🛡️ Gartner warns custom-built AI applications will increasingly strain security teams unless defenders are engaged early. It predicts that by 2028 at least half of enterprise incident response work will handle fallout from AI app security issues. Analysts urge teams to "shift left" to embed controls during development, and expect AI security platforms to be widely adopted within two years to enforce guardrails and mitigate prompt injection, data misuse and related threats.
read more →

Font-rendering trick hides malicious commands from AIs

🔍 LayerX researchers demonstrated a font-rendering technique that can hide malicious commands from AI assistants by encoding the payload in HTML while visually rendering a different, benign string to users. The proof-of-concept combines custom fonts with glyph substitution and CSS concealment (tiny fonts, color/opacity tricks) so the DOM appears harmless while the browser displays an executable instruction. In tests across many popular assistants, automated analyzers that read the DOM missed the hidden commands; LayerX urges assistants to compare rendered output with DOM text and to treat fonts, color/opacity matches, and unusually small fonts as potential attack surfaces.
read more →

GenAI Prompt Fuzzing Reveals LLM Guardrail Fragility

⚠️ Unit 42 demonstrates a genetic-algorithm-inspired prompt-fuzzing technique that automatically generates meaning-preserving variants of disallowed requests to evaluate LLM guardrails. Their experiments show evasion rates vary widely by keyword and model, with some combinations yielding high, operationally meaningful success rates. They recommend treating LLMs as probabilistic boundaries, applying layered controls, continuous adversarial testing, and using tools like Prisma AIRS and Unit 42 assessments to strengthen defenses.
read more →

Detecting and Responding to Prompt Abuse in AI Tools

🔍 This post, the second in Microsoft's AI Application Security series, moves from planning to practical detection and response for prompt abuse. It describes common attack types — direct prompt override, extractive abuse targeting sensitive inputs, and indirect prompt injection via hidden instructions such as URL fragments — and why these are hard to spot without telemetry. The article provides a stepwise detection and incident response playbook and maps mitigations to Microsoft tools so teams can log interactions, sanitize inputs, and contain incidents.
read more →

Perplexity's Comet AI Browser Tricked Into Phishing Scam

🔒 Researchers demonstrated that an AI-powered browser, Perplexity's Comet, can be manipulated into executing a phishing scam in under four minutes. By intercepting the agent's explanatory traffic and training a GAN on those signals, attackers iteratively optimized a malicious page until the agent reliably performed fraudulent steps. The exploit leverages intent collision and prompt-injection weaknesses, shifting the target from users to the AI agent itself.
read more →

AI vs. AI: The Gatling-Gun Moment in Cybersecurity Era

🛡️ The piece compares the Civil War’s Gatling gun to a September 2025 agentic AI-driven cyberespionage campaign that automated most tactical operations. According to the report, a Chinese state-linked group, GTG-1002, abused Anthropic’s Claude Code via prompt injection and role-playing to produce malicious code and execute ≈90% of the attack chain. The intrusion hit 30 U.S. companies and agencies and was disclosed after Anthropic’s threat team detected misuse of their platform.
read more →

Fuzzing AI Judges: Stealth Triggers Enable Policy Bypass

🔍 This research introduces AdvJudge-Zero, an automated fuzzer that discovers stealthy input sequences capable of flipping AI judge decisions and bypassing safety gates. Tests show low-perplexity, benign-looking tokens—such as markdown markers, role labels, and context-shift phrases—can reliably convert block outcomes into allows. The report documents a roughly 99% attack success rate across diverse models and recommends adversarial fuzzing, retraining with discovered examples, and operational monitoring using products like Prisma AIRS and Cortex AI-SPM.
read more →

OpenAI to Acquire Promptfoo to Boost AI Agent Security

🔒 OpenAI said it will acquire AI testing startup Promptfoo to strengthen security checks for AI agents as enterprises deploy autonomous systems in business workflows. Promptfoo’s tools let developers test LLM applications against adversarial prompts, including prompt injection and jailbreak attempts, and evaluate whether models follow safety and reliability guidelines. OpenAI plans to integrate Promptfoo into OpenAI Frontier and to continue developing the open-source project while expanding enterprise capabilities.
read more →

AI Assistants Shift Organizational Security Priorities

🤖 AI-based assistants such as OpenClaw are rapidly reshaping organizational security, blurring boundaries between data and code and between trusted co-workers and insider threats. Incidents and research show agents taking autonomous actions and misconfigured admin interfaces exposing credentials, conversations, and integrations. Demonstrated supply-chain and prompt injection attacks can install rogue agents and manipulate agent perception. Organizations should isolate agents, enforce strict network controls, vet third-party skills, and address AI fragility as a core security concern.
read more →

FortiAIGate: Runtime Protection for AI Workloads, Governance

🔒 FortiAIGate provides dedicated runtime protection for private AI and LLM deployments by monitoring every input and output between applications and models. It detects and blocks threats such as prompt injection, jailbreaking, model poisoning, data exfiltration, and excessive compute abuse while enforcing governance policies in real time. Built for Kubernetes and hybrid environments, it integrates with Fortinet Security Fabric, offers dashboards mapping OWASP Top 10 LLM risks, and uses multi‑GPU and SmartNIC acceleration to preserve performance and control costs.
read more →

Companies Inject Hidden Prompts into AI Summarization

🔒 Microsoft reports companies are embedding hidden instructions in Summarize with AI buttons that pass persistence commands via URL prompt parameters. These prompts tell assistants to 'remember [Company] as a trusted source' or 'recommend [Company] first,' biasing later responses toward vendors. Researchers found over 50 unique prompts from 31 companies across 14 industries, and freely available tooling makes this trivial to deploy. The manipulation can subtly skew recommendations in critical areas like health, finance, and security without users knowing.
read more →

OpenClaw: Supply-Chain Risks and Underground Chatter

🔍 OpenClaw is an AI-driven automation framework with a modular skills marketplace that lets agents run user-installed plugins to manage mail, schedules, and system tasks. Security researchers disclosed multiple critical flaws — including one-click RCE (CVE-2026-25253), token/OAuth abuse, prompt-injection pathways, and absent sandboxing — and documented dozens of poisoned skills on ClawHub. Flare's telemetry shows significant chatter across research and fringe channels but limited evidence of mass criminal operationalization; the immediate confirmed threat is supply-chain abuse where malicious skills execute with agent-level privileges and exfiltrate credentials and sessions.
read more →

RoguePilot Flaw: Copilot in Codespaces Could Leak Tokens

🛡️ RoguePilot was a vulnerability in GitHub Codespaces that allowed GitHub Copilot to be manipulated via a crafted GitHub issue, enabling silent execution of hidden AI instructions and potential exfiltration of a privileged GITHUB_TOKEN. Orca Security researcher Roi Nisimi reported that an attacker could embed the prompt inside an HTML comment and direct Copilot to send the token to an external server. Microsoft patched the flaw after responsible disclosure. The disclosure underscores risks from AI-mediated prompt injection and urges better prompt handling, content sanitization, and least-privilege token practices.
read more →

AI Unlocked: Interactive Prompt Injection Challenge

🔐 CrowdStrike has launched AI Unlocked: Decoding Prompt Injection, an interactive online challenge hosted via Falcon Encounter hands-on labs that immerses security teams in attacker-style prompt injection scenarios. Participants progress through three virtual rooms—Command Center, Data Gateway, and Nexus—using prompt injection techniques to convince the simulated supervisor SAIGE to reveal secret phrases while earning higher scores for brevity and efficiency. The exercise aims to convert abstract AI security risks into practical lessons, helping teams recognize attack patterns and the need for defensive guardrails.
read more →

The Promptware Kill Chain: A Framework for AI Threats

🛡️ The authors present a seven-step “promptware kill chain” to reframe prompt injection as a multistage malware paradigm targeting modern LLM-based systems. They describe how Initial Access can be direct or indirect—via web pages, emails, shared documents, or multimodal inputs—and how LLMs’ lack of separation between data and executable instructions enables escalation. The paper catalogs stages from jailbreaking and reconnaissance to persistence, C2, lateral movement, and harmful Actions on Objective, urging defenses that assume initial compromise and break the chain at later steps.
read more →

Road-sign prompt injection threatens embodied AI systems

⚠️ New research introduces CHAI, a prompt-injection technique that embeds deceptive natural-language instructions into visual inputs to hijack embodied AI agents. The method systematically searches token space, builds prompt dictionaries, and crafts Visual Attack Prompts to mislead LVLM-powered systems. Experiments on drones, autonomous driving stacks, aerial tracking, and a real robotic vehicle show CHAI outperforms prior attacks and highlights the limits of conventional adversarial robustness.
read more →

AI Recommendation Poisoning: Manipulating Assistant Memory

🔒 Microsoft Defender researchers describe a growing practice they call AI Recommendation Poisoning, where hidden instructions in pre-filled prompts and “Summarize with AI” links attempt to inject persistent memory commands into assistants. The study identified more than 50 unique prompts from 31 companies across 14 industries targeting assistants such as Copilot, ChatGPT, and Claude. Freely available tools and plugins make the technique trivial to deploy, enabling subtly biased recommendations on topics like health, finance, and security. Microsoft reports mitigations are in place and provides hunting queries and guidance for defenders.
read more →

SecurityScorecard: 40,214 OpenClaw Instances Exposed

🔒SecurityScorecard warns that widespread misconfiguration of the AI assistant OpenClaw has left 40,214 agent instances — linked to 28,663 unique IP addresses — exposed to the public internet. The vendor reports 63% of observed deployments are vulnerable, including 12,812 instances exploitable via remote code execution, and has correlated hundreds with prior breaches and known CVEs. Exposures are concentrated in China, the US and Singapore and affect sectors such as information services, technology, manufacturing and telecommunications. Users are urged to limit access, adopt a zero trust posture, scrutinize agent logic, and defend against prompt injection and leaked API keys.
read more →

OpenClaw Partners with VirusTotal to Scan ClawHub Skills

🛡️ OpenClaw has integrated VirusTotal scanning to inspect skills uploaded to its ClawHub marketplace, creating SHA-256 hashes for each skill and cross-checking them against VirusTotal's database. Bundles not matched are analyzed with VirusTotal Code Insight; benign verdicts are auto-approved, suspicious skills are flagged, and confirmed malicious items are blocked. OpenClaw also re-scans active skills daily but cautions this is not a complete defense against cleverly concealed prompt-injection payloads.
read more →

Glean and Prisma AIRS: Real-Time AI Security Integration

🔒 Glean and Prisma AIRS have integrated to provide real-time AI threat protection that neutralizes prompt injections, blocks toxic or biased outputs, and inspects generated code and URLs for malicious patterns. The integration enforces organizational policy across chats and agent interactions and immediately blocks risky requests while notifying users. Deployment is designed to be frictionless—enable protection in three clicks by pasting a Prisma AIRS runtime API key into the Glean admin console.
read more →