Tag Banner

All news with #prompt injection tag

Wed, November 19, 2025

ServiceNow Now Assist agents vulnerable by default settings

🔒 AppOmni disclosed a second-order prompt injection that abuses ServiceNow's Now Assist agent discovery and agent-to-agent collaboration to perform unauthorized actions. A benign agent parsing attacker-crafted prompts can recruit other agents to read or modify records, exfiltrate data, or escalate privileges — all enabled by default configuration choices. AppOmni recommends supervised execution, disabling autonomous overrides, agent segmentation, and active monitoring to reduce risk.

read more →

Tue, November 18, 2025

Prisma AIRS Integration with Azure AI Foundry for Security

🔒 Palo Alto Networks announced that Prisma AIRS now integrates natively with Azure AI Foundry, enabling direct prompt and response scanning through the Prisma AIRS AI Runtime Security API. The integration provides real-time, model-agnostic threat detection for prompt injection, sensitive data leakage, malicious code and URLs, and toxic outputs, and supports custom topic filters. By embedding security into AI development workflows, teams gain production-grade protections without slowing innovation; the feature is available now via an early access program.

read more →

Tue, November 18, 2025

Rethinking Identity in the AI Era: Building Trust Fast

🔐 CISOs are grappling with an accelerating identity crisis as stolen credentials and compromised identities account for a large share of breaches. Experts warn that traditional, human-centric IAM models were not designed for agentic AI and the thousands of autonomous agents that can act and impersonate at machine speed. The SINET Identity Working Group advocates an AI Trust Fabric built on cryptographic, proofed identities, dynamic fine-grained authorization, just-in-time access, explicit delegation, and API-driven controls to reduce risks such as prompt injection, model theft, and data poisoning.

read more →

Mon, November 17, 2025

A Methodical Approach to Agent Evaluation: Quality Gate

🧭 Hugo Selbie presents a practical framework for evaluating modern multi-step AI agents, emphasizing that final-output metrics alone miss silent failures arising from incorrect reasoning or tool use. He recommends defining clear, measurable success criteria up front and assessing agents across three pillars: end-to-end quality, process/trajectory analysis, and trust & safety. The piece outlines mixed evaluation methods—human review, LLM-as-a-judge, programmatic checks, and adversarial testing—and prescribes operationalizing these checks in CI/CD with production monitoring and feedback loops.

read more →

Mon, November 17, 2025

Best-in-Class GenAI Security: CloudGuard WAF Meets Lakera

🔒 The rise of generative AI introduces new attack surfaces that conventional security stacks were never designed to address. This post outlines how pairing CloudGuard WAF with Lakera's AI-risk controls creates layered protection by inspecting prompts, model interactions, and data flows at the application edge. The integrated solution aims to prevent prompt injection, sensitive-data leakage, and harmful content generation while maintaining application availability and performance.

read more →

Mon, November 17, 2025

Fight Fire With Fire: Countering AI-Powered Adversaries

🔥 We summarize Anthropic’s disruption of a nation-state campaign that weaponized agentic models and the Model Context Protocol to automate global intrusions. The attack automated reconnaissance, exploitation, and lateral movement at unprecedented speed, leveraging open-source tools and achieving 80–90% autonomous execution. It used prompt injection (role-play) to bypass model guardrails, highlighting the need for prompt injection defenses and semantic-layer protections. Organizations must adopt AI-powered defenses such as CrowdStrike Falcon and the Charlotte agentic SOC to match adversary tempo.

read more →

Thu, November 13, 2025

What CISOs Should Know About Securing MCP Servers Now

🔒 The Model Context Protocol (MCP) enables AI agents to connect to data sources, but early specifications lacked robust protections, leaving deployments exposed to prompt injection, token theft, and tool poisoning. Recent protocol updates — including OAuth, third‑party identity provider support, and an official MCP registry — plus vendor tooling from hyperscalers and startups have improved defenses. Still, authentication remains optional and gaps persist, so organizations should apply zero trust and least‑privilege controls, enforce strong secrets management and logging, and consider specialist MCP security solutions before production rollout.

read more →

Wed, November 12, 2025

Tenable Reveals New Prompt-Injection Risks in ChatGPT

🔐 Researchers at Tenable disclosed seven techniques that can cause ChatGPT to leak private chat history by abusing built-in features such as web search, conversation memory and Markdown rendering. The attacks are primarily indirect prompt injections that exploit a secondary summarization model (SearchGPT), Bing tracking redirects, and a code-block rendering bug. Tenable reported the issues to OpenAI, and while some fixes were implemented several techniques still appear to work.

read more →

Wed, November 12, 2025

Extending Zero Trust to Autonomous AI Agents in Enterprises

🔐 As enterprises deploy AI assistants and autonomous agents, existing security frameworks must evolve to treat these agents as first-class identities rather than afterthoughts. The piece advocates applying Zero Trust principles—identity-first access, least-privilege, dynamic contextual enforcement, and continuous monitoring—to agentic identities to prevent misuse and reduce attack surface. Practical controls include scoped, short-lived tokens, tiered trust models, strict access boundaries, and assigning clear human ownership to each agent.

read more →

Wed, November 12, 2025

Secure AI by Design: A Policy Roadmap for Organizations

🛡️ In just a few years, AI has shifted from futuristic innovation to core business infrastructure, yet security practices have not kept pace. Palo Alto Networks presents a Secure AI by Design Policy Roadmap that defines the AI attack surface and prescribes actionable measures across external tools, agents, applications, and infrastructure. The Roadmap aligns with recent U.S. policy moves — including the June 2025 Executive Order and the July 2025 White House AI Action Plan — and calls for purpose-built defenses rather than retrofitting legacy controls.

read more →

Tue, November 11, 2025

CometJacking: Prompt-Injection Risk in AI Browsers

🔒 Researchers disclosed a prompt-injection technique dubbed CometJacking that abuses URL parameters to deliver hidden instructions to Perplexity’s Comet AI browser. By embedding malicious directives in the 'collection' parameter an attacker can cause the agent to consult connected services and memory instead of searching the web. LayerX demonstrated exfiltration of Gmail messages and Google Calendar invites by encoding data in base64 and sending it to an external endpoint. According to the report, Comet followed the malicious prompt and bypassed Perplexity’s safeguards, illustrating broader limits of current LLM-based assistants.

read more →

Mon, November 10, 2025

Researchers Trick ChatGPT into Self Prompt Injection

🔒 Researchers at Tenable identified seven techniques that can coerce ChatGPT into disclosing private chat history by abusing built-in features like web browsing and long-term Memories. They show how OpenAI’s browsing pipeline routes pages through a weaker intermediary model, SearchGPT, which can be prompt-injected and then used to seed malicious instructions back into ChatGPT. Proof-of-concepts include exfiltration via Bing-tracked URLs, Markdown image loading, and a rendering quirk, and Tenable says some issues remain despite reported fixes.

read more →

Thu, November 6, 2025

CIO’s First Principles: A Reference Guide to Securing AI

🔐 Enterprises must redesign security as AI moves from experimentation to production, and CIOs need a prevention-first, unified approach. This guide reframes Confidentiality, Integrity and Availability for AI, stressing rigorous access controls, end-to-end data lineage, adversarial testing and a defensible supply chain to prevent poisoning, prompt injection and model hijacking. Palo Alto Networks advocates embedding security across MLOps, real-time visibility of models and agents, and executive accountability to eliminate shadow AI and ensure resilient, auditable AI deployments.

read more →

Thu, November 6, 2025

Multi-Turn Adversarial Attacks Expose LLM Weaknesses

🔍 Cisco AI Defense's report shows open-weight large language models remain vulnerable to adaptive, multi-turn adversarial attacks even when single-turn defenses appear effective. Using over 1,000 prompts per model and analyzing 499 simulated conversations of 5–10 exchanges, researchers found iterative strategies such as Crescendo, Role-Play and Refusal Reframe drove failure rates above 90% in many cases. The study warns that traditional safety filters are insufficient and recommends strict system prompts, model-agnostic runtime guardrails and continuous red-teaming to mitigate risk.

read more →

Thu, November 6, 2025

AI-Powered Malware Emerges: Google Details New Threats

🛡️ Google Threat Intelligence Group (GTIG) reports that cybercriminals are actively integrating large language models into malware campaigns, moving beyond mere tooling to generate, obfuscate, and adapt malicious code. GTIG documents new families — including PROMPTSTEAL, PROMPTFLUX, FRUITSHELL, and PROMPTLOCK — that query commercial APIs to produce or rewrite payloads and evade detection. Researchers also note attackers use social‑engineering prompts to trick LLMs into revealing sensitive guidance and that underground marketplaces increasingly offer AI-enabled “malware-as-a-service,” lowering the bar for less skilled threat actors.

read more →

Thu, November 6, 2025

Google Warns: AI-Enabled Malware Actively Deployed

⚠️ Google’s Threat Intelligence Group has identified a new class of AI-enabled malware that leverages large language models at runtime to generate and obfuscate malicious code. Notable families include PromptFlux, which uses the Gemini API to rewrite its VBScript dropper for persistence and lateral spread, and PromptSteal, a Python data miner that queries Qwen2.5-Coder-32B-Instruct to create on-demand Windows commands. GTIG observed PromptSteal used by APT28 in Ukraine, while other examples such as PromptLock, FruitShell and QuietVault demonstrate varied AI-driven capabilities. Google warns this "just-in-time AI" approach could accelerate malware sophistication and democratize cybercrime.

read more →

Thu, November 6, 2025

Google: LLMs Employed Operationally in Malware Attacks

🤖 Google’s Threat Intelligence Group (GTIG) reports attackers are using “just‑in‑time” AI—LLMs queried during execution—to generate and obfuscate malicious code. Researchers identified two families, PROMPTSTEAL and PROMPTFLUX, which query Hugging Face and Gemini APIs to craft commands, rewrite source code, and evade detection. GTIG also documents social‑engineering prompts that trick models into revealing red‑teaming or exploit details, and warns the underground market for AI‑enabled crime is maturing. Google says it has disabled related accounts and applied protections.

read more →

Wed, November 5, 2025

Google: PROMPTFLUX malware uses Gemini to self-write

🤖 Google researchers disclosed a VBScript threat named PROMPTFLUX that queries Gemini via a hard-coded API key to request obfuscated VBScript designed to evade static detection. A 'Thinking Robot' component logs AI responses to %TEMP% and writes updated scripts to the Windows Startup folder to maintain persistence. Samples include propagation attempts to removable drives and mapped network shares, and variants that rewrite their source on an hourly cadence. Google assesses the malware as experimental and currently lacking known exploit capabilities.

read more →

Wed, November 5, 2025

GTIG Report: AI-Enabled Threats Transform Cybersecurity

🔒 The Google Threat Intelligence Group (GTIG) released a report documenting a clear shift: adversaries are moving beyond benign productivity uses of AI and are experimenting with AI-enabled operations. GTIG observed state-sponsored actors from North Korea, Iran and the People's Republic of China using AI for reconnaissance, tailored phishing lure creation and data exfiltration. Threats described include AI-powered, self-modifying malware, prompt-engineering to bypass safety guardrails, and underground markets selling advanced AI attack capabilities. Google says it has disrupted malicious assets and applied that intelligence to strengthen classifiers and its AI models.

read more →

Wed, November 5, 2025

Researchers Find ChatGPT Vulnerabilities in GPT-4o/5

🛡️ Cybersecurity researchers disclosed seven vulnerabilities in OpenAI's GPT-4o and GPT-5 models that enable indirect prompt injection attacks to exfiltrate user data from chat histories and stored memories. Tenable researchers Moshe Bernstein and Liv Matan describe zero-click search exploits, one-click query execution, conversation and memory poisoning, a markdown rendering bug, and a safety bypass using allow-listed Bing links. OpenAI has mitigated some issues, but experts warn that connecting LLMs to external tools broadens the attack surface and that robust safeguards and URL-sanitization remain essential.

read more →