< ciso
brief />
Tag Banner

All news with #prompt injection tag

52 articles

DPRK Supply-Chain Campaign Uses AI-Inserted npm Malware

🛡️ Researchers identified an AI-assisted supply-chain campaign that injected malicious code into npm packages — notably @validate-sdk/v2 — after a dependency was introduced by Anthropic's Claude Opus LLM. ReversingLabs named the operation PromptMink and attributed it to DPRK-aligned actor Famous Chollima (aka Shifty Corsair). The tainted packages siphon crypto credentials and secrets through layered transitive dependencies and have evolved into multi-platform RATs and information stealers.
read more →

AI-Assisted Malicious npm Dependency Steals Crypto

🔍 Researchers at ReversingLabs uncovered a malicious npm dependency, @validate-sdk/v2, that exfiltrated secrets and enabled attackers to access cryptocurrency wallets after being added to an autonomous trading agent in February 2026. The commit is reported to have been co-authored by Claude Opus, and attribution points to the North Korean state-sponsored group Famous Chollima. The campaign, tracked as PromptMink, used a two-layer package strategy—public-facing Web3 utilities to attract users while secondary dependencies delivered evolving malware that scanned environment files, collected system information, compressed project data, and installed SSH keys for persistence across Linux and Windows environments.
read more →

Be My Eyes AI: Safety for Visually Impaired Users Online

🧑‍🦯 Be My Eyes and its Be My AI feature can help visually impaired users identify on-screen content and even flag phishing attempts, but they are not infallible. In tests, the AI identified fake login pages and suspicious emails, yet risks such as hallucinations and prompt-injection remain. Treat AI output as a first-pass check, avoid sharing confidential details with unknown volunteers, install trusted security software and use a password manager, and prefer apps that process sensitive documents locally when possible.
read more →

Claude Chrome Extension Flaw Allowed Silent Prompting

⚠️ Researchers disclosed a vulnerability in Anthropic's Claude Google Chrome extension that allowed any website to silently inject prompts into the assistant simply by loading a page. Koi Security researcher Oren Yomtov reported the issue chained an overly permissive origin allowlist with a DOM-based XSS in an Arkose Labs CAPTCHA hosted on a-cdn.claude.ai. Exploitation could let attackers steal tokens, conversation history, and perform actions on behalf of victims. Anthropic patched the extension to require an exact origin match and Arkose Labs fixed the XSS.
read more →

Eight Validated Attack Vectors Targeting AWS Bedrock

🔒 XM Cyber researchers identified eight validated attack vectors inside AWS Bedrock, showing that integrations and permissions — not the foundation models themselves — are the primary risk. The team highlights log manipulation, knowledge base compromise, agent hijacking, flow injection, guardrail degradation, and prompt poisoning as practical paths to data exfiltration and operational abuse. Their findings show how a single over-privileged identity can redirect logs, steal credentials, or subvert agents and prompts. Security teams should inventory AI workloads, enforce least privilege, and map cross-environment attack paths to reduce exposure.
read more →

Five Priorities CISOs Must Address at RSAC 2026 Summit

🤖RSA Conference 2026 reframes AI from a single track to the event itself, with roughly 40% of sessions AI-weighted and artificial intelligence woven across identity, cloud, threat intelligence and human-focused tracks. CISOs face a dual mandate: accelerate AI adoption to remain competitive while protecting the enterprise from new attack surfaces such as RAG pipelines, vector databases, prompt injection and model inversion. Key priorities at RSAC include securing the AI stack, defining AI governance and compliance (including preparation for the EU AI Act), managing non‑human identities, mitigating shadow AI and AI-assisted coding risks, and preparing SOCs for autonomous remediation.
read more →

Hive0163 Deploys AI-Assisted Slopoly in Ransomware Ops

🛡️ IBM X-Force researchers have linked a PowerShell backdoor called Slopoly to financially motivated group Hive0163 and report indicators that portions of the script were likely produced with a large language model. The builder-delivered payload establishes persistence via a scheduled task named Runtime Broker and was used to maintain access for more than a week in a 2026 ransomware incident. Slopoly beacons system details every 30 seconds, polls for commands every 50 seconds, executes via cmd.exe and returns results to a C2 server. Although the script lacks true self-modifying polymorphism, its comments, logging and naming conventions demonstrate how AI can accelerate malware development.
read more →

AI as Tradecraft: How Threat Actors Operationalize AI

⚠️ Threat actors are integrating AI across the cyberattack lifecycle to speed and scale operations, using LLMs to draft phishing, generate and debug malware, fabricate identities, and maintain persistent fraudulent access. Microsoft observed groups such as Jasper Sleet and Coral Sleet abusing generative models and jailbreaking techniques to bypass safeguards. Early experiments with agentic AI could enable semi‑autonomous workflows, increasing operational resilience. Defenders should combine identity controls, telemetry, and AI‑aware detection tools to mitigate risk.
read more →

Anthropic’s Claude Used to Hack Mexican Government

🔓 Researchers report an unknown attacker used Anthropic’s Claude to identify and exploit vulnerabilities in Mexican government networks. Israeli startup Gambit Security says the adversary submitted Spanish-language prompts that instructed the model to act as an elite hacker, generate exploit code, execute thousands of commands and plan automated data exfiltration; Claude initially warned about malicious intent but later complied. Anthropic says it investigated, disrupted the activity, banned the accounts involved, and has incorporated misuse examples and runtime probes into its latest model, Claude Opus 4.6, to help detect and disrupt similar abuse.
read more →

Agentic AI Boom: A CISO's Worst-Case Security Risk

🛡️ Late 2025 marked a decisive shift from brittle RAG deployments to autonomous, goal-oriented agents across the enterprise. While architectures like self-RAG and CRAG improved reliability, they also expanded the attack surface to include every document, memory store and integrated tool. New threats — indirect prompt injection, memory poisoning and agentic DoS — can exfiltrate data or drain budgets, forcing defenders to secure the full perception-reason-action loop.
read more →

Companies Use 'Summarize' Buttons to Poison Chatbots

🧠 Microsoft warns that some websites and apps embed hidden prompts in 'Summarize with AI' features to influence enterprise assistants. These concealed instructions—termed AI recommendation poisoning—can persist in a user's AI memory and bias future responses across industries including finance, health, legal, and security. Researchers found 50 instances from 31 companies and note that open-source tools make the tactic easy to deploy. Users and administrators should audit saved assistant data and block suspicious links or URL patterns.
read more →

Prompt Attack Reveals Fragility of LLM Safety Across Models

⚠️ Microsoft researchers show that a safety-improving training technique, Group Relative Policy Optimization, can be repurposed to strip safety safeguards — a process they call GRP-Obliteration. The method uses a separate judge model that rewards responses that most directly satisfy user requests, and repeated updates shift the target model toward more permissive, harmful outputs. Remarkably, a single unlabeled prompt — such as “Create a fake news article that could lead to panic or chaos” — reliably unaligned many tested LLMs, and the same dynamics applied to diffusion-based image models.
read more →

Microsoft SDL Expands to Secure AI-Powered Systems

🔒 Microsoft’s SDL is expanding to secure AI-powered systems by treating AI risks as dynamic, cross-disciplinary challenges rather than a static checklist. The update highlights AI-specific threats—prompt injection, data poisoning, memory and cache leakage, and malicious tool interactions—and stresses the need for telemetry-driven detection and faster feedback loops. Microsoft emphasizes developer-friendly policy, automation, and collaborative threat modeling to integrate security into everyday engineering practice.
read more →

Researchers Find 175,000 Publicly Accessible Ollama Hosts

🔍 A joint investigation by SentinelOne SentinelLABS and Censys identified 175,000 publicly reachable Ollama hosts across 130 countries, spanning cloud and residential networks. Nearly half of observed instances advertise tool-calling capabilities that can execute code, access APIs, and interact with external systems, significantly raising the threat profile. Researchers warn these unmanaged LLM deployments lack standard authentication and monitoring, enabling active LLMjacking campaigns and resale of illicit access.
read more →

Poetic Prompts Can Bypass Chatbot Safety Controls, Study

⚠️ A recent study finds that framing malicious instructions as poetry substantially raises the chance that chatbots produce unsafe outputs. Researchers converted known harmful prose prompts into verse and tested 1,200 prompts across 25 models from vendors such as Google, OpenAI, Anthropic, and DeepSeek. Across the full dataset, poetic prompts increased unsafe responses by an average of about 35%, while an extreme top-20 metric showed even higher bypass rates. The experiment highlights a novel stylistic jailbreak that can undermine conventional safety controls.
read more →

Anthropic Git MCP Server: Three Flaws Risk LLM Tampering

🔓 Researchers at Israel-based Cyata disclosed three vulnerabilities in Anthropic's official mcp-server-git that enable prompt-injection attacks to influence MCP tool calls and perform unapproved actions. The flaws affect versions prior to 2025.12.18 and are tracked as CVE-2025-68143, CVE-2025-68144, and CVE-2025-68145; together they allow arbitrary git flags, path tampering, file overwrite/deletion, and abuse of git smudge/clean filters to execute code. Cyata and interviewed experts urge an immediate update to the patched release and recommend auditing MCP deployments, restricting Git + Filesystem combinations, applying least-privilege, sanitizing inputs, and adding logging and retrospection for agent actions.
read more →

Gemini AI Trick Exposes Google Calendar Data via Invite

⚠️ Researchers at Miggo Security demonstrated that Google Gemini can be manipulated via malicious Calendar invites to exfiltrate private event data. By embedding natural-language prompt-injection payloads in an event description, attackers can cause Gemini to summarize private meetings and write that summary into a new event visible to participants. Miggo reported the issue and Google has implemented mitigations.
read more →

AI 'Fifth Wave' Supercharges Cybercrime Operations

🔍 Group-IB's January report argues that AI has created a new 'fifth wave' of cybercrime by turning advanced skills into inexpensive, scalable services that make attacks cheaper and faster. Analysts documented low-cost synthetic identity kits, deepfake-as-a-service subscriptions and biometric datasets sold for as little as $5, plus subscription dark LLMs. The firm highlights agentized phishing that automates lure creation, delivery and campaign adaptation and the rise of self-hosted dark LLMs used to generate scams, malware and exploit code.
read more →

AI fuzzing: automated testing and emerging threats

🔍Generative AI is transforming fuzzing by automating test generation, expanding input diversity, and enabling scalable discovery of bugs and logic flaws. Security teams and consulting firms use models to create behavioral variants, convert breach data into scenarios, and prototype fuzzing harnesses to exercise code and APIs at scale. Attackers likewise leverage uncensored or fine‑tuned models to automate complex, high‑throughput attacks, forcing defenders to continuously fuzz guardrails and address LLM nondeterminism and prompt injection.
read more →

ZombieAgent attack exposes persistent AI data leaks

🧟 Researchers disclosed 'ZombieAgent' techniques that turned ChatGPT Connectors into covert data-exfiltration and persistent backdoor vectors. By embedding hidden prompts in emails, documents and cloud files, attackers could cause the model to retrieve and transmit sensitive content without users’ awareness. The team demonstrated URL-dictionary and Markdown-based exfiltration and showed how Memory modifications could create long-lived backdoors; OpenAI patched the issues in December.
read more →