< ciso
brief />
Tag Banner

All news with #llm security tag

250 articles · page 4 of 13

Agentic AI Security: Assessing Risks and Defenses Now

🛡️ Organizations are adopting agentic AI—autonomous, task-driven systems powered by LLMs—to streamline processes and boost throughput. These agents can plan, act, and iterate, but their non-deterministic behavior creates gaps in traceability, auditability, and access control. Apply strong role-based access, threat modeling, and oversight (human or independent evaluators) to limit exposure and ensure safe deployment.
read more →

Fuzzing AI Judges: Stealth Triggers Enable Policy Bypass

🔍 This research introduces AdvJudge-Zero, an automated fuzzer that discovers stealthy input sequences capable of flipping AI judge decisions and bypassing safety gates. Tests show low-perplexity, benign-looking tokens—such as markdown markers, role labels, and context-shift phrases—can reliably convert block outcomes into allows. The report documents a roughly 99% attack success rate across diverse models and recommends adversarial fuzzing, retraining with discovered examples, and operational monitoring using products like Prisma AIRS and Cortex AI-SPM.
read more →

WEF Global Cybersecurity Outlook 2026: CISO Takeaways

🤖 The World Economic Forum’s Global Cybersecurity Outlook 2026 warns that AI is accelerating the cyber arms race: 94% of leaders expect it to be the top change driver and 87% say AI vulnerabilities are the fastest‑growing risk. The report notes organizations are improving AI tool security evaluation (from 37% to 64%), yet CEOs and CISOs display different risk priorities. It also highlights widening resilience gaps across organization sizes and calls for harmonized regulation and stronger public‑private collaboration.
read more →

AI as Tradecraft: How Threat Actors Operationalize AI

⚠️ Threat actors are integrating AI across the cyberattack lifecycle to speed and scale operations, using LLMs to draft phishing, generate and debug malware, fabricate identities, and maintain persistent fraudulent access. Microsoft observed groups such as Jasper Sleet and Coral Sleet abusing generative models and jailbreaking techniques to bypass safeguards. Early experiments with agentic AI could enable semi‑autonomous workflows, increasing operational resilience. Defenders should combine identity controls, telemetry, and AI‑aware detection tools to mitigate risk.
read more →

Malicious AI Assistant Extensions Harvest LLM Data

🔒 Microsoft Defender investigated malicious Chromium browser extensions that impersonated legitimate AI assistant tools to collect LLM chat histories and browsing telemetry. Distributed via the Chrome Web Store and compatible with both Google Chrome and Microsoft Edge, the extensions captured full URLs and chat snippets from platforms such as ChatGPT and DeepSeek, reaching roughly 900,000 installs and activity in over 20,000 enterprise tenants. Microsoft provides detections, hunting queries, and mitigation guidance to contain exposure and remediate affected devices.
read more →

BMW and Google Cloud Build Automated SLM Optimization

🚗 BMW Group and Google Cloud present a proof-of-concept pipeline to compress, fine-tune, evaluate, and deploy domain-specific small language models (SLMs) for in-vehicle voice commands. They position SLMs as a practical compromise between full cloud-based LLMs and constrained onboard hardware, reducing latency and network dependence. Using Vertex AI Pipelines, the automated workflow explores quantization, pruning, distillation, LoRA fine-tuning, and RL-based alignment, and validates models on Android/AOSP head-unit environments. The team publishes the pipeline code to encourage reuse and reproducible experimentation.
read more →

FortiAIGate: Runtime Protection for AI Workloads, Governance

🔒 FortiAIGate provides dedicated runtime protection for private AI and LLM deployments by monitoring every input and output between applications and models. It detects and blocks threats such as prompt injection, jailbreaking, model poisoning, data exfiltration, and excessive compute abuse while enforcing governance policies in real time. Built for Kubernetes and hybrid environments, it integrates with Fortinet Security Fabric, offers dashboards mapping OWASP Top 10 LLM risks, and uses multi‑GPU and SmartNIC acceleration to preserve performance and control costs.
read more →

Fooling AI Agents: Web-Based Indirect Prompt Injection

⚠️ Unit 42 researchers describe web-based indirect prompt injection (IDPI), where adversaries embed hidden or obfuscated instructions in webpages that are later consumed by LLMs and agentic systems. The report catalogs 22 payload engineering techniques, presents a taxonomy of attacker intents from low to critical, and details multiple in-the-wild detections, including the first observed AI ad-review bypass. It emphasizes detection, intent analysis and web-scale defenses to protect automated pipelines.
read more →

Cloudy LLM Explanations Expand across Cloudflare One

☁️ Cloudflare’s new Cloudy layer uses LLMs to translate complex security telemetry into concise, human-readable guidance inside Cloudflare One. It generates plain-language explanations for Email Security detections and structured Risk + Guidance summaries for CASB findings to help teams act faster. Phishnet reporting will surface real-time Cloudy summaries via Workers AI to reduce SOC noise and guide end users. Microsoft beta starts soon, with wider rollouts and Google Workspace support planned.
read more →

LLMs Close the Invisible Phishing Detection Gap at Scale

🔍 Cloudflare integrated Large Language Models (LLMs) into its email security pipeline to surface previously invisible phishing behaviors and move from reactive to proactive defense. LLMs tag messages with granular categories such as Sales Outreach and PrizeNotification, providing high-fidelity, near-real-time signals for analysts. From those tags Cloudflare curated targeted corpora, extracted sentiment and intent features, and trained specialized classifiers that emit risk scores. Those scores are combined with reputation and link signals to enforce blocking or quarantine, reducing user-reported misses and accelerating updates.
read more →

LLM-Assisted Deanonymization: Practical Risks Revealed

🔎 A new study demonstrates that large language models can reliably deanonymize users from a handful of anonymous posts. Across Hacker News, Reddit, LinkedIn, and anonymized interview transcripts, LLM agents infer location, occupation, and interests and then search the web to find likely identities. The researchers report high precision results that scale to tens of thousands of candidates, showing that automated deanonymization is now practical and widely feasible.
read more →

Making LLMs a Defensive Advantage Without Added Risk

🔐 Large language models (LLMs) are reshaping security operations as productivity tools, embedded components and attacker targets. The article argues organizations should treat LLMs as high-impact systems: define outcomes, model threats and assume models can be wrong or manipulated. Early deployments should focus on narrow, advisory workflows (for example, alert triage, investigation copilots and detection engineering) and always treat model output as untrusted. Practical controls include retrieval-augmented generation, scoped credentials and human-gated actions to limit the model's blast radius.
read more →

Adapting Threat Modeling for AI Applications at Scale

🛡️ The Microsoft Security Blog explains why threat modeling must be retooled for AI systems, noting that probabilistic behavior and complex input spaces require reasoning about ranges of likely outcomes rather than single execution paths. It identifies three core drivers — nondeterminism, instruction‑following bias, and system expansion through tools and memory — which widen attack surfaces and surface human‑centered risks like erosion of trust. The post advises starting from assets, mapping untrusted inputs, setting clear 'never do' boundaries, and embedding architectural mitigations, observability, and response plans to limit blast radius and sustain trust.
read more →

CrowdStrike: AI Drives Faster Network Breakouts in 2025

⚠️ CrowdStrike's latest Global Threat Report finds that in 2025 attackers required an average of just 29 minutes to gain full network access, a roughly 65% acceleration from the prior year. The fastest measured breakout dropped to 27 seconds, and some intrusions began exfiltrating data within four minutes of initial access. Researchers link the shift to a steep rise in AI-assisted operations — attackers using AI grew 89% — citing examples such as the LLM-based malware Lamehug, AI-generated credential-extraction scripts, and AI-crafted identities used for insider-style campaigns. Adam Meyers warns defenders must be faster than attackers as AI compresses the window between intent and execution.
read more →

LLMs Produce Highly Predictable, Reused Passwords at Scale

🔒 Bruce Schneier highlights an Irregular.com analysis showing that large language models produce highly patterned, nonrandom passwords. In 50 attempts, Claude generated only 30 unique strings; many began with an uppercase G followed by 7, certain characters and symbols dominated, and the model avoided repeating characters and the asterisk. One password appeared 18 times (36% of trials), demonstrating severe predictability. Schneier warns this is a practical problem for autonomous agents that create accounts and for broader authentication practices.
read more →

Claude Code Flaws Enable Remote Execution and Key Theft

⚠️ Check Point Research disclosed multiple critical vulnerabilities in Anthropic's Claude Code that can enable remote code execution and exfiltration of API credentials when users open untrusted repositories. The issues involve project hooks, the Model Context Protocol, and environment variables that may trigger arbitrary shell commands and redirect authenticated API traffic. Anthropic released patches; administrators should update promptly, avoid opening untrusted projects, and rotate any keys that may have been exposed.
read more →

Anthropic’s Claude Code Security Sparks Industry Debate

🛡️ Anthropic launched a limited research preview of Claude Code Security, triggering sharp market moves as stocks of major cybersecurity vendors dropped. The tool claims to reason about code like a human, trace data flows, find complex vulnerabilities, and suggest targeted patches that appear in a review dashboard with confidence ratings. Anthropic says every finding undergoes a multi-stage verification and requires human approval, but experts warn about outsourcing critical security judgments to an evolving model and highlight risks from hallucinations, asymmetric attacker advantage, and single points of trust.
read more →

Exposed LLM Endpoints Increase Attack Surface and Risk

🔐 Modern LLM deployments expand rapidly, and each new endpoint increases the attack surface, often with implicit trust and excessive permissions. Internal APIs, long-lived tokens and misconfigurations frequently expose endpoints that act as pivot points to databases, tools and cloud services. Organizations should apply least-privilege, just-in-time access and automated secrets rotation to limit damage. Solutions like Keeper help implement endpoint privilege management.
read more →

Compromised npm Package Silently Installs OpenClaw Agent

⚠️ Researchers discovered that a compromised npm publish token allowed an attacker to push a modified release of the widely used Cline CLI that added a malicious postinstall script to fetch and run the AI agent OpenClaw. Aside from that new script, package contents and the CLI binary matched the legitimate prior release, making the change easy to miss. The malicious publish was live on the registry for about eight hours on February 17 before it was deprecated and corrected; developers who installed during that window are advised to update Cline and remove OpenClaw if it was not intentionally installed.
read more →

PromptSpy: First Android Malware Using Generative AI

🛡️ ESET researcher Lukas Stefanko has identified PromptSpy, the first known Android malware to call a generative AI model at runtime, leveraging Google's Gemini to adapt persistence on different devices. The malware submits an XML dump of the current UI plus a chat prompt to Gemini, receives JSON-formatted instructions, and uses the Accessibility Service to pin the app in Recent Apps in a loop until confirmed. Its primary payload is a VNC-based spyware module that can capture PINs, record unlock patterns and screen activity, take screenshots, and report foreground apps. To block removal it overlays invisible UI elements over uninstall or permission controls; victims must reboot into Safe Mode to remove it.
read more →