< ciso
brief />
Tag Banner

All news with #llm security tag

249 articles · page 13 of 13

Cloudflare's Edge-Optimized LLM Inference Engine at Scale

⚡ Infire is Cloudflare’s new, Rust-based LLM inference engine built to run large models efficiently across a globally distributed, low-latency network. It replaces Python-based vLLM in scenarios where sandboxing and dynamic co-hosting caused high CPU overhead and reduced GPU utilization, using JIT-compiled CUDA kernels, paged KV caching, and fine-grained CUDA graphs to cut startup and runtime cost. Early benchmarks show up to 7% lower latency on H100 NVL hardware, substantially higher GPU utilization, and far lower CPU load while powering models such as Llama 3.1 8B in Workers AI.
read more →

LLMs Remain Vulnerable to Malicious Prompt Injection Attacks

🛡️ A recent proof-of-concept by Bargury demonstrates a practical and stealthy prompt injection that leverages a poisoned document stored in a victim's Google Drive. The attacker hides a 300-word instruction in near-invisible white, size-one text that tells an LLM to search Drive for API keys and exfiltrate them via a crafted Markdown URL. Schneier warns this technique shows how agentic AI systems exposed to untrusted inputs remain fundamentally insecure, and that current defenses are inadequate against such adversarial inputs.
read more →

Block Unsafe LLM Prompts with Firewall for AI at the Edge

🛡️ Cloudflare has integrated unsafe content moderation into Firewall for AI, using Llama Guard 3 to detect and block harmful prompts in real time at the network edge. The model-agnostic filter identifies categories including hate, violence, sexual content, criminal planning, and self-harm, and lets teams block or log flagged prompts without changing application code. Detection runs on Workers AI across Cloudflare's GPU fleet with a 2-second analysis cutoff, and logs record categories but not raw prompt text. The feature is available in beta to existing customers.
read more →

Logit-Gap Steering Reveals Limits of LLM Alignment

⚠️ Unit 42 researchers Tony Li and Hongliang Liu introduce Logit-Gap Steering, a new framework that exposes how alignment training produces a measurable refusal-affirmation logit gap rather than eliminating harmful outputs. Their paper demonstrates efficient short-path suffix jailbreaks that achieved high success rates on open-source models including Qwen, LLaMA, Gemma and the recently released gpt-oss-20b. The findings argue that internal alignment alone is insufficient and recommend a defense-in-depth approach with external safeguards and content filters.
read more →

Amazon Neptune integrates with Cognee for GenAI memory

🧠 Amazon Neptune now integrates with Cognee to provide graph-native memory for agentic generative AI applications. The integration enables developers to use Amazon Neptune Analytics as the persistent graph and vector store behind Cognee’s memory layer, supporting large-scale memory graphs, long-term memory, and multi-hop reasoning. Hybrid retrieval across graph, vector, and keyword modalities helps agents deliver more personalized, cost-efficient, and context-aware experiences; documentation and a sample notebook are available to accelerate adoption.
read more →

The Brain Behind Next-Generation Cyber Attacks and AI Risks

🧠 Researchers at Carnegie Mellon University demonstrated that leading large language models (LLMs), by themselves, struggle to execute complex, multi-host cyber-attacks end-to-end, frequently wandering off-task or returning incorrect parameters. Their proposed solution, Incalmo, is a structured abstraction layer that constrains planning to a precise set of actions and validated parameters, substantially improving completion and coordination. The work highlights both enhanced offensive potential when LLMs are scaffolded and urgent defensive challenges for security teams.
read more →

AI-Assisted Coding: Productivity Gains and Persistent Risks

🛠️ Martin Lee recounts a weekend experiment using an AI agent to assist with a personal software project. The model provided valuable architectural guidance, flawless boilerplate, and resolved a tricky threading issue, delivering a clear productivity lift. However, generated code failed to match real library APIs, used incorrect parameters and fictional functions, and lacked sufficient input validation. After manual debugging Lee produced a working but not security-hardened prototype, highlighting remaining risks.
read more →

Microsoft announces Phishing Triage Agent public preview

🛡️The Phishing Triage Agent is now in Public Preview and automates triage of user-reported suspicious emails within Microsoft Defender. Using large language models, it evaluates message semantics, inspects URLs and attachments, and detects intent to classify submissions—typically within 15 minutes—automatically resolving the bulk of false positives. Analysts receive natural‑language explanations and a visual decision map for each verdict, can provide plain‑language feedback to refine behavior, and retain control via role‑based access and least‑privilege configuration.
read more →

Defending Against Indirect Prompt Injection in LLMs

🔒 Microsoft outlines a layered defense-in-depth strategy to protect systems using LLMs from indirect prompt injection attacks. The approach pairs preventative controls such as hardened system prompts and Spotlighting (delimiting, datamarking, encoding) to isolate untrusted inputs with detection via Microsoft Prompt Shields, surfaced through Azure AI Content Safety and integrated with Defender for Cloud. Impact mitigation uses deterministic controls — fine-grained permissions, Microsoft Purview sensitivity labels, DLP policies, explicit user consent workflows, and blocking known exfiltration techniques — while ongoing research (TaskTracker, LLMail-Inject, FIDES) advances new design patterns and assurances.
read more →