< ciso
brief />
Tag Banner

All news with #llm security tag

249 articles

Trustpilot’s real-time data enrichment with Gemma

🧩Trustpilot built a high-volume streaming pipeline using fine-tuned Gemma models to process millions of user reviews in near real-time under tight latency and cost constraints. The team replaced variable per-token pricing with fixed infrastructure costs, fine-tuned lightweight models for tasks like NER, sentiment, and topic classification, and separated classifier and LLM endpoints. Performance tuning, vLLM optimizations, and load testing enabled scalable inference despite challenges with private networking, deployment observability, and GPU availability.
read more →

OpenAI GPT-5.5, GPT-5.4 and Codex now on Bedrock

🚀 Amazon Bedrock now supports OpenAI GPT-5.5, GPT-5.4, and Codex for production use, offering the same AWS security, governance, and operational controls. GPT-5.5 delivers advanced capabilities for agentic coding, data analysis, and multi-step autonomous tasks on a next-generation inference engine. Codex is available via a dedicated App, CLI, and IDE integrations for Visual Studio Code, JetBrains, and Xcode, and can be configured to run through Bedrock with pricing aligned to OpenAI first-party rates.
read more →

Greyvibe: Russian-linked group using AI in attacks

🛡️ Researchers from WithSecure uncovered a Russian-aligned group dubbed Greyvibe that extensively leverages large language models across its campaigns targeting private, government, and military organizations in Ukraine. The group uses spear phishing, fake websites, malicious archives, and ClickFix-style CAPTCHAs to deliver custom malware such as PhantomRelay, LegionRelay, and Android spyware FallSpy. Observed tooling and infrastructure indicate systematic use of generative AI for lure creation, code development, and backend setup, blurring lines between state-aligned activity and cybercrime ecosystem actors.
read more →

Major LLMs Vulnerable to Multi-Turn Bypass

🔒 Cisco researchers warn that safety guardrails in several leading large language models (LLMs) can be bypassed through multi-turn conversations. They tested frontier models including ChatGPT, Claude, Gemini, Nova and Grok, finding many were susceptible to manipulation that yields disallowed outputs. Techniques such as roleplay, ambiguity, reframing, and persona adoption were effective, and model configuration affected resilience.
read more →

Anthropic's Mythos model edging toward public release

🛡️ Anthropic appears to be preparing a public rollout of its restricted Mythos model, which the company warned poses major security risks by automating high-quality cyberattacks. Announced in April as an advanced frontier model, Mythos showed dramatic improvements in code reasoning and autonomy compared to Opus 4.7. References briefly appeared in Claude Code and Claude Security, suggesting a controlled preview, while Anthropic builds guardrails and works with partners through its Glasswing initiative.
read more →

Protect GenAI Chatbots with Check Point WAF

🛡️ Check Point explains why GenAI chatbots create new security risks by acting as a front door to internal systems and data. The post highlights real incidents—prompt injection, data exposure, and misleading responses—that demonstrate legal, financial, and reputational impacts. It describes how Check Point WAF extends unified application and API security into the conversational layer to detect and block malicious prompts, prevent data leaks, and control unsafe outputs.
read more →

Google AI Edge Portal Adds On‑Device LLM Benchmarking

🚀 Google AI Edge Portal now enables developers to benchmark and debug on-device LLMs across a physical lab of over 120 representative Android devices. It profiles initialization time, prefill and decode speeds, and peak memory usage across CPU, GPU, and NPU backends to surface real user-impacting metrics. The integrated Model Explorer visualizes model graphs, tensor shapes, and traces to speed root-cause analysis and collaboration.
read more →

Image-only Prompt Injection Threatens Multimodal AI

🔍 Researchers from Xidian University describe a new image-based prompt injection called CrossMPI that uses near-imperceptible pixel perturbations to alter how large vision-language models interpret both visual and textual inputs. The technique targets intermediate multimodal fusion layers rather than final outputs, misleading LVLMs without modifying text prompts. Tests show strong black-box transferability and high success rates across several open-source models, while common defenses reduce but do not fully eliminate the threat.
read more →

UK Regulators Warn Financial Firms on Frontier AI Risks

⚠️ On May 15 the UK government, the Financial Conduct Authority and the Bank of England issued a joint warning about cybersecurity threats from frontier AI. They noted models can outperform skilled practitioners at greater speed, scale and lower cost, amplifying risks to firms, customers and financial stability. The statement urges firms to strengthen governance, vulnerability management, third-party controls, protection and response capabilities and points to NCSC resources and prior resilience guidance.
read more →

Cloudflare Findings on Frontier Cybersecurity LLMs

🔍 Cloudflare tested security-focused LLMs on its infrastructure and reports detailed findings from using Anthropic’s Mythos Preview as part of Project Glasswing. The model stood out for exploit chain construction and automated proof generation, producing runnable PoCs and iterating on failures. Its emergent guardrails proved inconsistent across runs and prompts, so Cloudflare built a tailored harness and additional safeguards to scale safely. The team also observed higher-quality, actionable findings compared with earlier frontier models, but noted increased noise from memory-unsafe languages and model bias.
read more →

AI Finds 18-Year-Old Remote Code Execution Flaw in Nginx

🔍 Researchers using an LLM-powered platform discovered a critical 18-year-old heap buffer overflow in Nginx that can enable remote code execution under certain conditions. Tracked as CVE-2026-42945, it resides in ngx_http_rewrite_module and affects versions 0.6.27 through 1.30.0. Patches were released in 1.31.0 and 1.30.1 and in Nginx Plus releases; several F5 products remain pending updates. Exploitation can cause server crashes and, without ASLR, may allow arbitrary code execution.
read more →

AI Hallucinations Introduce Critical Security Risks

⚠️ AI hallucinations—confident but incorrect outputs—are increasingly driving risky decisions in critical infrastructure and cybersecurity operations, exploiting human trust in authoritative-sounding responses. A 2025 AA-Omniscience benchmark of 40 models found most systems were more likely to offer a confident wrong answer on difficult questions, underscoring that AI outputs must be treated as potential vulnerabilities until vetted. Effective controls include enforced human review before sensitive actions, treating training data as a security asset, strict least-privilege for AI systems, and prompt-engineering training to reduce ambiguous inputs.
read more →

Assessing the Risks of Anthropic’s Mythos AI Capabilities

🔍 Anthropic’s announcement that Claude Mythos Preview will not be released publicly underscores both genuine capability and strategic constraint. Independent testing and reproductions suggest similar performance from OpenAI’s GPT-5.5 and smaller community models, while Mythos’ cost and corporate incentives shape access. These generative systems dramatically improve automated vulnerability discovery, empowering both attackers and defenders. Mozilla’s use found 271 flaws, but many devices remain unpatchable, so organizations must adapt quickly.
read more →

GPT-5.5 Matches Mythos in Security Vulnerability Tests

🔍 The UK’s AI Security Institute evaluated GPT-5.5’s ability to identify software security vulnerabilities and concluded it performs comparably to Claude Mythos, based on a series of red-team style tests and benchmark prompts. The assessment highlights that GPT-5.5 is generally available from OpenAI, making high-quality automated vulnerability detection more accessible to organizations and researchers. The Institute also analyzed a smaller, cheaper model which, when given additional prompting scaffolding and careful supervision, delivered similar detection performance. Overall, the study suggests parity among leading LLMs for initial vulnerability discovery, with differences largely hinging on prompt engineering and deployment context.
read more →

AI Security Must Shift From Posture to Behavior Now

🔐 The article warns that AI security is repeating the endpoint-era mistake of focusing primarily on posture controls—model cards, SBOMs, guardrails and access policies—while overlooking how systems actually behave. It argues that behavioral detection is essential, monitoring sequences of actions, data access patterns, tool invocations and output drift. The AI surface is expanding rapidly with open-source LLMs, third-party APIs, RAG pipelines and autonomous agents, creating "shadow AI" and dynamic risks. The recommendation is to keep posture as table stakes but prioritize logging, behavioral baselines and SOC integration to turn findings into actionable incidents.
read more →

Critical Ollama GGUF Vulnerability Exposes Heap Data

⚠️ Security researchers disclosed a critical out-of-bounds read in Ollama that can leak process memory and is tracked as CVE-2026-7482 (CVSS 9.1), dubbed "Bleeding Llama". The flaw arises in the GGUF model loader's WriteTo() flow due to use of the unsafe package, allowing a crafted model upload to read past heap bounds. Successful exploitation can reveal environment variables, API keys, prompts, and user conversation data and exfiltrate it via the /api/push endpoint. Users are urged to apply fixes, restrict network exposure, and place an authentication proxy before Ollama instances.
read more →

Pen Tests Reveal AI Flaws More Severe Than Legacy Bugs

🔒 Penetration testing shows AI and LLM deployments contain a disproportionate share of severe vulnerabilities. Cobalt’s State of Pentesting Report finds 32% of LLM findings rated high risk versus 13% for legacy enterprise tests, and only 38% of those high-risk LLM issues are remediated. Experts point to emerging attack surfaces — notably prompt injection, now OWASP’s top LLM risk — broader blast radii from model integrations, and fragmented ownership for fixes. Recommended countermeasures include threat modeling, red teaming, least-privilege access, strict output validation, and human approval gates for high-consequence actions.
read more →

Amazon Bedrock AgentCore Memory Adds Metadata for LTM

🧠 Amazon Bedrock AgentCore Memory now supports metadata on long-term memory (LTM) records, enabling agents to tag, filter, and retrieve memories using structured attributes alongside semantic search. You can define up to ten indexed keys per memory resource with STRING, NUMBER, and STRING_LIST types and apply operator filters to refine retrieval results. Metadata can be attached at ingestion or inferred automatically by the LLM using extraction instructions defined on the memory resource. This capability is available today in all AWS Regions where AgentCore Memory is supported.
read more →

Poisoned Truth: The Quiet Threat to Enterprise AI Security

⚠️ Enterprise AI deployments face a quiet but serious integrity risk when models learn or retrieve false information: data poisoning and widespread data pollution can make LLMs produce plausible but incorrect outputs. This threat spans training datasets, RAG and retrieval layers, agent memory, and internal knowledge bases — and often originates from stale, conflicting, or poorly governed sources rather than deliberate attacks. Security leaders are urged to map all context sources, treat AI inputs as a supply chain, tighten data hygiene, and assign clear governance to identify and remediate corrupted truth.
read more →

Defending Against Attacks from Frontier AI Models: Readiness

🔒 A new generation of frontier AI models is changing how cyberattacks are developed, enabling speed, scale, and accessibility previously unseen. Early testing of advanced models, including Claude’s Mythos, shows they can identify code vulnerabilities, map attack paths, and generate working exploits with minimal effort. Organizations must treat these as fully AI-powered attacks and prioritize proactive readiness, detection, and mitigation strategies.
read more →