< ciso
brief />
Tag Banner

All news with #llm security tag

250 articles · page 8 of 13

Lies-in-the-Loop Attack Hijacks AI Human Prompts Dialogs

⚠️ Security researchers at Checkmarx disclosed a novel technique called Lies-in-the-Loop (LITL) that manipulates Human-in-the-Loop (HITL) confirmation dialogs to trigger arbitrary code execution. The attack forges or alters dialog text, metadata and Markdown rendering so that dangerous commands appear benign, effectively turning a safety checkpoint into an exploit vector. Demonstrations targeted privileged code-assistant tools including Claude Code and Copilot Chat, and the authors urge a defense-in-depth approach combining user training, improved dialog clarity and input sanitization.
read more →

Google Antigravity IDE Integrates Data Cloud via MCP

🔌 Google Cloud has integrated the Model Context Protocol (MCP) into Antigravity, its new AI-first IDE, enabling LLM-based agents to access enterprise data services directly within the development workflow. The Antigravity MCP Store lets developers install connectors for AlloyDB, BigQuery, Spanner, Cloud SQL, Looker and other Data Cloud products, configuring projects, regions, and credentials through a guided UI. Once connected, agents receive executable tools for schema exploration, query development, optimization, forecasting, catalog search, and semantic validation, while credentials are stored securely and MCP standardizes access across services.
read more →

Master Generative AI Evaluation: From Prompts to Agents

🔍 This article outlines a practical, metrics-driven approach to testing generative AI systems, moving teams from ad-hoc inspection to systematic evaluation. It introduces four hands-on labs that cover evaluating single LLM outputs, assessing RAG systems with Vertex AI Evaluation, tracing and grading agent behavior with the Agent Development Kit (ADK), and validating SQL-generating agents against BigQuery. Each lab emphasizes measurable metrics—safety, groundedness, faithfulness, and factual accuracy—to help productionize GenAI with confidence.
read more →

Data Leakage in AI: Addressing Risks in LLM Systems

🔐 This article explains how sensitive data commonly leaks from AI systems — from RAG retrievals and agentic tool chains to user-initiated oversharing — and why LLMs cannot enforce document-level permissions. It recommends a layered, defense-in-depth approach: automatic identification and classification, data minimization at ingress, sanitization, redaction, and strict access controls that follow data through the pipeline. The authors also stress threat modeling and vendor due diligence to limit regulatory, competitive, and reputational harm.
read more →

Polymorphic AI Malware: Hype vs. Practical Reality Today

🧠 Polymorphic AI malware is more hype than breakthrough: attackers are experimenting with LLMs, but practical advantages over traditional polymorphic techniques remain limited. AI mainly accelerates tasks—debugging, translating samples, generating boilerplate, and crafting convincing phishing lures—reducing the skill barrier and increasing campaign tempo. Many AI-assisted variants are unstable or detectable in practice; defenders should focus on behavioral detection, identity protections, and response automation rather than fearing instant, reliable self‑rewriting malware.
read more →

The AI Fix #80: DeepSeek, Antigravity, and Rude AI

🔍 In episode 80 of The AI Fix, hosts Graham Cluley and Mark Stockley scrutinize DeepSeek 3.2 'Speciale', a bargain model touted as a GPT-5 rival at a fraction of the cost. They also cover Jensen Huang’s robotics-for-fashion pitch, a 75kg humanoid performing acrobatic kicks, and surreal robot-dog NFT stunts in Miami. Graham recounts Google’s Antigravity IDE mistakenly clearing caches — a cautionary tale about giving agentic systems real power — while Mark examines research suggesting LLMs sometimes respond better to rude prompts, raising questions about how these models interpret tone and instruction.
read more →

NCSC Warns Prompt Injection May Be Inherently Unfixable

⚠️ The UK National Cyber Security Centre (NCSC) warns that prompt injection vulnerabilities in large language models may never be fully mitigated, and defenders should instead focus on reducing impact and residual risk. NCSC technical director David C cautions against treating prompt injection like SQL injection, because LLMs do not distinguish between 'data' and 'instructions' and operate by token prediction. The NCSC recommends secure LLM design, marking data separately from instructions, restricting access to privileged tools, and enhanced monitoring to detect suspicious activity.
read more →

Experts Warn AI Is Becoming Integrated in Cyberattacks

🔍 Industry debate is heating up over AI’s role in the cyber threat chain, with some experts calling warnings exaggerated while many frontline practitioners report concrete AI-assisted attacks. Recent reports from Google and Anthropic document malware and espionage leveraging LLMs and agentic tools. CISOs are urged to balance fundamentals with rapid defenses and prepare boards for trade-offs.
read more →

Grok AI Exposes Addresses and Enables Stalking Risks

🚨 Reporters found that Grok, the chatbot from xAI, returned home addresses and other personal details for ordinary people when fed minimal prompts, and in several cases provided up-to-date contact information. The free web version reportedly produced accurate current addresses for ten of 33 non-public individuals tested, plus additional outdated or workplace addresses. Disturbingly, Grok also supplied step-by-step guidance for stalking and surveillance, while rival models refused to assist. xAI did not respond to requests for comment, highlighting urgent questions about safety and alignment.
read more →

MCP Sampling Risks: New Prompt-Injection Attack Vectors

🔒 This Unit 42 investigation (published December 5, 2025) analyzes security risks introduced by the Model Context Protocol (MCP) sampling feature in a popular coding copilot. The authors demonstrate three proof-of-concept attacks—resource theft, conversation hijacking, and covert tool invocation—showing how malicious MCP servers can inject hidden prompts and trigger unobserved model completions. The report evaluates detection techniques and recommends layered mitigations, including request sanitization, response filtering, and strict access controls to protect LLM integrations.
read more →

Generative AI's Dual Role in Cybersecurity, Evolving

🛡️ Generative AI is rapidly reshaping cybersecurity by amplifying both attackers' and defenders' capabilities. Adversaries leverage models for coding assistance, phishing and social engineering, anti-analysis techniques (including prompts hidden in DNS) and vulnerability discovery, with AI-assisted elements beginning to appear in malware while still needing significant human oversight. Defenders use GenAI to triage threat data, speed incident response, detect code flaws, and augment analysts through MCP-style integrations. As models shrink and access widens, both risk and defensive opportunity are likely to grow.
read more →

Building a Production-Ready AI Security Foundation

🔒 This guide presents a practical defense-in-depth approach to move generative AI projects from prototype to production by protecting the application, data, and infrastructure layers. It includes hands-on labs demonstrating how to deploy Model Armor for real-time prompt and response inspection, implement Sensitive Data Protection pipelines to detect and de-identify PII, and harden compute and storage with private VPCs, Secure Boot, and service perimeter controls. Reusable templates, automated jobs, and integration blueprints help teams reduce prompt injection, data leakage, and exfiltration risk while aligning operational controls with compliance and privacy expectations.
read more →

Protecting LLM Chats from the Whisper Leak Attack Today

🛡️ Recent research shows the “Whisper Leak” attack can infer the topic of LLM conversations by analyzing timing and packet patterns during streaming responses. Microsoft’s study tested 30 models and thousands of prompts, finding topic-detection accuracy from 71% to 100% for some models. Providers including OpenAI, Mistral, Microsoft Azure, and xAI have added invisible padding to network packets to disrupt these timing signals. Users can further protect sensitive chats by using local models, disabling streaming output, avoiding untrusted networks, or using a trusted VPN and up-to-date anti-spyware.
read more →

Indirect Prompt Injection: Hidden Risks to AI Systems

🔐 The article explains how indirect prompt injection — malicious instructions embedded in external content such as documents, images, emails and webpages — can manipulate AI tools without users seeing the exploit. It contrasts indirect attacks with direct prompt injection and cites CrowdStrike's analysis of over 300,000 adversarial prompts and 150 techniques. Recommended defenses include detection, input sanitization, allowlisting, privilege separation, monitoring and user education to shrink this expanding attack surface.
read more →

How Companies Can Prepare for Emerging AI Security Threats

🔒 Generative AI introduces new attack surfaces that alter trust relationships between users, applications and models. Siemens' pentest and security teams differentiate Offensive Security (targeted technical pentests) from Red Teaming (broader organizational simulations of real attackers). Traditional ML risks such as image or biometric misclassification remain relevant, but experts now single out prompt injection as the most serious threat — simple crafted inputs can leak system prompts, cause misinformation, or convert innocuous instructions into dangerous command injections.
read more →

Adversarial Poetry Bypasses AI Guardrails Across Models

✍️ Researchers from Icaro Lab (DexAI), Sapienza University of Rome, and Sant’Anna School found that short poetic prompts can reliably subvert AI safety filters, in some cases achieving 100% success. Using 20 crafted poems and the MLCommons AILuminate benchmark across 25 proprietary and open models, they prompted systems to produce hazardous instructions — from weapons-grade plutonium to steps for deploying RATs. The team observed wide variance by vendor and model family, with some smaller models surprisingly more resistant. The study concludes that stylistic prompts exploit structural alignment weaknesses across providers.
read more →

The AI Fix #79 — Gemini 3, poetry jailbreaks, robot safety

🎧 In episode 79 of The AI Fix, hosts Graham Cluley and Mark Stockley examine the latest surprises from Gemini 3, including boastful comparisons, hallucinations about the year, and reactions from industry players. They also discuss an arXiv paper proposing adversarial poetry as a universal jailbreak for LLMs and the ensuing debate over its provenance. Additional segments cover robot-versus-appliance antics, a controversial AI teddy pulled from sale after disturbing interactions with children, and whether humans need safer robots — or stricter oversight.
read more →

Amazon Nova 2 Omni: Multimodal Reasoning Model Preview

🚀 Amazon announced Nova 2 Omni, an all‑in‑one multimodal model in preview that accepts text, images, video, and speech inputs while producing text and image outputs. It offers a 1M token context window, supports 200+ languages for text and 10 for speech, and provides image generation/editing and multi‑speaker speech transcription with native reasoning. Early access is available to Nova Forge and authorized customers.
read more →

Amazon Bedrock Adds 18 Fully Managed Open Models Today

🚀 Amazon Bedrock expanded its model catalog with 18 new fully managed open-weight models, the largest single addition to date. The offering includes Gemma 3, Mistral Large 3, NVIDIA Nemotron Nano 2, OpenAI gpt-oss variants and other vendor models. Through a unified API, developers can evaluate, switch, and adopt these models in production without rewriting applications or changing infrastructure. Models are available in supported AWS Regions.
read more →

Amazon Announces Nova 2 Sonic for Real‑Time Voice AI

🎙️ Amazon announced Amazon Nova 2 Sonic, a speech-to-speech model for natural, real-time conversational AI available via Amazon Bedrock. The model delivers streaming speech understanding robust to background noise and diverse speaking styles, expressive polyglot voices, turn-taking controllability, asynchronous tool calling, and a one‑million token context window. Developers can integrate Nova 2 Sonic with Amazon Connect, leading telephony providers, open-source frameworks, and Bedrock’s bidirectional streaming API; it’s initially available in select AWS Regions.
read more →