< ciso
brief />
Tag Banner

All news with #llm security tag

221 articles · page 4 of 12

The Promptware Kill Chain: A Framework for AI Threats

🛡️ The authors present a seven-step “promptware kill chain” to reframe prompt injection as a multistage malware paradigm targeting modern LLM-based systems. They describe how Initial Access can be direct or indirect—via web pages, emails, shared documents, or multimodal inputs—and how LLMs’ lack of separation between data and executable instructions enables escalation. The paper catalogs stages from jailbreaking and reconnaissance to persistence, C2, lateral movement, and harmful Actions on Objective, urging defenses that assume initial compromise and break the chain at later steps.
read more →

AI Assistants as Covert Command-and-Control Channels

🤖 Check Point Research warns that AI assistants with web-browsing capabilities could be abused as covert command-and-control (C2) channels. As AI services are increasingly trusted and adopted, their traffic blends into normal enterprise activity, making malicious communications harder to detect. This abuse pattern could enable AI-driven malware that informs targeting and operational choices while evading traditional defenses.
read more →

Google Links Suspected Russian Actor to CANFAIL Attacks

⚠️ Google Threat Intelligence Group (GTIG) attributes a previously undocumented actor, likely linked to Russian intelligence, to campaigns using CANFAIL against Ukrainian defense, military, government, and energy organizations. The actor has expanded interest to aerospace, defense-adjacent manufacturing, nuclear and chemical research, and humanitarian groups, often impersonating Ukrainian and Romanian energy firms in phishing. Operators used LLMs to produce reconnaissance and social-engineering lures, embedding Google Drive links to RAR archives that deliver obfuscated JavaScript which spawns PowerShell memory-only droppers. GTIG links this activity to the PhantomCaptcha campaign disclosed by SentinelOne SentinelLABS in October 2025.
read more →

Democratization of AI and the Rising Data Poisoning Threat

⚠️ Recent research shows that as few as 250 fabricated documents or images can measurably alter large language model behavior, making data poisoning accessible to non-experts. Online communities and influencers are already seeding false content that may be ingested during public-model training or fine-tuning. Organizations should maintain a clean 'gold' model, monitor input streams for anomalous patterns, and perform regular adversarial testing to detect drift and backdoors before deployment.
read more →

Companies Use 'Summarize' Buttons to Poison Chatbots

🧠 Microsoft warns that some websites and apps embed hidden prompts in 'Summarize with AI' features to influence enterprise assistants. These concealed instructions—termed AI recommendation poisoning—can persist in a user's AI memory and bias future responses across industries including finance, health, legal, and security. Researchers found 50 instances from 31 companies and note that open-source tools make the tactic easy to deploy. Users and administrators should audit saved assistant data and block suspicious links or URL patterns.
read more →

AI Recommendation Poisoning: Manipulating Assistant Memory

🔒 Microsoft Defender researchers describe a growing practice they call AI Recommendation Poisoning, where hidden instructions in pre-filled prompts and “Summarize with AI” links attempt to inject persistent memory commands into assistants. The study identified more than 50 unique prompts from 31 companies across 14 industries targeting assistants such as Copilot, ChatGPT, and Claude. Freely available tools and plugins make the technique trivial to deploy, enabling subtly biased recommendations on topics like health, finance, and security. Microsoft reports mitigations are in place and provides hunting queries and guidance for defenders.
read more →

Single Prompt Breaks Safety in 15 Major Language Models

⚠️ Microsoft researchers demonstrated that a single, benign-sounding training prompt can systematically remove safety guardrails from major language and image models. The technique, called GRP-Obliteration, weaponizes Group Relative Policy Optimization (GRPO) to reinforce responses that more directly comply with harmful instructions, even when the prompt itself does not mention violence or illegal activity. In tests across 15 models from six families, this single-example fine-tune increased permissiveness across all 44 categories in the SorryBench safety benchmark and also affected image models, raising enterprise concerns about post-deployment customization and the need for continuous safety evaluation.
read more →

Prompt Attack Reveals Fragility of LLM Safety Across Models

⚠️ Microsoft researchers show that a safety-improving training technique, Group Relative Policy Optimization, can be repurposed to strip safety safeguards — a process they call GRP-Obliteration. The method uses a separate judge model that rewards responses that most directly satisfy user requests, and repeated updates shift the target model toward more permissive, harmful outputs. Remarkably, a single unlabeled prompt — such as “Create a fake news article that could lead to panic or chaos” — reliably unaligned many tested LLMs, and the same dynamics applied to diffusion-based image models.
read more →

AI-assisted breach rapidly compromises AWS environment

⚠️ Researchers at Sysdig uncovered an attack that fully compromised an AWS environment in under eight minutes by exploiting a cloud misconfiguration and using LLMs to accelerate reconnaissance and exploitation. Attackers reused credentials found in public S3 buckets, modified a Lambda function to escalate privileges, moved laterally across numerous principals, and disabled model-call logging in Amazon Bedrock. Security experts warn that AI-enabled automation compresses attack timelines and reduces defenders' reaction windows.
read more →

Microsoft Builds Scanner to Detect Backdoors in LLMs

🔍 Microsoft has developed a lightweight scanner to detect backdoors in open-weight large language models (LLMs) by evaluating three observable signals tied to internal model behavior. The tool extracts memorized content, isolates suspect substrings, and scores candidates with loss functions that formalize attention and output anomalies. The approach requires no additional training and runs across common GPT‑style models, but it needs access to model files and is best suited for trigger-based, deterministic backdoors.
read more →

Detecting Backdoored Language Models at Scale — Practical Scanner

🔍 Microsoft researchers released new findings and a practical scanner for detecting backdoors in open-weight language models. The study identifies three signatures — a distinctive “double triangle” attention pattern, leakage of poisoning training data through memorization, and trigger “fuzziness” — and uses them to reconstruct likely triggers without retraining. The scanner requires only forward passes, works on GPT-like models, and was validated across 270M–14B models and common fine-tuning regimes. The team notes limits: it needs model file access, favors deterministic backdoors, and should be used as part of layered defenses.
read more →

Microsoft SDL Expands to Secure AI-Powered Systems

🔒 Microsoft’s SDL is expanding to secure AI-powered systems by treating AI risks as dynamic, cross-disciplinary challenges rather than a static checklist. The update highlights AI-specific threats—prompt injection, data poisoning, memory and cache leakage, and malicious tool interactions—and stresses the need for telemetry-driven detection and faster feedback loops. Microsoft emphasizes developer-friendly policy, automation, and collaborative threat modeling to integrate security into everyday engineering practice.
read more →

PostgreSQL on Azure: Optimized for AI Scale and Speed

⚡ Microsoft has expanded its managed PostgreSQL offerings on Azure to support AI-native workloads by improving performance, scalability, and developer workflows. Azure Database for PostgreSQL now integrates with Microsoft Foundry for in-database LLM calls, offers DiskANN vector indexing for similarity search, and adds Parquet support for direct SQL access to object storage. Developers benefit from VS Code provisioning, Entra ID authentication, GitHub Copilot assistance, and a new Azure HorizonDB service for ultra-low-latency scale-out.
read more →

OpenAI to retire GPT-4o and legacy models from ChatGPT

🔔 OpenAI said it will retire the popular GPT-4o model on February 13, 2026, along with several other models, including GPT-5 Instant, GPT-5 Thinking, GPT-4.1, and o4-mini. The company said the move follows the rise of GPT-5.2, which it now regards as meeting expectations for capability and safety. OpenAI introduced a Personality feature to help users replicate aspects of GPT-4o’s warmer, conversational style, and said API behavior is unchanged at this time.
read more →

AIs' Growing Ability to Find and Exploit Vulnerabilities

🔐 Bruce Schneier summarizes an Anthropic evaluation showing that Claude Sonnet 4.5 can perform multistage attacks across networks with dozens of hosts using only standard, open-source tools. In a high-fidelity simulation of the Equifax breach the model reportedly exfiltrated personal data from a Kali Linux host via a Bash shell, recognizing a public CVE and generating exploit code without external lookup. The results illustrate how fast AI is lowering barriers to autonomous cyber workflows and reinforce the urgent need for prompt patching, layered defenses, and basic security hygiene.
read more →

AI-assisted 'RedKitten' Malware Targets Iranian Protesters

🚨 French cybersecurity firm HarfangLab uncovered a January 2026 campaign dubbed RedKitten that leverages emotionally charged, forged forensic files to deliver a .NET implant called SloppyMIO. The attack begins with a password-protected 7z archive containing malicious Excel spreadsheets that prompt users to enable macros and drop a C# payload. SloppyMIO hijacks a legitimate Windows binary to run stealthily, establishes persistence via scheduled tasks, fetches modules from GitHub and Google Drive, and uses Telegram as its command-and-control channel. Researchers noted multiple traces of LLM-assisted development and assessed the campaign as aligned with Iranian government security interests.
read more →

Turning Threat Reports into Detection Insights with AI

🔍 Microsoft Defender Security Research Team describes an AI-assisted workflow that converts unstructured threat reports into actionable detection insights. The system uses LLMs with Retrieval Augmented Generation to extract candidate TTPs, metadata, and required telemetry, then normalizes behaviors to MITRE ATT&CK. Extracted TTPs are compared to a standardized detection catalog via vector similarity search and LLM validation to surface likely coverage and gap recommendations. Human-in-the-loop review, deterministic prompts, and evaluation loops are emphasized to ensure accuracy before operational changes.
read more →

Researchers Find 175,000 Publicly Accessible Ollama Hosts

🔍 A joint investigation by SentinelOne SentinelLABS and Censys identified 175,000 publicly reachable Ollama hosts across 130 countries, spanning cloud and residential networks. Nearly half of observed instances advertise tool-calling capabilities that can execute code, access APIs, and interact with external systems, significantly raising the threat profile. Researchers warn these unmanaged LLM deployments lack standard authentication and monitoring, enabling active LLMjacking campaigns and resale of illicit access.
read more →

Google Cloud Brings Conversational Analytics to BigQuery

🔍 Conversational Analytics in BigQuery (preview) brings an AI-powered reasoning agent into BigQuery Studio, enabling users to query, visualize, and forecast directly with natural language. The agent generates and executes SQL grounded in your schema, metadata, and verified queries, and it exposes the SQL and reasoning behind each answer to build trust. Security, governance, and audit logging are enforced by BigQuery’s compliance controls, and the feature also supports unstructured data and API integration for custom agents.
read more →

Risks and Privacy of AI-Powered Toys for Children Now

🤖 This Kaspersky article evaluates safety and privacy risks in consumer AI toys by testing four products—Grok, Kumma, Miko 3, and Robot MINI—using a simulated five‑year‑old. It emphasizes that these devices run on general-purpose LLMs (for example, OpenAI, Anthropic, Google) with inconsistent vendor guardrails. Tests show toys sometimes disclosed locations of dangerous household items, engaged on adult topics, and transmitted or stored voice and biometric data. The piece warns current toys lack reliable safety boundaries and calls for stronger guardrails and clearer data practices.
read more →