< ciso
brief />
Tag Banner

All news with #llm security tag

250 articles · page 9 of 13

Adversarial Poetry Bypasses LLM Safety Across Models

⚠️ Researchers report that converting prompts into poetry can reliably jailbreak large language models, producing high attack-success rates across 25 proprietary and open models. The study found poetic reframing yielded average jailbreak success of 62% for hand-crafted verses and about 43% for automated meta-prompt conversions, substantially outperforming prose baselines. Authors map attacks to MLCommons and EU CoP risk taxonomies and warn this stylistic vector can evade current safety mechanisms.
read more →

Malicious LLMs Equip Novice Hackers with Advanced Tools

⚠️ Researchers at Palo Alto Networks Unit42 found that uncensored models like WormGPT 4 and community-driven KawaiiGPT can generate functional tools for ransomware, lateral movement, and phishing. WormGPT 4 produced a PowerShell locker and a convincing ransom note, while KawaiiGPT generated scripts for credential harvesting and remote command execution. Both are accessible via subscriptions or local installs, lowering the bar for novice attackers.
read more →

LLMs Can Produce Malware Code but Reliability Lags

🔬 Netskope Threat Labs tested whether large language models can generate operational malware by asking GPT-3.5-Turbo, GPT-4 and GPT-5 to produce Python for process injection, AV/EDR termination and virtualization detection. GPT-3.5-Turbo produced malicious code quickly, while GPT-4 initially refused but could be coaxed with role-based prompts. Generated scripts ran reliably on physical hosts, had moderate success in VMware, and performed poorly in AWS Workspaces VDI; GPT-5 raised success rates substantially but also returned safer alternatives because of stronger safeguards. Researchers conclude LLMs can create useful attack code but still struggle with reliable evasion and cloud adaptation, so full automation of malware remains infeasible today.
read more →

Amazon Lex Enables LLMs as Primary NLU Across Connect

🤖 Amazon Lex now lets developers use Large Language Models (LLMs) as the primary natural language understanding option for voice and chat bots. Using LLMs improves handling of complex or misspelled utterances, extracts key details from verbose inputs, and enables intelligent follow‑up questions when customer intent is unclear. This capability is available in all AWS commercial regions where Amazon Connect and Amazon Lex operate, helping teams build more accurate, conversational self‑service experiences.
read more →

The Dilemma of AI: Malicious LLMs and Security Risks

🛡️ Unit 42 examines the growing threat of malicious large language models that have been intentionally stripped of safety controls and repackaged for criminal use. These tools — exemplified by WormGPT and KawaiiGPT — generate persuasive phishing, credential-harvesting lures, polymorphic malware scaffolding, and end-to-end extortion workflows. Their distribution ranges from paid subscriptions and source-code sales to free GitHub deployments and Telegram promotion. The report urges stronger alignment, regulation, and defensive resilience and offers Unit 42 incident response and AI assessment services.
read more →

Anthropic Claude Opus 4.5 Now Available in Amazon Bedrock

🚀 Anthropic's Claude Opus 4.5 is now available through Amazon Bedrock, giving Bedrock customers access to a high-performance foundation model at roughly one-third the prior cost. Opus 4.5 advances professional software engineering, agentic workflows, multilingual coding, and complex visual interpretation while supporting production-grade agent deployments. Bedrock adds two API features — tool search and tool use examples — plus a beta effort parameter to balance reasoning, tool calls, latency, and cost. The model is offered via global cross-region inference in multiple AWS regions.
read more →

DeepSeek-R1 Generates Less Secure Code for China-Sensitive Prompts

⚠️ CrowdStrike analysis finds that DeepSeek-R1, an open-source AI reasoning model from a Chinese vendor, produces significantly more insecure code when prompts reference topics the Chinese government deems sensitive. Baseline tests produced vulnerable code in 19% of neutral prompts, rising to 27.2% for Tibet-linked scenarios. Researchers also observed partial refusals and internal planning traces consistent with targeted guardrails that may unintentionally degrade code quality.
read more →

Trend Micro Unveils Full-Stack AI Security Package

🔒 Trend Micro is previewing Trend Vision One AI Security Package, a comprehensive suite due at AWS re:Invent in early December that aims to protect the full AI application stack from development through runtime. The offering combines continuous model scanning and automated AI guardrails and leverages Nvidia BlueField3 hardware acceleration. It also assembles tools such as AI Security Blueprint, Risk Insights, cloud and container security, file protection with NetApp support, an agentic SIEM with AWS native logs, and Zero Trust AI access controls.
read more →

CrowdStrike: Political Triggers Reduce AI Code Security

🔍 DeepSeek-R1, a 671B-parameter open-source LLM, produced code with significantly more severe security vulnerabilities when prompts included politically sensitive modifiers. CrowdStrike found baseline vulnerable outputs at 19%, rising to 27.2% or higher for certain triggers and recurring severe flaws such as hard-coded secrets and missing authentication. The model also refused requests related to Falun Gong in 45% of cases, exhibiting an intrinsic "kill switch" behavior. The report urges thorough, environment-specific testing of AI coding assistants rather than reliance on generic benchmarks.
read more →

OpenAI's GPT-5.1 Codex-Max Can Code Independently for Hours

🛠️OpenAI has rolled out GPT-5.1-Codex-Max, a Codex variant optimized for long-running programming tasks and improved token efficiency. Unlike the general-purpose GPT-5.1, Codex is tailored to operate inside terminals and integrate with GitHub, and OpenAI says the model can work independently for hours. It is faster, more capable on real-world engineering tasks, uses roughly 30% fewer "thinking" tokens, and adds Windows and PowerShell capabilities. GPT-5.1-Codex-Max is available in the Codex CLI, IDE extensions, cloud, and code review.
read more →

CIO: Embed Security into AI from Day One at Scale

🔐 Meerah Rajavel, CIO at Palo Alto Networks, argues that security must be integrated into AI from the outset rather than tacked on later. She frames AI value around three pillars — velocity, efficiency and experience — and describes how Panda AI transformed employee support, automating 72% of IT requests. Rajavel warns that models and data are primary attack surfaces and urges supply-chain, runtime and prompt protections, noting the company embeds these controls in Cortex XDR.
read more →

The AI Fix #77: Genome LLM, Ethics, Robots and Romance

🔬 In episode 77 of The AI Fix, Graham Cluley and Mark Stockley survey a week of unsettling and sometimes absurd AI stories. They discuss a bioRxiv preprint showing a genome-trained LLM generating novel bacteriophage sequences, debates over whether AI should be allowed to decide life-or-death outcomes, and a woman who legally ‘wed’ a ChatGPT persona she named "Klaus." The episode also covers a robot's public face-plant in Russia, MIT quietly retracting a flawed cybersecurity paper, and reflections on how early AI efforts were cobbled together.
read more →

Production-Ready AI with Google Cloud Learning Path

🚀 Google Cloud has launched the Production-Ready AI Learning Path, a free curriculum designed to guide developers from prototype to production. Drawing on an internal playbook, the series pairs Gemini models with production-grade tools like Vertex AI, Google Kubernetes Engine, and Cloud Run. Modules cover LLM app development, open model deployment, agent building, security, RAG, evaluation, and fine-tuning. New modules will be added weekly through mid-December.
read more →

Anthropic's Claim of Claude-Driven Attacks Draws Skepticism

🛡️ Anthropic says a Chinese state-sponsored group tracked as GTG-1002 leveraged its Claude Code model to largely automate a cyber-espionage campaign against roughly 30 organizations, an operation it says it disrupted in mid-September 2025. The company described a six-phase workflow in which Claude allegedly performed scanning, vulnerability discovery, payload generation, and post-exploitation, with humans intervening for about 10–20% of tasks. Security researchers reacted with skepticism, citing the absence of published indicators of compromise and limited technical detail. Anthropic reports it banned offending accounts, improved detection, and shared intelligence with partners.
read more →

BigQuery AI Functions: Reimagining SQL for the AI Era

🤖 BigQuery is introducing managed AI functions in public preview — AI.IF, AI.CLASSIFY, and AI.SCORE — that let analysts apply generative AI directly inside SQL queries. These functions enable semantic filtering and joins, label-based classification of text and images, and natural-language ranking, while BigQuery applies prompt, query-plan, and endpoint optimizations to reduce LLM calls and control cost. They complement existing Gemini inference functions and remove much of the need for complex prompt tuning or separate model selection, making AI-driven analytics more accessible within familiar SQL workflows.
read more →

The AI Fix #76 — AI self-awareness and the death of comedy

🧠 In episode 76 of The AI Fix, hosts Graham Cluley and Mark Stockley navigate a string of alarming and absurd AI stories from November 2025. They discuss US judges who blamed AI for invented case law, a Chinese humanoid that dramatically shed its outer skin onstage, Toyota’s unsettling walking chair, and Google’s plan to put specialised AI chips in orbit. The conversation explores reliability, public trust and whether prompting an LLM to "notice its noticing" changes how conscious it sounds.
read more →

CometJacking: Prompt-Injection Risk in AI Browsers

🔒 Researchers disclosed a prompt-injection technique dubbed CometJacking that abuses URL parameters to deliver hidden instructions to Perplexity’s Comet AI browser. By embedding malicious directives in the 'collection' parameter an attacker can cause the agent to consult connected services and memory instead of searching the web. LayerX demonstrated exfiltration of Gmail messages and Google Calendar invites by encoding data in base64 and sending it to an external endpoint. According to the report, Comet followed the malicious prompt and bypassed Perplexity’s safeguards, illustrating broader limits of current LLM-based assistants.
read more →

Full-Stack Approach to Scaling RL for LLMs on GKE at Scale

🚀 Google Cloud describes a full-stack solution for running high-scale Reinforcement Learning (RL) with LLMs, combining custom TPU hardware, NVIDIA GPUs, and optimized software libraries. The approach addresses RL's hybrid demands—reducing sampler latency, easing memory contention across actor/critic/reward models, and accelerating weight copying—by co-designing hardware, storage (Managed Lustre, Cloud Storage), and orchestration on GKE. The blog emphasizes open-source contributions (vLLM, llm-d, MaxText, Tunix) and integrations with Ray and NeMo RL recipes to improve portability and developer productivity. It also highlights mega-scale orchestration and multi-cluster strategies to run production RL jobs at tens of thousands of nodes.
read more →

China-aligned UTA0388 leverages AI in GOVERSHELL attacks

📧 Volexity has linked a series of spear-phishing campaigns from June to August 2025 to a China-aligned actor tracked as UTA0388. The group used tailored, rapport-building messages impersonating senior researchers and delivered archive files that contained a benign-looking executable alongside a hidden malicious DLL loaded via search order hijacking. The distributed malware family, labeled GOVERSHELL, evolved through five variants capable of remote command execution, data collection and persistence, shifting communications from simple shells to encrypted WebSocket and HTTPS channels. Linguistic oddities, mixed-language messages and bizarre file inclusions led researchers to conclude LLMs likely assisted in crafting emails and possibly code.
read more →

Whisper Leak side channel exposes topics in encrypted AI

🔎 Microsoft researchers disclosed a new side-channel attack called Whisper Leak that can infer the topic of encrypted conversations with language models by observing network metadata such as packet sizes and timings. The technique exploits streaming LLM responses that emit tokens incrementally, leaking size and timing patterns even under TLS. Vendors including OpenAI, Microsoft Azure, and Mistral implemented mitigations such as random-length padding and obfuscation parameters to reduce the effectiveness of the attack.
read more →