All news with #llm security tag

287 articles · page 9 of 15

January 14, 2026

Vibe coding tools produce critical security vulnerabilities

🛡️ Tenzai's December 2025 assessment found that five popular vibe coding tools — Claude Code, OpenAI Codex, Cursor, Replit, and Devin — frequently generate insecure code when given common programming prompts. Across 15 generated applications the researchers identified 69 vulnerabilities, many low‑to‑medium but several rated high and six rated critical. The most serious flaws involved API authorization and business‑logic failures; by contrast, the tools avoided classic issues such as SQLi and XSS. Tenzai concluded human oversight, targeted testing, and embedding security into AI development workflows remain essential.

AI Security LLM Security Vulnerability Management AppSec

January 13, 2026

The AI Fix #83: ChatGPT Health, LLM bluffing and more

🧠 In episode 83 of The AI Fix, hosts Graham Cluley and Mark Stockley explore how users are testing and tricking large language models, including a journalist’s invented idiom that exposed AI bluffers. They discuss OpenAI’s new ChatGPT Health, a Dutch case where a marriage certificate was invalidated after an official used ChatGPT, and quirky AI applications like an automated barman. The episode also examines research on new methods to corrupt LLMs and continuing debate over the future of Stack Overflow.

ChatGPT LLM Security Model Jailbreaks

January 12, 2026

Palo Alto Unit 42 Warns of Risks from Vibe Coding Practices

🛡️ Palo Alto Networks' Unit 42 warns that the generalization of vibe coding — using natural-language AI prompts to write code — has already been linked to data breaches, arbitrary code injection and authentication bypass incidents. Researchers say rapid adoption by both hobbyists and experienced developers often outpaces governance, leaving organizations with limited visibility and inadequate monitoring. To help customers assess and mitigate these risks, Unit 42 introduced SHIELD, a targeted security governance framework outlining separation of duties, human-in-the-loop checks, input/output validation, security-focused helper models, least agency and defensive technical controls.

LLM Security Secure Coding AI Security Threat Research

January 12, 2026

Weird Generalizations and Inductive Backdoors in LLMs

⚠️ Recent research demonstrates that small amounts of narrow finetuning can produce broad, unexpected shifts in LLM behavior. The authors show weird generalization—models adopting outdated worldviews from bird-naming examples—and introduce inductive backdoors, where models learn triggers and behaviors via generalization. These effects enable persona hijacking and hard-to-detect misalignment.

LLM Security Model Poisoning Research

January 9, 2026

Hackers Scan Misconfigured Proxies to Reach Paid LLMs

🔍 Threat actors have been probing misconfigured proxy servers to access paid large language model (LLM) endpoints, generating over 80,000 sessions since late December, according to GreyNoise. Attackers used low-noise queries to fingerprint models without triggering alerts and targeted vendors such as OpenAI, Anthropic, Google, Meta, Mistral and others. While GreyNoise reports no observed exploitation or data theft, the scale of enumeration indicates reconnaissance with possible malicious intent. Recommended mitigations include restricting Ollama model pulls to trusted registries, applying egress filtering, blocking known OAST callback domains at DNS, rate-limiting suspicious ASNs, and monitoring JA4 fingerprints.

OpenAI Anthropic Google Meta

January 8, 2026

The Dual Role of AI in Empowering and Threatening Security

🛡️ AI and large language models are transforming cybersecurity into a contest of speed and scale, serving as both best-in-class defensive tools and powerful offensive enablers. Researchers describe self-modifying malware and autonomous espionage that call commercial LLMs (e.g., PROMPTFLUX, PROMPTSTEAL) to adapt tactics mid-execution, while defenders are deploying solutions like XBOW, CodeMender and Watsonx to automate vulnerability discovery, remediation and compliance. CISOs must therefore pair AI-driven defenses with governance and model guardrails to manage this dual-use reality.

LLM Security AI Security Prompt Injection AI Guardrails

January 8, 2026

Prisma AIRS Secures Agentic Software Development Workflows

🛡️ Prisma AIRS integrates with Factory’s Droid Shield Plus to secure agent-native software development by inspecting all LLM interactions in real time. The platform monitors prompts, model responses and downstream tool calls to detect prompt injection, secret leakage and malicious code execution. Using an API Intercept pattern, Prisma AIRS can coach, block or quarantine risky inputs and generated outputs before they reach developers or repositories. This native, continuous protection is designed to preserve developer velocity while improving deployment confidence.

Palo Alto Networks Prompt Injection Attack Agent Security LLM Security

January 8, 2026

Securing Vibe Coding: Governance for AI Development

🛡️ Vibe coding accelerates development but often omits essential security controls, introducing vulnerabilities, data exfiltration, and destructive actions. Unit 42 documents incidents where AI-generated code bypassed authentication, executed arbitrary commands, deleted production databases, or exposed sensitive identifiers. To mitigate these risks, Unit 42 proposes the SHIELD framework—Separation, Human review, Input/output validation, Enforcer helper models, Least agency, and Defensive controls. Implementing these measures restores governance and enables safer AI-assisted development.

LLM Security AI Guardrails DevSecOps Secure SDLC

January 7, 2026

In 2026 Hackers Embrace AI: Vibe Hacking & HackGPT

🧠 Across dark web forums, Telegram channels, and underground marketplaces, criminals are framing AI as a shortcut to profit rather than a technical revolution. The rise of "vibe hacking" — an intuition-driven, AI-guided approach — and branded tools like FraudGPT, PhishGPT, and WormGPT lower the skill barrier and package familiar scams as turnkey services. AI jailbreaking, prompt-injection techniques, and "Hacking-GPT" offerings are openly bought and sold, amplifying volume over sophistication. Flare monitors those signals to give defenders earlier visibility.

Prompt Injection LLM Security AI Security Threat Intelligence

January 7, 2026

Google Seeks Engineers to Improve AI Answers Quality

🔎 Google has posted a job for AI Answers Quality engineers to verify and improve the accuracy of its AI Overviews, an implicit admission that AI-driven answers on Search can hallucinate and produce contradictory responses. The role aims to validate AI-generated content, improve citation fidelity, and enhance answer quality across the Search results page and AI Mode. The listing arrives as Google increasingly routes users into AI-driven experiences, including updated Discover feed summaries and AI-rewritten headlines. Reported issues range from fabricated company valuations to misleading health advice, highlighting the need for targeted quality work.

Google LLM Security AI Safety

January 5, 2026

Customizing NVIDIA Nemotron for Security Query Translation

🔒 CrowdStrike and NVIDIA operationalized Nemotron LLMs to enable natural-language-to-CQL translation inside the Falcon platform. They leveraged millions of analyst queries, AST-based deduplication, and a PII scrubbing pipeline, then used NVIDIA NeMo Data Designer to generate synthetic natural-language descriptions for fine-tuning. Fine-tuning Llama-3.3-Nemotron-Super-49B-v1.5 with LoRA produced improved accuracy, interpretability through intermediate reasoning, and 96% valid-query accuracy versus frontier alternatives.

CrowdStrike Nvidia LLM Security CrowdStrike Falcon

January 1, 2026

Infosecurity Top 10: Key Cybersecurity Stories of 2025

🔒 Cybersecurity in 2025 was defined by high-profile breaches, weaponized AI and renewed focus on supply-chain and vulnerability management. Major events included vendor withdrawals from MITRE ATT&CK evaluations, a large-scale IoT proxy network, a critical Fortinet zero-day in active exploitation, and the fast mitigation of an npm package compromise. New risks such as 'quishing', LLM-driven hallucination attacks and agentic AI guidance from OWASP also shaped the year.

Threat Report Supply Chain Vulnerability Fortinet LLM Security

December 29, 2025

Top 5 Real-World AI Security Threats Revealed in 2025

🔒 2025 exposed major, real-world risks across the AI ecosystem as rapid adoption of agentic AI expanded enterprise attack surfaces. Researchers documented pervasive Shadow AI and vulnerable vendor tools, AI supply-chain poisoning, credential theft (LLMjacking), prompt-injection attacks, and rogue or misconfigured MCP servers. These incidents affected popular frameworks and cloud services and resulted in data breaches, remote-code execution, and costly fraud.

AI Security LLM Security Prompt Injection Attack AI Supply Chain

December 29, 2025

Traditional Security Frameworks Fail Against AI Threats

🔒 Traditional security frameworks like NIST CSF, ISO 27001, and CIS Controls were designed for legacy IT assets and do not map cleanly to AI-specific risks. Recent incidents — including the December 2024 Ultralytics compromise, ChatGPT memory-extraction flaws across 2024, and August 2025 malicious Nx packages — show organizations can meet compliance yet remain exposed. The article argues security teams must adopt AI-tailored controls such as prompt validation, model integrity verification, semantic DLP, and AI-focused red teaming.

AI Security LLM Security Vulnerability Disclosure

December 22, 2025

CrowdStrike: Training GenAI Models at Scale, Distributed

🛡️ CrowdStrike outlines its methodology for training security-focused GenAI models at scale using the Google Cloud Vertex Training Cluster and an infrastructure-as-code approach. The team leverages Slurm for workload scheduling, modular data pipelines with synthetic augmentation, and a mix of parallelism strategies (data, tensor, pipeline, sequence/expert) to match model size and hardware. They optimize across GPU architectures (H100, B200) using high-performance attention kernels like Flash Attention and NCCL for inter-node communication to improve throughput, support extended contexts, and manage memory via gradient checkpointing and observability tooling.

CrowdStrike LLM Security Research

December 18, 2025

Human-in-the-Loop Safeguards Can Be Forged, Researchers Warn

⚠️ Checkmarx research shows Human-in-the-Loop (HITL) confirmation dialogs can be manipulated so attackers embed malicious instructions into prompts, a technique the researchers call Lies-in-the-Loop (LITL). Attackers can hide or misrepresent dangerous commands by padding payloads, exploiting rendering behaviors like Markdown, or pushing harmful text out of view. Approval dialogs meant as a final safety backstop can thus become an attack surface. Checkmarx urges developers to constrain dialog rendering and validate approved operations; vendors acknowledged the report but did not classify it as a vulnerability.

Prompt Injection Attack LLM Security AI Guardrails

December 17, 2025

Gemini 3 Flash: Speed, Efficiency, and Enterprise Scale

⚡ Gemini 3 Flash expands the Gemini 3 family with a low-latency, cost-efficient model tuned for high-frequency enterprise workflows. It combines Pro-grade reasoning with Flash-level speed to enable near real-time multimodal processing, rapid agentic coding, and responsive interactive agents without sacrificing accuracy. Available in Gemini Enterprise, Vertex AI, and Gemini CLI, it targets scale and affordability for production deployments.

Gemini LLM Security Product Launch

December 17, 2025

Lies-in-the-Loop Attack Hijacks AI Human Prompts Dialogs

⚠️ Security researchers at Checkmarx disclosed a novel technique called Lies-in-the-Loop (LITL) that manipulates Human-in-the-Loop (HITL) confirmation dialogs to trigger arbitrary code execution. The attack forges or alters dialog text, metadata and Markdown rendering so that dangerous commands appear benign, effectively turning a safety checkpoint into an exploit vector. Demonstrations targeted privileged code-assistant tools including Claude Code and Copilot Chat, and the authors urge a defense-in-depth approach combining user training, improved dialog clarity and input sanitization.

Prompt Injection LLM Security Claude

December 15, 2025

Google Antigravity IDE Integrates Data Cloud via MCP

🔌 Google Cloud has integrated the Model Context Protocol (MCP) into Antigravity, its new AI-first IDE, enabling LLM-based agents to access enterprise data services directly within the development workflow. The Antigravity MCP Store lets developers install connectors for AlloyDB, BigQuery, Spanner, Cloud SQL, Looker and other Data Cloud products, configuring projects, regions, and credentials through a guided UI. Once connected, agents receive executable tools for schema exploration, query development, optimization, forecasting, catalog search, and semantic validation, while credentials are stored securely and MCP standardizes access across services.

Google Cloud LLM Security MCP Security

December 15, 2025

Master Generative AI Evaluation: From Prompts to Agents

🔍 This article outlines a practical, metrics-driven approach to testing generative AI systems, moving teams from ad-hoc inspection to systematic evaluation. It introduces four hands-on labs that cover evaluating single LLM outputs, assessing RAG systems with Vertex AI Evaluation, tracing and grading agent behavior with the Agent Development Kit (ADK), and validating SQL-generating agents against BigQuery. Each lab emphasizes measurable metrics—safety, groundedness, faithfulness, and factual accuracy—to help productionize GenAI with confidence.

Vertex AI Model Evaluation LLM Security