All news with #hugging face tag

Thu, November 13, 2025

Google Cloud expands Hugging Face support for AI developers

#Hugging Face #Google #Vertex AI #TPU #Mandiant #AI Security #Open-Weight Models

🤝 Google Cloud and Hugging Face are deepening their partnership to speed developer workflows and strengthen enterprise model deployments. A new gateway will cache Hugging Face models and datasets on Google Cloud so downloads take minutes, not hours, across Vertex AI and Google Kubernetes Engine. The collaboration adds native TPU support for open models and integrates Google Cloud’s threat intelligence and Mandiant scanning for models served through Vertex AI.

Tue, November 11, 2025

AI startups expose API keys on GitHub, risking models

#AI Security #AI Data Leakage #Model Theft #Hugging Face #Weights & Biases #LangChain #Hardcoded Secrets

🔐 New research by cloud security firm Wiz found verified secret leaks in 65% of the Forbes AI 50, with API keys and access tokens exposed on GitHub. Some credentials were tied to vendors such as Hugging Face, Weights & Biases, and LangChain, potentially granting access to private models, training data, and internal details. Nearly half of Wiz’s disclosure attempts failed or received no response. The findings highlight urgent gaps in secret management and DevSecOps practices.

Mon, November 10, 2025

65% of Top Private AI Firms Exposed Secrets on GitHub

#AI Security #Hardcoded Secrets #Token Leakage #Key Leakage #Training Data Leakage #Weights & Biases #Hugging Face #Disclosure

🔒 A Wiz analysis of 50 private companies from the Forbes AI 50 found that 65% had exposed verified secrets such as API keys, tokens and credentials across GitHub and related repositories. Researchers employed a Depth, Perimeter and Coverage approach to examine commit histories, deleted forks, gists and contributors' personal repos, revealing secrets standard scanners often miss. Affected firms are collectively valued at over $400bn.

Thu, November 6, 2025

AI-Powered Malware Emerges: Google Details New Threats

#AI Security #Prompt Injection #Ransomware #Hugging Face #Google #Gemini

🛡️ Google Threat Intelligence Group (GTIG) reports that cybercriminals are actively integrating large language models into malware campaigns, moving beyond mere tooling to generate, obfuscate, and adapt malicious code. GTIG documents new families — including PROMPTSTEAL, PROMPTFLUX, FRUITSHELL, and PROMPTLOCK — that query commercial APIs to produce or rewrite payloads and evade detection. Researchers also note attackers use social‑engineering prompts to trick LLMs into revealing sensitive guidance and that underground marketplaces increasingly offer AI-enabled “malware-as-a-service,” lowering the bar for less skilled threat actors.

Thu, November 6, 2025

Google Warns: AI-Enabled Malware Actively Deployed

#AI Security #Google #OpenAI #Hugging Face #Prompt Injection #Data Exfil via Tools

⚠️ Google’s Threat Intelligence Group has identified a new class of AI-enabled malware that leverages large language models at runtime to generate and obfuscate malicious code. Notable families include PromptFlux, which uses the Gemini API to rewrite its VBScript dropper for persistence and lateral spread, and PromptSteal, a Python data miner that queries Qwen2.5-Coder-32B-Instruct to create on-demand Windows commands. GTIG observed PromptSteal used by APT28 in Ukraine, while other examples such as PromptLock, FruitShell and QuietVault demonstrate varied AI-driven capabilities. Google warns this "just-in-time AI" approach could accelerate malware sophistication and democratize cybercrime.

Thu, November 6, 2025

Google: LLMs Employed Operationally in Malware Attacks

#AI Security #Prompt Injection #Hugging Face #Gemini #Qwen #Data Exfil via Tools

🤖 Google’s Threat Intelligence Group (GTIG) reports attackers are using “just‑in‑time” AI—LLMs queried during execution—to generate and obfuscate malicious code. Researchers identified two families, PROMPTSTEAL and PROMPTFLUX, which query Hugging Face and Gemini APIs to craft commands, rewrite source code, and evade detection. GTIG also documents social‑engineering prompts that trick models into revealing red‑teaming or exploit details, and warns the underground market for AI‑enabled crime is maturing. Google says it has disabled related accounts and applied protections.

Wed, November 5, 2025

GTIG: Threat Actors Shift to AI-Enabled Runtime Malware

#AI Security #Prompt Injection #Hugging Face #Google #Ransomware #Backdoor Found

🔍 Google Threat Intelligence Group (GTIG) reports an operational shift from adversaries using AI for productivity to embedding generative models inside malware to generate or alter code at runtime. GTIG details “just-in-time” LLM calls in families like PROMPTFLUX and PROMPTSTEAL, which query external models such as Gemini to obfuscate, regenerate, or produce one‑time functions during execution. Google says it disabled abusive assets, strengthened classifiers and model protections, and recommends monitoring LLM API usage, protecting credentials, and treating runtime model calls as potential live command channels.

Wed, November 5, 2025

Cloud CISO: Threat Actors' Growing Use of AI Tools

#AI Security #Prompt Injection #Prompt Leakage #Hugging Face #Google #Google DeepMind #Data Exfil via Tools

⚠️Google's Threat Intelligence team reports a shift from experimentation to operational use of AI by threat actors, including AI-enabled malware and prompt-based command generation. GTIG highlighted PROMPTSTEAL, linked to APT28 (FROZENLAKE), which queries a Hugging Face LLM to generate scripts for reconnaissance, document collection, and exfiltration, while adopting greater obfuscation and altered C2 methods. Google disabled related assets, strengthened model classifiers and safeguards with DeepMind, and urges defenders to update threat models, monitor anomalous scripting and C2, and incorporate threat intelligence into model- and classifier-level protections.

Thu, October 23, 2025

Hugging Face and VirusTotal: Integrating Security Insights

#Hugging Face #VirusTotal #AI Security #AI Supply Chain #Insecure Deserialization #Model Poisoning

🔒 VirusTotal and Hugging Face have announced a collaboration to surface security insights directly within the Hugging Face platform. When browsing model files, datasets, or related artifacts, users will now see multi‑scanner results including VirusTotal detections and links to public reports so potential risks can be reviewed before downloading. VirusTotal is also enhancing its analysis portfolio with AI-driven tools such as Code Insight and format‑aware scanners (picklescan, safepickle, ModelScan) to highlight unsafe deserialization flows and other risky patterns. The integration aims to increase visibility across the AI supply chain and help researchers, developers, and defenders build more secure models and workflows.

Tue, October 14, 2025

Microsoft launches ExCyTIn-Bench to benchmark AI security

#AI Security #Microsoft #AI Evals #Microsoft Security Copilot #Microsoft Sentinel #Open-Weight Models #Hugging Face

🛡️ Microsoft released ExCyTIn-Bench, an open-source benchmarking tool to evaluate how well AI systems perform realistic cybersecurity investigations. It simulates a multistage Azure SOC using 57 Microsoft Sentinel log tables and measures multistep reasoning, tool usage, and evidence synthesis. The benchmark offers fine-grained, actionable metrics for CISOs, product owners, and researchers.

Fri, October 3, 2025

Dataproc ML library: Connect Spark to Gemini and Vertex

#Vertex AI #Google #Dataproc #Apache Spark #Gemini #Hugging Face

🔗 Google has released an open-source Python library, Dataproc ML, to streamline running ML and generative-AI inference from Apache Spark on Dataproc. The library uses a SparkML-style builder pattern so users can configure a model handler (for example, GenAiModelHandler) and call .transform() to apply Gemini or other Vertex AI models directly to DataFrames. It also supports loading PyTorch and TensorFlow model artifacts from GCS for large-scale batch inference and includes performance optimizations such as vectorized data transfer, connection reuse, and automatic retry/backoff.

Thu, September 25, 2025

Enabling AI Sovereignty Through Choice and Openness Globally

#Cloudflare #Workers AI #Open-Weight Models #AI Governance #Embeddings #Retrieval-Augmented Generation #Hugging Face

🌐 Cloudflare argues that AI sovereignty should mean choice: the ability for nations to control data, select models, and deploy applications without vendor lock-in. Through its distributed edge network and serverless Workers AI, Cloudflare promotes accessible, low-cost deployment and inference close to users. The company hosts regional open-source models—India’s IndicTrans2, Japan’s PLaMo-Embedding-1B, and Singapore’s SEA-LION v4-27B—and offers an AI Gateway to connect diverse models. Open standards, interoperability, and pay-as-you-go economics are presented as central to resilient national AI strategies.

Tue, September 16, 2025

Gemini and Open-Source Text Embeddings Now in BigQuery ML

#Google Cloud #BigQuery #Vertex AI #Hugging Face #Gemini #Retrieval-Augmented Generation #Open-Weight Models

🚀 Google expanded BigQuery ML to generate embeddings from Gemini and over 13,000 open-source text-embedding models via Hugging Face, all callable with simple SQL. The post summarizes model tiers to help teams trade off quality, cost, and scalability, and introduces Gemini's Tokens Per Minute (TPM) quota for throughput control. It shows a practical workflow to deploy OSS models to Vertex AI endpoints, run ML.GENERATE_EMBEDDING for batch jobs, and undeploy to minimize idle costs, plus a Colab tutorial and cost/scale guidance.

Wed, September 3, 2025

Model Namespace Reuse: Supply-Chain RCE in Cloud AI

#AI Security #Azure #Google #Hugging Face #Microsoft #Supply-Chain Incident

🔒 Unit 42 describes a widespread flaw called Model Namespace Reuse that lets attackers reclaim abandoned Hugging Face Author/ModelName namespaces and distribute malicious model code. The technique can lead to remote code execution and was demonstrated against major platforms including Google Vertex AI and Azure AI Foundry, as well as thousands of open-source projects. Recommended mitigations include version pinning, cloning models to trusted storage, and scanning repositories for reusable references.

Wed, August 27, 2025

Cloudflare's Edge-Optimized LLM Inference Engine at Scale

#Cloudflare #Cloudflare Workers #Hugging Face #Inference Security #Model Isolation #Model Routing #NVIDIA #Open-Weight Models

⚡ Infire is Cloudflare’s new, Rust-based LLM inference engine built to run large models efficiently across a globally distributed, low-latency network. It replaces Python-based vLLM in scenarios where sandboxing and dynamic co-hosting caused high CPU overhead and reduced GPU utilization, using JIT-compiled CUDA kernels, paged KV caching, and fine-grained CUDA graphs to cut startup and runtime cost. Early benchmarks show up to 7% lower latency on H100 NVL hardware, substantially higher GPU utilization, and far lower CPU load while powering models such as Llama 3.1 8B in Workers AI.

Mon, August 25, 2025

vLLM Performance Tuning for xPU Inference Configs Guide

#Google #Hugging Face #NVIDIA #VLLM

⚙️ This guide from Google Cloud authors Eric Hanley and Brittany Rockwell explains how to tune vLLM deployments for xPU inference, covering accelerator selection, memory sizing, configuration, and benchmarking. It shows how to gather workload parameters, estimate HBM/VRAM needs (example: gemma-3-27b-it ≈57 GB), and run vLLM’s auto_tune to find optimal gpu_memory_utilization and throughput. The post compares GPU and TPU options and includes practical troubleshooting tips, cost analyses, and resources to reproduce benchmarks and HBM calculations.