All news with #nvidia tag

104 articles · page 5 of 6

November 14, 2025

Copy-Paste RCE Flaw Impacts Major AI Inference Servers

🔒 Cybersecurity researchers disclosed a chain of remote code execution (RCE) vulnerabilities affecting AI inference frameworks from Meta, NVIDIA, Microsoft and open-source projects such as vLLM and SGLang. The flaws stem from reused code that called ZeroMQ’s recv-pyobj() and passed data directly into Python’s pickle.loads(), enabling unauthenticated RCE over exposed sockets. Vendors have released patches replacing unsafe pickle usage with JSON-based serialization and adding authentication and transport protections. Operators are urged to upgrade to patched releases and harden ZMQ channels, restrict network exposure, and avoid deserializing untrusted data.

Meta Nvidia Microsoft Remote Code Execution

November 13, 2025

AWS Expands EC2 G6f NVIDIA L4 GPU Instances to More Regions

🚀 Amazon Web Services has expanded availability of EC2 G6f instances powered by NVIDIA L4 GPUs to Europe (Spain) and Asia Pacific (Seoul), improving access for graphics and visualization workloads. G6f instances support GPU partitions as small as one-eighth of a GPU with 3 GB of GPU memory, enabling finer-grained right-sizing and cost savings compared to single‑GPU options. Instances are offered in multiple sizes paired with third‑generation AMD EPYC processors, and are purchasable as On‑Demand, Spot, or via Savings Plans; customers should use NVIDIA GRID driver 18.4 or later to launch these instances.

AWS Nvidia Product Update

November 12, 2025

Microsoft unveils Fairwater AI datacenter in Atlanta

🚀 Microsoft announced the new Fairwater Azure AI datacenter in Atlanta, Georgia, expanding its planet-scale AI superfactory. The purpose-built facility integrates massive NVIDIA Blackwell GPU clusters on a single flat network and uses rack-level direct liquid cooling plus a two-story layout to maximize compute density and reduce latency. It also connects via a dedicated AI WAN to enable cross-site fungibility and dynamic workload allocation.

Microsoft Azure Nvidia

October 28, 2025

Check Point's AI Cloud Protect with NVIDIA BlueField

🔒 Check Point has made AI Cloud Protect powered by NVIDIA BlueField available for enterprise deployment, offering DPU-accelerated security for cloud AI workloads. The solution aims to inspect and protect GenAI traffic and prompts to reduce data exposure risks while integrating with existing cloud environments. It targets prompt manipulation and infrastructure attacks at scale and is positioned for organizations building AI factories.

Check Point Nvidia AI Security Cloud Security

October 28, 2025

Google Cloud launches managed DRANET for GKE with A4X Max

🚀 Google Cloud is previewing managed DRANET on GKE, enabling Kubernetes to treat high-performance RDMA network interfaces as schedulable resources. The integration aligns NICs and GPUs by NUMA topology to reduce latency and increase throughput, while abstracting away operational complexity. It launches with the new A4X Max instances to deliver topology-aware networking for large multi-GPU AI workloads. Developers can request specific network interfaces in pod specs and rely on GKE to co-schedule NICs and accelerators, improving utilization and simplifying operations.

Google Cloud Google Kubernetes Engine Nvidia Network Security

October 28, 2025

A4X Max, GKE Networking, and Vertex AI Training Now Shipping

🚀 Google Cloud is expanding its NVIDIA collaboration with the new A4X Max instances powered by NVIDIA GB300 NVL72, delivering 72 GPUs with high‑bandwidth NVLink and shared memory for demanding multimodal reasoning. GKE now supports DRANET for topology‑aware RDMA scheduling and integrates NVIDIA NeMo Guardrails into GKE Inference Gateway, while Vertex AI Model Garden will host NVIDIA Nemotron models. Vertex AI Training adds NeMo and NeMo‑RL recipes and a managed Slurm environment to accelerate large‑scale training and deployment.

Google Cloud Google Kubernetes Engine Vertex AI Nvidia

October 28, 2025

Microsoft and NVIDIA Deepen AI Infrastructure Partnership

🚀 Microsoft and NVIDIA announced expanded AI infrastructure on Azure, bringing NVIDIA RTX PRO 6000 Blackwell Server Edition to Azure Local, new Nemotron and Cosmos models via Azure AI Foundry, and broader support for Run:ai and GB300 NVL72 supercomputing clusters. These updates enable on-premises and edge AI with cloud-like management, improved GPU utilization, and infrastructure tailored for frontier reasoning, multimodal workloads, and real-time inferencing. Microsoft also highlighted NVIDIA Dynamo optimizations for ND GB200-v6 VMs to boost inference throughput at scale.

Microsoft Nvidia Azure Azure AI Foundry

October 28, 2025

Securing the AI Factory: Palo Alto Networks and NVIDIA

🔒 Palo Alto Networks outlines a platform-centric approach to protect the enterprise AI Factory, announcing integration of Prisma AIRS with NVIDIA BlueField DPUs. The collaboration embeds distributed zero-trust security directly into infrastructure, delivering agentless, penalty-free runtime protection and real-time workload threat detection. Validated on NVIDIA RTX PRO Server and optimized for BlueField‑3, with BlueField‑4 forthcoming, the solution ties into Strata Cloud Manager and Cortex for end-to-end visibility and control, aiming to secure AI operations at scale without compromising performance.

Palo Alto Networks Nvidia Zero Trust AI Security

October 20, 2025

Google Cloud G4 VMs: NVIDIA RTX PRO 6000 Blackwell GA

🚀 The G4 VM is now generally available on Google Cloud, powered by NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs and offering up to 768 GB of GDDR7 memory per instance class. It targets latency-sensitive and regulated workloads for generative AI, real-time rendering, simulation, and virtual workstations. Features include FP4 precision support, Multi-Instance GPU (MIG) partitioning, an enhanced PCIe P2P interconnect for faster multi‑GPU All-Reduce, and an NVIDIA Omniverse VMI on Marketplace for industrial digital twins.

Google Cloud Nvidia Product Launch

October 20, 2025

G4 VMs: High-performance P2P Fabric for Multi‑GPU Workloads

🚀 Google Cloud's newly GA G4 VMs combine NVIDIA RTX PRO 6000 Blackwell GPUs with a custom, software-defined PCIe fabric to enable high-performance peer-to-peer (P2P) GPU communication. The platform accelerates collective operations like All-Gather and All-Reduce without code changes, delivering up to 2.2x faster collectives. For tensor-parallel inference, customers can see up to 168% higher throughput and up to 41% lower inter-token latency. G4 integrates with GKE Inference Gateway for horizontal scaling and production deployments.

Google Cloud Nvidia Product Update

October 15, 2025

Google Cloud and NVIDIA Power AI Innovation Week in D.C.

🤝 At the end of October in Washington, D.C., Google Cloud and NVIDIA will lead a week of events highlighting advances in AI, high-performance computing, and secure mission deployments. NVIDIA GTC DC (Oct. 27–29) features keynotes, demos, and hands-on sessions showcasing next-generation models and infrastructure. The Google Public Sector Summit (Oct. 29) convenes government leaders to explore practical uses of technologies like Gemini for Government and discuss secure, scalable AI adoption for mission impact.

Google Cloud Nvidia AI Governance

October 14, 2025

Microsoft Advances Open Standards for Frontier AI Scale

🔧 Microsoft details OCP contributions to accelerate open-source infrastructure for frontier-scale AI, focusing on power, cooling, networking, security, and sustainability. It highlights innovations such as solid-state transformers, a power-stabilization paper with OpenAI and NVIDIA, and a next-generation HXU for liquid cooling. Networking efforts include ESUN and scale-up Ethernet workstreams, while security contributions introduce Caliptra 2.1, Adams Bridge 2.0, and L.O.C.K. The post also advances fleet lifecycle management, carbon accounting, and waste-heat reuse for globally deployable AI datacenters.

Microsoft OpenAI Nvidia AI Governance

October 9, 2025

Microsoft Azure Debuts Large-Scale NVIDIA GB300 Cluster

🚀 Microsoft Azure announced the first production-scale cluster using more than 4,600 NVIDIA GB300 NVL72 (Blackwell Ultra) GPUs, co-engineered with NVIDIA to support OpenAI and other frontier AI workloads. The new ND GB300 v6 VMs are optimized for reasoning models, agentic systems, and multimodal generative AI, delivered on rack-scale systems with 72 GPUs per rack and 36 NVIDIA Grace CPUs. Microsoft says this infrastructure will shorten training from months to weeks and will scale to hundreds of thousands of Blackwell Ultra GPUs globally.

Microsoft Azure Nvidia OpenAI

October 1, 2025

Cisco Talos Discloses Multiple Nvidia and Adobe Flaws

⚠ Cisco Talos disclosed five vulnerabilities in NVIDIA's CUDA Toolkit components and one use-after-free flaw in Adobe Acrobat Reader. The Nvidia issues affect tools like cuobjdump (12.8.55) and nvdisasm (12.8.90), where specially crafted fatbin or ELF files can trigger out-of-bounds writes, heap overflows, and potential arbitrary code execution. The Adobe bug (2025.001.20531) involves malicious JavaScript in PDFs that can reuse freed objects, leading to memory corruption and possible remote code execution if a user opens a crafted document.

Cisco Nvidia Adobe Vulnerability Disclosure

October 1, 2025

WireTap Attack Extracts Intel SGX ECDSA Key via DDR4

🔬 Researchers from Georgia Institute of Technology and Purdue University describe WireTap, a physical memory-bus interposer attack that passively inspects DDR4 traffic to recover secrets from Intel SGX enclaves. By exploiting deterministic memory encryption, the team built an oracle enabling a full key-recovery of an SGX ECDSA attestation key from the Quoting Enclave. The prototype uses inexpensive, off-the-shelf equipment (roughly $1,000) and can be introduced via supply-chain compromise or local physical access. Intel says the scenario requires physical access and falls outside its memory-encryption threat model.

Nvidia Supply Chain Vulnerability Research

September 18, 2025

Inside Fairwater: Microsoft's New Frontier AI Datacenter

🚀 Microsoft unveiled Fairwater, a purpose-built AI datacenter in Wisconsin and sister sites in Norway and the UK, designed to operate as a single, global-scale supercomputer. The facility deploys interconnected racks of NVIDIA GB200 servers (72 GPUs per rack) and claims 10× the performance of the world’s fastest supercomputer. It combines closed-loop liquid cooling, exabyte-scale storage and an AI WAN to enable distributed training and large-scale inference across Azure.

Microsoft Microsoft Azure Nvidia Infrastructure Security

September 17, 2025

CrowdStrike Secures AI Across the Enterprise with Partners

🔒 CrowdStrike describes how the Falcon platform delivers unified visibility and lifecycle defense across the full AI stack, from GPUs and training data to inference pipelines and SaaS agents. The post highlights integrations with NVIDIA, AWS, Intel, Dell, Meta, and Salesforce to extend protection into infrastructure, data, models, and applications. It also introduces agentic defense via Charlotte AI for autonomous triage and rapid response, and emphasizes governance controls to prevent data leaks and adversarial manipulation.

CrowdStrike Nvidia AWS Meta

September 12, 2025

Amazon SageMaker Adds EC2 P6-B200 Notebook Instances

🚀 Amazon Web Services announced general availability of EC2 P6-B200 instances for SageMaker notebooks. These instances include eight NVIDIA Blackwell GPUs with 1,440 GB of high-bandwidth GPU memory and 5th Gen Intel Xeon processors, offering up to 2x the training performance versus P5en. They enable interactive development and fine-tuning of large foundation models in JupyterLab and CodeEditor, and are available in US East (Ohio) and US West (Oregon).

AWS Amazon SageMaker AI Nvidia

September 10, 2025

Disaggregated AI Inference with NVIDIA Dynamo on GKE

⚡ This post announces a reproducible recipe to deploy NVIDIA Dynamo for disaggregated LLM inference on Google Cloud’s AI Hypercomputer using Google Kubernetes Engine, vLLM, and A3 Ultra (H200) GPUs. The recipe separates prefill and decode phases across dedicated GPU pools to reduce contention and lower latency. It includes single-node and multi-node examples and step-by-step deployment actions. The repository provides configuration guidance and future plans for broader GPU and engine support.

Nvidia Google Cloud Google Kubernetes Engine AI Security

September 8, 2025

Reviewing AI Data Center Policies to Mitigate Risks

🔒 Investment in AI data centers is accelerating globally, creating not only rising energy demand and emissions but also an expanded surface of cyber threats. AI facilities rely on GPUs, ASICs and FPGAs, which introduce side-channel, memory-level and GPU-resident malware risks that differ from traditional CPU-focused threats. Organizations should require operators to implement supply-chain vetting, physical shielding (for example, Faraday cages), continuous model auditing and stronger personnel controls to reduce model exfiltration, poisoning and foreign infiltration.

Nvidia AI Security Supply Chain Compromise